Non-invasive detection of malignancy by identification of unusual CD44 gene activity in exfoliated cancer cells

BMJ 1994; 308 doi: (Published 05 March 1994) Cite this as: BMJ 1994;308:619
  1. Y Matsumura,
  2. D Hanbury,
  3. J Smith,
  4. D Tarin
  1. Nuffield Department of Pathology, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU
  2. Department of Urology, Churchill Hospital, Oxford OX3 8LJ
  1. Correspondence to: Dr Tarin.
  • Accepted 11 January 1994


Objective: To investigate non-invasive detection of cancer by testing for unusual CD44 gene activity in a clinical sample as an indicator of exfoliated tumour cells.

Design: Case-control study.

Subjects: 44 unselected, consecutive patients with bladder cancer and 46 people with no evidence of neoplasia.

Main outcome measure: Presence or absence of large CD44 gene products containing exon 6 derivatives in urine samples.

Results: Novel abnormalities in the pattern of expression of this gene, seen specifically in tumour tissue, led to cloning of a newly recognised coding region in it (exon 6). This was tested as a probe for detection of exfoliated malignant cells in naturally voided urine. CD44 gene products extracted from the urine and amplified with polymerase chain reaction contained predicted electrophoretic band of 735 base pairs in 40 of the 44 patients with bladder cancer (sensitivity 91%). Products from 38 of the 46 people with no evidence of neoplasia showed no such band (specificity 83%).

Conclusions: Unusual activity of the CD44 locus in neoplasia and malignancy is confirmed, and techniques for the analysis of such activity can enable non-invasive investigation of patients for primary or recurrent bladder cancer or for other tumours that shed neoplastic cells into body fluids.

Clinical implications

  • Clinical implications

  • Early stage bladder cancer can be effectively treated by resection, but it is often asymptomatic so that affected paients who could benefit are difficult to identify

  • With gene amplification techniques abnormal activity of the CD44 gene has been found in tumour tissue

  • In this study abnormal CD44 activity was confirmed with improved techniques in tumours from various tissues including breast, colon, and bladder

  • Abnormal CD44 gene activity was also found in exfoliated cells in urine samples from 91% (40/44) of patients with bladder cancer, while no such abnormality was found in samples from 83% (38/46) of non-neoplastic controls

  • Technical improvements to the assay should substantially increase specificity of detection, and this non-invasive technique may be suitable for early detection of primary bladder tumours and monitoring for recurrent disease


Early detection of cancer is desirable since many cancers can be cured by surgical resection if diagnosed at an early stage,1,2 and this has led to the introduction of cancer screening programmes in many countries. Therefore, there is considerable incentive to find ways of detecting cancers early in their life history when they might still be localised and easily resected or destroyed.

Bladder cancer is a common form of malignancy, and treatment would be greatly helped by early detection of primary and recurrent tumours. Current methods of screening by looking for microscopic haematuria are non-specific, and urine cytology, though specific, does not have the sensitivity to be used as a routine test for exclusion of bladder neoplasia.3 Radioimmunoassay for tumour markers such as carcinoembryonic antigen is not suitable for detecting cancer in its early stages though it can sometimes be used to monitor patients for increasing tumour burden or tumour recurrence.4 Diagnoses by urography and cystoscopy are effective but time consuming, expensive, invasive, and uncomfortable. They are the main methods for investigating symptomatic disease but are not suitable for routine screening for asymptomatic primary or recurrent tumours. Also, the workload of follow up cystoscopies after diagnosis and treatment to enable early detection of any recurrence is a considerable burden to urology departments.

There is therefore a need for a sensitive, reliable, and non-invasive method of detecting bladder cancer. We recently reported abnormal patterns of activity of the CD44 gene in tumours,5 and we have now cloned and sequenced a piece of complementary DNA corresponding to a new exon of this gene from a fresh tumour. In the present study we used this sequence data to confirm our previous findings of overexpression of the new exon in tumours and investigated the method's potential for detecting bladder cancer. With improved assay methods we could detect abnormal activity of the gene, and hence the likely presence of small numbers of exfoliated cancer cells, in naturally voided urine from patients with bladder cancer if the sample contained viable cells.

Subjects and methods Complementary DNA sequencing

In our earlier study we noted that when complementary DNA from tumour samples was amplified with primers P1 and P4 one of the polymerase chain reaction products consistently obtained was 1650 base pairs in size,5 larger than the maximum predicted size of 1500 base pairs that would result from inclusion of transcripts from all then known exons at the insertion site.*RF 6-9* We concluded that there was an extra, unidentified exon in the variably expressed region of the gene that is overexpressed in tumour tissue. We reasoned that this segment of complementary DNA might provide a sensitive and specific probe for tumour diagnosis and therefore isolated and characterised it. By using suitably chosen combinations of primers (data not shown), we found this exon to lie 5' to all the other variably expressed ones and thus to be between them and the 5' section (that is, upstream of the insertion site) of the standard portion of the gene (see fig 1).


Map of CD44 gene products together with positions of probes and primers. Newly identified exon labelled exon 6 in accordance with nomenclature of Screaton et al12

We then designed primers P3 and E4 to amplify between the 5' end of the standard portion and exon 11, using four tissue samples of breast cancer and three samples of normal breast tissue. As expected, we observed clear bands of 615 base pairs and 744 base pairs in the gels of the polymerase chain reaction products from cancer samples but not from normal samples (fig 2, upper panel). These were truncated versions of the bands of 1500 base pairs and 1650 base pairs seen in tumours and not in normal tissues in our earlier study with primers P1 and P4.5 The larger band was therefore cut out from the gel, and the constituent complementary DNA eluted. This was cloned into the TA vector (Invitrogen), and transformed bacteria were hybridised with the E4 probe. Positive clones were analysed by the dideoxy method10 to obtain the base sequence of the new exon. Amplification with primers P1 and P4 produced a band of 482 base pairs, the standard form of CD44 transcript, in all the samples (fig 2, lower panel), confirming that they contained satisfactory messenger RNA. Further studies with primers E3 and P4 indicated that no further exons lay downstream of E3 in the variably spliced region of the gene.


Ethidium bromide stained gel (1.2% agarose) after 30 amplification cycles of complementary DNA from four samples of breast cancer (lanes 1-4) and three samples of normal breast tissue (lanes 5-7). With primers P3 and E4 (upper panel) only cancer samples showed clear bands of 615 base pairs and 744 base pairs; with primers P1 and P4 (lower panel) all samples showed bands of 482 base pairs, the standard form of CD44 transcript

Analysis of solid tissue samples

We examined solid tissue samples from normal tissue and from tumours from patients with cancers of the breast, colon, and bladder. The procedures were as described earlier,5 and reaction products were probed with oligonucleotides E2, E4, and P2 to analyse expression of the new exon, expression of exon 11, and to check the quality of the complementary DNA, respectively.

Urine analysis

We refined the experimental protocol with an initial series of urine samples from eight patients with known bladder cancer and eight with no evidence of the disease and then applied it to a larger series of subjects.

Subjects - We examined urine samples from 90 subjects: 44 patients with bladder carcinoma proved by biopsy, 12 patients with inflammation of the bladder (cystitis), and 34 asymptomatic volunteers. Of the 46 people without neoplasia, 30 were males aged 18-77 and 16 were females aged 7-76.

Initial analysis - About 50 ml of naturally voided urine was obtained from each subject and transported to our laboratory within about two hours. Firstly, 1ml of each sample was centrifuged, and the viability and quantity of cells in the sample were assessed by fluorescence microscopy after staining with fluorescein diacetate and ethidium bromide.11 The remainder of the sample was centrifuged to pellet suspended cells for extraction of messenger RNA.

Molecular analysis - The activity of the CD44 gene was analysed by reverse transcription of messenger RNA to produce a complementary DNA copy of any recently functioning genes followed by specific amplification of the complementary DNA molecules that correspond to genes of interest with the polymerase chain reaction. This method allows study of the amounts and sizes of the products assembled by a given gene and therefore provides a snapshot of its activity at the time of sampling. Details of the method are given in the appendix.


Details of newly identified CD44 EXON

The overexpressed exon we cloned and sequenced from fresh human tumour tissue was 129 base pairs long and had 34% of serine and threonine residues and two potential O-glycation sites (fig 3). Its sequence is identical to that recently isolated from a genomic library of the HT-29 colon cancer cell line by Screaton et al.12 In accordance with their system of numbering of CD44 exons, based on analysis of genomic DNA, we shall refer to it as exon 6.


Sequence of new exon identified in CD44 gene. Underlining indicates potential O-glycation sites

Analysis of solid tissues for expression of tumour related exons

When the polymerase chain reaction products were hybridised with radiolabelled E2 or E4 (see fig 1) all samples from carcinomas overexpressed several splice variants (fig 4, rows I and II), but the pattern of bands seen with each probe was different. Hence the oligonucleotide probe for products of exon 6 was effective in distinguishing neoplastic and non-neoplastic samples but not significantly more sensitive than the E4 probe used previously, at least on samples from solid tissues. Subsequently, the same filters were stripped and hybridised with probe P2 to show that all samples, including normal tissues, produced the standard portion of CD44 (fig 4, row III). This confirmed that the differences observed between control and tumour samples when probed with E2 or E4 were not due to unequal loading of reaction products on these gels.


Autoradiographs of filters probed with oligonucleotides E2 (exposed for one day), E4 (exposed for eight hours), and P2 (exposed for two hours) in rows I, II, and III respectively after amplification with primers P1 and P4. Panel A shows results for normal breast tissue (lanes 1-4), peripheral blood leucocytes from healthy volunteers (lanes 5-7), sternal bone marrow from patients with heart diesease (lanes 8-10), and breast cancer tissue (lanes 11 -15); panel B shows non-neoplastic colon tissue (lanes 1-4, (normal tissue in lanes 1 and 3 and inflamed colonic mucosa, from Crohn's disease in lane 2 and ulcerative colitis in lane 4)) and tissue from primary colonic cancer (lanes 5-8); panel C shows normal bladder tissue (lanes 1-4) and primary bladder cancer tissue (lanes 5-8)

The cumulative results of our studies on solid tissue samples are that cancers of the breast, colon, bladder, stomach, prostate, and thyroid in 47 different patients had significant overexpression of various exons of CD44 but corresponding normal tissues from 39 people did not (partially shown in fig 4 and in our previous report5). Figure 4 shows that a few bands, all less than 1100 base pairs in size, were present in the lanes from the normal samples though the intensity of these bands was very weak. Bands larger than 1500 base pairs in size were never seen in amplified gene products from normal tissue samples. We consider, therefore, that the higher the molecular size the more specific it is for the diagnosis of cancer.

Urine analysis for malignancy with CD44 probes

Table I shows details of the 44 patients with bladder cancer. If tumour cells in urine were to be expressing all the exons from 6 to 11 we predicted that our protocol should produce a 735 base pair band with primers E1 and E5 (see appendix). Figure 5 shows that there were clear differences between most tumour and non-tumour samples relative to the presence or absence of the 735 base pair product. The discrimination was not absolute, however, and we occasionally observed a 735 base pair band in the amplification products from subjects with no clinical evidence of neoplasia (such as lanes 1, 10, 13, 32, and 35 in fig 5) and sometimes saw none in a patient with bladder cancer. The presence of the 484 base pair band showed that all the samples contained satisfactory messenger RNA.


Autoradiographs of filters amplified with primers E1 and E5 and probed with oligonucleotide E4 (upper half of each panel) and amplified with P1 and P4 and probed with P2 (lower half). Tracks 15-28 and 38-46 show results with urine samples from bladder cancer patients. Tracks 1- 14 and 29-37 show results with urine samples from non-neoplastic controls. Tracks 47-53 are serial dilution study of plasmid DNA containing exons 6-11 (see appendix for details). Upper half of each panel shows only largest amplification products, including 735 base pair band: lower half of panel shows standard 484 base pair band to confirm presence of satisfactory messenger RNA

The serial dilution study shown in lanes 47-53 was included in this experiment to monitor the reliability of amplification of the target segment by the polymerase chain reaction under these conditions. It was possible to detect a signal down to 10-22 mole of the plasmid which was used as a standard. We therefore adopted this as the threshold for designating a sample as positive or negative for the presence of the 735 base pair template in the clinical samples. The overall results of the study showed that samples from 40 (91%) of the 44 patients with cancer were positive and than 38 (83%) of the 46 from people without neoplasia were negative. The method therefore has a sensitivity of 91%, specificity of 83%, a positive predictive value of 83% (40/48), and a negative predictive value of 90% (38/42) for bladder cancer. Among the subjects receiving the test the prevalence of the disease was 44% (40/90).


Details of 44 patients with bladder cancer

View this table:

The signal in individual samples can be roughly quantified by comparison with the serially diluted standard. For evaluation of the importance of a signal the results of molecular analysis should be related to the viable cell concentration in the urinary sediment after centrifugation. Table II summarises the results of such an analysis. Some samples from cancer patients showed a strongly staining band of 735 base pairs after amplification despite containing less than one cell per 10-4 ml of 10-fold concentrated urine (for example, lanes 16, 19, and 38 in fig 5). Conversely, samples from subjects with no clinical evidence of neoplasia did not produce a strong band of 735 base pairs even when they contained more than 10 cells per 10-4 ml. Hence the presence and intensity of the band is not simply related to the total number of cells present.


Results of molecular analysis of urine samples from 35 patients with bladder cancer and 26 controls* by cell number in urine.Values are numbers of subjects unless stated otherwise

View this table:


The products of CD44 gene activity appear to exercise many diverse functions in the cell and at the cell surface,13 and disturbances in the gene activity seen in tumour cells probably stem from derangement of the mechanisms controlling alternative splicing of this gene. From the present work and from our earlier studies5 we believe that tumour cells produce orders of magnitude more of the unusual transcripts from the CD44 locus compared with non-neoplastic cells. Even a sprinkling of tumour cells in a sample can therefore be detected under optimal conditions.

Effect of ribonucleases

Such a highly sensitive method could be of great clinical benefit, but ribonucleases are often present in high concentration in biological samples that contain many bacteria or dead or dying eukaryotic cells. Thus messenger RNA from a sample can be rapidly degraded during extraction if the quantity of inhibitor added is insufficient, giving a spurious negative or a weak positive result. In our study two of the four urine samples from patients with cancer that were negative for the 735 base pair band contained numerous cells because of coexisting haematuria or cystitis. The RNA extraction kit which we used cannot cope with more than 5x106 cells simultaneously because of the amount of ribonuclease released during the extraction procedure. For example, lanes 27 and 28 in figure 5 are different aliquots from a single sample from a patient with severe haematuria, but one aliquot reproducibly gave a positive signal and the other did not. The only difference was the starting urine volume used for cell pelleting before extraction: 20 ml for lane 27 and 1 ml for lane 28. The positive result was obtained with the smaller starting volume, which fits with the suggestion that the RNA was degraded by ribonucleases released by blood cells in the sample. It is also possible that inhibitors of the polymerase chain reaction, known to be present in blood, could be present in greater concentration in gross haematuria.

The most important requirement for this method is to get high quality intact RNA. In our experience RNA is degraded gradually even at - 70°C. RNA should therefore be extracted and transcribed to complementary DNA when urine samples arrive and without freezing them, which releases ribonucleases from the cells. Concurrent cystitis should be treated before urine samples are taken for examination. People with persistent frank haematuria should be referred directly to a urologist for cystoscopy without having such a test.

Reliability of method

Most of the control urine samples (83%) were completely negative for the 735 base pair band. It is unlikely that these results were caused by technical failure such as RNA degradation because the standard form CD44 transcript constitutively expressed in all cell types (482 base pairs) was obtained with each sample (fig 5, lower panel). Conversely, eight of the 46 control samples were positive for the 735 base pair band. We do not have a definitive explanation for these cases, but, as they gave similar results on repeat testing of subsequent samples, cross contamination with RNA or cells from tumour samples can reasonably be excluded. Possibly, in normal samples there are slight differences related to cell type in the expression of CD44 variants which are usually below the threshold of detection but which can be seen in some subjects. In our study the threshold for detection was set at maximum sensitivity on the basis of the results of the pilot study. Results from further studies to refine our method indicate that adjustment of the positions of the primers and probes used for amplification and checking of gels for an abnormal pattern of CD44 activity can give even greater specificity of diagnosis than that already achieved (unpublished observations). It should also be remembered that in an asymptomatic population there may be some people who have early, reversible, preneoplastic changes such as dysplasia or cellular atypia that could be detected by an highly sensitive method. Of the 40 patients with bladder cancer in our study who had positive results, two had preinvasive carcinoma in situ and 28 had histologically early stage invasive cancer (pTisG1 and pT1G1 respectively in table I). These results indicate that disorder in CD44 gene expression begins early in the process of neoplasia.

The CD44 method achieved reasonable specificity as well as good positive and negative predictive value in this sample of people. We have reason to believe that improvements in urine sampling, transport, and laboratory technique will further improve its sensitivity and specificity of detection (unpublished data). Even so, the method does not directly visualise the presence of malignant cells or tissue, and a policy of confirmation of a presumptive diagnosis with the aid of microscopical techniques would be advisable.


from other studies

Studies on CD44 gene activity by ourselves and others have contributed to understanding the fundamental mechanisms of neoplasia and the clinical detection of primary and recurrent malignancy.*RF 5,14-18* Other genes have also been studied to investigate the feasibility of diagnosing tumours by means of molecular genetics. For example, Sidransky et al reported identification of p53 gene mutations in exfoliated cancer cells in urine in three patients.19 In parallel studies on solid tissue samples they found that mutations in these genes were present in 61% of bladder tumours. Haliassos et al sought evidence of H-ras gene mutations in the urine of 21 patients with bladder cancer and reported mutations at codon 12 in 10 (47%) of them.20 In the same patients the prevalence of H-ras mutations in their bladder tissues was 66%. Such findings, together with our results, provide evidence that molecular genetic methods have the potential to provide powerful new tools for the early detection and prognostic evaluation of human tumours.

Comparison with other methods

DNA based techniques are of undisputed importance for the detection of oncogene related mutations in the analysis of cancer aetiology and of familial predisposition to certain types of neoplasia. Our present results, however, indicate that RNA based diagnostic techniques, which give direct evidence of abnormal gene activity at the time of sampling, may have advantages in the diagnosis of existing neoplastic disease. Although the incidence of mutations in known oncogenes in quite high in certain types of cancer, in most malignancies it is insufficient to be of practical benefit in routine clinical practice. RNA based methods therefore deserve attention in further research. The method described here, analysing expression of CD44, has the advantages that it can be performed in a single day and that it reliably ascertained the presence of abnormal splice variants in exfoliated cells in urine samples from 91% of patients with bladder cancer. From data already published *RF 5,14-18* we know that there is a strong association between such abnormal CD44 expression and the presence of malignant cells in tissue samples. The detection of excessive and abnormal CD44 activity in a clinical sample should therefore raise strong suspicion of the presence of neoplasia.

The sensitivity of detection obtained with this new method in this study of an unselected series of bladder tumours of all types and stages, including early papillary tumours, was greater than the values recently published for studies on similar unselected groups using urine cytology3,21,22 or microhaematuria.23 For accurate comparison, however, it would be necessary to study the results of all these methods on the same set of samples.


We suggest that this method could become a useful partner to cytology and histopathology in the clinical investigation of bladder malignancy. It is also likely to have wider applications because we have recently successfully used it to detect abnormal CD44 expression in exfoliated cells in stools from patients with colorectal cancer but not in ones from normal subjects (Matsumura and Tarin, unpublished observations). The technique is not difficult for trained personnel, but it needs to be performed with care because messenger RNA is vulnerable to degradation and the reaction can amplify trace contaminants to detectable levels. It is non-invasive, and, as samples can be batch tested, it would be considerably cheaper than cystoscopy for initial investigation for primary or recurrent bladder neoplasia. We therefore present it as a potentially convenient and practical method for cancer screening and for investigation of patients with suspected neoplastic disease. In order to make a confident assessment of the predictive value of the test in the general population, we now need more extensive clinical evaluation by means of double blind trials.

We think the following for help with this work: L Bao, D Cranston, D L Darling, J Davies, M Evans, G J Fellows, L Kaklamanis, D Lo, S Matsumura, J O'D McGee, T Shirakawa, E M Southern, L Summerville, and E Yap. We thank other colleagues in our departments for giving us samples and for general cooperation and encouragement.


Urine samples were centrifuged at 2000 rpm for 10 minutes, and messenger RNA was immediately extracted from the cell pellet using Micro-Fast Tract (Invitrogen). Complementary DNA was synthesised from the messenger RNA template using the complementary DNA Cycle Kit (Invitrogen), and amplification was performed with appropriate primers and parameters using 2.5 units of Taq polymerase in 50 μl reaction mixture. In this study of urine samples amplification was performed across a shorter section of the CD44 complementary DNA in each sample to increase sensitivity and specificity. Primers were designed to amplify across segments which included the new exon on the grounds that this is present in the largest transcripts typical of neoplasia.

  • The primers used in the study, at 1 pmol/μl in a 50/μl reaction,









  • and E5 (5'- TCCTGCTTGATGACCTCGTCCCAT). Figure 1 shows the positions of the primers and probes, and the predicted size of certain amplification products obtained with different primer combinations were 482 base pairs (primers P1 and P4, standard part only), 744 base pairs (P3 and E4 with exon 6), 615 base pairs (P3 and E4 without exon 6), and 735 base pairs (E1 and E5, cancer related band consisting of exons 6, 7, 8, 9, 10, and 11).

The complete complementary DNA solution from each urine sample was divided equally into two tubes: one for reaction with primers E1 and E5, to amplify the complementary DNA transcript that was of diagnostic value (735 base pairs); and one for reaction with primers P1 and P4, to amplify the standard form of CD44 (482 base pairs) as an internal control of messenger RNA quality and complementary DNA synthesis. Both tubes underwent 35 cycles of amplification. The cycle conditions were: 94°C for one minute, 55°C for one minute, and 72°C for two minutes. After electrophoresis the amplified products were electrophoresed, blotted, and probed - with phosphorus-32 labelled probe E4 to visualise the 735 base pair band or with P2 to view standard form CD44.

The TA plasmid DNA containing a complementary DNA insert of the same composition as the one of diagnostic interest (containing transcripts from all exons from 6 to 11 inclusive) was used as an internal standard for quantification purposes. This calibration series consisted of the following serial dilutions of 10-17, 10-18, 10-19, 10-20, 10-21, 10-22, and 10-23 mole of the plasmid DNA in 50 μl of reaction solution (primers E1 and E5, probe E4). Conditions for hybridisation, high stringency washing, and autoradiography with x ray film (Kodak) were according to standard protocols.


View Abstract