Patent application title: IDENTIFICATION OF TUMORS
Inventors:
IPC8 Class: AC12Q16886FI
USPC Class:
1 1
Class name:
Publication date: 2020-06-18
Patent application number: 20200190597
Abstract:
The invention provides methods for the use of gene expression
measurements to classify or identify tumors in samples obtained from a
subject in a clinical setting, such as in cases of formalin fixed,
paraffin embedded (FFPE) samples.Claims:
1. A method of classifying a cell containing sample containing tumor
cells of a type of tissue, said method comprising determining the
expression levels of 50 to 350 transcribed sequences from cells in a cell
containing sample obtained from a human subject, and classifying the
sample as containing tumor cells of a type of tissue from a plurality of
tumor types based on the expression levels of said sequences.
2. A method of classifying a cell containing sample as containing tumor cells of a type of tissue, said method comprising determining the expression levels of 50 or more transcribed sequences from cells in a cell containing sample obtained from a human subject, and classifying the sample as containing tumor cells of a type of tissue from a plurality of tumor types based on the expression levels of said sequences, wherein a) the expression of more than 50% of said transcribes sequences are not correlated with expression of another one of said transcribed sequences, or b) the 50 or more transcribes sequences are not selected based upon supervised learning using known tumor samples, on the level of correlation between their expression and said plurality of tumor types, or on their rank in a correlation between their expression and said plurality of tumor types.
3. The method of claim 1, further comprising determining the expression levels of an excess number of transcribed sequences, which expression levels are not used in said classifying.
4. The method of claim 3 wherein said expression levels are determined by use of a microarray.
5. The method of claim 1 wherein said classifying is with an accuracy of 60% or higher.
6. The method of claim 1 wherein said 50 or more transcribed sequences comprise one or more selected from the set of 74 gene sequences or the set of 90 gene sequences.
7. The method of claim 6 wherein said 50 or more transcribed sequences comprise five or more selected from the set of 74 gene sequences or the set of 90 gene sequences.
8. The method of claim 1 wherein said determining comprises measurement in comparison to one or more reference transcribed sequences.
9. The method of claim 1 wherein said determining comprises measuring the expression of all or part of the transcribed sequences.
10. The method of claim 1 wherein said determining comprises amplification of all or part of the transcribed sequences, or reverse transcription and labeling RNA corresponding to said transcribed sequences.
11. The method of claim 10 wherein said amplification comprises linear RNA amplification or quantitative PCR.
12. The method of claim 10 wherein said amplification is of sequences present within 600 nucleotides of the polyadenylation sites of the transcripts.
13. The method of claim 10 wherein said amplification is quantitative PCR amplification of at least 50 nucleotides of the transcripts.
14. A microarray comprising oligonucleotide probes to detect the amplification products of claim 10.
15. The method of claim 1 wherein said transcribed sequences are selected to be non-redundant.
16. The method of claim 15, further comprising determining the expression levels of an excess number of transcribed sequences which are redundant to those used for said classifying.
17. The method of claim 1, wherein said sample is a clinical sample from a human patient.
18. The method of claim 17, wherein said sample is a formalin fixed, paraffin embedded (FFPE) sample.
19. The method of claim 1, further comprising, before said determining of the expression levels of 50 or more transcribed sequences, diagnosis of a human subject as in need of said determining; or obtaining of a cell containing sample from a human subject; or receipt of a cell containing sample; or sectioning a cell containing sample; or isolating cells from a cell containing sample; or obtaining RNA from cells of a cell containing sample.
20. The method of claim 1, further comprising, after said determining of the expression levels of 50 or more transcribed sequences and said classifying of the sample, processing reimbursement or payment for said determining or classifying by indicating that 1) payment has been received, or 2) payment will be made by other payer, or 3) payment remains unpaid; or receiving reimbursement for said determining or said classifying; or forwarding or having forwarded a reimbursement request to an insurance company, health maintenance organization, federal health agency, or to said patient for said determining or classifying; or receiving indication of approval for payment or denial of payment for said determining or classifying; or sending a request for reimbursement for said determining or classifying; or indicating the need for reimbursement or payment on a form or into a database for said determining or classifying; or indicating the performance of said determining or classifying on a form or into a database; or reporting the results of said determining or classifying, optionally to a health care facility, a health care provider, a doctor, a nurse, or said patient; or receiving a payment from said patient for the performance of said determining or classifying.
Description:
RELATED APPLICATIONS
[0001] This application claims benefit of priority from U.S. Provisional Patent Application 60/577,084, filed Jun. 4, 2004
FIELD OF THE INVENTION
[0002] This invention relates to the use of gene expression to classify human tumors. The classification is performed by use of gene expression profiles, or patterns, of 50 or more expressed sequences that are correlated with tumors arising from certain tissues as well as being correlated with certain tumor types. The invention also provides for the use of 50 or more specific gene sequences, the expression of which are correlated with tissue source and tumor type in various cancers. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other expression formats, may be used to determine a cell containing sample as containing tumor cells of a tissue type or from a tissue origin to permit a more accurate identification of the cancer and thus treatment thereof as well as the prognosis of the subject from whom the sample was obtained.
SUMMARY OF THE INVENTION
[0003] This invention relates to the use of gene expression measurements to classify or identify tumors in cell containing samples obtained from a subject in a clinical setting, such as in cases of formalin fixed, paraffin embedded (FFPE) samples. The invention provides the ability to classify tumors in the real-world conditions faced by hospital and other laboratories which have to conduct testing on clinical FFPE samples. The invention may also be applied to other samples, such as fresh samples, that have undergone none to little or minimal treatment (such as simply storage at a reduced, non-freezing, temperature), and frozen samples. The samples maybe of a primary tumor sample or of a tumor that has resulted from a metastasis of another tumor. Alternatively, the sample may be a cytological sample, such as, but not limited to, cells in a blood sample. In some cases of a tumor sample, the tumors may not have undergone classification by traditional pathology techniques, may have been initially classified but confirmation is desired, or have been classified as a "carcinoma of unknown primary" (CUP) or "tumor of unknown origin" (TUO) or "unknown primary tumor". The need for confirmation is particularly relevant in light of the estimates of 5 to 10% misclassification using standard techniques. Thus the invention may be viewed as providing means for cancer identification, or CID.
[0004] In a first aspect of the invention, the classification is performed by use of gene expression profiles, or patterns, of 50 or more expressed sequences. The gene expression profiles, whether embodied in nucleic acid expression, protein expression, or other markers of gene expression, may be used to determine a cell containing sample as containing tumor cells of a tissue type or from a tissue origin to permit a more accurate identification of the cancer and thus treatment thereof as well as the prognosis of the subject from whom the sample was obtained.
[0005] In some embodiments, the invention is used to classify among at least 34 or at least 39 tumor types with significant accuracy in a clinical setting. The invention is based in part on the surprising and unexpected discovery that 50 or more expressed sequences in the human genome are capable of classifying among at least 34, or at least 39, tumor types, as well as subsets of those tumor types, in a meaningful manner. Stated differently, the invention is based in part on the discovery that it is not necessary to use supervised learning to identify gene sequences which are expressed in correlation with different tumor types. Thus the invention is based in part on the recognition that any 50 or more expressed sequences, even a random collection of expressed sequences, has the capability to classify, and so may be used to classify, a cell as being a tumor cell of a tissue or tissue origin.
[0006] In another aspect, the invention provides for the classifying of a cell containing sample as containing a tumor cell of a tissue type or origin by determining the expression levels of 50 or more transcribed sequences and then classifying, the cell containing, sample as containing a tumor cell of a plurality (two or more) of tumor types. To classify among at least 34 to at least 39 tumor types, and subsets thereof, as few as any 50 expressed sequences may be used to provide classification in a meaningful manner. The invention is also based in part on the observation that the expressed sequences need not be those the expression levels of which are evidently or highly correlated (directly, or indirectly through correlation with another expressed sequence) with any of the tumor types. Thus the invention provides, in a further embodiment, for the use of the expression levels of genes, the expression levels of which are not strongly correlated with the actual classification of the particular tumor sample, as one of the 50 or more transcribed sequences. All of the genes selected may be such non-correlates, or only a portion of the genes may be non-correlates, typically at least 90%, 85%, 75%, 50% or 25%, as well as portions falling within the ranges created by using any two of the foregoing point examples as endpoints of a range.
[0007] The invention is practiced by determining the expression levels of gene sequences where the sequences need not have been selected based on a correlation of their expression levels with the tumor types to be classified. Thus as a non-limiting example, the gene sequences need not be selected based on their correlation values with rumor types or a ranking based on the correlation values. Additionally, the invention may be practice with use of gene expression levels which are not necessarily correlated to one or more other gene expression levels) used for classification. Thus in additional embodiments, the ability for the expression level of one expressed sequence to function in classification is not redundant with (is independent of) the ability of at least one other gene expression level used for classification.
[0008] The invention may be applied to identify the origin of a cancer in a patient in a wide variety of cases including, but not limited to, identification of the origin of a cancer in a clinical setting. In some embodiments, the identification is made by classification of a cell containing sample known to contain cancer cells, but the origin of those cells is unknown, in other embodiments, the identification is made by classification of a cell containing sample as containing one or more cancer cells followed by identification of the origin(s) of those cancer cell(s). In further embodiments, the invention is practiced with a sample from a subject, with a previous history of cancer, and identification is made by classification of a cell as either being cancer from a previous origin of cancer or a new origin. Additional embodiments include those where multiple cancers found in the same organ or tissue and the invention is used to determine the origin of each cancer, as well as whether the cancers are of the same origin.
[0009] The invention is also based in part on the discovery that the expression levels of particular gene sequences can be used to classify among tumor types with greater accuracy than the expression levels of a random group of gene sequences. In one embodiment, the invention provides for the use of expression levels of 50 to 74 expressed sequences of a first set in the human genome to classify among at least 34 or at least 39 tumor types with significant accuracy. The invention thus provides for the identification and use of gene expression patterns (or profiles or "signatures") based on the 50 to 74 expressed sequences as correlated with at least the 34 or 39 tumor types. The invention also provides for the use of 50 to 74 of these expressed sequences to classify among subsets of the 34 or 39 tumor types. Depending on the number of tumor types, accuracies ranging from over 80% to 100% may be achieved.
[0010] In another embodiment, the invention provides for the use of expression levels of 50 to 90 expressed sequences of a second set in the human genome to classify among at least 34 or at least 39 tumor types with significant accuracy. 38 of the sequences in the second set are present in the first set of 74 sequences. The expression levels of the 50 to 90 sequences in the second set may be used in the same manner as described for the first set of 74 sequences. Depending on the number of tumor types, accuracies ranging from about 75% to about 95% may be achieved.
[0011] The invention is also based in part upon the discovery that use of 50 or more expressed sequences to classify among 53 tumor types, which include (but are not limited to) the 34 and 39 types described herein, was limited by the number of available samples of some tumor types. As noted hereinbelow, accuracy is linked to the number of available samples of each tumor type such that the ability to classify additional tumor types is readily achieved by the application of increased numbers of each tumor type. Thus while the invention is exemplified by use in classifying among 34 or 39 tumor types as well as subsets of the 34 or 39, 50 or more expressed sequences can also be used to classify among all tumor types with the inclusion of samples of the additional tumor types. Thus the invention also provides for the classification of a tumor as being a type beyond the 34 or 39 types described herein.
[0012] The invention is based upon the expression levels of the gene sequences in a set of known tumor cells from different tissues and of different tumor types. These gene expression profiles (of gene sequences in, the different known tumor cells/types), whether embodied in nucleic acid expression, protein expression, or other expression formats, may be compared to the expression levels of the same sequences in an unknown tumor sample to identify the sample as containing a tumor of a particular type and/or a particular origin or cell type. The invention provides, such as in a clinical setting, the advantages of a more accurate identification of a cancer and thus the treatment thereof as well as the prognosis, including survival and/or likelihood of cancer recurrence following treatment, of the subject from whom the sample was obtained.
[0013] The invention is further based in part on the discovery that use of 50 or more expressed sequences as described herein as capable of classifying among two or more tumor types necessarily and effectively eliminates one or more tumor types from consideration during classification. This reflects the lack of a need to select genes with expression levels that are highly correlated with all tumor types within the range of the classification system. Stated differently, the invention may be practiced with a plurality of genes the expression levels of which are not highly correlated with any of the individual tumor types or multiple types in the group of tumor types being classified. This is in contrast to other approaches based upon the selection and use of highly correlated genes, which likely do not "rule out" other tumor types as opposed to "rule in" a tumor type based on the positive correlation.
[0014] The classification of a tumor sample as being one of the possible tumor types described herein to the exclusion of other tumor types is of course made based upon a level of confidence as described below. Where the level of confidence is low, or an increase in the level of confidence is preferred, the classification can simply be made at the level of a particular tissue origin or cell type for the tumor in the sample. Alternatively, and where a tumor sample is not readily classified as a single tumor type, the invention permits the classification of the sample as one of a few possible tumor types described herein. This advantageously provides for the ability to reduce the number of possible tissue types, cell types, and tumor types from which to consider for selection and administration of therapy to the patient from whom the sample was obtained.
[0015] The invention thus provides a non-subjective means for the identification of the tissue source and/or tumor type of one or more cancers of an afflicted subject. Where subjective interpretation may have been previously used to determine the tissue source and/or tumor type, as well as the prognosis and/or treatment of the cancer based on that determination, the present invention provides objective gene expression patterns, which may used alone or in combination with subjective criteria to provide a more accurate identification of cancer classification. The invention is particularly advantageously applied to samples of secondary or metastasized tumors, but any cell containing sample (including a primary tumor sample) for which the tissue source and/or tumor type is preferably determined by objective, criteria may also be used with the invention. Of course the ultimate determination of class may be made based upon a combination of objective and non-objective (or subjective/partially subjective) criteria.
[0016] The invention includes its use as part of the clinical or medical care, of a patient. Thus in addition to using an expression profile of genes as described herein to assay a cell containing sample from a subject afflicted with cancer to determine the tissue source and/or tumor type of the cancer, the profile may also be used as part of a method to determine the prognosis of the cancer in the subject. The classification of the tumor/cancer and/or the prognosis may be used to select or determine or alter the therapeutic treatment for said subject. Thus the classification methods of the invention may be directed toward the treatment of disease, which is diagnosed in whole or in part based upon the classification. Given the diagnosis, administration of an appropriate anti-tumor agent or therapy, or the withholding or alternation of an anti-tumor agent or therapy may be used to treat the cancer.
[0017] Other clinical methods include those involved in the providing of medical care to a patient based on a classification as described herein. In some embodiments, the methods relate to providing, diagnostic services based on expression levels of gene sequences, with or without inclusion of an interpretation of levels for classifying cells of a sample. In some embodiments, the method of providing a diagnostic service of the invention is preceded by a determination of a need for the service. In other embodiments, the method includes acts in the monitoring of the performance of the service as well as acts in the request or receipt of reimbursement for the performance of the service.
[0018] The details of one or more embodiments of the invention are set forth in the accompanying drawing and the description below. Other features and advantages of the invention will be apparent from the drawing and detailed description, and from the claims.
DEFINITIONS
[0019] As used herein, a "gene" is a polynucleotide that encodes a discrete product, whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. The term includes alleles and polymorphisms of a gene that encodes the same product, or a functionally associated (including gain, loss, or modulation of function) analog thereof, based upon chromosomal location and ability to recombine during normal mitosis.
[0020] A "sequence" or "gene sequence" as used herein is a nucleic acid molecule or polynucleotide composed of a discrete order of nucleotide bases. The term includes the ordering of bases that encodes a discrete product (i.e. "coding region"), whether RNA or proteinaceous in nature. It is appreciated that more than one polynucleotide may be capable of encoding a discrete product. It is also appreciated that alleles and polymorphisms of the human gene sequences may exist and may be used in the practice of the invention to identify the expression level(s) of the gene sequences or an allele or polymorphism thereof. Identification of an allele or polymorphism depends in part upon chromosomal location and ability to recombine during mitosis.
[0021] The terms "correlate" or "correlation" or equivalents thereof refer to an association between expression of one or more genes and another event, such as, but not limited to, physiological phenotype or characteristic, such as tumor type.
[0022] A "polynucleotide" is a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double- and single-stranded DNA, and RNA. It also includes known types of modifications including labels known in the art, methylation, "caps", substitution of one or more of the naturally occurring nucleotides with an analog, and internucleotide modifications such as uncharged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), as well as unmodified forms of the polynucleotide.
[0023] The term "amplify" is used in the broad sense to mean creating an amplification product can be made enzymatically with DNA or RNA polymerases. "Amplification," as used herein, generally refers to the process of producing multiple copies of a desired sequence, particularly those of a sample. "Multiple copies" mean at least 2 copies. A "copy" does not necessarily mean perfect sequence complementarily or identity to the template sequence. Methods for amplifying mRNA are generally known in the art, and include reverse transcription PCR (RT-PCR) and quantitative PCR (or Q-PCR) or real time PCR. Alternatively, RNA may be directly labeled as the corresponding cDNA by methods known in the art.
[0024] By "corresponding", it is meant that a nucleic acid molecule shares a substantial amount of sequence identity with another nucleic acid molecule. Substantial amount means at least 95%, usually at least 98% and more usually at least 99%, and sequence identity is determined using the BLAST algorithm, as described in Altschul et al. (1990), J. Mol. Biol. 215:403-410 (using the published default setting, i.e. parameters w=4, t=17).
[0025] A "microarray" is a linear or two-dimensional or three dimensional (and solid phase) array of discrete regions, each having a defined area, formed on the surface of a solid support such as, but not limited to, glass, plastic, or synthetic membrane. The density of the discrete regions on a microarray is determined by the total numbers of immobilized polynucleotides to be detected on the surface of a single solid phase support, such as of at least about 50/cm.sup.2, at least about 100/cm.sup.2, or at least about 500/cm.sup.2, up to about 1,000/cm.sup.2 or higher. The arrays may contain less than about 500, about 1000, about 1500, about 2000, about 2500, or about 3000 immobilized polynueleotides in total. As used herein, a DNA microarray is an array of oligonucleotide or polynucleotide probes placed on a chip or other surfaces used to hybridize to amplified or cloned polynucleotides from a sample. Since the position of each particular group of probes in the array is known, the identities of a sample polynucleotides can be determined based on their binding to a particular position in the microarray. As an alternative to the use of a microarray, an array of any size may be used in the practice of the invention, including an arrangement of one or more position of a two-dimensional or three dimensional arrangement in a solid phase to detect expression of a single gene sequence. In some embodiments, a microarray for use with the present invention may be prepared by photolithographic techniques (such as synthesis of nucleic acid probes on the surface front the 3' end) or by nucleic synthesis followed by deposition on a solid surface.
[0026] Because the invention relies upon the identification of gene expression, some embodiments of the invention determine expression by hybridization of mRNA, or an amplified or cloned version thereof, of a sample cell to a polynucleotide that is unique to a particular gene sequence. Polynucleotides of this type contain at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, or at least about 32 consecutive basepairs of a gene sequence that is not found in other gene sequences. The term "about" as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Other embodiments are polynucleotides of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 100, at least or about 350, at least or about 400, at least or about 450, or at least or about 500 consecutive bases of a sequence that is not found in other gene sequences. The term "about" as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value. Longer polynueleotides may of course contain minor mismatches (e.g. via the presence of mutations) which do not affect hybridization to the nucleic acids of a sample. Such polynucleotides may also be referred to as polynucleotide probes that are capable of hybridizing to sequences of the genes, or unique portions thereof, described herein. Such polynucleotides may be labeled to assist in their detection. The sequences may be those of mRNA encoded by the genes, the corresponding cDNA to such mRNAs, and/or amplified versions of such sequences. In some embodiments of the invention, the polynucleotide probes are immobilized on an array, other solid support devices, or in individual spots that localize the probes.
[0027] In other embodiments of the invention, all or part of a gene sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally reaffirm: RT-PCR or real-time Q-PCR. Such methods would utilize one or two primers that are complementary to portions of a gene sequence, where the primers are used to prime nucleic acid synthesis. The newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention. The newly synthesized nucleic acids may be contacted with polynucleotides (containing sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.
[0028] Alternatively, and in further embodiments of the invention, gene expression may be determined by analysis of expressed protein in a cell sample of interest by use of one or more antibodies specific for one or more epitopes of individual gene products (proteins), or proteolytic fragments thereof, in said cell sample or in a bodily fluid of a subject. The cell sample may be one of breast cancer epithelial cells enriched from the blood of a subject, such as by use of labeled antibodies against cell surface markers followed by fluorescence activated cell sorting (FACS). Such antibodies may be labeled to permit their detection after binding to the gene product. Detection methodologies suitable for use in the practice of the invention include, but are not limited to, immunohistochemistry of cell containing samples or tissue, enzyme linked immunosorbent assays (ELISAs) including antibody sandwich assays of cell containing tissues or blood samples, mass spectroscopy, and immuno-PCR.
[0029] The terms "label" or "labeled" refer to a composition capable of producing a detectable signal indicative of the presence of the labeled molecule. Suitable labels include radioisotopes, nucleotide chromophores, enzymes, substrates, fluorescent molecules, chemiluminescent moieties, magnetic particles, bioluminescent moieties, and the like. As such, a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
[0030] The term "support" refers to conventional supports such as beads, particles, dipsticks, fibers, filters, membranes and silane or silicate supports such as glass slides.
[0031] "Expression" and "gene expression" include transcription and/or translation of nucleic acid material.
[0032] As used herein, the term "comprising" and its cognates are used in their inclusive sense; that is, equivalent to the term "including" and its corresponding cognates.
[0033] Conditions that "allow" an event to occur or conditions that are "suitable" for an event to occur, such as hybridization, strand extension, and the like, or "suitable" conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event. Such conditions, known in the art and described herein, depend upon, for example, the nature of the nucleotide sequence, temperature, and buffer conditions. These conditions also depend what event is desired, such as hybridization, cleavage, strand extension or transcription.
[0034] Sequence "mutation," as used herein, refers to any sequence alteration in the sequence of a gene disclosed herein interest in comparison to a reference sequence. A sequence mutation includes single nucleotide changes, or alterations of more than one nucleotide in a sequence, due to mechanisms such as substitution, deletion or insertion. Single nucleotide polymorphism (SNP) is also a sequence mutation as used herein. Because the present invention is based on the relative level of gene expression, mutations in non-coding regions of genes as disclosed herein may also be assayed in the practice of the invention.
[0035] "Detection" or "detecting" includes any means of detecting, including direct and indirect determination of the level of gene expression and changes therein.
[0036] Unless defined otherwise all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs.
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] FIG. 1 shows a capacity plot the ability to use the expression levels of subsets of a set of 100 expressed gene sequences to classify among 39 tumor types and subsets thereof. Expression levels of random combinations of 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 and 100 (each sampled 10 times) of the 100 sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types. A plot of numbers of tumor types versus prediction accuracies for results using from 50 to 100 genes are shown as non-limiting examples. Generally, accuracy improves with higher numbers of gene sequences, where 50 gene sequences results in a more noticeable reduction in accuracy when used with about 20 or ore tumor types.
[0038] FIG. 2 shows an alternative presentation of the data used with respect to FIG. 1. A plot of numbers of gene sequences used, ranging from 50-100, versus prediction accuracies for various representative numbers of tumor types is shown. The plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
[0039] FIG. 3 shows the performance of using all genes from a first set of 74 gene sequences and a second set of 90 gene sequences to classify various numbers of tumor types. Generally, the accuracy of the two sets are very similar, with the set of 74 displaying a more noticeable higher accuracy with about 28 or more (up to 39) tumor types.
[0040] FIG. 4 shows a capacity plot for the ability to use the expression levels of all or portions of a first set of 74 expressed gene sequences to classify among 39 tumor types and subsets thereof. Expression levels of random combinations of 50, 55, 60, 65, and 70 (each sampled 10 times) as well as all 74 of the sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types. A plot of numbers of tumor types versus prediction accuracies for results using from 50 to 74 genes are shown as non-limiting examples. Generally, accuracy improves with higher numbers of gene sequences, with the use of 74 genes being more noticeable as providing the highest accuracies, and 50 gene sequences producing the lowest accuracies, when used with about 20 or more tumor types.
[0041] FIG. 5 shows an alternative presentation of the data used with respect to FIG. 4. A plot of numbers of gene sequences used, ranging from 50-74, versus prediction accuracies for various representative numbers of tumor types is shown. The plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
[0042] FIG. 6 shows a capacity plot for the ability to use the expression levels of subsets of a set of 90 expressed gene sequences to classify among 39 tumor types and subsets thereof. Expression levels of random combinations of 50, 55, 60, 65, 70, 75, 80, and 85 (each sampled 10 times) as well as all 90 of the sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to 39 types. A plot of numbers of tumor types versus prediction accuracies for results using from 50 to 90 genes are shown as non-limiting examples. Generally, accuracy improves with higher numbers of gene sequences, where 50 gene sequences results in noticeably reduced accuracy when used with about 20 or more tumor types.
[0043] FIG. 7 shows an alternative presentation of the data used with respect to FIG. 6. A plot of numbers of gene sequences used, ranging from 50-90, versus prediction accuracies for various representative numbers of tumor types is shown. The plotted lines, from top to bottom, are of the results from 2, 10, 20, 30, and 39 tumor types, respectively.
[0044] FIGS. 8A-8D show a "tree" that classifies tumor types covered herein as well as additional known tumor types. It was constructed mainly according to "Cancer, Principles and Practice of Oncology, (DeVito, Hellman and Rosenberg), 6.sup.th edition". Thus beginning with a "tumor of unknown origin" (or "tuo"), the first possibilities are that it is either of a germ cell or non-germ cell origin. If it is the former, then it may be of ovary or testes origin. Within those of testes origin, the tumor may be of seminoma origin or an "other" origin.
[0045] If the tumor is of a non-germ cell origin, then it is either of a epithelial or non-epithelial origin. If it is the former, then it is either squamous or non-squamous origin. Squamous origin tumors are of cervix, esophagus, larynx, lung, or skin in origin. Non-squamous origin tumors are of urinary bladder, breast, carcinoid-intestine, cholarigiocarcinoma, digestive, kidney, liver, lung, prostate, reproductive system, skin-basal cell, or thyroid-follicular-papillary origin. Among those of digestive origin, the tumors are of small and large bowel, stomach-adenocarcinoma, bile duct, esophagus, gall bladder, and pancreas in origin. The esophagus origin tumors may be of either Barrett's esophagus or adenocarcinoma types. Of the reproductive system origin tumors, they may be of cervix adenocarcinoma type, endometrial tumor, or ovarian origin. Ovarian origin tumors are of the clear, serous, mucinous, and endometroid types.
[0046] If the tumor is of non-epithelial origin, then it is of adrenal gland, brain, GIST (gastrointestinal stromal tumor), lymphoma, meningioma, mesothelioma, sarcoma, skin melanoma, or thyroid-medullary origin. Of the lymphomas, they are B cell, Hodgkin's, or T cell type. Of the sarcomas, they are leimyosarcoma, osteosarcoma, soft-tissue sarcoma, soft tissue MFH (malignant fibrous histiocytoma), soft tissue sarcoma synovial, soft tissue Ewing's sarcoma, soft tissue fibrosarcoma, and soft tissue rhabdomyosarcoma types.
DETAILS DESCRIPTION OF MODES OF PRACTICING THE INVENTION
[0047] This invention provides methods for the use of gene expression information to classify tumors in a more objective manner than possible with conventional pathology techniques. The invention is based in part on the results of randomly reducing the number of gene sequences used to classify a tumor sample as one of a plurality of tumor types, such as the 34 tumor types described below and in U.S. Provisional Application 60/577,084, filed Jun. 4, 2004. A total number of 16,948 genes, which were filtered down from a larger set based upon removal of genes that display low or constant signals in the samples used was used for both cross-validation and prediction accuracies as described in the examples below. 100 random selections of 50, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2600, 2800, 3000 and more genes from the total were selected and used for classification as described herein.
[0048] Thus in a first aspect, the invention provides a method of classifying a cell containing sample as including a tumor cell of (or from) a type of tissue or a tissue origin. The method comprises determining or measuring the expression levels of 50 or more transcribed sequences from cells in a cell containing sample obtained from a subject, and classifying the sample as containing tumor cells of a type of tissue from a plurality of tumor types based on the expression levels of said sequences. As used herein, "a plurality" refers to the state of two or more.
[0049] In some embodiments of the invention, the expression of more than 50% of said transcribes sequences are not correlated with expression of another one of said transcribed sequences; and/or the 50 or more transcribes sequences are not selected based upon supervised learning using known tumor samples, on the level of correlation between their expression and said plurality of tumor types, or on their rank in a correlation between their expression and said plurality of tumor types.
[0050] The classifying is based upon a comparison of the expression levels of the 50 or more transcribed sequences in the cells of the sample to their expression levels in known tumor samples and/or known non-tumor samples. Alternatively, the classifying is based upon a comparison of the expression levels of the 50 or more transcribed sequences to the expression of reference sequences in the same samples, relative to, or based on, the same comparison in known tumor samples and/or known non-tumor samples. Thus as a non-limiting example, the expression levels of the gene sequences may be determined in a set of known tumor samples to provide a database against which the expression levels detected or determined in a cell containing sample from a subject is compared. The expression level(s) of gene sequence(s) in a sample also may be compared to the expression level(s) of said sequence(s) in normal, or non-cancerous cells, preferably from the same sample or subject. As described below and in embodiments of the invention utilizing Q-PCR or real time Q-PCR, the expression levels may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used.
[0051] The selection of 50 or more gene sequences to use may be random, or by selection based on various criteria. As one non-limiting example, the gene sequences may be selected based upon unsupervised learning, including clustering techniques. As another non-limiting example, selection may be to reduce or remove redundancy with respect to their ability to classify tumor type. For example, gene sequences are selected based upon the lack of correlation between their expression and the expression of one or more other gene sequences used for classifying. This is accomplished by assessing the expression level of each gene sequence in the expression data set for correlation, across the plurality of samples, with the expression level of each other gene in the data set to produce a correlation matrix of correlation coefficients. These correlation determinations may be performed directly, between expression of each pair of gene sequences, or indirectly, without direct comparison between the expression values of each pair of gene sequences.
[0052] A variety of correlation methodologies may be used in the correlation of expression data of individual gene sequences within the data set. Non-limiting examples include parametric and non-parametric methods as well as methodologies based on mutual information and non-linear approaches. Non-limiting examples of parametric approaches include Pearson correlation (or Pearson r, also referred to as linear or product-moment correlation) and cosine correlation. Non-limiting examples of non-parametric methods include Spearman's R (or rank-order) correlation, Kendall's Tau correlation, and the Gamma statistic. Each correlation methodology can be used to determine the level of correlation between the expressions of individual gene sequences in the data set. The correlation of all sequences with all other sequences is most readily considered as a matrix. Using Pearson's correlation as a non-limiting example, the correlation coefficient r in the method is used as the indicator of the level of correlation. When other correlation methods are used, the correlation coefficient analogous to r may be used, along with the recognition of equivalent levels of correlation corresponding to r being at or about 0.25 to being at or about 0.5.
[0053] The correlation coefficient may be selected as desired to reduce the number of correlated gene sequences to various numbers. In some embodiments of the invention using r, the selected coefficient value may be of about 0.25 or higher, about 0.3 or higher, about 0.35 or higher, about 0.4 or higher, about 0.45 or higher, or about 0.5 or higher. The selection of a coefficient value means that where expression between gene sequences in the data set is correlated at that value or higher, they are possibly not included in a subset of the invention. Thus in some embodiments, the method comprises excluding or removing (not using for classification) one or more gene sequences that are expressed in correlation, above a desired correlation coefficient, with another gene sequence in the tumor type data set. It is pointed out, however, that there can be situations of gene sequences that are not correlated with any other gene sequences, in which case they are not necessarily removed from use in classification.
[0054] Thus the expression levels of gene sequences, where more than about 10%, more than about 20%, more than about 30%, more than about 40%, more than about 50%, more than about 60%, more than about 70%, more than about 80%, or more than about 90% of the levels are not correlated with that of another one of the gene sequences used, may be used in the practice of the invention. Correlation between expression levels may be based upon a value below about 0.9, about 0.8, about 0.7, about 0.6, about 0.5, about 0.4, about 0.3, or about 0.2. The ability to classify among classes with exclusion of the expression levels of some gene sequences is present because expression of the gene sequences in the subset is correlated with expression of the gene sequences excluded from the subset. So no information was lost because information based on the expression of the excluded gene sequences is still represented by sequences retained in the subset. Therefore, expression of the gene sequences of the subset has information content relevant to properties and/or characteristics (or phenotype) of a cell. This has application and relevance to the classification of additional tumor type classes not included as part of the original gene expression data set which can be classified by use of a subset of the invention because based on the redundancy of information between expression of sequences in the subset and sequences expressed in those additional classes. Thus the invention may be used to classify cells as being a tumor type beyond the plurality of known classes used to generate the original gene expression data set.
[0055] Selection of gene sequences based upon reducing correlation of expression to a particular tumor type may also be used. This also reflects a discovery of the present invention, based upon the observation that expression levels that were most highly correlated with one or more tumor types was not necessarily of greatest value in classification among different tumor types. This is reflected both by the ability to use randomly selected gene sequences for classification as well as the use of particular sequences, as described herein, which are not expressed with the most significant correlation with one or more tumor types. Thus the invention may be practiced without selection of gene sequences based upon the most significant P values or a ranking based upon correlation of gene expression and one or more tumor types. Thus the invention may be practiced without the use of ranking based methodologies, such as the Kruskal-Wallis H-test.
[0056] The gene sequences used in the practice of the invention may include those which have been observed to be expressed in correlation with particular tumor types, such as expression of the estrogen receptor, which has been observed to be expressed in correlation with some breast and ovarian cancers. In some embodiments of the invention, however, the invention is practiced with use of the expression level of at least one gene sequence that has not been previously identified as being associated with any of the tumor types being classified. Thus the invention may be practiced without all of the gene sequences having previously been associated or correlated with expression in the 2 or more (up to 39 or more) tumor types to which a cell containing sample may be classified.
[0057] While the invention is described mainly with respect to human subjects, samples from other subjects may also be used. All that is necessary is the ability to assess the expression levels of gene sequences in a plurality of blown tumor samples such that the expression levels in an unknown or test sample may be compared. Thus the invention may be applied to samples from any organism for which a plurality of expressed sequences, and a plurality of known tumor samples, are available. One non-limiting example is application of the invention to mouse samples, based upon the availability of the mouse genome to permit detection of expressed murine sequences and the availability of known mouse tumor samples or the ability to obtain known samples. Thus, the invention is contemplated for use with other samples, including those of mammals, primates, and animals used in clinical testing (such as rats, mice, rabbits, dogs, cats, and chimpanzees) as non-limiting examples.
[0058] While the invention is readily practiced with the use of cell containing samples, any nucleic acid containing sample which may be assayed for gene expression levels may be used in the practice of the invention. Without limiting the invention, a sample of the invention may be one that is suspected, or known to contain tumor cells. Alternatively, a sample of the invention may be a "tumor sample" or "tumor containing sample" or "tumor cell containing sample" of tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, cancer. Non-limiting examples of samples for use with the invention include a clinical sample, such as, but not limited to, a fixed sample, a fresh sample, or a frozen sample. The sample may be an aspirate, a cytological sample (including blood or other bodily fluid), or a tissue specimen, which includes at least some information regarding the in situ context of cells in the specimen, so long as appropriate cells or nucleic acids are available for determination of gene expression levels. The invention is based in part on the discovery that results obtained with frozen tissue sections can be validly applied to the situation with fixed tissue or cell samples and extended to fresh samples.
[0059] Non-limiting examples of fixed samples include those that are fixed with formalin or formaldehyde (including FFPE samples), with Boudin's, glutaldehyde, acetone, alcohols, or any other fixative, such as those used to fix cell or tissue samples for immunohistochemistry (IHC). Other examples include fixatives that precipitate cell associated nucleic acids and proteins. Given possible complications in handling frozen tissue specimens, such as the need to maintain its frozen state, the invention may be practiced with non-frozen samples, such as fixed samples, fresh samples, including cells from blood or other bodily fluid or tissue, and minimally treated, samples. In some applications of the invention, the sample has not been classified using standard pathology techniques, such as, but not limited to, immunohistochemistry based assays.
[0060] In some embodiments of the invention, the sample is classified as containing a tumor cell of a type selected from the following 53, and subsets thereof: Adenocarcinoma of Breast, Adenocarcinoma of Cervix, Adenocarcinoma of Esophagus, Adenocarcinoma of Gall Bladder, Adenocarcinoma of Lung, Adenocarcinoma of Pancreas, Adenocarcinoma of Small-Large Bowel, Adenocarcinoma of Stomach, Astrocytoma, Basal Cell Carcinoma of Skin, Cholangiocarcinoma of Liver, Clear Cell Adenocarcinoma of Ovary, Diffuse Large B-Cell Lymphoma, Embryonal Carcinoma of Testes, Endometrioid Carcinoma of Uterus, Ewings Sarcoma, Follicular Carcinoma of Thyroid, Gastrointestinal Stromal Tumor, Germ Cell Tumor of Ovary, Germ Cell Tumor of Testes, Glioblastoma Multiforme, Hepatocellular Carcinoma of Liver, Hodgkin's Lymphoma, Large Cell Carcinoma of Lung, Leiomyosarcoma, Liposarcoma, Lobular Carcinoma of Breast, Malignant Fibrous Histiocytoma, Medulary Carcinoma of Thyroid, Melanoma, Meningioma, Mesothelioma of Lung, Mucinous Adenocarcinoma of Ovary, Myofibrosarcoma, Neuroendocrine Tumor of Bowel, Oligodendroglioma, Osteosarcoma, Papillary Carcinoma of Thyroid, Pheochromocytoma, Renal Cell Carcinoma of Kidney, Rhabdomyosarcoma, Seminoma of Testes, Serous Adenocarcinoma of Ovary, Small Cell Carcinoma of Lung, Squamous Cell Carcinoma of Cervix, Squamous Cell Carcinoma of Esophagus, Squamous Cell Carcinoma of Larynx, Squamous Cell Carcinoma of Lung, Squamous Cell Carcinoma of Skin, Synovial Sarcoma, T -Cell Lymphoma, and Transitional Cell Carcinoma of Bladder.
[0061] In other embodiments of the invention, the sample is classified as containing a tumor cell of a type selected from the following 34, and subsets thereof adrenal, brain, breast, carcinoid-intestine, cervix (squamous cell), cholangiocarcinoma, endometrium, germ-cell, GIST (gastrointestinal stromal tumor), kidney, leiomyosarcoma, liver, lung (adenocarcinoma, large cell), lung (small cell), lung (squamous), lymphoma (B cell), Lymphoma (Hodgkins), meningioma, mesothelioma, osteosarcoma, ovary (clear cell), ovary (serous cell), pancreas, prostate, skin (basal cell), skin (melanoma), small and large bowel; soft tissue (liposarcoma); soft tissue (MFH or Malignant Fibrous Histiocytoma), soft tissue (Sarcoma-synovial), testis (seminoma), thyroid (follicular-papillary), thyroid (medullary carcinoma), and urinary bladder.
[0062] In further embodiments of the invention, the sample is classified as containing a tumor cell of a type selected from the following 39, and subsets thereof: adrenal gland, brain, breast, carcinoid-intestine, cervix-adenocarcinoma, cervix-squamous, endometrium, gall bladder, germ cell-ovary, GIST, kidney, leimlyosarcoma, liver, lung-adenocarcinoma-large cell, lung-small cell, lung-squamous, lymphoma-B cell, lymphoma-Hodgkin's, lymphoma-T cell, meningioma, mesothelioma, osteosarcoma, ovary-clear cell, ovary-serous, pancreas, prostate, skin-basal cell, skin-melanoma, skin-squamous, small and large bowel, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovial, stomach-adenocarcinoma, testis-other (or non-seminoma), testis-seminoma, thyroid-follicular-papillary, thyroid-medullary, and urinary bladder.
[0063] The methods of the invention may also be applied to classify a cell containing sample as containing a tumor cell of a tumor of a subset of any of the above sets. The size of the subset will usually be small, composed of two, three, four, five, six, seven, eight, nine, or ten of the tumor types described above. Alternatively, the size of the subset may be any integral number up to the full size of the set. Thus embodiments of the invention include classification among 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, or 52 of the above types. In some embodiments, the subset will be composed of tumor types that are of the same tissue or organ type. Alternatively, the subset will be composed of tumor types of different tissues or organs. In some embodiments, the subset will include one or more types selected from adrenal gland, brain, carcinoid intestine, cervix-adenocarcinoma, cervix-squamous, gall bladder, germ cell-ovary, GIST, leiomyosarcoma, liver, meningioma, osteosarcoma, skin-basal cell, skin-squamous, soft tissue-liposarcoma, soft tissue-MFH, soft tissue-sarcoma-synovial, testis-other (or non-seminoma), testis-seminoma, thyroid-follicular-papillary, and thyroid-medullary.
[0064] Classification among subsets of the above tumor types is demonstrated by the results shown in FIGS. 1 and 2, where the expression levels of as few as 50 or more genes sequences can be used to classify among random samples of 2 tumor types among those in the set of 39 listed above. Expression levels of 50-100 gene sequences (that were randomly selected) can be used to classify among 2 to 39 tumor types with varying degrees of accuracy. The invention may be practiced with the expression levels of 50 or more, about 55 or more, about 60 or more, about 65 or more, about 70 or more, about 75 or more, about 80 or more, about 85 or more, about 90 or more, about 100 or more, about 110 or more, about 120 or more, about 130 or more, about 140 or more, about 150 or more, about 200 or more, about 250 or more, about 300 or more, about 350 or more, or about 400 or more transcribed sequences as found in the human "transcriptome" (transcribed portion of the genome). The invention may also be practiced with expression levels of 50-60 or more, about 60-70 or more, about 70-80 or more, about 80-90 or more, about 90-100 or more, about 100-110 or more, about 110-120 or more, about 120-130 or more, or about 130-140 or more transcribed sequences. In some embodiments of the invention, the transcribed genes may be randomly picked or include all or some of the specific genes sequences disclosed herein. As demonstrated herein, classification with accuracies of about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95% or higher can be performed by use of the instant invention.
[0065] In other embodiments, the gene expression levels of other gene sequences may be determined along with the above described determinations of expression levels for use in classification. One non-limiting example of this is seen in the case of a microarray based platform to determine gene expression, where the expression of other gene sequences is also measured. Where those other expression levels are not used in classification, they may be considered the results of "excess" transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the description of using particular numbers of sequences does not necessarily exclude the use of expression levels of additional sequences. In some embodiments, the invention includes the use of expression levels) from one or more "excess" gene sequences, such as those which may provide information redundant to one or more other gene sequences used in a method of the invention.
[0066] Because classification of a sample as containing cells of one of the above tumor types inherently also classifies the tissue or organ site origin of the sample, the methods of the invention may be applied to classification of a tumor sample as being of a particular tissue or organ site of the patient. This application of the invention is particularly useful in cases where the sample is of a tumor that is the result of metastasis by another tumor. In some embodiments of the invention, the tumor sample is classified as being one of the following 24: Adrenal, Bladder, Bone, Brain, Breast, Cervix, Endometrium, Esophagus, Gall Bladder, Kidney, Larynx, Liver, Lung, Lymph Node, Ovary, Pancreas, Prostate, Skin, Soft Tissue, Small/Large Bowel, Stomach, Testes, Thyroid, and Uterus.
[0067] While the invention also provides for classification as one of the above tumor types based upon comparisons to the expression levels of sequences in the 39 tumor types, it is possible that a higher level of confidence in the classification is desired. If an increase in the confidence of the classification is preferred, the classification can be adjusted to identify the tumor sample as being of a particular origin or cell type as shown in FIG. 8. Thus an increase in confidence can be made in exchange for a decrease in specificity as to tumor type by identification of origin or cell type.
[0068] The classification of a cell containing sample as having a tumor cell of one of the 39 tumor types above inherently also classifies the tissue or organ site origin of the sample. For example, the identification of a sample as being cervix-squamous necessarily classifies the tumor as being of cervical origin, squamous cell type (and thus epithelial rather than non-epithelial in origin) as shown in FIG. 8. It also means that the tumor was necessarily not germ cell in origin. Thus, the methods of the invention may be applied to classification of a tumor sample as being of a particular tissue or organ site of a subject or patient. This application of the invention is particularly useful in cases where the sample is of a tumor that is the result of metastasis by another tumor.
[0069] The practice of the invention to classify a cell containing sample as having a tumor cell of one of the above types is by use of an appropriate classification algorithm that utilizes supervised learning to accept 1) the levels of expression of the gene sequences in a plurality of known tumor types as a training set and 2) the levels of expression of the same genes in one or more cells of a sample to classify the sample as having cells of one of the tumor types. Further discussion of this is provided in the Example section herein. The levels of expression may be provided based upon the signals in any format, including nucleic acid expression or protein expression as described herein.
[0070] As would be evident to the skilled practitioner, the range of classification is affected by the number of tumor types as well as the number of samples for each tumor type. But given adequate samples of the full range of human tumors as provided herein, the invention is readily applied to the classification of those tumor types as well as additional types.
[0071] Non-limiting examples of classification algorithms that may be used in the practice of the invention include supervised learning algorithms, machine learning algorithms, linear discriminant analysis, attribute selection algorithms, and artificial neural networks (ANN). In preferred embodiments of the invention, a distance-based classification algorithm, such as the k-nearest neighbor (KNN) algorithm, or support vector machine (SVM) are used.
[0072] The use of KNN is in some embodiments of the invention and is discussed further as a non-limiting representative example. KNN can be used to analyze the expression data of the genes in a "training set" of known tumor samples including all 39 of the tumor types described herein. The training data set can then be compared to the expression data for the same genes in a cell containing sample. The expression levels of the genes in the sample are then compared to the training data set via KNN to identify those tumor samples with the most similar expression patterns. As a non-limiting example, the five "nearest neighbors" may be identified and the tumor types thereof used to classify the unknown tumor sample. Of course other numbers of "nearest neighbors" may be used. Non-limiting examples include less than 5, about 7, about 9, or about 11 or more "nearest neighbors".
[0073] As a hypothetical example, if the five "nearest neighbors" of an unknown sample are four B cell lymphomas and one T cell lymphoma, then the classification of the sample as being of a B cell lymphoma can be made with great accuracy. This has been used with 84% or greater accuracy, such as 90%, as described in the Examples.
[0074] The classification ability may be combined with the inherent nature of the classification scheme to provide a means to increase the confidence of tumor classification in certain, situations. For example, if the five "nearest neighbors" of a sample are three ovary clear cell and two ovary serous tumors, confidence can be improved by simply treating the tumors as being of ovarian origin and treating the subject or patient (from whom the sample was obtained) accordingly. See FIG. 8. This is an example of trading off specificity in favor of increased confidence. This provides the added benefit of addressing the possibility that the unknown sample was a mucinous or endometroid tumor. Of course the skilled practitioner is free to treat the tumor as one or both of these two most likely possibilities and proceeding in accordance with that determination.
[0075] Because the developmental lineage of tumor cells in certain tumor types (e.g., germ cells) can be complex and involve multiple cell types, FIG. 8 may appear to be oversimplified. However, it serves as a good basis to relate known histopathology and to serve as a "guide tree" for analyzing and relating tumor-associated gene expression signatures.
[0076] The inherent nature of the classification scheme also provides a means to increase the confidence of tumor classification in cases wherein the "nearest neighbors" are ambiguous. For example, if the five "nearest neighbors" were one urinary bladder, one breast, one kidney, one liver, and one prostate, the classification can simply be that of a non squamous cell tumor. Such a determination can be made with significant confidence and the subject or patient from whom the sample was obtained can be treated accordingly. Without being bound by theory, and offered solely to improve the understanding of the invention, the last two examples reflect the similarities in gene expression of cells of a similar cell type and/or tissue origin.
[0077] Embodiments of the invention include use of the methods and materials described herein to identify the origin of a cancer from a patient. Thus given a sample containing tumor cells, the tissue origin of the tumor cells is identified by use of the present invention. One non-limiting example is in the case of a subject with an inflamed lymph node containing cancer cells. The cells may be from a tissue or organ that drains into the lymph node or it may be from another tissue source. The present invention may be used to classify the cells as being of a particular tumor or tissue type (or origin) which allows the identification of the source of the cancer cells. In an alternative non-limiting example, the sample (such as that from a lymph node) contains cells, which are first assayed by use of the invention to classify at least one cell as being a tumor cell of a tissue type or origin. This is then used to identify the source of the cancer cells in the sample. Both of these are examples of the advantageous use of the invention to save time, effort, and cost in the use of other cancer diagnostic tests.
[0078] In further embodiments, the invention is practiced with a sample from a subject with a previous history of cancer. As a non-limiting example, a cell containing sample (from the lymph node or elsewhere) of the subject may be found to contain cancer cells such that the present invention may be used to determine whether the cells are from the same or a different tissue from that of the previous cancer. This application of the invention may also be used to identify a new primary tumor, such as the case where new cancer cells are found in the liver of a subject who previously had breast cancer. The invention may be used to identify the new cancer cells as being the result of metastasis from the previous breast cancer (or from another tumor type, whether previously identified or not) or as a new primary occurrence of liver cancer. The invention may also be applied to samples of a tissue or organ where multiple cancers are found to determine the origin of each cancer, as well as whether the cancers are of the same origin.
[0079] While the invention may be practiced with the use of expression levels of a random group of expressed gene sequences, the invention also provides exemplary gene sequences for use in the practice of the invention. The invention includes a first group of 74 gene sequences from which 50 or more may be used in the practice of the invention. The 50 to 74 gene sequences may be used along with the determination of expression levels of additional sequences so long as the expression levels of gene sequences from the set of 74 are used in classifying. A non-limiting example of such embodiments of the invention is where the expression of the 74 gene sequences, or at least 50 (or 50 to about 90) members thereof, is measured along with the expression levels of a plurality of other sequences, such as by use of a microarray based platform used to perform the invention. Where those other expression levels are not used in classification, they may be considered the results of "excess" transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the use of the above described sequences does not necessarily, exclude the use of expression levels of additional sequences.
[0080] mRNA sequences corresponding to a set of 74 gene sequences for use in the practice of the invention are provided in the attached Sequence Listing. A listing of the SEQ ID NOs, with corresponding identifying information, including accession numbers and other information, is provided by the following.
TABLE-US-00001 (SEQ ID NO: 1) >Hs.73995_mRNA_1 gi|190403|gb|M60502.1|HUMPROFILE Human profilaggrin mRNA, 3' end polyA = 1 (SEQ ID NO: 2) >Hs.75236_mRNA_4 gi|14280328|gb|AY033998.1|Homo sapiens polyA = 3 (SEQ ID NO: 3) >Hs.299867_mRNA_1 gi|4758533|ref|NM_004496.1|Homo sapiens hepatocyte nuclear factor 3, alpha (HNF3A), mRNA polyA = 3 (SEQ ID NO: 4) >Hs.285401_contig1 AI147926|AI880620|AA768316|AA761543|AA279147|AI216016|AI738663|N79248| AI684489|AA960845|AI718599|AI379138|N29366|BF002507|AW044269|R34339| R66326|H04648|R67467|AI523112|BF941500 polyA = 2 polyA = 3 (SEQ ID NO: 5) >Hs.182507_mRNA_1 gi|15431324|ref|NM_002283.2|Homo sapiens keratin, hair, basic, 5 (KRTHB5), mRNA polyA = 3 (SEQ ID NO: 6) >Hs.292653_contig1 AI200660|AW014007|AI341199|AI692279|AI393765|AI378686|AI695373|AW292108| T10352|R44346|AW470408|AI380925|BF938983|AW003704|H08077|F03856|H08075| F08895|AW468398|AI865976|H22568|AI858374|AI216499 polyA = 2 polyA = 3 (SEQ ID NO: 7) >Hs.97616_mRNA_3 gi|12654852|gb|BC001270.1|BC001270 Homo sapiens clone MGC: 5069 IMAGE: 3458016 polyA = 3 (SEQ ID NO: 8) >Hs.123078_mRNA_3 gi|14328043|gb|BC009237.1|BC009237 Homo sapiens clone MGC: 2216 IMAGE: 2989823 polyA = 3 (SEQ ID NO: 9) >Hs.285508_contig1 AW194680|BF939744|BF516467 polyA = 1 polyA = 1 (SEQ ID NO: 10) >Hs.183274_contig1 BF437393|BF064008|BF509951|AW134603|AI277015|AI803254|AA887915|BF054958| AI004413|AI393911|AI278517|AW612644|AI492162|AI309926|AI863671|AA448864| AI640165|AA479926|AA461188|AA780161|BF591180|AI918020|AI758226|AI291375| BF001845|BF003064|AI337393|AI522206|BE856784|BF001760|AI280300 FLAG = 1 polyA = 2 WARN polyA = 3 (SEQ ID NO: 11) >Hs.334841_mRNA_3 gi|14290606|gb|BC009084.1|BC009084 Homo sapiens clone MGC: 9270 IMAGE: 3853674 polyA = 3 (SEQ ID NO: 12) >Hs.3321_contig1 AI804745|AI492375|AA594799|BE672611|AA814147|AA722404|AW170088|D11718| BG153444|AI680648|AA063561|BE219054|AI590287|R55185|AI479167|AI796872| AI018324|Ai701122|BE218203|AA905336|AI681917|BI084742|AI480008|AI217994| AI401468 polyA = 2 polyA = 3 (SEQ ID NO: 13) >Hs.306216_singlet1 AW083022 polyA = 1 polyA = 2 (SEQ ID NO: 14) >Hs.99235_contig1 AA456140|AI167259|AA450056 polyA = 2 polyA = 3 (SEQ ID NO: 15) >Hs.169172_mRNA_2 gi|2274961|emb|AJ000388.1|HSCANPX Homo sapiens mRNA for calpain-like protease CANPX polyA = 3 (SEQ ID NO: 16) >Hs.351486_mRNA_1 gi|16549178|dbj|AK054605.1|AK054605 Homo sapiens cDNA FLJ30043 fis, clone 3NB692001548 polyA = 0 (SEQ ID NO: 17) >Hs.153504_contig2 BE962007|AW016349|AW016358|AW139144|AA932969|AI025620|AI688744|AI865632| AA854291|AA932970|AU156702|AI634439|AI152496|AI539557|AI123490|AI613215| AI318363|AW105672|AA843483|AI366889|AW181938|AI813801|AI433695|AA934772| N72230|AI760632|BE858965|AW058302|AI760087|AI682077|AA886672|AI350384| AW243848|AW300574|BE466359|AI859529|AI921588|BF062899|BE855597|BE617708 polyA = 2 polyA = 3 (SEQ ID NO: 18) >Hs.1994534_singlet1 AI669760 polyA = 1 polyA = 2 (SEQ ID NO: 19) >Hs.162020_contig1 AW291189|AA505872 polyA = 2 polyA = 3 (SEQ ID NO: 20) >Hs.30743_mRNA_3 gi|18201906|ref|NM_006115.2|Homo sapiens preferentially expressed antigen in melanoma (PRAME), mRNA polyA = 3 (SEQ ID NO: 21) >Hs.271580_contig1 AI632869|AW338882|AW338875|AW613773|AI982899|AW193151|BE206353|BE208200| AI811548|AW264021 polyA = 2 polyA = 3 (SEQ ID NO: 22) >Hs.69360_mRNA_2 gi|14250609|gb|BC008764.1|BC008764 Homo sapiens clone MGC: 1266 IMAGE: 3347571 polyA = 3 (SEQ ID NO: 23) >Hs.30827_contig1 H07885|N39347|W85913|AA583408|W86449| polyA = 2 polyA = 3 (SEQ ID NO: 24) >Hs.211593_contig2 BF592799|AI570478|AA234440|R40214|BE501078|AW593784|AI184050|AI284161| W72149|AW780437|AI247981|AW241273|H60824 polyA = 2 polyA = 3 (SEQ ID NO: 25) >Hs.155097_mRNA_1 gi|15080385|gb|BC011949.1|BC011949 Homo sapiens clone MGC: 9006 IMAGE: 3863603 polyA = 3 (SEQ ID NO: 26) >Hs.5163_mRNA_1 gi|15990433|gb|BC015582.1|BC015582 Homo sapiens clone MGC: 23280 IMAGE: 4637504 polyA = 3 (SEQ ID NO: 27) >Hs.55150_mRNA_1 gi|17068414|gb|BC017586.1|BC017586 Homo sapiens clone MGC: 26610 IMAGE: 4837506 polyA = 3 (SEQ ID NO: 28) >Hs.170177_contig3 AI620495|AW291989|AA780896|AA976262|AI298326|BF111862|AW591523|AI922518| AI480280|BF589437|AA600354|AI886238|AA035599|H90049|BF112011|N52601| AI570965|AI565367|AW768847|H90073|BE504361|N45292|AI632075|AA679729| AW168052|AI978827|AI968410|AI669255|N45300|AI651256|AI698970|AI521256| AW078614|AI802070|AI885947|AI342534|AI653624|AW243936|T16586|R15989| AI289789|AI871636|AI718785|AW148847 polyA = 2 polyA = 3 (SEQ ID NO: 29) >Hs.184601_mRNA_5 gi|4426639|gb|AF104032.1|AF104032 Homo sapiens polyA = 2 (SEQ ID NO: 30) >Hs.351972_singlet1 AA865917 polyA = 2 polyA = 3 (SEQ ID NO: 31) >Hs.5366_mRNA_2 gi|15277845|gb|BC012926.1|BC012926 Homo sapiens clone MGC: 16817 IMAGE: 3853503 polyA = 3 (SEQ ID NO: 32) >Hs.18140_contig1 AI685931|AA410954|T97707|AA706873|AI911572|AW614616|AA548520|AW027764| BF511251|AI914294|AW151688 polyA = 1 polyA = 1 (SEQ ID NO: 33) >Hs.133196_contig2 BF224381|BE467991|AW137689|AI695045|AW207361|BF445141|AA405473 polyA = 2 WARN polyA = 3 (SEQ ID NO: 34) >Hs.63325_mRNA_5 gi|15451939|ref|NM_019894.1|Homo sapiens transmembrane protease, serine 4 (TMPRSS4), mRNA polyA = 3 (SEQ ID NO: 35) >Hs.250692_mRNA_2 gi|184223|gb|M95585.1|HUMHLF Human hepatic leukemia factor (HLF) mRNA, complete cds polyA = 3 (SEQ ID NO: 36) >Hs.250726_singlet4 AW298545 polyA = 2 polyA = 3 (SEQ ID NO: 37) >Hs.79217_mRNA_2 gi|16306657|gb|BC001504.1|BC001504 Homo sapiens clone MGC: 2273 IMAGE: 3505512 polyA = 3 (SEQ ID NO: 38) >Hs.47986_mRNA_1 gi|13279253|gb|BC004331.1|BC004331 Homo sapiens clone MGC: 10940 IMAGE: 3630835 polyA = 3 (SEQ ID NO: 39) >Hs.94367_mRNA_1 gi|10440200|djb|AK027147.1|AK027147 Homo sapiens cDNA: FLJ23494 fis, clone LNG01885 polyA = 3 (SEQ ID NO: 40)
>Hs.49215_contig1 BI493248|N66529|AA452255|BI492877|AW196683|AI963900|BF478125|AI421654| BE466675 polyA = 1 polyA = 1 (SEQ ID NO: 41) >Hs.281586_contig2 R61469|R15891|AA007214|R61471|AI014624|N69765|AW592075|H09780|AA709038| AI335898|AI559229|F09750|R49594|H11055|T72573|AA935558|AA988654|AA826438| AI002431|AI299721 polyA = 1 polyA = 2 (SEQ ID NO: 42) >Hs.79378_mRNA_1 gi|16306528|ref|NM_003914.2|Homo sapiens cyclin A1 (CCNA1), mRNA polyA = 3 (SEQ ID NO: 43) >Hs.156469_contig2 AI341378|AI670817|AI701687|AI3In set 22|AW235883|AI948598|AA446356 polyA = 2 polyA = 3 (SEQ ID NO: 44) >Hs.6631_mRNA_1 gi|7020430|dbj|AK000380.1|AK000380 Homo sapiens cDNA FLJ20373 fis, clone HEP19740 polyA = 3 (SEQ ID NO: 45) >Hs.155977_contig1 AI309080|AI313045 polyA = 1 WARN polyA = 1 (SEQ ID NO: 46) >Hs.95197_mRNA_4 gi|5817138|emb|AL110274.1|HSM800829 Homo sapiens mRNA; cDNA DKFZp564I0272 (from clone DKFZp564I0272) polyA = 3 (SEQ ID NO: 47) >Hs.48956_contig1 N64339|AI569513|AI694073 polyA = 1 polyA = 1 (SEQ ID NO: 48) >Hs.118825_mRNA_10 gi|1495484|emb|X96757.1|HSSAPKK3 H. sapiens mRNA for MAP kinase kinase polyA = 3 (SEQ ID NO: 49) >Hs.135118_contig3 AI683181|AI082848|AW770198|AI333188|AI873435|AW169942|AI806302|AW340718| BF196955|AA909720 polyA = 1 polyA = 2 (SEQ ID NO: 50) >Hs.171857_mRNA_1 gi|13161080|gb|AF332224.1|AF332224 Homo sapiens testis protein mRNA, partial cds polyA = 3 (SEQ ID NO: 51) >Hs.18910_mRNA_3 gi|12804464|gb|BC001639.1|BC001639 Homo sapiens clone MGC: 1944 IMAGE: 2959372 polyA = 3 (SEQ ID NO: 52) >Hs.194774_mRNA_1 gi|16306633|gb|BC001492.1|BC001492 Homo sapiens clone MGC: 1774 IMAGE: 3510004 polyA = 3 (SEQ ID NO: 53) >Hs.127428_mRNA_2 gi|16306818|gb|BC006537|BC006537 Homo sapiens clone MGC: 1934 IMAGE: 2987903 polyA = 3 (SEQ ID NO: 54) >Hs.126852_contig1 AI802118|BF197404|BF224434|AA931964|AW236083|AI253119|AW614335|AI671372| AI793240|AW006851|AI953604|AI640505|AI633982|AI195809|AI493069|AW058576| AW293622 polyA = 2 polyA = 3 (SEQ ID NO: 55) >Hs.28149_mRNA_1 gi|14714936|gb|BC010626.1|BC010626 Homo sapiens clone MGC: 17687 IMAGE: 3865868 polyA = 3 (SEQ ID NO: 56) >Hs.35453_mRNA_3 gi|7018494|emb|AL157475.1|HSM802461 Homo sapiens mRNA; cDNA DKFZp761G151 (from clone DKFZp761G151); partial cds polyA = 3 (SEQ ID NO: 57) >Hs.180570_contig1 R08175|AA707224|AA699986|R11209|W89099|T98002|AA494546 polyA = 2 polyA = 3 (SEQ ID NO: 58) >Hs.196270_mRNA_1 gi|11545416|gb|AF283645.1|AF283645 Homo sapiens chromosome 8 map 8q21 polyA = 3 (SEQ ID NO: 59) >Hs.9030_mRNA_3 gi|12652600|gb|BC000045.1|BC000045 Homo sapiens clone MGC: 2032 IMAGE: 3504527 polyA = 3 (SEQ ID NO: 60) >Hs.1282_mRNA_3 gi|4559405|ref|NM_000065.1| Homo sapiens complement component 6 (C6), mRNA polyA = 1 (SEQ ID NO: 61) >Hs.268562_mRNA_2 gi|15341874|gb|BC013117.1|BC013117 Homo sapiens clone MGC: 8711 IMAGE: 3882749 polyA = 3 (SEQ ID NO: 62) >Hs.151301_mRNA_3 gi|16041747|gb|BC015754.1|BC015754 Homo sapiens clone MGC: 23085 IMAGE: 4862492 polyA = 3 (SEQ ID NO: 63) >Hs.111_contig1 AA946776|AW242338|H24724|AI078616 polyA = 1 polyA = 2 (SEQ ID NO: 64) >Hs.150753_contig1 AI123582|AI288234 polyA = 0 polyA = 0 (SEQ ID NO: 65) >Hs.82109_mRNA_1 gi|14250611|gb|BC008765.1|BC008765 Homo sapiens clone MGC: 1622 IMAGE: 3347793 polyA = 3 (SEQ ID NO: 66) >Hs.44276_mRNA_2 gi|12654896|gb|BC001293.1|BC001293 Homo sapiens clone MGC: 5259 IMAGE: 3458115 polyA = 3 (SEQ ID NO: 67) >Hs.2142_mRNA_4 gi|13325274|gb|BC004453.1|BC004453 Homo sapiens clone MGC: 4303 IMAGE: 2819400 polyA = 3 (SEQ ID NO: 68) >Hs.180908_contig1 AA846824|AW611680|AA846182|AA846342|AA846360 polyA = 2 polyA = 3 (SEQ ID NO: 69) >Hs.89436_mRNA_1 gi|16507959|ref|NM_004063.2| Homo sapiens cadherin 17, LI cadherin (liver-intestine) (CDH17), mRNA polyA = 1 (SEQ ID NO: 70) >Hs.151544_mRNA_8 gi|3153107|emb|AL023657.1|HSDSHP Homo sapiens SH3D1A cDNA, formerly known as DSHP polyA = 3 (SEQ ID NO: 71) >Hs.1657_contig4 AW473119|AA164586|AI540656|AI758480|AI810941|AI978964|AI675862|AI784397| AW591562|AW514102|AI888116|AI983175|AI634735|AI669577|AI202659|AI910598| AI961352|AI565481|AI886254|AI538838|AA291749|AW571455|AI370308|AI274727| AW473925|AW514787|AI273871|AW470552|AI524356|AI888281|AW089672|AI952766| AW440601|AI654044|AW438839|AI972926 polyA = 2 polyA = 3 (SEQ ID NO: 72) >Hs.35894_mRNA_1 gi|6049161|gb|AF133587.1|AF133587 Homo sapiens chromosome 22 map 22q11.2 polyA = 3 (SEQ ID NO: 73) >Hs.334534_mRNA_2 gi|17389403|gb|BC017742.1|BC017742 Homo sapiens, clone IMAGE: 4391536, mRNA polyA = 3 (SEQ ID NO: 74) >Hs.60162_mRNA_1 gi|10437644|dbj|AK025181.1|AK025181 Homo sapiens cDNA: FLJ21528 fis, clone COL05977 polyA = 3
[0081] As would be understood by the skilled person, detection of expression of any of the above identified sequences, as well as sequences of the set of 90 below, or the sequences provided in the attached Sequence Listing may be performed by the detection of expression of any appropriate portion or fragment of these sequences. Preferably, the portions are sufficiently large to contain unique sequences relative to other sequences expressed in a cell containing sample. Moreover, the skilled person would recognize that the disclosed sequences represent one strand of a double stranded molecule and that either strand may be detected as an indicator of expression of the disclosed sequences. This follows because the disclosed sequences are expressed as RNA molecules in cells which are preferably converted to cDNA molecules for ease of manipulation and detection. The resultant cDNA molecules may have the sequences of the expressed RNA as well as those of the complementary strand thereto. Thus either the RNA sequence strand or the complementary strand may be detected. Of course is it also possible to detect the expressed RNA without conversion to cDNA.
[0082] In some embodiments of the invention, the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the following oligonucleotides, which correspond to the above sequences as indicated by the accession numbers provided.
TABLE-US-00002 >AF133587 (SEQ ID NO: 75) CCCGGATCGCCATCAGTGTCATCGAGTTCAAACCCTGAGCCCTTCATTCACCTCTGTGAG >BC017742 (SEQ ID NO: 76) TGCCCTTGCTCTGTGTCATCTCAGTCATTTGACTTAGAAAGTGCCCTTCAAAAGGACCCT >BF437393 (SEQ ID NO: 77) GGAGGGAGGGCTAATTATATATTTTGTTGTTCCTCTATACTTTGTTCTGTTGTCTGCGCC >AI620495 (SEQ ID NO: 78) CAGTTTGGATTGTATAATAACGCCAAGCCCAGTTGTAGTCGTTTGAGTGCAGTAATGAAA >AK000380 (SEQ ID NO: 79) AAATCAGAGTAACCCTTTCTGTATTGAGTGCAGTGTTTTTTACTCTTTTCTCATGCAGAT >BC009237 (SEQ ID NO: 80) TGCCTGGCACAAAGAAGGAAGAATATAAATGATAGTTCCACTCGTCTGTGGAAGAACTTA >BC008765 (SEQ ID NO: 81) AGTCTTTTGCTTTTGGCAAAACTCTACTTAATCCAATGGGTTTTTCCCTGTACAGTAGAT >BC001504 (SEQ ID NO: 82) GGTTACTGTGGGTGGAATAGTGGAGGCCTTCAACTGATTAGACAAGGCCCGCCCACATCT >NM_019894 (SEQ ID NO: 83) TAAAATGCACTGCCCTACTGTTGGTATGACTACCGTTACCTACTGTTGTCATTGTTATTA >BF224381 (SEQ ID NO: 84) TTCTCTTTTGGGGGCAAACACTATGTCCTTTTCTTTTTCTAGATACAGTTAATTCCTGGA >AL157475 (SEQ ID NO: 85) AAGACCCACACCCTGTAGCAATACCAAGTGCTATTACATAATCAATGGACGATTTATACT >AY033998 (SEQ ID NO: 86) AGTGTTGCAAGTTTCCTTTAAAACCAACAAAGCCCACAAGTCCTGAATTTCCCATTCTTA >H07885 (SEQ ID NO: 87) GTCACTGTCATAGGAGCTGTGATTTCACAAGGAAGGGTGCTGCAGGGGGACCTGGTTGAT >NM_004496 (SEQ ID NO: 88) TTTCATCCAGTGTTATGCACTTTCCACAGTTGGTGTTAGTATAGCCAGAGGGTTTCATTA >AA846824 (3E0 ID NO: 89) GGGAAGTAGGGATTATTCGTTTAAATTCAATCGCGAGCACCAAGTCGGACTGGCCGGGGA >BC017586 (SEQ ID NO: 90) GGGACCAGGCCCTGGGACAGCCATGTGGCTCCAAATGACTAAATGTCAGCTCAAAAACCA >AA456140 (SEQ ID NO: 91) TCCGTTTATGGAGGCAATTCCATATCCTTTCTTGAACGCACATTCAGCTTACCCCAGAGA >NM_002283 SEQ ID NO: 92) AGAGTTAAGCCACTTCCTGGGTCTCCTTCTTATGACTGTCTATGGGTGCATTGCCTTCTG >AL023697 (SEQ ID NO: 93) GTGGCCTGAGTAATGCATTATGGGTGGTTTACCATTTCTTGAGGTAAAAGCATCACATGA >BC001639 (SEQ ID NO: 94) ACACATGCATGTGTCTGTGTATGTGTGAATGTGAGAGAGACACAGCCCTCCTTTCAGAAG >BC015754 (SEQ ID NO: 95) TCTGTAACTGCACAACCCTGGGGTTTGCTGCAGAGCTATTTCTTTCCATGTAAAGTAGTG >AF332224 (SEQ ID NO: 96) AAACACTCTTTCCGACTCCAGAGGAGAAGCTGGCAGCTCTCTGTAAGAAATATGCTGATC >BC001270 (SEQ ID NO: 97) GCTTCCTCTATCGCCCAATGCAAAATCGATGAAATGGGGAGTTCTCTGGGCCAGGCCACA >AI147926 (5E0 ID NO: 98) GTAGAATCCTCTGTTCATAATGAACAAGATGAACCAATGTGGATTAGAAAGAAGTCCGAG >AW298545 (SEQ ID NO: 99) CTGTTTTAAAACTGAATGGCACGAAATTGTTTTCCTCAACTCGGAGATTCCTGTATGGAG >AI802118 (SEQ ID NO: 100) AATAAATAGTAGCTCTGCTGATGATGACGTTGATAACCAAACTGTTCTGTGGTCTTAAGT >A1683181 (SEQ ID NO: 101) CAAACAGCCCGGTCTTGATGCAGGAGAGTCTGGAAAAGGAAGAAAATGGTTTCAGTTTCA >M95585 (SEQ ID NO: 102) AACATGGACCATCCAAATTTATGGCCGTATCAAATGGTAGCTGAAAAAACTATATTTGAG >AK027147 (SEQ ID NO: 103) TTGTAATCATGCCAATTCCAGATCAATAACTGCATGTCTGTTCTTTGGTAGAAATAGCTT >AW291189 (SEQ ID NO: 104) AAAGATTATTAACCCAAATCACCTTTCTTGCTTACTCCAGATGCCTCAGCCTCTGATATA >A1632669 (SEQ ID NO: 105) GACTTCCTTTAGGATCTCAGGCTTCTGCAGTTCTCATGACTCCTACTTTTCATCCTAGTC >BC006537 (SEQ ID NO: 106) CTGTATATTTTGCAATAGTTACCTCAAGGCCTACTGACCAAATTGTTGTGTTGAGATGAT >R61469 (SEQ ID NO: 107) TGTTCAAACAGACTTTAACCTCTGCATCATACTTAACCCTGCGACATGCGTACAGTATGC >BC009084 (SEQ ID NO: 108) TGAGTCATATACATTTACTGACCACTGTTGCTTGTTGCTCACTGTGCTGCTTTTCCATGA >N64339 (SEQ ID NO: 109) CTGAAATGTGGATGTGATTGCCTCAATAAAGCTCGTCCCCATTGCTTAAGCCTTCAAAAA >AI200660 (SEQ ID NO: 110) ATCAAGAAAACCTAATCTTCTGACTCCCAGGCCAGGATGTTTTATTTCTCACATCATGTC >AK054605 (SEQ ID NO: 111) TTCATTTCCAAACATCATCTTTAAGACTCCAAGGATTTTTCCAGGCACAGTGGCTCATAC >NM_006115 (SEQ ID NO: 112) AGTTAGAAATAGAATCTGAATTTCTAAAGGGAGATTCTGGCTTGGGAAGTACATGTAGGA >X96757 (SEQ ID NO: 113) CAATTTTCTTTTTACTCCCCCTCTTAAGGGGGCCTTGGAATCTATAGTATAGAATGAACT >AI804745 (SEQ ID NO: 114) GGGTGGAGTTTCAGTGAGAATAAACGTGTCTGCCTTTGTGTGTGTGTATATATACAGAGA >AJ000388 (SEQ ID NO: 115) CTCGCTCATTTTTTACCATGTTTTCCAGTCTGTTTAACTTCTGCAGTGCCTTCACTACAC >BC008764 (SEQ ID NO: 116) CTTTGGGCCGAGCACTGAATGTCTTGTACTTTAAAAAAATGTTTCTGAGACCTCTTTCTA >AI309080 SEQ ID NO: 117) CTGGACCCTTGGAGCAGTGTTGTGTGAACTTGCCTAGAACTCTGCCTTCTCCGTTGTCAA >AA845917 (SEQ ID NO: 118) CCACCTCCTTCGACCTCCACTGCGCCCCACCTCCCTGCCTGTGTGTGTTATTTCAAAGGA >AA946776 (SEQ ID NO: 119) TCTGGCTGGTGGCCTGCGCGAGGGTGCAGTCTTACTTAAAAGACTTTCAGTTAATTCTCA >AF104032 (SEQ ID NO: 120) AGATGCTGTCGGCACCATGTTTATTTATTTCCAGTGGTCATGCTCAGCCTTGCTGCTCTG >AW194680 (SEQ ID NO: 121) TCCTTCCTCTTCGGTGAATGCAGGTTATTTAAACTTTGGGAAATGTACTTTTAGTCTGTC >BC001293 (SEQ ID NO: 122) GTCCTGTCCCTGTCTGGGAGTTGTGTTATTTAAAGATATTCTGTATGTTGTATCTTTTGC >EE962007 (SEQ ID NO: 123) ATTATATTTCAGGTGTCCTGAACAGGTCACTAGACTCTACATTGGGCAGCCTTTAAATAT >BI493248 (SEQ ID NO: 124) AGGAATGGTACTACCGTTCCAGATTTTCTGTAATTGCTTCTGCAAAGTAATAGGCTTCTT >AF283645 (SEQ ID NO: 125) CTGTACCCAAAGGATGCCAGAATACTAGTATTTTTATTTATCGTAAACATCCACGAGTGC >AI669760 (SEQ ID NO: 126) ATTGCCCCCCTAACCAATCATGCAAACTTTTCCCCCCCTGGGGTAATTCACCAGTTAAAA >BC001492 (SEQ ID NO: 127)) CCCACAGTATTTAATGCCCTGTCAGTCCCTTCTAGTCTGACTCAATGGTAACTTGCTGTA >BC004453 (SEQ ID NO: 128) AAAACCAACTCTCTACTACACAGGCCTGATAACTCTGTACGAGGCTTCTCTAACCCCTAG >BC010626 (SEQ ID NO: 129) CTCAGACTGGGCTCCACACTCTTGGGCTTCAGTCTGCCCATCTGCTGAATGGAGACAGCA >BC013117 (SEQ ID NO: 130) CCTAATGGGGATTCCTCTGGTTGTTCACTGCCAAAACTGTGGCATTTTCATTACAGGAGA >BC011949 (SEQ ID NO: 131) CACTCACAATTGTTGACTAAAATGCTGCCTTTAAAACATAGGAAAGTAGAATGGTTGAGT >AW083022 (SEQ ID NO: 132) CTTTGAAGGGCTGCTGCACATTGTTGAATCCATCGACCTTTAGCTGCAATGGGATCTCTA >R08175 (SEQ ID NO: 133) TGCCTCATCGATATTATAGGGGTCCATCACAACCCAACTGTGTGGCCGGATCCTGAGTCT >NM_000065 (SEQ ID NO: 134) AAAACAGACAAAAGCCTTTGCCTTCATGAAGCATACATTCATTCAGGGGTAGACACACAA >AK025181 (SEQ ID NO: 135) TAACAAACAAAGGCAGTAGCTCATCACTTGGGTAGCAGGTACCCATTTTAGGACCCTACA >NM_003914 (SEQ ID NO: 136) ATATCAGAAGTGCCAATAATCGTCATAGGCTTCTGCACGTTGGATCAACTAATGTTGTTT >AI123582 (SEQ ID NO: 137)
ATCATAGCCCAACCATGTGAGAAGAAGGAGAAGGCCCCCCTTTCTTCATTAATCTGAAAA >BC004331 (SEQ ID NO: 138) GCAGACCATTCTATCATACCTGGCAGGGCTTCTGTTTTATTTTGTAGGCTGGATGCTACC >AI341378 (SEQ ID NO: 139) ACTACAAGCCTCTTGTTTTTCACCAAAACCCTACATCTCAGGCTTACTAATTTTTGTGAT >NM_004063 (SEQ ID NO: 140) GCCATGCATACATGCTGCGCATGTTTTCTTCATTCGTATGTTAGTAAAGTTTTGGTTATT >BC012926 (SEQ ID NO: 141) CACCTATTTATTTTACCTCTTTCCCAAACCTGGAGCATTTATGCCTAGGCTTGTCAAGAA >AL110274 (SEQ ID NO: 142) GTGGACATAGCCACTAACCAACTAGTTACCTTTGGACTGCAACAAAAAATGTGAAAATGA >AW473119 (SEQ ID NO: 143) ACTTGTAAACCTCTTTTGCACTTTGAAAAAGAATCCAGCGGGATGCTCGAGCACCTGTAA >AI685931 (SEQ ID NO: 144) AATTCTCTATAAACGGTTCACCAGCAAACCACCAATACATTCCATTGTTTGCCTAGAGAG >BF59299 (SEQ ID NO: 145) AATGGCCCATGCATGCTGTTTGCAGCAGTCAATTGAGTTGAATTAGAATTCCAACCATAC >BC000045 (SEQ ID NO: 146) GAGCTCAGTACTTGCCCTGTGAAAATCCCAGAAGCCCCCGCTGTCAATGTTCCCCATCCA >BC015582 (SEQ ID NO: 147) ATGAAGCGGAATTAGGCTCCCGAGCTAAGGGACTCGCCTAGGGTCTCACAGTGAGTAGGA >M60502 (SEQ ID NO: 148) AGTGGCTATATCAACATCAGGGCTAGCACATCTTTCTCTATTATCCTTCTATTGGAATTC
[0083] The invention also provides a second group of 90 gene sequences from which 50 or more may be used in the practice of the invention. The 50 to 90 gene sequences may be used along with the determination of expression levels of additional sequences so long as the expression levels of gene sequences from the set of 90 are used in classifying. A non-limiting example of such embodiments of the invention is where the expression of the 90 gene sequences, or at least 50 (or 50 to about 90) members thereof, is measured along with the expression levels of a plurality of other sequences, such as by use of a microarray based platform used to perform the invention. Where those other expression levels are not used in classification, they may be considered the results of "excess" transcribed sequences and not critical to the practice of the invention. Alternatively, and where those other expression levels are used in classification, they are within the scope of the invention, where the use of the above described sequences does not necessarily exclude the use of expression levels of additional sequences.
[0084] 38 members of the set of 90 are included in the first set of 74 described above. The accession numbers of these members in common between the two sets are AA456140, AA846824, AA946776, AF332224, AI1620495, AI632869, AI802118, AI804745, AJ000388, AK025181, AK027147, AL157475, AW194680, AW291189, AW298545, AW473119, BC000045, BC001293, BC001504, BC004453, BC006537, BC008765, BC009084, BC011949, BC012926, BC013117, BC015754, BE962007, BF224381, BF437393, BI493248, M60502, NM_000065, NM_003914, NM_004063, NM_004496,NM_006115, and R61469. mRNA sequences corresponding to members of the set of 90 that are not present in the set of 74 gene sequences are also provided in the Sequence Listing and identified as SEQ ID NOS: 149-200. The listing of identifying information for these 52 unique members by accession numbers, as well as corresponding oligonucleotide sequences which may be used in the practice of the invention, is provided by the following.
TABLE-US-00003 >R15881 (SEQ ID NO: 201) ACTTCTGGTGATGATAAAAATGGTTTTATCACCCAGATGTGAAAGAAGCTGCCTGTTTAC >I041545 (SEQ ID NO: 202) GTGGTTCTGTAAAAACGCAGAGGAAAAGAGCCAGAAGGTTTCTGTTTAATGCATCTTGCC >NM_024423 (SEQ ID NO: 203) TTTATAAGGAAGCAGCTGTCTAAAATGCAGTGGGGTTTGTTTTGCAATGTTTTAAACACA >AB038160 (SEQ ID NO: 204) CTTATGAAGCTGGCCGGGCCACTCACGTTCAATGGTACATCTGGGTCTCTATGTGGTTCT >AB026790 (SEQ ID NO: 205) GTGAGCCAGCATTTCCCATAGCTAACCCTATTCTCTTAGTCTTTCAAAATGTAGAATGGG >BC012727 (SEQ ID NO: 206) CTTTACACCTGATAAAATATTTTGCGAAGAGAGGTGTTCTTTTTCCTTACTGGTGCTGAA >BC016451 (SEQ ID NO: 207) GCATACATCTCATCCACAGGGGAAGATAAAGATGGTCACACAAACAGTTTCCATAAAGAT >H09748 (SEQ ID NO: 208) TGAGTTCAGCATGTGTCTGTCCATTTCATTTGTACGCTTGTTCAAAACCAAGTTTGTTCT >NM_006142 (SEQ ID NO: 209) AAGACCGAGACTGAGGGAAAGCATGTCTGCTGGGTGTGACCATGTTTCCTCTCAATAAAG >AF191770 (SEQ ID NO: 210) GGCATCTGGCCCCTGGTAGCCAGCTCTCCAGAATTACTTGTAGGTAATTCCTCTCTTCAT >NM_006378 (SEQ ID NO: 211) TGGATGTTTGTGCGCGTGTGTGGACAGTCTTATCTTCCAGCATGATAGGATTTGACCATT >BC006819 (SEQ ID NO: 212) TCCTGGCAGAGCCATGGTCCCAGGCTTCCCAAAAGTGTTTGTGGCAATTATTCCCCTAGG >X79676 (SEQ ID NO: 213) TTTGATGATAGCAGACATTGTTACAAGGACATGGTGAGTCTATTTTTAATGCACCAATCT >BC006811 (SEQ ID NO: 214) TTCTTCCAGTTGCACTATTCTGAGGGAAAATCTGACACCTAAGAAATTTACTGTGAAAAA >NM_000198 (SEQ ID NO: 215) GAACAATTGTGGTCTCTCTTAACTTGAGGTTCTCTTTTGACTAATAGAGCTCCATTTCCC >AF301598 (SEQ ID NO: 216) GTTAAGTGTGGCCAAGCGCACGGCGGCAAGTTTTCAAGCACTGAGTTTCTATTCCAAGAT >NM_002847 (SEQ ID NO: 217) CGGCCTACTGAGCGGACAGAATGATGCCAAAATATTGCTTATGTCTCTACATGGTATTGT >NM_004062 (SEQ ID NO: 218) CAGGGTGTTTGCCCAATAATAAAGCCCCAGAGAACTGGGCTGGGCCCTATGGGATTGGTA >AW118445 (SEQ ID NO: 219) TGTACAGTTTGGTTGTTGCTGTAAATATGGTAGCGTTTTGTTGTTGTTGTTTTTTCATGC >BC002551 (SEQ ID NO: 220) TACCAAACTGGGACTCACAGCTTTATTGGGCTTTCTTTGTGTCTTGTGTGTTTCTTTTAT >AA765597 (SEQ ID NO: 221) CATTGAGGTTTGGATGGTGGCAGGTAAAACAGAAAGGCAAGATGTCATCTGACATTAGGC >ALl37761 (SEQ ID NO: 222) AGTTCAGCACTGTGGTTATCATTGGTGATGCCAGAAAACATTAGTAGACTTAGACAATTG >X78202 (SEQ ID NO: 223) TAAAATTTCTTGATTGTGACTATGTGGTCATATGCCCGTGTTTGTCACTTACAAAAATGT >AK025615 (SEQ ID NO: 224) AGCCATCTGGTGTGAAGAACTCTATATTTGTATGTTGAGAGGGCATGGAATAATTGTATT >BC001665 (SEQ ID NO: 225) CTTATTGTCACTGGTTAAGAACTTGGCGAGATTGAAGGGCTTTTGTTATTGTTGTTGGAT >AI985118 (SEQ ID NO: 226) CTTTCTAGTGAGCTAACCGTAACAGAGAGCCTACAGGATACACGTGAGATAATGTCACGT >AL039118 (SEQ ID NO: 227) TTGTCTTAAAATTTCTTGATTGTGATACTGTGGTCATATGCCCGTGTTTGTCACTTACAA >AA782845 (SEQ ID NO: 228) CCTGGGGGAAAGGGGCATTCATGACCTGAACTTTTTAGCAAATTATTATTCTCAGTTTCC >BC016340 (SEQ ID NO: 229) TTCATTAACAGTACTAAGTGGAAGGGATCTGCAGATTCCAAATTGGAATAAGCTCTATCA >AA745593 (SEQ ID NO: 230) CCAATGCAGAAGAGTATTAAGAAAGATGCTCAAGTCCCATGGCACAGAGCAAGGCGGGCA >NM_004967 (SEQ ID NO: 231) CAAGGCTACGATGGCTATGATGGTCAGAATTACTACCACCACCAGTGAAGCTCCAGCCTG >BPS510316 (SEQ ID NO: 232) AGCTCACAGCTGGACAGGTGTTGTATATAGAGTGGAATCTCTTGGATGCAGCTTCAAGAA >A993639 (SEQ ID NO: 233) TCCAAAGTAGAAAGGGTTCTTTTAGAAAACTTGAAGAATGTGCCTCCTCTTAGCATCTGT >AV656862 (SEQ ID NO: 234) GATGCATTTTTCAGTCCCTTTTCAGAGCAAATGCTTTTGCAATGGTAGTAATGTTTAGTT >X69699 (SEQ ID NO: 235) CCTGTGGGGCTTCTCTCCTTGATGCTTCTTTCTTTTTTTAAAGACAACCTGCCATTACCA >BC0l3282 (SEQ ID NO: 236) TTGCACTAAGTCATGCTGTTTCCTCAAAGAAGCTTTGTTTTTTGTTAACGTATTACTCAG >AI457360 (SEQ ID NO: 237) CTGGATCCCAGGCCCTGGCACCCCTCAGGAAATACAAGAAAAAGAATATTCACATCTGTT >AW445220 (SEQ ID NO: 238) TTAGAGGGGCCACCTATCAACTCATCAGTGTTCAAAGAATATGCTGGGAGCATGGGTGAG >AF038191 (SEQ ID NO: 239) GGCCCATTTATGTCCCTCATGTCTCTAGATTTTCTCGTCACCCAGCCTCAAAAATATATG >X05615 (SEQ ID NO: 240) TCCCCAAAAACCTCACCCGAGGCTGCCCACTATGGTCATCTTTTTCTCTAAAATAGTTAC >BC005364 (SEQ ID NO: 241) GAAATTCCTCACACCTTGCACCTTCCCTACTTTTCTGAATTGCTATGACTACTCCTTGTT >AK025701 (SEQ ID NO: 242) TGTCTGTCCACCACGAGATGGGAGGAGGAGAAAAAGCGGTACGATGCCTTCCTGACCTCA >BF446419 (SEQ ID NO: 243) GTCTTATCTCTCAGGGGGGGTTTAAGTGCCGTTTGCAATAATGTCGTCTTATTTATTTAG >AK025470 (SEQ ID NO: 244) CCGAGTAGTATGGGTCTCTGTGTGAGAAACCAGGAGATATTTTCATCTTGTTCGGAAATA >BE552004 (SEQ ID NO: 245) TTGTGCAAAAGTCCCACAACCTTTCTGGATTGATAGTTTGTGGTGAAATAAACAATTTTA >H05388 (SEQ ID NO: 246) TCCAGTATTCTGCAGGGCCAGTCAGTTGTACAGAAGTTGGAATATTCTGTTCCAGAATTA >NM_033229 (SEQ ID NO: 247) GTCTCGAACAGCGGTTGTTTTTACTTTATTTATCTTAGGCCCTCAGCTCCCTGACGTCCT >BC010437 (SEQ ID NO: 248) AGTGAATCTTTTCCTCTTGGTAGCATCAACACTGGGGATAAATCAGAACCATTCTGTGGA >AI952953 (SEQ ID NO: 249) TGAGAGCCCAGAACAAGAAGGAGCAGAAGGGCACTTTGACCTTCATTATTATGAAAATCA >R45389 (SEQ ID NO: 250) GGAAGAACTGATGCTTGCTGCTAACTAAAGTTTTGGATGTATCGATTTAGAGAACCAATT >NM_001337 (SEQ ID NO: 251) GAATGAGAGAATAAGTCATGTTCCTTCAAGATCATGTACCCCAATTTACTTGCCATTACT >AI499593 (SEQ ID NO: 252) TACGGAAAGGAAACAGGTTATACTCTTAGATTTAAAAAGTGAAAGAAACTGCAGGCGCCT
[0085] In some embodiments of the invention, the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the above oligonucleotides, which correspond to sequences in the Sequence Listing as indicated by the accession numbers provided.
[0086] In additional embodiments, the invention provides for use of any number of the gene sequences of the set of 74 or the set of 90 in the methods of the invention. Thus anywhere from 1 to all of the 50 or more gene sequences used in the invention may be from either or both of the above sets. So from one, two, three, four, five, six, seven, eight, nine, ten, or, more of the 50 or more sequences may be from the set of 74 or the set of 90.
[0087] As used herein, a "tumor sample" or "tumor containing sample" or "tumor cell containing sample" or variations thereof, refer to cell containing samples of tissue or fluid isolated from an individual suspected of being afflicted with, or at risk of developing, cancer. The samples may contain tumor cells which may be isolated by known methods or other appropriate methods as deemed desirable by the skilled practitioner. These include, but are not limited to, microdissection, laser capture microdissection (LCM), or laser microdissection (LMD) before use in the instant invention. Alternatively, undissected cells within a "section" of tissue may be used. Non-limiting examples of such samples include primary isolates (in contrast to cultured cells) and may be collected by any non-invasive or minimally invasive means, including, but not limited to, ductal lavage, fine needle aspiration, needle biopsy, the devices and methods described in U.S. Pat. No. 6,328,709, or any other suitable means recognized in the art. Alternatively, the sample may be collected by an invasive method, including, but not limited to, surgical biopsy.
[0088] The detection and measurement of transcribed sequences may be accomplished by a variety of means known in the art or as deemed appropriate by the skilled practitioner. Essentially, any assay method may be used as long as the assay reflects, quantitatively or qualitatively, expression of the transcribed sequence being detected.
[0089] The ability to classify tumor samples is provided by the recognition of the relevance of the level of expression of the gene sequences (whether randomly selected or specified) and not by the form of the assay used to determine the actual level of expression. An assay of the invention may utilize any identifying feature of a individual gene sequence as disclosed herein as long as the assay reflects, quantitatively or qualitatively, expression of the gene in the "transcriptome" (the transcribed fraction of genes in a genome) or the "proteome" (the translated, fraction of expressed genes in a genome). Additional assays include those based on the detection of polypeptide fragments of the relevant member or members of the proteome. Non-limiting examples of the latter include detection of proteolytic fragments found in a biological fluid, such as blood or serum. Identifying features include, but are not limited to, unique nucleic acid sequences used to encode (DNA), or express (RNA), said gene or epitopes specific to, or activities of, a protein encoded by a gene sequence.
[0090] Additional means include detection of nucleic acid amplification as indicative of increased expression levels and nucleic acid inactivation, deletion, or methylation, as indicative of decreased expression levels. Stated differently, the invention may be practiced by assaying one or more aspect of the DNA template(s) underlying the expression of each gene sequence, of the RNA used as an intermediate to express the sequence, or of the proteinaceous product expressed by the sequence, as well as proteolytic fragments of such products. As such, the detection of the presence of, amount of, stability of, or degradation (including rate) of, such DNA, RNA and proteinaceous molecules may be used in the practice of the invention.
[0091] In some embodiments, all or part of a gene sequence may be amplified and detected by methods such as the polymerase chain reaction (PCR) and variations thereof, such as, but not limited to, quantitative PCR (Q-PCR), reverse transcription PCR (RT-PCR), and real-time PCR (including as a means of measuring the initial amounts of mRNA copies for each sequence in a sample), optionally real-time RT-PCR or real-time Q-PCR. Such methods would utilize one or two primers that are complementary to portions of a gene sequence, where the primers are used to prime nucleic acid synthesis. The newly synthesized nucleic acids are optionally labeled and may be detected directly or by hybridization to a polynucleotide of the invention. The newly synthesized nucleic acids may be contacted with polynucleotides (containing gene sequences) of the invention under conditions which allow for their hybridization. Additional methods to detect the expression of expressed nucleic acids include RNAse protection assays, including liquid phase hybridizations, and in situ hybridization of cells.
[0092] Alternatively, the expression of gene sequences in FFPE samples may be detected as disclosed in U.S. applications 60/504,087, filed Sep. 19, 2003, Ser. No. 10/727,100, filed Dec. 2, 2003, and Ser. No. 10/773,761, filed Feb. 6, 2004 (all three of which are hereby incorporated by reference as if fully set forth). Briefly, the expression of all or part of an expressed gene sequence or transcript may be detected by use of hybridization mediated detection (such as, but not limited to, microarray, bead, or particle based technology) or quantitative PCR mediated detection (such as, but not limited to, real time PCR and reverse transcriptase PCR) as non-limiting examples. The expression of all or part of an expressed polypeptide may be detected by use of immunohistochemistry techniques or other antibody mediated detection (such as, but not limited to, use of labeled antibodies that bind specifically to at least part of the polypeptide relative to other polypeptides) as non-limiting examples. Additional means for analysis of gene expression are available, including detection of expression within an assay for global, or near global, gene expression in a sample (e.g. as part of a gene expression profiling analysis such as on a microarray). Non-limiting examples linear RNA amplification and those described in U.S. patent application Ser. No. 10/062,857 (filed on Oct. 25, 2001), as well as U.S. Provisional Patent Applications 60/298,847 (filed Jun. 15, 2001.) and 60/257,801 (filed Dec. 22, 2000), all of which are hereby incorporated by reference in their entireties as if fully set forth.
[0093] In embodiments using a nucleic acid based assay to determine expression includes immobilization of one or more gene sequences on a solid support, including, but not limited to, a solid substrate as an array or to beads or bead based technology as known in the art. Alternatively, solution based expression assays known in the art may also be used. The immobilized gene sequence(s) may be in the form of polynucleotides that are unique or otherwise specific to the gene(s) such that the polynucleotides would be capable of hybridizing to the DNA or RNA of said gene(s). These polynucleotides may be the full length of the gene(s) or be short sequences of the genes (up to one nucleotide shorter than the full length sequence known in the art by deletion from the 5' or 3' end of the sequence) that are optionally minimally interrupted (such as by mismatches or inserted non-complementary basepairs) such that hybridization with a DNA or RNA corresponding to the genes is not affected. In some embodiments, the polynucleotides used are from the 3' end of the gene, such as within about 350, about 300, about 250, about 200, about 150, about 100, or about 50 nucleotides from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. Polynucleotides containing mutations relative to the sequences of the disclosed genes may also be used so long as the presence of the mutations still allows hybridization to produce a detectable signal. Thus the practice of the present invention is unaffected by the presence of minor mismatches between the disclosed sequences and those expressed by cells of a subject's sample. A non-limiting example of the existence of such mismatches are seen in cases of sequence polymorphisms between individuals of a species, such as individual human patients within Homo sapiens.
[0094] As will be appreciated by those skilled in the art, some gene sequences include 3' poly A (or poly T on the complementary strand) stretches that do not contribute to the uniqueness of the disclosed sequences. The invention may thus be practiced with gene sequences lacking the 3' poly A (or poly T) stretches. The uniqueness of the disclosed sequences refers to the portions or entireties of the sequences which are found only in nucleic acids, including unique sequences found at the 3' untranslated portion thereof. Some unique sequences for the practice of the invention are those which contribute to the consensus sequences for the genes such that the unique sequences will be useful in detecting expression in a variety of individuals rather than being specific for a polymorphism present in some individuals. Alternatively, sequences unique to an individual or a subpopulation may be used. The unique sequences may be the lengths of polynucleotides of the invention as described herein.
[0095] In additional embodiments of the invention, polynucleotides having sequences present in the 3' untranslated and/or non-coding regions of gene sequences are used to detect expression levels in cell containing samples of the invention. Such polynucleotides may optionally contain sequences found in the 3' portions of the coding regions of gene sequences. Polynucleotides containing a combination of sequences from the coding and 3' non-coding regions preferably have the sequences arranged contiguously, with no intervening heterologous sequence(s).
[0096] Alternatively, the invention may be practiced with polynucleotides having sequences present in the 5' untranslated and/or non-coding regions of gene sequences to detect the level of expression in cells and samples of the invention. Such polynucleotides may optionally contain sequences found in the 5' portions of the coding regions. Polynucleotides containing a combination of sequences from the coding and 5' non-coding regions may have the sequences arranged contiguously, with no intervening heterologous sequence(s). The invention may also be practiced with sequences present in the coding regions of gene sequences.
[0097] The polynucleotides of some embodiments contain sequences from 3' or 5' untranslated and/or non-coding regions of at least about 16, at least about 18, at least about 20, at least about 22, at least about 24, at least about 26, at least about 28, at least about 30, at least about 32, at least about 34, at least about 36, at least about 38, at least about 40, at least about 42, at least about 44, or at least about 46 consecutive nucleotides. The term "about" as used in the previous sentence refers to an increase or decrease of 1 from the stated numerical value. Other embodiments use polynueleotides containing sequences of at least or about 50, at least or about 100, at least about or 150, at least or about 200, at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides. The term "about" as used in the preceding sentence refers to an increase or decrease of 10% from the stated numerical value.
[0098] Sequences from the 3' or 5' end of gene coding regions as found in polynucleotides of the invention are of the same lengths as those described above, except that they would naturally be limited by the length of the coding region. The 3' end of a coding region may include sequences up to the 3' half of the coding region. Conversely, the 5' end of a coding region may include sequences up the 5' half of the coding region. Of course the above described sequences, or the coding regions and polynucleotides containing portions thereof, may be used in their entireties.
[0099] In another embodiment of the invention, polynucleotides containing deletions of nucleotides from the 5' and/or 3' end of gene sequences may be used. The deletions are preferably of 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-125, 125-150, 150-175, or 175-200 nucleotides from the 5' and/or 3' end, although the extent of the deletions would naturally be limited by the length of the sequences and the need to be able to use the polynucleotides for the detection of expression levels.
[0100] Other polynucleotides of the invention from the 3' end of gene sequences include those of primers and optional probes for quantitative PCR. Preferably, the primers and probes are those which amplify a region less than about 750, less than about 700, less than about 650, less than about 6000, less than about 550, less than about 500, less than about 450, less than about 400, less than about 350, less than about 300, less than about 250, less than about 200, less than about 150, less than about 100, or less than about 50 nucleotides from the from the polyadenylation signal or polyadenylation site of a gene or expressed sequence. The size of a PCR amplicon of the invention may be of any size, including, at least or about 50, at least or about 100, at least about or 150, at least or about 200. at least or about 250, at least or about 300, at least or about 350, or at least or about 400 consecutive nucleotides, all with inclusion of the portion complementary to the PCR printers used.
[0101] Other polynucleotides for use in the practice of the invention include those that have sufficient homology to gene sequences to detect their expression by use of hybridization techniques. Such polynucleotides preferably have about or 95%, about or 96%, about or 97%, about or 98%, or about or 99% identity with the gene sequences to be used. Identity is determined using the BLAST algorithm, as described above. The other polynucleotides for use in the practice of the invention may also be described on the basis of the ability to hybridize to polynucleotides of the invention under stringent conditions of about 30% v/v to about 50% formamide and from about 0.01M to about 0.15M salt for hybridization and from about 0.01M to about 0.15M salt for wash conditions at about 55 to about 65.degree. C. or higher, or conditions equivalent thereto.
[0102] In a further embodiment of the invention, a population of single stranded nucleic acid molecules comprising one or both strands of a human gene sequence is provided as a probe such that at least a portion of said population may be hybridized to one or both strands of a nucleic acid molecule quantitatively amplified from RNA of a cell or sample of the invention. The population may be only the antisense strand of a human gene sequence such that a sense strand of a molecule from, or amplified from, a cell may be hybridized to a portion of said population. The population preferably comprises a sufficiently excess amount of said one or both strands of a human gene sequence in comparison to the amount of expressed (or amplified) nucleic acid molecules containing a complementary gene sequence.
[0103] The invention further provides a method of classifying a human tumor sample by detecting the expression levels of 50 or more transcribed sequences in a nucleic acid or cell containing sample obtained from a human subject, and classifying the sample as containing a tumor cell of a if tumor type found in humans to the exclusion of one or more other human tumor types. In some embodiments, the method may be used to classify a sample as being, or having, cells of, one of the 53 tumor types listed above to the exclusion of one or more of the other 52. In other embodiments, the method is used to classify a sample as being, or having cells of, one of the 34 tumor types listed above to the exclusion of one or more of the other 33 tumor types. In further embodiments, the method is used to classify a sample as being, or having cells of, one of the 39 tumor types listed above to the exclusion of one or more of the other 38 tumor types.
[0104] The invention also provides a method for classifying tumor samples as being one of a subset of the possible tumor types described herein by detecting the expression levels of 50 or more transcribed sequences in a nucleic acid containing tumor sample obtained from a human subject, and classifying the sample as being one of a number of tumor types found in humans to the exclusion of one or more other human tumor types. In some embodiments of the invention, the number of other tumor types is from 1 to about 3, more preferably from 1 to about 5, from 1 to about 7, or from 1 to about 9 or about 10. In other embodiments, the number of tumor types are all of the organ origin such as those listed above. This aspect of the invention is related to the above discussion of FIG. 8 and of trading off specificity in favor of increased confidence, and may be advantageously applied to situations where the classification of a sample as a single tumor type is at a level of accuracy or performance that can be improved by classifying the sample as one of a subset of possible tumor types.
[0105] In additional embodiments, the invention may be practiced by analyzing gene expression from single cells or homogenous cell populations which have been dissected away from, or otherwise isolated or purified from, contaminating cells of a sample as present in a simple biopsy. One advantage provided by these embodiments is that contaminating, non-tumor cells (such as infiltrating lymphocytes or other immune system cells) may be removed as so be absent from affecting the genes identified or the subsequent analysis of gene expression levels as provided herein. Such contamination is present where a biopsy is used to generate gene expression profiles.
[0106] In further embodiments of the invention utilizing Q-PCR or reverse transcriptase Q-PCR as the assay platform, the expression levels of gene sequences of the invention may be compared to expression levels of reference genes in the same sample or a ratio of expression levels may be used. This provides a means to "normalize" the expression data for comparison of data on a plurality of known tumor types and a cell containing sample to be assayed. While a variety of reference genes may be used, the invention may also be practiced with the use of 8 particular reference gene sequences that were identified for use with the set of 39 tumor types. Moreover, the Q-PCR may be performed in whole or in part with use of a multiplex format.
[0107] mRNA sequences corresponding to the 8 reference sequences are provided in the attached Sequence Listing. A listing of the corresponding SEQ ID NOs, with corresponding identifying information, including accession numbers and other information, is provided by the
TABLE-US-00004 (SEQ ID NO: 253) >Hs.77031_mRNA_1 gi|16741772|gb|BC016680.1| BC016680 Homo sapiens clone MGC: 21349 IMAGE: 4338574 polyA = 3 (SEQ ID NO: 254) >Hs.77541_mRNA_1 gi|12804364|gb|BC003043.1| BC003403 Homo sapiens clone MGC: 4370 IMAGE: 2822973 polyA = 3 (SEQ ID NO: 255) >Hs.7001_mRNA_1 gi|6808256|emb|AL137727.1| HSM802274 Homo sapiens mRNA; cDNA DKFZp434M0519 (from clone DKFZp434M0519); parital cds polyA = 3 (SEQ ID NO: 256) >Hs.302144_mRNA_1 gi|11493400|gb|AF130047.1| AF130047 Homo sapiens clone FLB3020 polyA = 0 (SEQ ID NO: 257) >Hs.26510_mRNA_2 gi|11345385|gb|AF308803.1| AF308803 Homo sapiens chromosome 15 map 15q26 polyA = 3 (SEQ ID NO: 258) >Hs.324709_mRNA_2 gi|12655026|gb|BC001361.1| BC001361 Homo sapiens clone MGC: 2474 IMAGE: 3050694 polyA = 2 (SEQ ID NO: 259) >Hs.65756_mRNA_3 gi|3641494|gb|AF035154.1| AF035154 Homo sapiens chromosome 16 map 16p13.3 polyA = 3 (SEQ ID NO: 260) >Hs.165743_mRNA_2 gi|13543889|gb|BC006091.1| BC006091 Homo sapiens clone MGC: 12673 IMAGE: 3677524 polyA = 3
[0108] Detection of express any of the above reference sequences may be by the same or different methodology as for the other gene sequences described above. In some embodiments of the invention, the expression levels of gene sequences is measured by detection of expressed sequences in a cell containing sample as hybridizing to the following oligonucleotides, which correspond to the above sequences as indicated by the accession numbers provided.
TABLE-US-00005 >BC006091 (SEQ ID NO: 261) TCATCTTCACCAAACCAGTCCGAGGGGTCGAAGCCAGACACGAGAGGAAGA GGGTCCTGG >BC003043 (SEQ ID NO: 262) CTCTGCTCCTGCTCCTGCCTGCATGTTCTCTCTGTTGTTGGAGCCTGGAGC CTTGCTCTC >AF130047 (SEQ ID NO: 263) TGCTCCCGGCTGTCCTCCTCTCCTCTTCCCTAGTGAGTGGTTAATGAGTGT TAATGCCTA >AF035154 (SEQ ID NO: 264) CCCCATCTCTAAAACCAGTAAATCAGCCAGCGAATACCCGGAAGCAAGATG CACAGGCGG >BC001361 (SEQ ID NO: 265) CCAGAAACAAGGAAGAGGAAAGACAAAGGGAAGGGACGGGAGCCCTGGAGA AGCCCGACC >AF308803 (SEQ ID NO: 266) AAGTACAACCCATGCTGCTAAGATGCGAGCAGGAAGAGGCATCCTTTGCTA AATCCTGTT >BC016680 (SEQ ID NO: 267) ACCTCACCCCTGCCCGGCCCAAGCTCTACTTGTGTACAGTGTATATTGTAT AATAGACAA >AL137727 (SEQ ID NO: 268) TTCCCTTAATTCCTCCTCCCGACCTTTTTTACCCCCCCAGTTGCAGTATTT AACTGGGCT
[0109] In an additional aspect, the methods provided by the present it may also be automated in whole or in part. This includes the embodiment of the invention in software. Non-limiting examples include processor executable instructions on one or more computer readable storage devices wherein said instructions direct the classification of tumor samples based upon gene expression levels as described herein. Additional processor executable instructions on one or more computer readable storage devices are contemplated wherein said instructions cause representation and/or manipulation, via a computer output device, of the process or results of a classification method.
[0110] The invention includes software and hardware embodiments wherein the gene expression data of a set of gene sequences in a plurality of known tumor types is embodied as a data set. In some embodiments, the gene expression data set is used for the practice of a method of the invention. The invention also provides computer related means and systems for performing the methods disclosed herein. In some embodiments, an apparatus for classifying a cell containing sample is provided. Such an apparatus may comprise a query input configured to receive a query storage configured to store a gene expression data set, as described herein, received from a query input; and a module for accessing and using data from the storage in a classification algorithm as described herein. The apparatus may further comprise a string storage for the results of the classification algorithm, optionally with a module for accessing and using data from the string storage in an output algorithm as described herein.
[0111] The steps of a method, process, or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The various steps or acts in a method or process may be performed in the order shown, or may be performed in another order. Additionally, one or more process or method steps may be omitted or one or more process or method steps may be added to the methods and processes. An additional step, block, or action may be added in the beginning, end, or intervening existing elements of the methods and processes.
[0112] A further aspect of the invention provides for the use of the present invention in relation to clinical activities. In some embodiments, the determination or measurement of gene expression as described herein is performed as part of providing medical care to a patient, including the providing of diagnostic services in support of providing medical care. Thus the invention includes a method in the medical care of a patient, the method comprising determining or measuring expression levels of gene sequences in a cell containing sample obtained from a patient as described herein. The method may further comprise the classifying of the sample, based on the determination/measurement, as including a tumor cell of a tumor type or tissue origin in a manner as described herein. The determination and/or classification may be for use in relation to any aspect or embodiment of the invention as described herein.
[0113] The determination or measurement of expression levels may be preceded by a variety of related actions. In some embodiments, the measurement is preceded by a determination or diagnosis of a human subject as in need of said measurement. The measurement may be preceded by a determination of a need for the measurement, such as that by a medical doctor, nurse or other health care provider or professional, or those working under their instruction, or personnel of a health insurance or maintenance organization in approving the performance of the measurement as a basis to request reimbursement or payment for the performance.
[0114] The measurement may also be preceded by preparatory acts necessary to the actual measuring. Non-limiting examples include the actual obtaining of a cell containing sample from a human subject; or receipt of a cell containing sample; or sectioning a cell containing sample; or isolating cells from a cell containing sample; or obtaining RNA from cells of a cell containing sample; or reverse transcribing RNA from cells of a cell containing sample. The sample may be any as described herein for the practice of the invention.
[0115] In additional embodiments, the invention provides for a method of ordering, or receiving an order for, the performance of a method in the medical care of a patient or other method of the invention. The ordering may be made by a medical doctor, a nurse, or other health care provider, or those working under their instruction, while the receiving, directly or indirectly, may be made by any person who performs the method(s). The ordering may be by any means of communication, including communication that is written, oral, electronic, digital, analog, telephonic, in person, by facsimile, by mail, or otherwise passes through a jurisdiction within the United States.
[0116] The invention further provides methods in the processing of reimbursement or payment for a test, such as the above method in the medical care of a patient or other method of the invention. A method in the processing of reimbursement or payment may comprise indicating that 1) payment has been received, or 2) payment will be made by another payer, or 3) payment remains unpaid on paper or in a database after performance of an expression level detection, determination or measurement method of the invention. The database may be in any form, with electronic forms such as a computer implemented database included within the scope of the invention. The indicating may be in the form of a code on paper or in the database. The "another payer" may be any person or entity beyond that to whom a previous request for reimbursement or payment was made.
[0117] Alternative, the method may comprise receiving reimbursement or payment for the technical or actual performance of the above method in the medical care of a patient; for the interpretation of the results from said method; or for any other method of the invention. Of course the invention also includes embodiments comprising instructing another person or party to receive the reimbursement or payment. The ordering may be by any communication means, including those described above. The receipt may be from any entity, including an insurance company, health maintenance organization, governmental health agency, or a patient as non-limiting examples. The payment may be in whole or in part. In the case of a patient, the payment may be in the form of a partial payment known as a co-pay.
[0118] In yet another embodiment, the method may comprise forwarding or having forwarded a reimbursement or payment request to an insurance company, health maintenance organization, governmental health agency, or to a patient for the performance of the above method in the medical care of a patient or other method of the invention. The request may be by any communication means, including those described above.
[0119] In a further embodiment, the method may comprise receiving indication of approval for payment, or denial of payment, for performance of the above method in the medical care of a patient or other method of the invention. Such an indication may come from any person or party to whom a request for reimbursement or payment was made. Non-limiting examples include an insurance company, health maintenance organization, or a governmental health agency, like Medicare or Medicaid as non-limiting examples. The indication may be by any communication means, including those described above.
[0120] An additional embodiment is where the method comprises sending a request for reimbursement for performance of the above method in the medical care of a patient or other method of the invention. Such a request may be made by any communication means, including those described above. The request may have been made to an insurance company, health maintenance organization, federal health agency, or the patient for whom the method was performed.
[0121] A further method comprises indicating the need for reimbursement or payment on a form or into a database for performance of the above method in the medical care of a patient or other method of the invention. Alternatively, the method may simply indicate the performance of the method. The database may be in any form, with electronic forms such as a computer implemented database included within the scope of the invention. The indicating may be in the form of a code on paper or in the database.
[0122] In the above methods in the medical care of a patient or other method of the invention, the method may comprise reporting the results of the method, optionally to a health care facility, a health care provider or professional, a doctor, a nurse, or personnel working therefor. The reporting may also be directly or indirectly to the patient. The reporting may be by any means of communication, including those described above.
[0123] The invention further provides kits for the determination or measurement of gene expression levels in a cell containing sample as described herein. A kit will typically comprise one or more reagents to detect gene expression as described herein for the practice of the present invention. Non-limiting examples include polynucleotide probes or primers for the detection of expression levels, one or more enzymes used in the methods of the invention, and one or more tubes for use in the practice of the invention. In some embodiments, the kit will include an array, or solid media capable of being assembled into an array, for the detection of gene expression as described herein. In other embodiments, the kit may comprise one or more antibodies that is immunoreactive with epitopes present on a polypeptide which indicates expression of a gene sequence. In some embodiments, the antibody will be an antibody fragment.
[0124] A kit of the invention may also include instructional materials disclosing or describing the use of the kit or a primer or probe of the present invention in a method of the invention as provided herein. A kit may also include additional components to facilitate the particular application for which the kit is designed. Thus, for example, a kit may additionally contain means of detecting the label (e.g. enzyme substrates for enzymatic labels, filter sets to detect fluorescent labels, appropriate secondary labels such as a sheep anti-mouse-HRP, or the like). A kit may additionally include buffers and other reagents recognized for use in a method of the invention.
[0125] Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.
EXAMPLES
Example 1: Materials and Methods
[0126] The following table shows the types and number of samples of known tumors used in Example 2.
TABLE-US-00006 Tumor type Number of samples Adrenal 7 Brain-glial 16 Brain-Meningioma 7 Breast 43 Cervsx-acteno 8 Cervix-squamous 13 Endometrium 13 GallBladder 5 Germ-cell 22 GIST 10 Kidney 11 Leiomyosarcoma 13 Liver 14 Lung-adeno 9 Lung-large 9 Lung-small 8 Lung-squamous 10 Lymphoma-B 7 Lymphoma-Hodgkins 9 Lymphoma-T 5 Mesothelioma 10 Osteosarcoma 7 Ovary-clear 14 Ovary-serous 14 Pancreas 24 Prostate 11 Skin-basal-cell 5 Skin-melanoma 10 Skin-squamous 6 Small-and-large-bowel 42 Soft-tissue-Liposarcoma 5 Soft-tissue-MFH 11 Soft-tissue-Sarcoma-synovial 7 Stomach-adeno 9 Testis-Seminoma 10 Thyroid-foliicuiar-papillary 12 Thyroid-medullary 7 UrinaryBiadder 25 Total 468 Bile-Duct 1 Chofangiocarcinoma 4 Esophagus 2 Esophagus -Barretts 4 Esophagus-squamous 4 HN-squamous 3 Ovary (unclassified) 1 Ovary-endometriod 1 Ovary-mucinous 4 Ovary-stromal 1 Soft-tissue-Ewings-sarcoma 2 Soft-tissue-Fibrosarcoma 2 Soft-tissue- Rhabdomyosarcoma 3 Total 32
[0127] The 500 samples were fresh or frozen samples of tumor containing tissue. The 468 samples shown above were used for further experiments by taking 374 as the training set and the remaining 94 samples as the testing set. Tumor types of fewer than 5 samples were not used initially.
[0128] The samples contained both primary and metastatic tumors with a confirmed diagnosis. A single 5 .mu.m section was, stained (H+E), and the tumor visualized. Pure tumor populations were obtained by either manual dissection, or laser capture microdissection (Arcturus, Mountain View, Calif.).
[0129] RNA extraction and quality control were performed on each sample. Briefly, samples were processed using a silica spin column-based extraction method (Arcturus, Mountain View, Calif.). The total quantity of RNA extracted was assessed using quantitative PCR (Taqman, ABI), with primers specific for .beta.-actin transcription. Only samples with greater than 10 ng of RNA were amplified.
[0130] Samples were amplified using a modified RNA polymerase 2-round amplification protocol (Arcturus, Mountain View, Calif.). Following amplification, the RNA product yield was quantitated by OD(260/280) spectroscopy, and the amplified product visualized by agarose (2%) denaturing gel electrophoresis.
[0131] The amplified product from each sample was then hybridized to a microarray to detect the level of transcript expression in the samples. Random gene selection was performed using random sampling function software. For each number of genes selected, random samples were selected 100 times and used to compute the cross-validation and predictive accuracies on both training and testing sets. Cross-validation was by dividing the training set into parts with one being used to train and another being used as a test.
Example 2: Results
[0132] The mean of the accuracies from 100 samplings and the 95% confidence interval were calculated and plotted for each step from 50 to 16948 genes. The plots showed the cross-validation and predictive accuracies from KNN (k-nearest neighbor) algorithm versus the number of genes selected by chance. Random gene selection used random sampling function in R software.
[0133] 50 or more genes were capable of accurately classifying among the numerous tumor types in toto with a better than 50% accuracy. Similar results are observed with the use of the samples and KNN with known FFPE tumor specimens from which RNA was extracted and analyzed for gene expression.
[0134] It should be noted that while the accuracy stabilized with the use of additional genes, it is expected that there are particular sets of 50 or more genes that have significantly higher accuracies. Classification of additional tumor types, such as those totaling 32 samples in the table above, may be made with the inclusion of additional samples.
[0135] The accuracy level of a set of 100 randomly selected expressed gene sequences was determined to be 66% and was used as described in Example 3 to generate FIGS. 1 and 2.
Example 3: Information Capacity of Random Gene Sets
[0136] Subsets of the 100 randomly selected expressed gene sequences used to classify among 39 tumor types were tested for their ability to classify among subsets of the 39 tumor types. The expression levels of random combinations of 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 and all 100 (each combination sampled 10 times) of the 100 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. FIG. 1 shows the classification capability of various gene sets are shown relative to the number of tumor types classified. As expected, a higher number of gene sequences are needed to classify tumor types with higher accuracies. FIG. 2 shows the classification performance for various numbers of tumor types relative to the number of gene sequences used.
[0137] The GenBank accession numbers of the 100 gene sequences are AF269223, BC006286, AK025501, AJ002367, AI469140, AW013883, NM_001238, AI476350, BC006546, AI041212, BF724944, AI376951, R56211, BC006393, X13274, BC001133, N62397, BC000885, AK001588, AK057901, AF146760, AI951287, AK025604, BC007581, BC015025, R43102, AW449550, AI922539, AI684144, AI277662, BC015999, AW444656, BC011612, BC015401, BF447279, BC009956, AL050163, BC001248, BE672684, AL137353, BC001340, U45975, BE856598, BC009060, AL137728, AA713797, AL583913, AK054617, AI028262, AI753041, BG1939593, AL080179, AA814915, AF131798, AI961568, BC009849, AK021603, BC012561, AI570494, BC006973, AW294857, BC004952, AK026535, AI923614, AW082090, AI005513, AF339768, AK023167, AF169693, AF076249, BC007662, BC015520, AI814187, AI565381, AW271626, AK024120, AF139065, BC014075, AI887245, AF257081, AI767898, AF070634, AF155132, X69804, U65579, NM_004933, AI655104, AW131780, AI650407, AF131774, AA814057, AJ311123, BC009702, AF264036, AL161961, AJ010857, AF106912, AK023542, AF073518, and D83032. They were indexed from 1 to 100, and representative, and non-limiting, random sets used in the invention are as follows:
[0138] For 50-genes, set 1, genes 9, 52, 55, 24, 44, 58, 20, 79, 81, 86, 22, 84, 27, 32, 73, 70, 18, 41, 54, 38, 46, 78, 87, 49, 15, 95, 12, 23, 30, 13, 36, 98, 28, 56, 21, 19, 35, 51, 25, 43, 99, 34, 64, 66, 82, 72, 11, 92, 59, and 71 were used. In set 2, genes 72, 92, 27, 8, 14, 87, 42, 83, 65, 85, 40, 21, 74, 66, 6, 28, 13, 98, 91, 78, 49, 52, 33, 30, 97, 84, 2, 95, 88, 64, 93, 11, 1, 45, 61, 39, 12, 67, 53, 89, 43, 17, 54, 7, 55, 38, 3, 15, 70, and 31 were used. In set 3, genes 9, 35 87, 52, 73, 74, 88, 22, 41, 28, 93, 15, 67, 20, 68, 17, 46, 43, 51, 24, 84, 79, 19, 100, 76, 6, 49, 97, 16, 59, 89, 66, 45, 63, 2, 27, 13, 98, 69, 60, 26, 86, 83, 58, 71, 54, 82, 32, 42, and 77 were used. In set 4, genes 34, 67, 48, 53, 24, 61, 6, 64, 89, 76, 35, 21, 86, 83, 68, 7, 25, 65, 58, 28, 97, 90, 31, 57, 3, 50, 2, 96, 84, 29, 42, 46, 82, 62, 19, 95, 44, 52, 33, 36, 15, 37, 70, 11, 43, 13, 8, 49, 16, and 99 were used. In set 5, genes 11, 22, 87, 25, 5, 38, 35, 68, 94, 51, 60, 53, 20, 42, 95, 92, 33, 15, 14, 24, 85, 37, 69, 17, 19, 93, 8, 97, 46, 83, 26, 86, 66, 89, 63, 16, 74, 28, 52, 2, 96, 99, 71, 10, 65, 90, 29, 34, 77, and 45 were used. In set 6, genes 62, 6, 69, 12, 19, 50, 51, 5, 1, 32, 41, 84, 27, 10, 93, 28, 79, 21, 88, 47, 58, 64, 74, 39, 33, 46, 17, 86, 87, 4, 60, 98, 97, 45, 26, 72, 40, 63, 30, 54, 52, 11, 15, 96, 14, 24, 73, 67, 59, and 38 were used. In set 7, genes 67, 21, 62, 15, 59, 6, 23, 30, 89, 94, 82, 74, 96, 17, 41, 38, 48, 100, 5, 71, 20, 55, 79, 28, 44, 64, 92, 65, 51, 37, 32, 22, 72, 98, 12, 34, 78, 50, 60, 76, 88, 3, 40, 80, 77, 16, 24, 42, 8, and 14 were used. In set 8, genes 43, 68, 8, 38, 82, 73, 12, 23, 77, 63, 56, 33, 66, 14, 47, 17, 53, 62, 42, 57, 30, 89, 44, 58, 34, 24, 81, 40, 45, 1, 99, 52, 37, 80, 96, 10, 71, 50, 20, 51, 18, 54, 31, 70, 84, 3, 83, 76, 59, and 91 were used. In set 9, genes 36, 90, 34, 79, 29, 24, 44, 51, 27, 58, 52, 37, 68, 49, 89, 80, 57, 8, 22, 77, 54, 65, 26, 91, 21, 64, 59, 61, 13, 74, 87, 50, 63, 20, 78, 23, 96, 67, 30, 55, 81, 35, 72, 56, 95, 82, 39, 42, 88, and 92 were used. In set 10, genes 59, 94, 91, 88, 3, 45, 13, 96, 66, 58, 60, 69, 21, 95, 4, 7, 67, 83, 44, 2, 37, 24, 8, 12, 53, 47, 34, 9, 31, 46, 11, 68, 1, 6, 29, 14, 33, 54, 43, 80, 39, 18, 100, 10, 84, 65, 5, 76, 26, and 22 were used.
[0139] For 55 genes, set 1, genes 20, 76, 33, 73, 15, 83, 47, 2, 95, 67, 26, 49, 97, 25, 46, 13, 51, 42, 14, 11, 39, 94, 37, 100, 56, 63, 6, 66, 45, 75, 3, 78, 55, 7, 72, 44, 35, 48, 65, 38, 60, 90, 30, 36, 77, 23, 16, 32, 80, 89, 8, 91, 43, 50, and 28 were used. In set 2, genes 11, 63, 93, 79, 21, 57, 66, 10, 42, 83, 75, 94, 3, 38, 49, 91, 53, 90, 50, 52, 39, 99, 85, 48, 31, 18, 89, 25, 87, 56, 40, 5, 19, 88, 27, 92, 20, 100, 59, 43, 95, 80, 86, 44, 55, 68, 54, 33, 96, 45, 2, 9, 81, 73, and 37 were used. In set 3, genes 20, 73, 76, 29, 44, 33, 84, 98, 15, 69, 32, 14, 50, 70, 63, 41, 87, 74, 99, 34, 23, 36, 37, 68, 89, 43, 91, 18, 26, 45, 9, 90, 28, 92, 7, 30, 22, 54, 96, 72, 16, 38, 58, 52, 56, 79, 57, 47, 83, 17, 49, 2, 80, 51, and 46 were used. In set 4, genes 90, 63, 60, 82, 81, 50, 25, 24, 56, 9, 8, 89, 70, 55, 15, 4, 35, 75, 77, 46, 87, 6, 49, 85, 98, 58, 28, 27, 64, 47, 99, 51, 86, 21, 54, 80, 41, 74, 88, 14, 36, 2, 23, 32, 19, 30, 52, 84, 62, 37, 43, 53, 72, 39, and 92 were used. In set 5, genes 27, 43, 33, 84, 89, 31, 60, 97, 15, 45, 42, 73, 4, 6, 90, 61, 72, 56, 2, 38, 96, 74, 94, 14, 25, 77, 58, 86, 21, 32, 82, 3, 50, 17, 28, 48, 44, 7, 70, 20, 59, 83, 1, 71, 52, 95, 69, 54, 39, 46, 63, 51, 57, 34, and 22 were used. In set 6, genes 96, 12, 94, 27, 11, 33, 25, 22, 26, 50, 60, 70, 68, 30, 82, 34, 17, 32, 29, 19, 87, 76, 81, 7, 55, 35, 45, 56, 31, 99, 5, 24, 54, 97, 21, 92, 98, 36, 88, 23, 58, 77, 14, 95, 9, 73, 84, 61, 2, 38, 83, 65, 42, 74, and 48 were used. In set 7, genes 52, 11, 79, 27, 23, 64, 96, 33, 75, 12, 34, 94, 26, 78, 67, 51, 57, 70, 28, 89, 9, 98, 62, 91, 41, 65, 73, 74, 8, 16, 90, 37, 1, 10, 59, 81, 63, 30, 80, 18, 15, 48, 36, 19, 84, 14, 45, 38, 97, 99, 3, 82, 54, 22, and 5 were used. In set 8, genes 83, 57, 6, 37, 44, 76, 5, 59, 74, 62, 72, 23, 93, 75, 32, 100, 98, 29, 30, 65, 21, 17, 78, 46, 13, 82, 14, 50, 66, 63, 90, 49, 54, 68, 60, 10, 87, 94, 58, 91, 33, 31, 36, 8, 11, 92, 51, 38, 43, 52, 7, 86, 89, 84, and 70 were used. In set 9, genes 29, 100, 79, 21, 63, 12, 51, 2, 18, 77, 81, 33, 68, 69, 13, 23, 37, 39, 14, 3, 93, 36, 5, 35, 30, 40, 28, 61, 49, 71, 27, 99, 75, 96, 83, 97, 78, 54, 19, 89, 62, 38, 8, 53, 26, 43, 52, 25, 58, 9, 31, 86, 65, 6, and 60 were used. In set 10, genes 7, 37, 22, 39, 41, 89, 57, 75, 6, 23, 47, 51, 55, 93, 49, 5, 15, 79, 20, 11, 42, 87, 78, 33, 68, 76, 94, 77, 62, 16, 31, 54, 28, 99, 90, 61, 25, 21, 59, 73, 83, 95, 30, 91, 65, 24, 4, 17, 10, 72, 63, 98, 34, 69, and 1 were used.
[0140] For 60 genes, set 1, genes 67, 60, 53, 20, 3, 9, 87, 16, 1, 14, 96, 82, 79, 94, 35, 32, 44, 22, 17, 46, 59, 29, 40, 57, 68, 52, 48, 31, 34, 23, 91, 38, 92, 49, 51, 86, 88, 55, 50, 39, 83, 65, 11, 42, 4, 63, 47, 73, 84, 75, 77, 18, 74, 100, 26, 5, 72, 10, 90, and 76 were used. In set 2, genes 62, 67, 70, 82, 8, 10, 26, 45, 98, 38, 76, 14, 72, 36, 89, 95, 86, 96, 18, 91, 75, 74, 7, 46, 16, 83, 65, 33, 29, 57, 32, 42, 34, 37, 80, 100, 99, 9, 2, 22, 64, 11, 87, 15, 23, 55, 60, 61, 81, 49, 5, 58, 3, 40, 71, 54, 85, 94, 66, and 20 were used. In set 3, genes 49, 10, 76, 94, 83, 90, 42, 57, 38, 85, 29, 1, 60, 71, 65, 30, 64, 23, 72, 27, 70, 13, 100, 43, 20, 44, 4, 88, 79, 24, 84, 91, 87, 41, 21, 48, 54, 68, 16, 35, 6, 89, 2, 34, 96, 22, 99, 52, 28, 3, 15, 47, 7, 61, 63, 75, 19, 97, 56, and 39 were used. In set 4, genes 99, 94, 58, 51, 46, 87, 77, 23, 9, 74, 52, 4, 47, 42, 5, 62, 48, 14, 35, 32, 75, 98, 95, 18, 67, 76, 50, 8, 1, 19, 22, 72, 11, 83, 82, 89, 12, 24, 90, 80, 92, 85, 26, 66, 38, 78, 79, 60, 49, 59, 25, 84, 36, 29, 45, 55, 27, 70, 39, and 57 were used. In set 5, genes 39, 21, 70, 81, 88, 30, 2, 57, 45, 5, 47, 93, 1, 34, 51, 49, 3, 6, 65, 97, 41, 67, 95, 85, 98, 29, 82, 38, 17, 84, 72, 52, 20, 33, 53, 66, 7, 54, 25, 23, 80, 61, 76, 9, 14, 48, 26, 12, 32, 4, 64, 73, 56, 87, 59, 35, 31, 62, 13, and 15 were used. In set 6, genes 99, 80, 35, 87, 17, 27, 53, 43, 38, 45, 61, 34, 81, 3, 16, 42, 24, 37, 19, 39, 59, 6, 28, 74, 32, 92, 18, 31, 25, 66, 79, 41, 51, 97, 58, 7, 49, 70, 71, 33, 78, 85, 63, 72, 89, 15, 40, 29, 46, 1, 73, 68, 56, 54, 47, 5, 65, 100, 44, and 22 were used. In set 7, genes 15, 51, 66, 47, 4, 82, 78, 71, 72, 75, 61, 10, 34, 18, 12, 55, 32, 80, 45, 14, 3, 62, 20, 74, 96, 48, 94, 88, 69, 64, 86, 9, 24, 41, 8, 28, 81, 13, 37, 87, 53, 44, 57, 43, 30, 38, 67, 5, 100, 91, 50, 2, 42, 77, 7, 83, 73, 99, 68, and 6 were used. In set 8, genes 41, 21, 20, 62, 50, 86, 13, 23, 94, 45, 80, 51, 42, 52, 47, 76, 18, 72, 25, 8, 35, 58, 37, 32, 46, 71, 99, 33, 48, 77, 38, 19, 44, 66, 7, 53, 12, 10, 74, 96, 84, 28, 30, 15, 2, 81, 7, 26, 79, 88, 24, 49, 65, 17, 95, 63, 75, 11, 55, and 36 were used. In set 9, genes 14, 40, 30, 48, 37, 3, 28, 57, 58, 22, 70, 74, 91, 98, 46, 76, 81, 65, 54, 23, 11, 34, 17, 53, 26, 67, 80, 42, 86, 73, 25, 24, 9, 88, 38, 45, 13, 56, 83, 87, 31, 36, 43, 100, 35, 41, 16, 33, 61, 6, 49, 63, 71, 64, 96, 8, 19, 39, 68, and 84 were used. In set 10, genes 97, 39, 83, 8, 35, 74, 13, 96, 20, 19, 69, 10, 81, 57, 65, 17, 12, 48, 86, 4, 94, 25, 92, 22, 55, 43, 34, 45, 73, 18, 31, 15, 2, 61, 51, 91, 89, 82, 68, 46, 24, 77, 27, 88, 72, 16, 37, 70, 29, 60, 80, 14, 23, 44, 49, 66, 62, 32, 28, and 98 were used.
[0141] For 65 genes, set 1, genes 68, 57, 82, 75, 62, 43, 41, 76, 59, 34, 78, 95, 32, 79, 88, 46, 4, 89, 96, 84, 66, 10, 31, 23, 52, 16, 85, 98, 28, 25, 74, 69, 39, 63, 64, 58, 65, 30, 13, 19, 40, 50, 48, 6, 93, 2, 11, 51, 100, 26, 27, 24, 1, 87, 91, 38, 5, 21, 56, 35, 61, 17, 90, 94, and 83 were used. In set 2, genes 62, 33, 59, 65, 12, 97, 20, 99, 13, 64, 29, 23, 49, 35, 66, 74, 77, 46, 14, 11, 81, 32, 42, 34, 70, 17, 54, 44, 24, 53, 3, 8, 71, 47, 96, 80, 86, 40, 15, 37, 90, 67, 73, 50, 25, 51, 36, 75, 72, 92, 93, 4, 84, 18, 76, 21, 38, 88, 68, 9, 60, 52, 45, 7, and 41 were used. In set 3, genes 12, 80, 56, 70, 50, 95, 15, 85, 95, 53, 45, 47, 10, 99, 12, 76, 67, 89, 83, 35, 91, 62, 6, 84, 23, 52, 65, 9, 37, 4, 51, 42, 48, 49, 100, 21, 5, 43, 75, 92, 98, 36, 16, 27, 19, 22, 82, 73, 58, 63, 34, 74, 3, 71, 87, 72, 81, 1, 68, 46, 55, 88, 64, 11, and 33 were used. In set 4, genes 16, 41, 15, 40, 19, 47, 77, 96, 5, 21, 38, 84, 22, 27, 81, 46, 74, 36, 8, 52, 98, 87, 91, 54, 86, 80, 25, 39, 75, 42, 10, 83, 51, 90, 62, 78, 17, 9, 53, 68, 12, 100, 24, 89, 20, 58, 59, 11, 92, 32, 30, 95, 49, 55, 73, 82, 99, 70, 97, 13, 6, 93, 67, 29, and 45 were used. In set 5, genes 94, 3 31, 85, 51, 80, 55, 22, 93, 97, 49, 14, 81, 67, 76, 77, 75, 19, 59, 5, 72, 34, 62, 58, 43, 7, 44, 35, 98, 24, 74, 41, 73, 63, 13, 87, 56, 15, 42, 12, 91, 50, 37, 29, 40, 53, 83, 2, 99, 100, 1, 10, 33, 16, 26, 9, 71, 39, 11, 46, 57, 66, 92, and 82 were used. In set 6, genes 86, 55, 15, 9, 13, 94, 33, 16, 14, 11, 32, 59, 88, 64, 90, 50, 45, 82, 7, 44, 48, 98, 21, 51, 62, 99, 75, 25, 19, 41, 24, 26, 17, 23, 6, 71, 72, 47, 42, 2, 85, 22, 56, 81, 78, 79, 43, 18, 100, 36, 34, 70, 39, 80, 66, 97, 58, 31, 30, 57, 35, 96, 12, 29, and 10 were used. In set 7, genes 16, 50, 4, 18, 60, 65, 37, 94, 1, 88, 76, 71, 31, 2, 53, 59, 19, 26, 28, 89, 87, 77, 63, 57, 92, 55, 20, 93, 72, 38, 46, 62, 45, 11, 52, 95, 54, 14, 36, 42, 39, 64, 7, 99, 86, 78, 27, 43, 66, 58, 25, 81, 79, 41, 90, 13, 73, 67, 32, 44, 23, 34, 29, 6, and 35 were used. In set 8, genes 8, 53, 3, 33, 84, 61, 74, 98, 31, 9, 55, 62, 4, 88, 27, 50, 85, 34, 69, 83, 99, 17, 25, 19, 40, 90, 45, 30, 28, 92, 93, 75, 95, 37, 6, 24, 79, 96, 70, 60, 91, 52, 89, 49, 10, 100, 39, 77, 41, 23, 29, 20, 22, 5, 16, 59, 21, 46, 80, 32, 73, 72, 2, 26, and 48 were used. In set 9, genes 98, 82, 24, 35, 25, 93, 5, 56, 76, 96, 2, 78, 40, 13, 83, 86, 92, 77, 81, 29, 58, 99, 97, 80, 18, 27, 1, 65, 14, 16, 59, 20, 26, 67, 32, 22, 90, 37, 85, 7, 41, 34, 4, 68, 45, 12, 79, 62, 17, 75, 84, 91, 54, 72, 57, 10, 95, 44, 52, 9, 28, 89, 100, 33, and 21 were used. In set 10, genes 96, 40, 22, 50, 75, 38, 98, 89, 55, 60, 86, 18, 87, 85, 49, 2, 57, 73, 33, 29, 59, 42, 63, 68, 62, 92, 74, 53, 8, 7, 51, 71, 11, 30, 83, 56, 77, 81, 79, 16, 37, 69, 61, 64, 27, 67, 25, 100, 31, 3, 13, 4, 12, 21, 65, 99, 36, 66, 6, 94, 44, 35, 72, 95, and 90 were used.
[0142] For 70 genes, set 1, genes 36, 6, 100, 39, 37, 3, 27, 45, 93, 19, 89, 43, 68, 9, 60, 46, 51, 80, 32, 52, 62, 35, 58, 14, 10, 33, 85, 12, 64, 67, 75, 86, 17, 44, 83, 24, 87, 84, 23, 96, 79, 20, 13, 8, 11, 76, 88, 56, 38, 98, 29, 16, 99, 2, 66, 30, 48, 26, 5, 25, 78, 42, 47, 94, 15, 4, 55, 65, 97, 71 we used. In set 2, genes 96, 98, 38, 32, 52, 25, 31, 14, 91, 53, 8, 94, 49, 27, 69, 20, 44, 4, 92, 56, 61, 97, 18, 65, 66, 54, 21, 3, 29, 79, 80, 70, 77, 50, 39, 99, 58, 23, 85, 51, 15, 72, 33, 19, 24, 68, 7, 41, 81, 64, 57, 73, 84, 46, 22, 74, 11, 45, 55, 82, 6, 47, 59, 42, 88, 9, 16, 34, 83, and 30 were, used. In set 3, genes 27, 46, 30, 54, 47, 94, 26, 38, 73, 31, 43, 8, 50, 48, 6, 56, 59, 25, 89, 52, 78, 68, 49, 29, 83, 92, 97, 98, 4, 3, 95, 87, 23 1, 51, 44, 34, 35, 85, 61, 22, 84, 42, 13, 75, 93, 45, 88, 19, 80, 39, 24, 77, 2, 55, 62, 11, 90, 18, 81, 57, 20, 96, 28, 7, 70, 86, 5, 63, and 69 were used. In set 4, genes 65, 29, 88, 19, 42, 30, 15, 16, 74, 53, 25, 8, 95, 5, 69, 99, 59, 67, 84, 14, 80, 12, 37, 13, 71, 39, 43, 100, 60, 79, 51, 11, 45, 82, 83, 61, 62, 90, 6, 20, 2, 18, 97, 1, 48, 81, 35, 87, 56, 36, 93, 41, 54, 46, 10, 27, 47, 33, 55, 64, 26, 57, 85, 89, 9, 96, 72, 68, 23, and 32 were used. In set 5, genes 25, 41, 56, 91, 19, 22, 63, 39, 59, 83, 7, 74, 20, 86, 84, 2, 43, 73, 69, 58, 35, 26, 23, 42, 29, 10, 13, 77, 16, 72, 71, 81, 40, 66, 80, 50, 12, 48, 64, 100, 24, 94, 97, 57, 98, 68, 78, 92, 53, 31, 45, 38, 61, 75, 5, 1, 44, 99, 3, 36, 88, 34, 21, 17, 15, 89, 37, 51, 85, and 79 were used. In set 6, genes 59, 78, 34, 83, 5, 11, 60, 97, 3, 9, 20, 90, 33, 8, 31, 10, 80, 7, 92, 15, 23, 72, 14, 86, 82, 18, 42, 88, 94, 48, 79, 73, 77, 52, 95, 16, 87, 28, 98, 71, 74, 21, 67, 6, 66, 35, 99, 29, 32, 75, 26, 39, 47, 45, 50, 41, 54, 1, 84, 85, 91, 100, 61, 12, 37, 4, 25, 55, 46, and 13 were used. In set 7, genes 63, 14, 66, 75, 12, 2, 90, 81, 27, 72, 70, 89, 59, 46, 6, 53, 22, 80, 30, 79, 82, 71, 92, 19, 73, 83, 38, 40, 1, 68, 20, 8, 50, 74, 94, 26, 35, 28, 43, 34, 77, 18, 96, 16, 95, 15, 9, 11, 84, 39, 10, 54, 65, 57, 25, 60, 51, 55, 33, 17, 44, 29, 58, 93, 62, 21, 4, 7, and 78 were used. In set 8, genes 60, 76, 17, 29, 68, 24, 54, 87, 16, 66, 15, 8, 85, 92, 67, 100, 82, 74, 41, 33, 3, 35, 94, 78, 58, 75, 98, 63, 95, 12, 47, 81, 91, 9, 7, 83, 77, 22, 89, 56, 49, 31, 96, 2, 70, 23, 46, 6, 39, 90, 59, 71, 44, 10, 36, 52, 42, 86, 5, 64, 55, 69, 84, 28, 93, 53, 38, 27, 13, and 26 were used. In set 9, genes 21, 24, 41, 29, 92, 30, 51, 31, 83, 71, 37, 23, 11, 53, 14, 93, 45, 69, 52, 56, 70, 68, 3, 79, 26, 58, 66, 15, 50, 95, 16, 2, 4, 5, 28, 42, 34, 9, 82, 6, 63, 44, 87, 32, 59, 80, 55, 96, 54, 89, 22, 94, 36, 46, 40, 86, 98, 38, 67, 85, 35, 60, 25, 1, 78, 61, 17, 64, 7, and 91 were used. In set 10, genes 93, 44, 77, 3, 31, 64, 39, 89, 23, 51, 78, 85, 35, 81, 22, 74, 97, 14, 27, 13, 16, 88, 28, 61, 57, 79, 99, 37, 30, 36, 24, 11, 45, 34, 54, 50, 41, 1, 7, 48, 56, 63, 58, 49, 17, 26, 15, 69, 2, 53, 43, 62, 55, 100, 95, 52, 83, 29, 19, 38, 59, 76, 20, 87, 66, 25, 72, 70, 4, and 73 were used,
[0143] For 75 genes, set 1, genes 73, 40, 56, 32, 59, 42, 70, 12, 100, 6, 28, 11, 43, 55, 5, 64, 80, 99, 23, 57, 18, 82, 60, 61, 31, 81, 14, 3, 91, 76, 86, 19, 26, 83, 38, 29, 8, 36, 69, 85, 96, 27, 47, 10, 35, 39, 94, 24, 62, 34, 54, 65, 25, 90, 51, 67, 41, 46, 33, 1, 37, 49, 9, 71, 13, 21, 44, 2, 98, 52, 84, 20, 74, 93, and 88 were used. In set 2, genes 26, 21, 43, 56, 15, 55, 9, 34, 58, 12, 85, 44, 20, 99, 74, 35, 39, 88, 53, 8, 92, 67, 6, 48, 69, 28, 23, 87, 71, 5, 72, 89, 38, 100, 25, 1, 13, 3, 14, 29, 96, 62, 64, 90, 78, 63, 68, 66, 11, 41, 77, 42, 4, 60, 24, 98, 18, 17, 52, 46, 30, 32, 70, 33, 31, 83, 45, 36, 84, 95, 82, 80, 22, 50, and 73 were used. In set 3, genes 96, 11, 58, 14, 77, 32, 6, 28, 55, 12, 40, 72, 83, 7, 89, 67, 51, 63, 95, 15, 74, 99, 88, 81, 84, 38, 36, 13, 87, 5, 69, 62, 19, 86, 90, 76, 66, 33, 52, 4, 20, 78, 59, 27, 17, 2, 43, 75, 64, 79, 53, 26, 3, 42, 100, 48, 71, 85, 41, 25, 61, 57, 49, 70, 37, 80, 24, 94, 30, 54, 9, 35, 21, 16, and 22 were used. In set 4, genes 48, 31, 73, 90, 10, 100, 32, 56, 83, 38, 93, 7, 53, 8, 79, 15, 63, 5, 92, 76, 58, 59, 35, 67, 2, 98, 23, 37, 24, 94, 25, 9, 46, 36, 82, 40, 89, 27, 34, 71, 84, 97, 86, 6, 21, 54, 22, 72, 17, 44, 26, 57, 64, 11, 91, 75, 80, 95, 62, 88, 51, 39, 99, 69, 43, 68, 42, 52, 16, 4, 30, 77, 81, 60, and 50 were used. In set 5, genes 86, 46, 90, 79, 40, 99, 53, 67, 97, 82, 7, 15, 49, 71, 94, 48, 68, 80, 20, 51, 19, 96, 100, 38, 91, 83, 50, 33, 76, 66, 93, 22, 74, 85, 45, 31, 10, 62, 84, 25, 88, 77, 43, 78, 69, 24, 61, 57, 41, 56, 63, 32, 16, 59, 12, 4, 14, 28, 87, 44, 65, 55, 98, 35, 9, 64, 75, 47, 89, 18, 52, 36, 29, 54, and 81 were used. In set 6, genes 70, 47, 96, 46, 43, 2, 66, 39, 54, 40, 31, 84, 92, 30, 5, 75, 21, 9, 4, 24, 59, 90, 42, 44, 45, 97, 55, 69, 74, 79, 87, 86, 91, 56, 13, 98, 12, 64, 34, 99, 67, 83, 27, 68, 16, 10, 81, 61, 80, 7, 94, 82, 49, 71, 53, 15, 76, 36, 11, 19, 41, 65, 8, 28, 14, 95, 62, 51, 63, 88, 3, 60, 18, 58, and 52 were used. In set 7, genes 90, 80, 39, 46, 51, 91, 25, 16, 3, 36, 20, 30, 17, 99, 95, 44, 27, 89, 61, 9, 65, 19, 86, 13, 84, 14, 5, 10, 82, 67, 85, 45, 59, 81, 35, 41, 4, 71, 32, 24, 22, 6, 53, 98, 54, 66, 42, 18, 97, 94, 87, 49, 79, 56, 72, 57, 76, 69, 28, 43, 23, 11, 52, 92, 7, 93, 96, 75, 73, 8, 58, 83, 50, 29, and 68 were used. In set 8, genes 95, 93, 14, 43, 31, 32, 100, 6, 92, 28, 68, 99, 35, 60, 90, 70, 22, 49, 54, 94, 56, 4, 97, 85, 2, 46, 11, 50, 63, 30, 38, 76, 39, 58, 64, 67, 83, 33, 88, 79, 87, 40, 57, 27, 55, 18, 3, 29, 82, 53, 98, 91, 61, 80, 26, 84, 20, 77, 86, 51, 1, 74, 23, 19, 10, 21, 47, 69, 24, 66, 81, 96, 15, 36, and 41 were used. In set 9, genes 33, 41, 48, 68, 53, 45, 30, 79, 23, 70, 86, 13, 71, 92, 58, 1, 77, 26, 61, 81, 69, 14, 73, 88, 44, 87, 74, 9, 4, 12, 20, 75, 60, 57, 55, 82, 22, 94, 46, 65, 16, 19, 52, 40, 59, 66, 64, 28, 96, 91, 93, 39, 72, 5, 98, 6, 3, 62, 24, 36, 49, 31, 47, 90,35, 89, 84, 99, 32, 11, 56, 17, 83, 51, and 97 were used. In set 10, genes 40, 10, 67, 9, 43, 13, 52, 73, 50, 41, 54, 56, 98, 100, 83, 85, 28, 32, 47, 66, 74, 65, 79, 81, 94, 36, 90, 69, 31, 64, 88, 99, 44, 18, 33, 75, 95, 42, 58, 92, 15, 53, 97, 34, 63, 30, 24, 3, 45, 29, 82, 48, 17, 14, 26, 49, 93, 27, 87, 6, 57, 39, 68, 12, 70, 4, 25, 91, 11, 89, 21, 23, 96, 84, and 46 were used.
[0144] For 80 genes, set 1, genes 75, 2, 91, 94, 19, 31, 43, 50, 96, 49, 29, 14, 93, 58, 69, 82, 28, 6, 65, 26, 66, 40, 64, 34, 33, 53, 13, 4, 37, 80, 57, 59, 1, 87, 11, 16, 83, 21, 35, 52, 25, 99, 45, 46, 36, 89, 88, 7, 39, 55, 90, 72, 17, 9, 85, 44, 22, 56, 8, 23, 18, 77, 12, 10, 48, 97, 61, 74, 92, 81, 95, 68, 47, 71, 62, 24, 70, 20, 79, and 32 were used. In set 2, genes 1, 34, 89, 27, 22, 77, 28, 35, 11, 7, 39, 21, 46, 49, 74, 43, 13, 75, 14, 65, 73, 92, 19, 66, 29, 81, 88, 78, 40, 32, 12, 71, 9, 44, 23, 70, 45, 10, 98, 48, 68, 55, 82, 5, 56, 59, 15, 95, 33, 99, 87, 85, 18, 97, 100, 83, 53, 63, 6, 2, 37, 17, 67, 62, 50, 42, 25, 94, 31, 69, 90, 84, 64, 16, 57, 51, 54, 80, 86, and 38 were used. In set 3, genes 63, 28, 35, 67, 96, 9, 12, 31, 1, 59, 22, 44, 11, 82, 6, 64, 87, 47, 21, 94, 42, 2, 72, 19, 20, 27, 89, 13, 77, 3, 16, 79, 38, 10, 80, 52, 50, 33, 25, 4, 30, 40, 32, 36, 8, 43, 26, 51, 18, 66, 61, 68, 56, 74, 53, 7, 73, 88, 49, 23, 46, 76, 92, 93, 83, 70, 24, 98, 97, 58, 65, 29, 55, 91, 95, 90, 5, 69, 86, and 78 were used. In set 4, genes 79, 72, 68, 31, 42, 95, 78, 36, 10, 34, 59, 91, 46, 40, 82, 1, 44, 4, 69, 3, 17, 43, 35, 63, 18, 13, 77, 81, 67, 26, 60, 86, 25, 61, 89, 76, 55, 27, 22, 29, 20, 11, 7, 30, 54, 39, 62, 8, 74, 28, 71, 12, 38, 65, 66, 64, 21, 9, 56, 16, 88, 99, 96, 32, 94, 51, 90, 37, 87, 92, 97, 70, 41, 57, 50, 45, 83, 24, 48, and 58 were used. In set 5, genes 100, 69, 33, 24, 83, 84, 97, 22, 40, 45, 17, 3, 43, 52, 50, 30, 8, 99, 9, 46, 7, 14, 35, 61, 15, 16, 64, 6, 23, 41, 60, 63, 96, 98, 38, 36, 49, 13, 76, 85, 87, 71, 66, 56, 80, 20, 34, 29, 57, 91, 81, 78, 27, 88, 37, 94, 51, 5, 1, 74, 44, 70, 58, 25, 19, 89, 39, 47, 65, 62, 68, 95, 18, 75, 79, 59, 2, 10, 73, and 53 were used, in set 6, genes 69, 100, 3, 35, 58, 56, 96, 43, 39, 50, 61, 36, 71, 95, 30, 18, 90, 63, 21, 31, 94, 46, 44, 23, 7, 10, 88, 49, 9, 53, 25, 54, 2, 97, 82, 75, 68, 48, 26, 91, 70, 65, 51, 19, 84, 29, 47, 12, 99, 85, 20, 16, 5, 22, 73, 93, 92, 89, 62, 81, 77, 41, 83, 1, 72, 27, 15, 79, 67, 37, 11, 64, 87, 86, 80, 74, 55, 8, 13, and 60 were used. In set 7, genes 67, 73, 85, 95, 92, 60, 29, 28, 24, 90, 72, 71, 37, 76, 27, 78, 53, 34, 98, 70, 87, 33, 5, 41, 42, 68, 62, 82, 100, 96, 69, 65, 6, 91, 21, 38, 3, 80, 25, 75, 31, 52, 79, 20, 84, 83, 19, 86, 57, 9, 77, 58, 64, 97, 14, 8, 50, 2, 51, 94, 56, 46, 35, 93, 7, 39, 1, 88, 59, 17, 48, 74, 32, 81, 99, 16, 11, 49, 13, and 30 were used. In set 8, genes 80, 52, 14, 42, 21, 76, 32, 69, 30, 60, 86, 61, 48, 24, 67, 92, 16, 75, 93, 2, 6, 99, 20, 73, 9, 97, 98, 56, 47, 12, 35, 26, 36, 41, 96, 55, 11, 84, 7, 87, 4, 70, 79, 88, 44, 17, 50, 27, 89, 28, 29, 43, 77, 39, 8, 15, 91, 65, 22, 71, 53, 37, 34, 95, 83, 45, 68, 1, 18, 13, 31, 85, 3, 90, 51, 49, 19, 66, 63, and 54 were used. In set 9, genes 91, 22, 68, 85, 53, 89, 10, 77, 97, 4, 7, 33, 46, 51, 14, 76, 82, 62, 17, 3, 65, 70, 84, 75, 31, 50, 73, 63, 19, 52, 42, 26, 23, 47, 96, 2, 64, 56, 9, 54, 38, 93, 13, 90, 86, 8, 59, 57, 79, 28, 21, 88, 5, 66, 1, 94, 55, 35, 15, 87, 74, 32, 27, 92, 72, 18, 69, 80, 37, 67, 71, 34, 95, 99, 40, 83, 30, 81, 48, and 39 were used. In set 10, genes 92, 76, 86, 5, 20, 1, 48, 42, 62, 29, 12, 7, 37, 46, 47, 82, 32, 66, 97, 77, 56, 91, 30, 80, 36, 72, 17, 31, 2, 81, 23, 28, 51, 55, 98, 40, 95, 13, 10, 58, 33, 21, 14, 74, 85, 88, 22, 75, 94, 27, 43, 3, 100, 61, 67, 4, 25, 6, 44, 60, 24, 93, 63, 89, 70, 41, 15, 11, 53, 87, 16, 65, 52, 68, 57, 99, 50, 45, 71, and 38 were used.
[0145] For 85 genes, set 1, genes 38, 35, 85, 59, 17, 7, 31, 58, 96, 97, 16, 70, 82, 42, 21, 54, 88, 34, 63, 4, 27, 29, 3, 19, 69, 36, 9, 99, 74, 86, 76, 24, 15, 81, 73, 93, 40, 52, 26, 57, 37, 87, 55, 90, 41, 79, 45, 77, 91, 71, 61, 11, 94, 83, 25, 48, 1, 5, 8, 22, 33, 46, 60, 56, 20, 44, 89, 18, 10, 23, 78, 65, 50, 72, 75, 47, 98, 28, 66, 68, 32, 12, 51, 13, and 100 were used. In set 2, genes 32, 90, 94, 21, 77, 63, 17, 27, 62, 41, 35, 81, 100, 14, 45, 69, 3, 75, 34, 76, 65, 15, 95, 86, 39, 92, 89, 24, 57, 4, 54, 50, 58, 88, 5, 56, 22, 59, 6, 52, 28, 1, 9, 40, 98, 99, 91, 19, 8, 23, 96, 2, 73, 67, 7, 25, 53, 12, 44, 18, 13, 87, 60, 49, 93, 55, 20, 72, 42, 66, 30, 80, 33, 26, 64, 46, 84, 31, 70, 61, 71, 83, 38, 36, and 29 were used. In set 3, genes 88, 20, 1, 58, 53, 32, 65, 34, 50, 75, 71, 36, 59, 39, 30, 61, 8, 62, 14, 3, 94, 66, 35, 37, 17, 47, 77, 60, 4, 80, 74, 28, 97, 87, 93, 33, 64, 48, 29, 18, 49, 21, 56, 69, 22, 25, 43, 54, 91, 7, 81, 79, 12, 85, 96, 40, 63, 52, 82, 86, 41, 24, 44, 84, 70, 6, 15, 38, 57, 16, 55, 90, 76, 42, 51, 23, 11, 67, 45, 98, 19, 10, 27, 2, and 31 were used. In set 4, genes 64, 86, 54, 83, 47, 21, 67, 57, 73, 23, 71, 76, 56, 9, 44, 75, 82, 11, 8, 99, 72, 13, 79, 28, 92, 5, 27, 90, 24, 91, 33, 68, 51, 60, 94, 58, 78, 48, 18, 42, 53, 98, 70, 32, 41, 49, 45, 6, 30, 63, 95, 80, 36, 87, 97, 65, 77, 3, 26, 35, 59, 40, 84, 17, 61, 81, 39, 46, 22, 1, 2, 50, 25, 69, 4, 43, 15, 29, 20, 17, 88, 10, 38, 100, and 19 were used. In set 5, genes 11, 92, 15, 42, 33, 19, 6, 57, 23, 87, 31, 5, 30, 21, 54, 51, 14, 68, 97, 34, 59, 24, 20, 50, 29, 65, 13, 80, 16, 73, 8, 25, 47, 55, 27, 45, 100, 96, 85, 38, 37, 81, 44, 4, 9, 70, 98, 77, 48, 35, 28, 79, 41, 71, 86, 61, 2, 49, 60, 67, 66, 69, 72, 3, 83, 26, 1, 89, 17, 39, 52, 10, 32, 75, 82, 99, 40, 95, 90, 53, 22, 91, 62, 78, and 56 were used. In set 6, genes 87, 12, 4, 63, 15, 81, 92, 10, 74, 44, 7, 23, 89, 93, 28, 59, 50, 72, 30, 60, 54, 71, 39, 12, 21, 85, 40, 37, 68, 64, 97, 66, 52, 67, 98, 91, 1, 83, 61, 6, 24, 38, 86, 77, 26, 88, 43, 100, 48, 20, 14, 31, 82, 9, 13, 62, 55, 45, 57, 11, 27, 90, 25, 80, 17, 5, 94, 42, 53, 49, 29, 99, 78, 2, 84, 73, 58, 75, 18, 19, 65, 3, 47, 41, and 36 were used. In set 7, genes 56, 38, 23, 74, 34, 99, 93, 4, 13, 18, 61, 49, 20, 5, 76, 88, 91, 31, 78, 32, 1, 89, 12, 16, 51, 54, 81, 70, 86, 97, 66, 19, 59, 39, 8, 80, 73, 35, 71, 77, 24, 53, 68, 33, 62, 69, 43, 41, 15, 94, 44, 52, 29, 100, 55, 36, 27, 25, 67, 21, 96, 30, 42, 92, 11, 3, 45, 63, 72, 57, 47, 46, 75, 90, 2, 48, 14, 6, 9, 87, 22, 98, 95, 84, and 65 were used. In set 8, genes 79, 64, 71, 18, 37, 40, 54, 34, 26, 65, 39, 67, 14, 62, 95, 11, 49, 92, 59, 48, 6, 12, 57, 9, 20, 81, 16, 50, 38, 33, 100, 47, 63, 3, 84, 87, 35, 98, 56, 93, 66, 23, 2, 29, 90, 78, 85, 60, 19, 72, 97, 36, 13, 94, 25, 45, 41, 27, 69, 52, 8, 68, 46, 30, 1, 96, 7, 83, 80, 4, 99, 15, 76, 10, 58, 89, 88, 51, 55, 82, 53, 28, 44, 73, and 77 were used. In set 9, genes 35, 85, 81, 4, 20, 88, 66, 74, 13, 36, 6, 24, 95, 97, 2, 21, 90, 57, 89, 42, 73, 79, 64, 59, 46, 68, 92, 67, 82, 28, 56, 14, 65, 99, 39, 38, 8, 62, 61, 78, 11, 48, 93, 91, 29, 33, 76, 16, 69, 47, 84, 94, 7, 54, 30, 32, 23, 70, 52, 43, 51, 41, 60, 100, 27, 63, 75, 77, 80, 5, 3, 44, 10, 87, 40, 71, 37, 72, 1, 53, 22, 83, 49, 17, and 34 were used. In set 10, genes 23, 39, 86, 48, 65, 73, 24, 27, 61, 37, 99, 64, 58, 74, 3, 22, 57, 60, 13, 93, 44, 100, 66, 69, 38, 83, 6, 81, 59, 36, 68, 95, 71, 70, 84, 62, 96, 26, 30, 32, 20, 54, 80, 19, 97, 16, 4, 77, 12, 5, 35, 29, 18, 52, 53, 87, 98, 90, 10, 75, 72, 55, 50, 88, 28, 34, 41, 94, 11, 76, 7, 45, 31, 46, 49, 9, 82, 17, 79, 1, 25, 40, 67, 47, and 85 were used.
[0146] For 90 genes, set 1, genes 79, 27, 100, 96, 11, 32, 63, 42, 68, 13, 65, 88, 75, 64, 82, 72, 37, 45, 98, 2, 90, 94, 1, 87, 73, 86, 69, 92, 3, 25, 29, 84, 60, 50, 39, 4, 95, 47, 12, 10, 33, 22, 77, 71, 57, 97, 38, 89, 91, 53, 51, 9, 67, 44, 7, 78, 34, 85, 15, 41, 54, 49, 62, 76, 83, 46, 59, 23, 24, 8, 14, 26, 30, 52, 18, 6, 66, 31, 20, 93, 36, 16, 61, 28, 74, 43, 56, and 48 were used. In set 2, genes 95, 28, 46, 62, 91, 99, 53, 65, 66, 60, 22, 29, 50, 2, 93, 33, 54, 57, 92, 24, 9, 4, 69, 5, 8, 58, 88, 43, 6, 100, 51, 18, 16, 45, 81, 44, 68, 14, 59, 82, 63, 73, 30, 86, 98, 13, 84, 94, 1, 55, 38, 83, 3, 37, 11, 89, 77, 85, 26, 97, 12, 21, 40, 96, 56, 41, 10, 42, 64, 17, 76, 27, 49, 20, 87, 34, 75, 15, 74, 35, 19, 31, 39, 48, 23, 67, 78, 32, 7, and 80 were used. In set 3, genes 88, 89, 6, 94, 17, 60, 8, 76, 45, 90, 47, 80, 15, 85, 51, 5, 46, 36, 65, 4, 25, 67, 78, 77, 97, 23, 11, 40, 61, 53, 39, 12, 38, 21, 59, 55, 32, 34, 71, 69, 20, 50, 93, 3, 30, 29, 75, 73, 49, 98, 58, 43, 18, 95, 42, 82, 66, 16, 33, 37, 92, 52, 56, 41, 87, 99, 74, 24, 86, 48, 81, 57, 83, 26, 79, 68, 13, 63, 72, 9, 70, 14, 54, 100, 64, 19, 96, 7, 31, and 2 were used. In set 4, genes 19, 33, 41, 40, 70, 51, 14, 48, 42, 12, 90, 4, 32, 60, 89, 64, 45, 86, 73, 16, 50, 5, 9, 72, 81, 3, 27, 87, 76, 58, 29, 31, 13, 21, 55, 18, 6, 62, 56, 96, 47, 63, 37, 98, 28, 91, 36, 82, 39, 100, 68, 25, 88, 11, 93, 35, 66, 24, 43, 59, 8, 65, 74, 30, 10, 22, 17, 99, 49, 44, 26, 54, 2, 80, 94, 57, 71, 38, 67, 79, 75, 77, 23, 85, 61, 52, 83, 7, 78, and 53 were used. In set 5, genes 49, 55, 13, 97, 59, 83, 61, 34, 80, 19, 12, 65, 86, 72, 89, 25, 39, 77, 82, 47, 22, 48, 20, 11, 23, 84, 31, 4, 54, 91, 8, 87, 33, 14, 32, 45, 68, 27, 51, 28, 96, 1, 100, 92, 37, 29, 64, 15, 7, 98, 60, 53, 17, 69, 24, 75, 81, 74, 5, 18, 26, 78, 62, 94, 88, 46, 73, 44, 63, 52, 9, 93, 76, 6, 95, 99, 42, 50, 66, 38, 90, 70, 35, 57, 85, 58, 16, 43, 30, and 10 were used. In set 6, genes 81, 52, 60, 16, 18, 40, 67, 47, 58, 51, 26, 5, 53, 34, 24, 68, 14, 43, 49, 69, 99, 73, 29, 96, 37, 62, 66, 38, 88, 48, 11, 50, 79, 74, 15, 39, 83, 57, 94, 95, 100, 12, 84, 10, 33, 3, 93, 91, 17, 46, 59, 86, 7, 9, 71, 19, 22, 80, 27, 97, 4, 75, 89, 21, 78, 85, 63, 61, 77, 31, 32, 56, 6, 72, 92, 55, 76, 90, 36, 35, 98, 1, 82, 25, 23, 44, 65, 64, 28, and 42 were used. In set 37, genes 51, 1, 54, 94, 93, 56, 22, 29, 53, 67, 88, 82, 16, 44, 65, 21, 14, 35, 48, 91, 12, 97, 31, 74, 6, 99, 86, 26, 28, 19, 72, 58, 24, 34, 5, 38, 81, 11, 49, 39, 3, 89, 75, 64, 96, 52, 59, 69, 42, 78, 33, 100, 2, 25, 66, 77, 90, 40, 71, 9, 4, 57, 13, 36, 10, 50, 17, 87, 15, 47, 60, 46, 63, 68, 70, 23, 80, 37, 30, 92, 7, 32, 27, 43, 98, 84, 8, 61, 73, and 41 were used. In set 8, genes 53, 63, 17, 43, 6, 44, 95, 58, 78, 13, 3, 15, 28, 41, 12, 93, 2, 92, 23, 42, 62, 57, 33, 8, 65, 49, 80, 81, 50, 71, 74, 39, 4, 70, 77, 51, 84, 21, 30, 36, 46, 75, 47, 94, 16, 67, 55, 1, 26, 52, 60, 19, 59, 90, 96, 14, 87, 37, 40, 66, 88, 73, 29, 10, 5, 56, 100, 45, 31, 34, 22, 64, 91, 54, 48, 25, 98, 61, 18, 72, 69, 27, 68, 99, 83, 35, 24, 82, 85, and 38 were used. In set 9, genes 62, 91, 49, 28, 69, 38, 19, 35, 89, 3, 24, 79, 32, 12, 47, 40, 39, 50, 86, 6, 44, 65, 33, 70, 16, 41, 21, 53, 72, 74, 87, 14, 51, 7, 60, 67, 100, 42, 93, 36, 2, 57, 76, 20, 25, 27, 95, 18, 73, 97, 54, 99, 63, 66, 96, 22, 77, 56, 90, 81, 61, 17, 48, 23, 15, 4, 30, 45, 59, 8, 71, 52, 85, 92, 46, 98, 64, 94, 75, 83, 13, 26, 43, 84, 5, 1, 29, 68, 82, and 31 were used. In set 10, genes 45, 10, 63, 9, 18, 7, 70, 50, 22, 52, 91, 88, 5, 38, 17, 80, 54, 92, 20, 19, 24, 8, 13, 40, 15, 21, 87, 72, 12, 14, 2, 53, 46, 93, 4, 44, 99, 76, 47, 32, 60, 27, 23, 81, 78, 68, 36, 71, 64, 30, 95, 82, 90, 26, 74, 86, 100, 89, 62, 37, 66, 35, 83, 94, 31, 43, 65, 84, 11, 67, 25, 33, 61, 79, 97, 16, 75, 73, 98, 57, 28, 59, 1, 96, 51, 41, 69, 3, 56, and 55 were used.
[0147] For 95 genes, set 1, genes 35, 64, 32, 25, 20, 69, 88, 42, 97, 6, 23, 86, 98, 93, 16, 44, 53, 51, 91, 21, 70, 73, 31, 81, 74, 14, 29, 66, 4, 87, 11, 94, 52, 95, 56, 63, 18, 8, 78, 100, 62, 99, 39, 89, 17, 50, 71, 10, 90, 65, 84, 83, 60, 48, 22, 5, 92, 13, 15, 24, 27, 37, 57, 33, 38, 82, 3, 9, 30, 1, 34, 7, 40, 68, 67, 58, 28, 47, 46, 19, 12, 43, 41, 61, 76, 96, 72, 36, 75, 54, 45, 80, 49, 79, an d55 were used, In set 2, genes 58, 44, 39, 62, 1, 19, 61, 33, 84, 36, 91, 21, 53, 30, 63, 35, 92, 45, 11, 87, 10, 82, 96, 64, 8, 32, 42, 78, 69, 59, 24, 72, 48, 66, 15, 27, 49, 75, 40, 47, 57, 52, 31, 95, 97, 94, 26, 5, 93, 34, 60, 81, 88, 29, 23, 67, 76, 6, 98, 37, 74, 43, 100, 20, 18, 12, 13, 51, 41, 54, 14, 2, 68, 99, 3, 38, 70, 77, 50, 4, 17, 22, 9, 83, 71, 85, 25, 79, 46, 86, 7, 73, 16, 65, and 28 were used. In set 3, genes 15, 4, 25, 94, 92, 77, 78, 70, 17, 52, 36, 23, 44, 98, 39, 99, 59, 50, 75, 16, 82, 48, 18, 90, 10, 72, 8, 34, 9, 19, 1, 57, 93, 46, 54, 69, 32, 21, 81, 91, 28, 38, 68, 3, 41, 47, 87, 63, 24, 13, 84, 5, 65, 67, 74, 62, 85, 12, 53, 30, 73, 51, 2, 80, 29, 26, 83, 43, 55, 86, 88, 89, 35, 66, 31, 96, 100, 58, 60, 14, 6, 61, 49, 22, 20, 27, 7, 64, 37, 45, 97, 95, 40, 71, and 11 were used. In set 4, genes 21, 78, 42, 23, 84, 10, 64, 36, 48, 26, 79, 71, 72, 39, 49, 56, 44, 20, 47, 82, 63, 1, 91, 2, 8, 40, 96, 18, 68, 9, 57, 28, 100, 89, 60, 75, 70, 73, 25, 15, 46, 85, 86, 97, 32, 94, 65, 90, 74, 98, 16, 45, 3, 6, 31, 77, 41, 11, 12, 35, 95, 93, 53, 50, 30, 61, 81, 92, 80, 54, 13, 38, 58, 14, 52, 22, 76, 83, 5, 17, 37, 69, 66, 87, 19, 88, 51, 34, 59, 99, 24, 33, 27, 4, and 62 were used. In set 5, genes 29, 34, 28, 58, 89, 1, 73, 30, 92, 76, 68, 33, 38, 8, 49, 3,42, 9, 40, 36, 43, 81, 97, 59, 7, 79, 54, 15, 11, 61, 18, 82, 100, 41, 52, 23, 31, 13, 57, 66, 65, 27, 72, 44, 16, 69, 39, 26, 2, 55, 71, 80, 86, 77, 12, 25, 14, 50, 88, 22, 93, 51, 75, 64, 47, 62, 96, 10, 35, 5, 67, 60, 32, 84, 94, 48, 56, 90, 95, 83, 21, 6, 37, 91, 46, 70, 24, 87, 85, 17, 98, 99, 45, 19, and 63 were used. In set 6, genes 36, 34, 46, 2, 5, 77, 91, 59, 61, 29, 9, 85, 52, 16, 17, 60, 51, 95, 69, 58, 57, 23, 82, 33, 18, 45, 43, 49, 90, 1, 94, 93, 47, 37, 35, 63, 27, 96, 32, 15, 25, 86, 55, 24, 26, 71, 48, 7, 28, 79, 11, 44, 76, 3, 68, 88, 62, 73, 54, 39, 22, 13, 75, 19, 66, 98, 70, 10, 83, 100, 42, 31, 38, 4, 92, 78, 99, 97, 56, 21, 20, 6, 72, 40, 65, 67, 53, 30, 8, 14, 84, 50, 12, 80, and 81 were used. In set 7, genes 26, 7, 14, 64, 91, 50, 8, 48, 23, 29, 34, 28, 9, 20, 74, 97, 27, 63, 25, 66, 60, 43, 92, 61, 58, 46, 68, 49, 21, 98, 2, 41, 52, 1, 51, 77, 53, 69, 36, 93, 62, 55, 17, 38, 31, 40, 76, 54, 71, 5, 99, 83, 82, 78, 42, 15, 24, 70, 84, 100, 73, 10, 59, 33, 96, 4, 56, 3, 94, 75, 90, 13, 32, 65, 89, 79, 19, 30, 11, 87, 37, 95, 12, 6, 88, 80, 18, 47, 81, 72, 44, 16, 86, 85, and 67 were used. In set 8, genes 24, 84, 92, 71, 56, 68, 93, 67, 59, 75, 85, 35, 72, 86, 39, 46, 65, 51, 23, 100, 8, 37, 70, 69, 57, 27, 17, 87, 44, 1, 2, 50, 9, 91, 63, 29, 95, 3, 5, 40, 96, 47, 54, 64, 66, 18, 28, 13, 14, 36, 80, 21, 12, 61, 48, 26, 88, 83, 7, 43, 42, 97, 99, 41, 10, 16, 94, 53, 45, 98, 15, 73, 89, 55, 74, 81, 20, 90, 79, 34, 38, 82, 76, 4, 60, 33, 31, 78, 58, 62, 22, 6, 52, 49, and 19 were used. In set 9, genes 99, 77, 10, 92, 24, 43, 41, 15, 46, 78, 38, 19, 2, 5, 3, 81, 82, 22, 56, 63, 47, 90, 33, 34, 75, 100, 62, 65, 13, 30, 95, 98, 94, 25, 67, 11, 6, 66, 14, 48, 93, 4, 21, 89, 35, 68, 97, 45, 27, 59, 76, 85, 42, 49, 23, 40, 37, 74, 26, 52, 8, 91, 53, 57, 58, 86, 31, 20, 9, 16, 84, 69, 96, 44, 32, 54, 60, 7, 51, 83, 72, 28, 29, 61, 80, 55, 64, 17, 18, 70, 50, 1, 12, 73, and 39 were used. In set 10, genes 76, 1, 12, 25, 77, 24, 100, 17, 66, 65, 26, 29, 60, 91, 63, 52, 6, 30, 8, 72, 82, 68, 15, 16, 54, 43, 59, 34, 89, 20, 44, 87, 70, 56, 3, 28, 74, 86, 7, 2, 33, 35, 46, 67, 58, 22, 49, 21, 75, 14, 27, 64, 90, 42, 73, 36, 97, 40, 11, 37, 51, 19, 83, 45, 47, 50, 55, 23, 80, 61, 95, 71, 78, 32, 81, 93, 98, 62, 92, 99, 9, 4, 53, 84, 18, 13, 41, 57, 88, 5, 79, 38, 39, 31, and 94 were used.
[0148] Classification of subsets of the 39 tumor types was performed with use of random selections of tumor types from the group of 39. The expression levels of gene sequence sets as described herein were used to classify random combinations of tumor types. Different random sets of tumor types were used with each of the sets of 100, 74, and 90 gene sequences as described in these examples. Representative, and non-limiting, examples of random sets, of from 2 to 20 tumor types used are as follows, where the set of 39 tumor types were indexed from 1 to 39.
[0149] For 2 tumor types, set 1 used types 26 and 16. Set 2 used types 8 and 5. Set 3 used types 39 and 8. Set 4 used types 27 and 23. Set 5 used types 8 and 19. Set 6 used 12 and 21. Set 7 used types 30 and 15. Set 8 used types 30 and 5. Set 9 used types 18 and 22. Set 10 used types 27 and 26.
[0150] For 4 tumor types, set 1 used types 20, 35, 15 and 7. Set 2 used types 36, 1, 28 and 19. Set 3 used types 13, 4, 12 and 21. Set 4 used types 12, 33, 14 and 28. Set 5 used types 6, 28, 5 and 37. Set 6 used types 5, 25, 36 and 15. Set 7 used types 12, 26, 21 and 19. Set 8 used types 19, 3, 20 and 17. Set 9 used types 18, 10, 8 and 9. Set 10 used, types 28, 20, 2 and 22.
[0151] For 6 tumor types, set 1 used types 27, 3, 10, 39, 11 and 20. Set 2 used types 33, 10, 20, 32, 13 and 19. Set 3 used types 31, 27, 18, 39, 8 and 16. Set 4 used types 25, 28, 10, 12, 7 and 39. Set 5 used types 14, 13, 28, 24, 30 and 36. Set 6 used types 9, 24, 8, 17, 36 and 26. Set 7 used types 20, 1, 34, 26, 6 and 19. Set 8 used types 12, 13, 3, 17, 34 and 22. Set 9 used types 7, 1, 17, 13, 20 and 34. Set 10 used types 5, 11, 25, 29, 28 and 35.
[0152] For 8 tumor types, set 1 used types 34, 33, 28, 3, 23, 25, 9 and 29. Set 2 used types 27, 8, 38, 28, 20, 14, 12 and 9. Set 3 used types 29, 21, 19, 1, 13, 26, 11 and 31. Set 4 used types 25, 17, 7, 20, 34, 8, 28 and 10. Set 5 used types 36, 28, 35, 26, 2, 8, 29 and 7. Set 6 used types 10, 23, 2, 27, 33, 21, 25 and 35. Set 7 used types 10, 18, 38, 2, 6, 7, 19 and 32. Set 8 used types 11, 37, 6, 28, 3, 9, 2 and 16. Set 9 used types 22, 2, 10, 8, 17, 19 and 33. Set 10 used types 35, 39, 8, 10, 37, 4, 36 and 6.
[0153] For 10 tumor types, set 1 used types 25, 10, 26, 2, 32, 31, 39, 23, 22 and 18. Set 2 used types 12, 35, 6, 16, 20, 3, 39, 36, 11 and 2. Set 3 used types 34, 1, 15, 29, 5, 39, 2, 12, 25 and 18. Set 4 used types 10, 8, 14, 18, 31, 19, 23, 20, 32 and 33. Set 5 used types 10, 18, 37, 15, 4, 35, 33, 24, 39 and 20. Set 6 used types 22, 16, 4, 3, 18, 21, 1, 25, 37 and 13. Set 7 used types 14, 6, 28, 18, 11, 13, 2, 32, 33 and 19. Set 8 used types 39, 2, 38, 4, 34, 8, 25, 6, 32 and 35. Set 9 used types 3, 10, 11, 16, 6, 15, 18, 14, 12 and 26. Set 10 used types 24, 25, 21, 9, 36, 29, 20, 39, 10 and 37.
[0154] For 12 tumor types, set 1 used types 26, 20, 4, 12, 2, 31, 38, 18, 16, 39, 3 and 33. Set 2 used types 25, 16, 4, 9, 29, 27, 14, 24, 21, 7, 23 and 2. Set 3 used types 31, 18, 23, 13, 25, 1, 29, 21, 35, 10, 32 and 39. Set 4 used types 8, 34, 23, 9, 35, 14, 25, 21, 2, 33, 18 and 28. Set 5 used types 6, 11, 21, 8, 5, 7, 19, 32, 3, 13, 36 and 9. Set 6 used types 12, 33, 14, 26, 27, 15, 2, 21, 36, 35, 9 and 39. Set 7 used types 26, 29, 32, 17, 31, 19, 6, 5, 20, 34, 2 and 24. Set 8 used types 17, 12, 8, 22, 28, 9, 27, 29, 14, 35, 4 and 32. Set 9 used types 29, 9, 36, 23, 33, 18, 21, 35, 3, 6, 2 and 1. Set 10 used types 1, 3, 35, 29, 22, 27, 8, 23, 2, 36, 14 and 19.
[0155] For 14 tumor types, set 1 used types 9, 26, 38, 25, 31, 3, 15, 14, 17, 33, 12, 35, 39 and 16. Set 2 used types 1, 26, 16, 25, 20, 12, 14, 37, 38, 24, 23, 33, 27 and 35. Set 3 used types 11, 21, 35, 38, 32, 34, 27, 39, 16, 15, 4, 5, 13 and 18. Set 4 used types 27, 5, 13, 28, 18, 17, 15, 20, 29, 37, 21, 36, 25 and 14. Set 5 used types 5, 12, 17, 9, 25, 21, 33, 37, 8, 15, 24, 3, 34 and 28. Set 6 used types 11, 19, 34, 26, 9, 6, 32, 14, 27, 29, 30, 16, 24 and 17. Set 7 used types 31, 26, 11, 18, 19, 20, 9, 8, 5, 36, 12, 6, 27 and 38. Set 8 used types 20, 17, 11, 5, 15, 9, 2, 39, 34, 24, 27, 26, 35 and 10. Set 9 used types 1, 14, 39, 30, 17, 6, 10, 35, 31, 33, 15, 29, 32 and 7. Set 10 used types 1, 19, 24, 28, 34, 12, 13, 18, 32, 11, 14, 21, 22 and 25.
[0156] For 16 tumor types, set 1 used types 27, 15, 8, 12, 6, 20, 26, 19, 25, 2, 37, 38, 7, 39, 4 and 33. Set 2 used types 17, 18, 28, 5, 6, 31, 25, 13, 8, 20, 37, 36, 35, 9, 23 and 27. Set 3 used types 23, 37, 34, 14, 16, 27, 32, 33, 21, 38, 4, 30, 24, 22, 17 and 25. Set 4 used types 7, 37, 38, 21, 34, 31, 32, 25, 10, 36, 19, 11, 6, 26, 18 and 35. Set 5 used types 9, 32, 12, 24, 20, 13, 38, 21, 39, 23, 36, 18, 37, 22, 5 and 3. Set 6 used types 14, 21, 5, 17, 6, 20, 18, 35, 22, 10, 3, 23, 13, 2, 34 and 26. Set 7 used types 1, 8, 19, 6, 9, 39, 28, 18, 13, 31, 14, 16, 37, 12, 3 and 25. Set 8 used types 32, 36, 28, 38, 9, 33, 2, 5,4, 11, 19, 18, 13, 8, 12 and 3. Set 9 used types 9, 14, 10, 5, 28, 32, 23, 6, 39, 3, 17, 8, 19, 1, 31 and 12. Set 10 used types 4, 34, 11, 6, 38, 19, 7, 20, 23, 3, 25, 37, 26, 1, 15 and 12,
[0157] For 18 tumor types, set 1 used types 15, 24, 39, 35, 7, 30, 16, 13, 20, 3, 26, 4, 12, 10, 34, 25, 21 and 28. Set 2 used types 21, 23, 29, 11, 10, 19, 13, 28, 4, 20, 17, 24, 30, 12, 39, 34, 31 and 9. Set 3 used types 7, 17, 27, 6, 30, 8, 22, 2, 32, 26, 21, 14, 4, 38, 1, 35, 16 and 28. Set 4 used types 17, 13, 20, 33, 10, 3, 16, 22, 1, 38, 2, 9, 28, 5, 6, 19, 12 and 11. Set 5 used types 4, 35, 21, 25, 18, 17, 8, 14, 31, 30, 9, 1, 2, 23, 36, 29, 32 and 37. Set 6 used types 17, 34, 2, 18, 19, 15, 16, 13, 4, 24, 5, 35, 6, 22, 28, 37, 38 and 1. Set 7 used types 34, 26, 12, 25, 27, 3, 17, 7, 2, 32, 9, 36, 21, 19, 22, 8, 20 and 29. Set 8 used types 12, 34, 38, 25, 17, 22, 14, 39, 10, 7, 31, 2, 3, 11, 29, 30, 16 and 24. Set 9 used types 13, 26, 27, 14, 5, 10, 8, 7, 16, 30, 37, 4, 6, 35, 28, 1, 36 and 20. Set 10 used types 15, 2, 17, 23, 26, 28, 36, 38, 12, 6, 19, 37, 20, 14, 9, 39, 11 and 21.
[0158] For 20 tumor types, set 1 used types 25, 13, 21, 15, 37, 20, 12, 28, 9, 10, 26, 22, 14, 24, 16, 7, 39, 34, 33 and 4. Set 2 used types 20, 17, 10, 27, 19, 28, 5, 1, 23, 21, 38, 7, 13, 22, 32, 31, 9, 4, 3 and 24. Set 3 used types 17, 13, 7, 20, 11, 38, 34, 3, 15, 12, 5, 39, 9, 10, 4, 35, 27, 6, 21 and 33. Set 4 used types 6, 13, 17, 26, 1, 7, 33, 5, 10, 32, 3, 23, 35, 4, 14, 28, 12, 38, 8 and 27. Set 5 used types 10, 23, 9, 38, 5, 29, 12, 27, 25, 6, 7, 26, 37, 31, 24, 36, 19, 15, 16 and 11. Set 6. used types 30, 24, 21, 11, 23, 25, 8, 9, 7, 31, 27, 5, 14, 29, 1, 19, 16, 12, 22 and 17. Set 7 used types 26, 13, 23, 19, 22, 11, 25, 21, 33, 20, 6, 17, 2, 10, 31, 34, 27, 37, 7 and 9. Set 8 used types 30, 1, 38, 7, 31, 37, 11, 25, 6, 19, 28, 33, 17, 29, 10, 27, 16, 3, 14 and 15. Set 9 used types 15, 19, 26, 24, 5, 33, 11, 2, 13, 18, 31, 22, 32, 20, 23, 6, 10, 25, 36 and 3. Set 10 used types 24, 25, 21, 29, 14, 18, 31, 2, 20, 39, 23, 9, 38, 12, 6, 32, 22, 26, 33 and 7.
Example 4: Specified Gene Sets
[0159] A first set of 74 genes and a second set of 90 genes, where the two sets have 38 members in common, were used in the practice of the invention. The performance of the two sets versus varying numbers of tumor types is shown in FIG. 3.
[0160] Random subsets of 50 to all members of the set of 74 expressed gene sequences were evaluated in a manner analogous to that described in Example 3. Again, the expression levels of random combinations of 50, 55, 60, 65, 70, and all 74 (each combination sampled 10 times) of the 74 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. The resulting data are shown in FIGS. 4 and 5,
[0161] The members of the 74 gene sequences were indexed from 1 to 74, and representative random sets used in the invention are as follows:
[0162] For 50-genes, set 1, genes 69, 64, 74, 29, 4, 57, 30, 72, 36, 59, 42, 47, 11,3 3, 60, 35, 39, 10, 50, 49, 41, 12, 34, 51, 32, 66, 71, 37, 13, 14, 8, 25, 53, 21, 68, 7, 67, 55, 27, 22, 1, 44, 46, 28, 48, 19, 73, 23, 16, and 3 were used. In set 2, genes 60, 61, 23, 17, 10, 31, 16, 8, 72, 73, 18, 49, 71, 46, 29, 21, 66, 39, 22, 27, 43, 30, 51, 3, 38, 19, 37, 35, 70, 54, 40, 2, 55, 28, 45, 33, 25, 14, 48, 20, 36, 47, 62, 9, 69, 68, 53, 58, 15, and 7 were used. In set 3, genes 53, 68, 31, 2, 62, 17, 49, 71, 6, 56, 3, 66, 23, 21, 33, 30, 45, 73, 74, 11, 58, 27, 64, 18, 72, 42, 7, 28, 34, 43, 38, 65, 12, 47, 16, 40, 41, 36, 54, 61, 19, 63, 25, 46, 59, 9, 39, 55, 22, and 48 were used. In set 4, genes 23, 70, 48, 1, 11, 25, 60, 26, 5, 58, 46, 39, 28, 71, 35, 34, 2, 59, 69, 55, 49, 40, 15, 14, 68, 57, 10, 31, 67, 74, 62, 44, 16, 12, 64, 63, 61, 13, 52, 45, 19, 50, 36, 33, 9, 24, 32, 29, 56, and 72 were used. In set 5, genes 30, 26, 10, 34, 67, 73, 15, 59, 3, 64, 14, 70, 23, 47, 72, 71, 44, 49, 31, 48, 5, 61, 53, 20, 33, 58, 37, 50, 43, 18, 21, 38, 29, 16, 12, 63, 39, 4, 45, 60, 69, 25, 24, 65, 55, 13, 36, 11, 17, and 22 were used. In set 6, genes 43, 34, 61, 19, 35, 56, 24, 3, 23, 15, 13, 69, 1, 67, 42, 41, 64, 25, 63, 28, 8, 53, 38, 71, 6, 36, 68, 14, 18, 65, 51, 33, 4, 60, 5, 22, 40, 30, 50, 37, 29, 17, 27, 11, 9, 66, 62, 57, 59, and 10 were used. In set 7, genes 51, 55, 46, 31, 21, 72, 8, 67, 56, 1, 64, 6, 63, 32, 20, 16, 25, 61, 2, 45, 35, 22, 66, 38, 36, 3, 34, 27, 74, 47, 54, 30, 14, 13, 37, 23, 19, 12, 59, 18, 52, 5, 17, 33, 7, 39, 43, 58, 41, and 10 were used. In set 8, genes 28, 68, 71, 46, 48, 47, 5, 23, 22, 35, 60, 3, 40, 33, 41, 72, 12, 24, 15, 37, 1, 20, 45, 53, 61, 65, 74, 4, 10, 51, 26, 30, 38, 44, 55, 73, 66, 6, 39, 52, 36, 2, 59, 67, 27, 43, 50, 18, 8, and 69 were used. In set 9, genes 73, 51, 67,63, 24, 55, 42, 61, 13, 29, 23, 64, 49, 53, 19, 2, 43, 11, 15, 31, 58, 40, 38, 46, 44, 4, 27, 41, 28, 69, 8, 26, 5, 68, 37, 70, 25, 62, 22, 52, 1, 57, 54, 34, 16, 71, 9, 65, 14, and 30 were used. In set 10, genes 9, 13, 46, 2, 62, 47, 50, 36, 58, 23, 55, 31, 6, 40, 32, 27, 35, 33, 39, 1, 22, 19, 65, 16, 52, 72, 30, 3, 12, 7, 74, 21, 54, 20, 41, 10, 28, 37, 24, 53, 69, 11, 14, 67, 25, 71, 15, 42, 18, and 73 were used.
[0163] For 55 genes, set 1, genes 19, 3, 26, 44, 16, 59, 11, 39, 46, 54, 22, 7, 60, 30, 72, 6, 74, 53, 57, 14, 43, 47, 27, 45, 37, 24, 33, 64, 21, 36, 20, 50, 68, 62, 63, 17, 61, 10, 70, 18, 25, 71, 29, 65, 51, 56, 58, 69, 5, 55, 12, 1, 40, 49, and 13 were used. In set 2, genes 35, 15, 11, 33, 5, 29, 73, 69, 31, 70, 10, 45, 41, 72, 74, 26, 32, 12, 30, 34, 16, 64, 13, 50, 46, 38, 18, 48, 37, 68, 40, 61, 62, 6, 63, 47, 36, 65, 17, 67, 71, 39, 4, 59, 22, 24, 8, 9, 58, 3, 52, 20, 14, 25, and 7 were used. In set 3, genes 7, 19, 50, 62, 47, 74, 22, 26, 37, 8, 41, 53, 52, 67, 16, 40, 54, 34, 30, 46, 25, 55, 31, 3, 69, 38, 29, 65, 45, 43, 51, 68, 18, 57, 21, 5, 32, 20, 27, 73, 66, 10, 49, 24, 12, 13, 11, 71, 60, 23, 63, 35, 48, 39, and 70 were used. In set 4, genes 58, 70, 43, 68, 39, 57, 71, 27, 21, 53, 16, 23, 25, 60, 40, 36, 2, 63, 33, 49, 5, 54, 32, 66, 50, 59, 14, 52, 15, 48, 45, 44, 19, 72, 26, 10, 6, 41, 34, 61, 42, 67, 17,;24, 8, 11, 29, 74, 3, 51, 47, 65, 69, 28, and 1 were used. In set 5, genes 60, 53, 21, 63, 7, 19, 69, 3, 9, 22, 10, 50, 59, 71, 20, 11, 70, 6, 4, 17, 58, 16, 40, 68, 73, 38, 18, 15, 57, 26, 34, 67, 41, 27, 49, 28, 46, 54, 1, 13, 31, 48, 32, 61, 42, 66, 29, 5, 55, 72, 25, 30, 39, 44, and 56 were used. In set 6, genes 4, 36, 17, 47, 16, 6, 14, 51, 65, 42, 31, 38, 26, 15, 70, 28,41, 72, 30, 3,29, 55, 34, 32, 54, 24, 48, 39, 22, 57, 37, 23, 71, 61, 50, 21, 27, 53, 25, 40, 20, 69, 58, 66, 46, 1, 43, 12, 33, 63, 18, 68, 10, 56, and 45 were used. In set 7, genes 71, 7, 38, 61, 22, 33, 51, 25, 68, 6, 1, 49, 9, 58, 18, 55, 5, 50, 65, 52, 26, 59, 35, 11, 15, 70, 54, 27, 60, 28, 19, 63, 21, 10, 32, 42, 73, 36, 45, 66, 47, 2, 56, 23, 64, 44, 34, 29, 48, 69, 37, 16, 74, 53, and 43 were used. In set 8, genes 25, 42, 70, 28, 6, 48, 43, 20, 60, 18, 56, 74, 27, 9, 55, 67, 58, 68, 39, 38, 29, 1, 21, 45, 44, 66, 53, 34, 47, 64, 41, 57, 10, 3, 31, 65, 54, 46, 50, 59, 23, 73, 24, 51, 36, 26, 16, 49, 37, 62, 7, 32, 19, 22, and 14 were used. In set 9, genes 49, 65, 20, 59, 21, 45, 54, 29, 51, 50, 17, 37, 55, 47, 57, 9, 8, 18, 11, 10, 25, 1, 30, 68, 5, 6, 74, 70, 60, 53, 48, 39, 4, 23, 27, 73, 35, 40, 41, 44, 24, 3, 58, 19, 14, 13, 33, 63, 62, 46, 2, 12, 72, 36, and 7 were used. In set 10, genes 73, 53, 26, 24, 58, 25, 59, 71, 34, 65, 46, 2, 57, 48, 68, 21, 44, 22, 16, 70, 60, 8, 66, 45, 14, 27, 43, 37, 20, 36, 72, 18, 56, 4, 7, 6, 23, 15, 74, 1, 9, 50, 5, 35, 40, 32, 12, 38, 69, 33, 61, 62, 10, 47, and 39 were used.
[0164] For 60 genes, set 1, genes 49, 60, 66, 26, 22, 53, 33, 56, 10, 44, 17, 36, 41, 6, 21, 57, 39, 65, 24, 30, 31, 15, 43, 68, 64, 59, 28, 73, 13, 18, 51, 34, 63, 40, 71, 58, 48, 11, 37, 42, 70, 45, 72, 3, 67, 35, 52, 46, 32, 55, 27, 38, 19, 25, 5, 69, 62, 14, 23, and 4 were used. In set 2, genes 57, 5, 31, 15, 20, 54, 21, 42, 71, 50, 17, 68, 61, 53, 9, 35, 67, 12, 14, 52, 41, 38, 22, 45, 32, 39, 70, 18, 6, 26, 59, 40, 25, 28, 56, 10, 3, 47, 34, 8, 60, 2, 9, 62, 66, 19, 11, 37, 27, 36, 69, 7, 65, 4, 33, 24, 51, 55, 48, arms 44 were used. In set 3, genes 37, 54, 44, 66, 36, 1, 61, 62, 47, 69, 4, 30, 31, 11, 8, 63, 38, 16, 65, 25, 74, 21, 34, 60, 20, 71, 12, 19, 43, 15, 27, 57, 6, 55, 64, 22, 14, 39, 53, 23, 17, 28, 51, 56, 40, 29, 58, 48, 42, 59, 68, 5, 35, 50, 72, 10, 45, 32, 33, and 73 were used. In set 4, genes 24, 2, 49, 57, 35, 45, 40, 51, 42, 7, 47, 5, 8, 17, 61, 74, 64, 72, 50, 60, 70, 26, 9, 56, 32, 4, 16, 44, 27, 43, 53, 33, 46, 55, 41, 68, 48, 11, 10, 39, 19, 6, 3, 14, 65, 69, 30, 34, 29, 36, 58, 28, 1, 23, 73, 15, 25, 13, 54, and 18 were used. In set 5, genes 18, 28, 1, 22, 71, 37, 62, 46, 31, 25, 70, 64, 66, 35, 5, 60, 10, 26, 9, 43, 67, 20, 59, 51, 33, 42, 3, 24, 49, 13, 27, 38, 61, 14, 52, 63, 11, 74, 7, 16, 23, 72, 39, 73, 15, 6, 17, 30, 57, 8, 50, 48, 34, 53, 2, 69, 29, 56, 44, and 47 were used. In set 6, genes 33, 74, 12, 7, 49, 25, 38, 1, 8, 4, 48, 26, 58, 54, 21, 50, 72, 45, 62, 66, 36, 13, 42, 5, 39, 17, 28, 23, 67, 41, 29, 73, 19, 56, 51, 69, 10, 16, 55, 14, 24, 64, 22, 59, 52, 35, 2, 31, 3, 9, 27, 71, 30, 32, 53, 11, 40, 61, 15, and 70 were used. In set 7, genes 30, 65, 26, 48, 47, 20, 17, 56, 35, 32, 10, 11, 1, 59, 50, 53, 45, 13, 63, 49, 41, 74, 16, 57, 15, 64, 12, 54, 5, 8, 67, 69, 31, 14, 61, 60, 37, 66, 43, 71, 23, 36, 51, 44, 34, 2, 42, 19, 58, 25, 27, 68, 18, 52, 21, 7, 70, 22, 28, and 62 were used. In set 8, genes 12, 58, 11, 5, 72, 70, 63, 66, 49, 44, 14, 48, 26, 73, 51, 47, 13, 65, 1, 39, 61, 17, 40, 8, 24, 54, 42, 34, 64, 21, 53, 59, 46, 4, 20, 29, 57, 74, 31, 67, 6, 69, 7, 68, 41, 3, 18, 62, 19, 32, 10, 43, 36, 71, 28, 60, 30, 15, 23, and 52 were used. In set 9, genes 7, 20, 69, 12, 58, 40, 70, 57, 3, 37, 6, 16, 61, 11, 13, 31, 55, 17, 49, 22, 36, 47, 44, 18, 45, 68, 25, 72, 19, 14, 39, 46, 30, 59, 56, 5, 66, 2, 41, 51, 9, 54, 35, 15, 26, 27, 23, 65, 4, 63, 1, 60, 21, 74, 24, 43, 38, 64, 50, and 67 were used. In set 10, genes 5, 43, 54, 22, 49, 48, 25, 24, 52, 35, 14, 70, 26, 72, 59, 71, 9, 41, 74, 36, 17, 47, 29, 34, 20, 27, 65, 68, 3, 73, 45, 62, 57, 56, 53, 44, 6, 7, 31, 55, 30, 23, 15, 33, 38, 42, 10, 60, 66, 8, 1, 64, 19, 16, 12, 61, 63, 51, 18, and 2 were used.
[0165] For 65 genes, set 1, genes 11, 10, 1, 69, 43, 33, 54, 24, 39, 27, 42, 18, 9, 46, 12, 20, 61, 44, 51, 64, 35, 8, 36, 38, 21, 7, 57, 59, 23, 49, 17, 15, 22, 55, 29, 16, 37, 72, 30, 31, 45, 63, 40, 28, 41, 32, 66, 65, 5, 47, 53, 60, 25, 50, 74, 52, 14, 68, 48, 13, 2, 4, 3, 6, and 67 were used. In set 2, genes 37, 8, 31, 4, 23, 57, 69, 40, 3, 9, 5, 32, 42, 44, 56, 21, 10, 34, 74, 61, 39, 38, 13, 70, 41, 19, 48, 47, 29, 52, 26, 72, 49, 45, 7, 63, 16, 25, 24, 14, 18, 60, 59, 11, 35, 2, 30, 68, 58, 67, 27, 33, 66, 12, 71, 51, 55, 6, 20, 54, 1, 46, 64, 62, and 53 were used. In set 3, genes 24, 19, 35, 57, 27, 8, 23, 30, 65, 32, 59, 29, 4, 47, 17, 53, 34, 54, 73, 14, 20, 63, 43, 3, 38, 61, 31, 49, 25, 42, 41, 51, 18, 7, 40, 39, 33, 50, 70, 28, 13, 74, 36, 45, 64, 5, 16, 58, 1, 66, 62, 46, 15, 12, 72, 21, 2, 68, 71, 9, 44, 26, 37, 6, and 55 were used, In set 4, genes 62, 29, 5, 41, 18, 4, 21, 63, 65, 8, 55, 61, 66, 34, 23, 28, 14, 49, 68, 15, 1, 11, 19, 73, 13, 57, 20, 27, 50, 2, 72, 22, 6, 7, 40, 67, 51, 45, 10, 36, 53, 64, 54, 24, 25, 37, 74, 12, 52, 26, 38, 32, 3, 30, 33, 39, 48, 58, 17, 42, 71, 43, 69, 56, and 9 were used. In set 5, genes 49, 58, 74, 65, 67, 44, 57, 28, 56, 18, 59, 31, 10, 17, 41, 39, 63, 7, 21, 55, 38, 2, 51, 42, 5, 53, 20, 34, 16, 43, 19, 15, 50, 4, 6, 11, 52, 37, 8, 64, 69, 12, 48, 60, 1, 66, 27, 36, 45, 30, 14, 72, 68, 73, 35, 47, 71, 22, 70, 33, 32, 46, 25, 13, and 54 were used. In set 6, genes 7, 44, 23, 68, 46, 30, 10, 4, 3, 53, 22, 38, 50, 26, 55, 49, 20, 11, 73, 12, 62, 63, 43, 69, 6, 61, 52, 25, 65, 16, 47, 34, 33, 28, 42, 58, 29, 39, 31, 1, 36, 13, 5, 60, 35, 19, 40, 18, 59, 64, 41, 70, 72, 57, 67, 9, 74, 8, 14, 71, 45, 56, 32, 51, and 2 were used. In set 7, genes 57, 61, 9, 48, 31, 4, 40, 35, 1, 16, 44, 67, 68, 34, 6, 64, 7, 54, 53, 10, 18, 39, 23, 14, 33, 74, 51, 38, 24, 19, 72, 63, 36, 65, 32, 2, 27, 45, 3, 43, 21, 49, 30, 60, 50, 70, 41, 20, 11, 37, 13, 15, 5, 12, 46, 26, 22, 71, 8, 62, 29, 28, 25, 17, and 52 were used. In set 8, genes 11, 21, 3, 6, 74, 58, 52, 40, 17, 23, 41, 63, 22, 56, 55, 8, 60, 54, 51, 57, 66, 68, 29, 24, 69, 39, 16, 49, 72, 59, 48, 61, 2, 7, 44, 37, 43, 45, 35, 25, 1, 4, 20, 14, 36, 42, 65, 62, 71, 32, 19, 70, 28, 27, 9, 46, 33, 18, 67, 15, 30, 26, 12, 47, and 53 were used. In set 9, genes 48, 27, 64, 55, 30, 2, 33, 16, 31, 21, 57, 50, 63, 17, 44, 29, 4, 6, 60, 65, 23, 19, 58, 68, 25, 59, 14, 7, 42, 12, 69, 45, 53, 73, 56, 34, 41, 3, 18, 5, 72, 70, 40, 37, 62, 43, 51, 24, 52, 20, 39, 8, 13, 26, 10, 66, 54, 22, 49, 61, 11, 46, 32, 67, and 36 were used. In set 10, genes 31, 39, 50, 60, 17, 33, 73, 30, 3, 27, 10, 62, 29, 12, 59, 1, 34, 69, 51, 72, 65, 52, 16, 36, 28, 23, 42, 40, 66, 58, 48, 46, 38, 74, 20, 55, 21, 49, 63, 2, 70, 7, 26, 53, 41, 45, 25, 44, 71, 32, 24, 13, 14, 6, 57, 11, 68, 35, 54, 22, 64, 8, 9, 56, and 37 were used.
[0166] For 70 genes, set 1, genes 72, 74, 31, 73, 52, 16, 32, 24, 14, 66, 59, 28, 54, 1, 11, 12, 34, 57, 5, 67, 25, 42, 62, 71, 68, 69, 48, 7, 18, 20, 47, 19, 53, 2, 4, 15, 26, 63, 37, 17, 10, 60, 65, 8, 22, 70, 36, 30, 41, 9, 21, 35, 49, 38, 33, 56, 46, 27, 45, 44, 39, 43, 29, 50, 61, 40, 23, 64, 55, and 3 were used. In set 2, genes 45, 32, 60, 2, 42, 56, 8, 46, 30, 27, 17, 62, 26, 24, 65, 49, 16, 70, 3, 47, 50, 4, 40, 28, 1, 36, 22, 59, 48, 9, 57, 5, 72, 23, 13, 44, 67, 14, 12, 34, 21, 41, 71, 39, 66, 25, 69, 19, 15, 6, 68, 29, 52, 43, 64, 58, 54, 11, 37, 38, 55, 7, 20, 61, 53, 63, 10, 74, 51, and 35 were used. In set 3, genes 66, 71, 40, 62, 60, 51, 61, 5, 19, 15, 34, 13, 18, 8, 28, 59, 35, 54, 2, 55, 29, 22, 41, 37, 45, 64, 48, 7, 73, 27, 30, 69, 63, 23, 25, 42, 1, 24, 14, 38, 4, 70, 53, 3, 36, 12, 74, 68, 26, 57, 33, 17, 67, 72, 52, 58, 46, 39, 43, 56, 65, 10, 44, 11, 20, 47, 50, 9, 21, and 49 were used. In set 4, genes 73, 26, 33, 40, 71, 50, 62, 59, 10, 39, 64, 68, 3, 1, 44, 9, 72, 57, 43, 37, 24, 65, 48, 6, 11, 23, 36, 19, 7, 31, 67, 69, 38, 29, 16, 35, 63, 21, 46, 15, 47, 28, 2, 5, 52, 70, 14, 22, 56, 45, 17, 4, 25, 66, 13, 55, 20, 30, 32, 54, 51, 49, 58, 74, 42, 53, 61, 34, 12, and 60 were used. In set 5, genes 7, 1, 24, 70, 26, 35, 68, 71, 74, 33, 5, 20, 49, 27, 65, 10, 72, 21, 66, 12, 4, 43, 9, 55, 23, 56, 47, 31, 42, 59, 61, 45, 67, 13, 63, 58, 17, 54, 28, 3, 64, 53, 39, 36, 30, 40, 37, 16, 41, 11, 52, 14, 62, 8, 46, 25, 44, 69, 29, 48, 51, 22, 73, 57, 18, 15, 19, 38, 6, and 50 were used. In set 6, genes 41, 36, 1, 27, 9, 51, 4, 38, 8, 25, 73, 5,7, 22, 68, 30, 6, 33, 65, 70, 21, 26, 60, 62, 63, 54, 57, 74, 58, 44, 11, 31, 53, 34, 10, 48, 23, 3, 42, 35, 49, 13, 71, 17, 50, 28, 19, 20, 40, 64, 56, 43, 69, 59, 39, 66, 72, 46, 32, 2, 14, 47, 52, 45, 15, 37, 12, 16, 24, and 67 were used. In set 7, genes 39, 70, 16, 5, 43, 6, 36, 30, 9, 53, 2, 34, 72, 42, 64, 73, 56, 63, 38, 13, 19, 27, 29, 60, 37, 52, 1, 3, 21, 22, 68, 69, 26, 55, 61, 11, 18, 12, 45, 8, 51, 65, 32, 33, 67, 48, 50, 10, 20, 28, 58, 7, 49, 35, 57, 71, 23, 17, 24, 62, 59, 54, 15, 40, 14, 41, 47, 46, 44, and 4 were used. In set 8, genes 3, 5, 50, 35, 53, 57, 14, 49, 55, 8, 25, 22, 71, 60, 13, 19, 12, 32, 26, 44, 15, 39, 17, 31, 61, 23, 66, 68, 4, 6, 7, 41, 24,40, 58, 67, 46, 70, 45, 64, 51, 69, 18, 62, 47, 52, 11, 30, 73, 28, 33, 2, 36, 1, 72, 42, 20, 27, 10, 16, 63, 38, 59, 74, 43, 9, 56, 34, 21, and 65 were used. In set 9, genes 18, 49, 70, 46, 29, 9, 52, 53, 64, 28, 37, 27, 7, 57, 44, 19, 72, 61, 67, 30, 62, 47, 2, 39, 8, 65, 26, 14, 63, 4, 20, 59, 45, 15, 10, 3, 16, 58, 25, 38, 60, 71, 66, 32, 23, 55, 69, 12, 33, 6, 42, 36, 22, 48, 24, 68, 41, 17, 54, 13, 21, 51, 73, 74, 40, 43, 1, 11, 56, and 35 were used. In set 10, genes 14, 12, 65, 74, 58, 6, 36, 5, 34, 11, 18, 33, 32, 7, 22, 37, 64, 59, 9, 52, 41, 26, 3, 19, 48, 35, 56, 62, 42, 60, 1, 8, 43, 50, 25, 61, 54, 49, 20, 70, 44, 30, 15, 46, 72, 38, 4, 29, 68, 21, 39, 17, 16, 53, 45, 73, 63, 31, 55, 47, 24, 69, 2, 71, 13, 28, 66, 23, 57, and 40 were used.
[0167] A similar experiment was performed with random subsets of 50 to all members of the set of 90 expressed gene sequences. Again, the expression levels of random combinations of 50, 55, 60, 65, 70, and all 90 (each combination sampled 10 times) of the 90 expressed sequences were used with data from tumor types and then used to predict test random sets of tumor samples (each sampled 10 times) ranging from 2 to all 39 types. The resulting data are shown in FIGS. 6 and 7.
[0168] The members of the 90 gene sequences were indexed from 1 to 90, and representative random sets used in the invention are as follows:
[0169] For 50-genes, set 1, genes 89, 30, 62, 23, 31, 20, 53, 25, 15, 38, 11, 22, 68, 44, 58, 7, 14, 61, 67, 32, 18, 71, 9, 54, 46, 3, 57, 50, 59, 79, 48, 90, 82, 64, 39, 21, 60, 37, 47, 10, 52, 77, 33, 45, 35, 83, 16, 69, 74, and 27 were used. In set 2, genes 25, 17, 64, 82, 23, 5, 77, 48, 72, 63, 34, 60, 61, 35, 58, 19, 56, 83, 8, 13, 38, 89, 59, 62, 88, 71, 11, 29, 31, 68, 65, 67, 78, 44, 27, 81, 24, 1, 18, 55, 85, 46, 41, 14, 84, 26, 16, 21, 12, and 69 were used. In set 3, genes 24, 39, 35, 15, 49, 44, 28, 58, 20, 3, 88, 23, 54, 31, 33, 37, 62, 25, 87, 75, 17, 41, 21, 19, 38, 85, 86, 74, 8, 12, 77, 30, 27, 43, 76, 73, 9, 14, 6, 63, 64, 81, 26, 66, 2, 56, 34, 60, 57, and 61 were used. In set 4, genes 40, 71, 55, 63, 2, 13, 38, 58, 26, 18, 76, 74, 17, 67, 69, 4, 9, 20, 21, 10, 35, 70, 49, 37, 12, 77, 61, 60, 15, 7, 36, 89, 33, 59, 78, 39, 82, 16, 64, 28, 6, 66, 52, 5, 44, 73, 34, 75, 31, and 29 were used. In set 5, genes 16, 37, 57, 18, 29, 66, 54, 6, 44, 70, 20, 65, 5, 61, 72, 83, 85, 58, 87, 73, 23, 76, 25, 68, 49, 24, 79, 89, 55, 75, 47, 19, 33, 39, 21, 63, 84, 32, 77, 40, 12, 11, 42, 50, 1, 9, 78, 3, 74, and 7 were used. In set 6, genes 42, 29, 74, 68, 27, 54, 15, 63, 30, 51, 78, 56, 82, 66, 80, 79, 90, 64, 22, 44, 71, 2, 89, 39, 46, 52, 36, 32, 84, 6, 59, 9, 38, 4, 55, 19, 7, 60, 49, 23, 73, 5, 11, 50, 70, 34, 61, 81, 67, and 28 were used. In set 7, genes 31, 27, 24, 75, 7, 46, 40, 60, 51, 37, 87, 28, 67, 62, 50, 66, 61, 63, 49, 1, 39, 74, 81, 4, 52, 22, 79, 45, 12, 41, 15, 90, 26, 33, 78, 48, 83, 10, 53, 73, 6, 19, 71, 59, 68, 56, 64, 13, 32, and 30 were used. In set 8, genes 88, 57, 5, 4, 1, 43, 12, 32, 66, 81, 90, 19, 51, 18, 55, 9, 29, 75, 11, 73, 23, 61, 6, 79, 69, 60, 13, 62, 8, 71, 2, 52, 67, 59, 87, 33, 80, 21, 14, 89, 39, 65, 56, 38, 47, 31, 84, 25, 45, and 41 were used In set 9, genes 60, 45, 51, 32, 49, 2, 44, 66, 83, 50, 87, 1, 90, 28, 42, 85, 13, 40, 70, 82, 79, 89, 64, 63, 27, 52, 10, 86, 77, 15, 56, 8, 33, 53, 38, 46, 67, 19, 68, 29, 48, 21, 34, 61, 18, 55, 25, 35, 39, and 80 were used. In set 10, genes 80, 39, 23, 76, 87, 33, 30, 88, 85, 89, 24, 47, 44, 43, 48, 55, 14, 73, 22, 19, 67, 1, 42, 51, 60, 12, 9, 6, 75, 17, 40, 25 28, 74, 38, 66, 5, 50, 8, 37, 15, 29, 21, 11, 35, 31, 13, 36, 52, and 18 were used.
[0170] For 55 genes, set 1, genes 86, 47, 80, 15, 74, 20, 79, 35, 14, 49, 41, 2, 48, 30, 81, 65, 5, 24, 51, 10, 31, 68, 7, 21 28, 38, 4, 18, 23, 44, 77, 42, 19, 61, 27, 75, 67, 36, 22, 26, 50, 32, 58, 71, 57, 76, 1, 88, 72, 33, 6, 34, 59, and 13 were used. In set 2, genes 73, 88, 39, 52, 87, 78, 84, 1, 42, 69, 62, 58, 10, 51, 38, 14, 77, 49, 36, 35, 34, 15, 65, 60, 20, 17, 61, 2, 59, 22, 81, 11, 19, 41, 5, 29, 12, 43, 7, 4, 64, 40, 74, 48, 72, 54, 68, 86, 66, 6, 67, 89, 21, 16, and 9 were used. In set 3, genes 28, 89, 35, 86, 49, 56, 69, 18, 15, 27, 13, 6, 51, 77, 8, 80, 16, 78, 43, 29, 37, 20, 9, 31, 32, 67, 48, 65, 82, 62, 76, 25, 54, 41, 90, 47, 2, 87, 84, 57, 74, 61, 59, 85, 75, 10, 66, 46, 73, 24, 44, 14, 4, and 7 were used. In set 4, genes 48, 76, 17, 62, 65, 87, 19, 24, 83, 29, 55, 12, 68, 82, 73, 18, 20, 10, 81, 53, 33, 56, 34, 5, 60, 46, 16, 25, 2, 42, 6, 49, 4, 45, 88, 32, 77, 8, 1, 71, 3, 27, 72, 59, 79, 64, 11, 80, 57, 61, 75, 39, 23, 52, and 37 were used. In set 5, genes 54, 77, 74, 76, 81, 17, 25, 57, 29, 36, 55, 75, 66, 15, 2, 41, 37, 59, 12, 45, 4, 9, 69, 18, 49, 22, 42, 62, 10, 52, 48, 31, 44, 19, 79, 50, 40, 32, 89, 87, 11, 5, 73, 20, 80, 35 70, 53, 83, 72, 88, 47, 84, 39, and 65 were used. In set 6, genes 86, 43, 75, 90, 32, 85, 38, 54, 87, 42, 73, 55, 27, 34, 11, 14, 65, 82, 77, 21, 26, 46, 83, 10, 15, 22, 66, 20, 67, 72, 35, 68, 3, 53, 44, 50, 70, 40, 30, 31, 84, 81, 62, 51, 80, 79, 59, 57, 88, 69, 2, 64, 23, 28, and 16 were used. In set 7, genes 76, 15, 53, 8, 89, 52, 20, 3, 47, 83, 45, 31, 80, 82, 4, 57, 65, 41, 29, 77, 46, 60, 24, 33, 70, 37, 12, 66, 42, 61, 63, 86, 30, 11, 40, 27, 39, 56, 9, 49, 35, 22, 10, 48, 18, 68, 58, 62, 34, 85, 84, 26, 43, 81, and 38 were used. In set 8, genes 3, 46, 11, 89, 63, 61, 26, 69, 47, 82, 27, 39, 52, 2, 70, 6, 41, 14, 36, 30, 65, 74, 28, 34, 42, 79, 59, 4, 72, 37, 66, 50, 45, 23, 73, 71, 10, 19, 78, 62, 20, 5, 56, 25, 75, 38, 13, 86, 88, 22, 32, 58, 60, 1, and 51 were used. In set 9, genes 16, 61, 85, 3, 42, 24, 55, 4, 9, 22, 28, 31, 53, 74, 25, 52, 10, 49, 2, 21, 30, 78, 54, 26, 38, 87, 35, 37, 45, 84, 83, 57, 64, 65, 68, 50, 1, 34, 75, 67, 60, 5, 7, 58, 59, 76, 27, 77, 44, 32, 6, 11, 48, 56, and 15 were used. In set 10, genes 72, 86, 46, 5, 3, 29, 54, 66, 20, 44, 41, 47, 14, 65, 83, 56, 43, 26, 49, 48, 69, 24, 45, 27, 73, 11, 40, 22, 78, 2, 39, 15, 31, 35, 77, 61, 9, 52, 37, 1, 89, 79, 60, 18, 50, 17, 88, 80, 57, 71, 12, 53, 36, 58, and 42 were used,
[0171] For 60 genes, set 1, genes 75, 54, 79, 78, 4, 48, 36, 29, 28, 32, 82, 38, 21, 8, 80, 46, 47, 57, 76, 50, 18, 68, 85, 13, 61, 65, 71, 56, 45, 58, 84, 25, 72, 43, 7, 77, 74, 69, 86, 31, 19, 63, 35, 83, 70, 3, 62, 90, 52, 87, 44, 41, 66, 12, 23, 59, 1, 10, 49, and 67 were used. In set 2, genes 6, 50, 10, 38, 29, 59, 60, 12, 74, 14, 65, 61, 54, 2, 89, 68, 9, 62, 20, 81, 70, 67, 66, 52, 45, 58, 43, 31, 86, 79, 82, 1, 42, 88, 85, 22, 87, 84, 24, 21, 5, 39, 25, 51, 40, 63, 49, 7, 35, 36, 71, 90, 47, 15, 56, 23, 83, 34, 76, and 19 were used. In set 3, genes 17, 68, 41, 53, 15, 58, 90, 21, 10, 61, 72, 44, 22, 8, 32, 47, 55, 48, 45, 3, 5, 7, 1, 4, 24, 49, 75, 54, 39, 57, 19, 70, 79, 66, 6, 60, 51, 56, 46, 14, 85, 80, 36, 31, 37, 86, 42, 84, 87, 23, 2, 81, 11, 50, 40, 52, 13, 65, 62, and 76 were used. In set 4, genes 54, 24, 50, 11, 77, 63, 84, 71, 16, 51, 78, 83, 10, 28, 31, 29, 43, 14, 30, 61, 81, 58, 4, 48, 64, 37, 18, 39, 1, 67, 45, 40, 80, 79, 8, 55, 36, 2, 32, 25, 21, 46, 73, 38, 34, 52, 49, 65, 13, 66, 6, 76, 20, 85, 15, 44, 60, 69, 86, and 88 were used. In set 5, genes 89, 22, 12, 82, 28, 14, 87, 8, 79, 48, 69, 84, 66, 43, 88, 13, 9, 50, 75, 71, 20, 36, 5, 54, 80, 62, 4, 23, 24, 60, 19, 10, 63, 81, 68, 30, 32, 52, 56, 37, 15, 83, 16, 90, 26, 44, 78, 39, 61, 59, 45, 74, 58, 86, 35, 33, 47, 57, 18, and 42 were used. In set 6, genes 41, 38, 76, 54, 12, 29, 66, 35, 68, 80, 64, 57, 46, 25, 27, 49, 86, 36, 20, 5, 16, 19, 69, 59, 48, 4, 10, 70, 17, 60, 50, 63, 18, 33, 65, 39, 23, 82, 51, 55, 8, 28, 53, 84, 67, 22, 71, 77, 13, 9, 42, 21, 62, 31, 78, 11, 89, 45, 52, and 74 were used. In set 7, genes 84, 12, 17, 10, 33, 56, 50, 61, 74, 21, 78, 11, 37, 36, 3, 5, 30, 43, 47, 54, 27, 32, 77, 51, 42, 4, 76, 71, 83, 46, 57, 73, 87, 24, 90, 8, 72, 29, 35, 66, 28, 70, 22, 39, 65, 85, 1, 82, 40, 89, 80, 58, 52, 38, 59, 86, 69, 13, 16, and 14 were used. In set 8, genes 71, 3, 44, 6, 16, 69, 34, 20, 56, 72, 5, 68, 9, 52, 49, 58, 79, 76, 2, 59, 7, 73, 51, 74, 19, 88, 60, 30, 61, 13, 89, 50, 31, 40, 81, 10, 21, 54, 45, 77, 67, 36, 46, 1, 43, 83, 55, 12, 80, 28, 41, 86, 47, 39, 53, 17, 78, 63, 87, and 48 were used. In set 9, genes 47, 30, 10, 11, 39, 23, 41, 29, 21, 36, 45, 49, 69, 1, 24, 66, 57, 12, 56, 22, 71, 9, 89, 52, 83, 28, 80, 37, 72, 67, 76, 87, 82, 5, 88, 4, 3, 68, 58, 64, 62, 46, 74, 7, 20, 15, 48, 53, 54, 63, 19, 13, 43, 32, 51, 31, 33, 27, 35, and 40 were used. In set 10, genes 75, 29, 27, 66, 15, 47, 14, 3, 12, 80, 31, 32, 41, 17, 74, 7, 57, 59, 64, 25, 13, 77, 33, 43, 81, 55, 48, 68, 30, 54, 69, 88, 62, 86, 67, 37, 20, 8, 42, 19, 70, 24, 49, 73, 23, 10, 1, 85, 89, 44, 58, 2, 11, 63, 76, 5, 53, 83, 50, and 9 were used.
[0172] For 65 genes, set 1, genes 55, 36, 14, 26, 67, 60, 28, 31, 46, 85, 16, 10, 17, 45, 73, 87, 7, 72, 90, 4, 84, 34, 78, 19, 71, 54, 29, 43, 76, 12, 35, 61, 49, 57, 89, 20, 50, 47, 86, 88, 59, 75, 15, 8, 5, 3, 32, 81, 74, 23, 41, 13, 33, 63, 77, 22, 9, 38, 64, 69, 80, 25, 1, 18, and 30 Were used. In set 2, genes 32, 81, 5, 65, 79, 12, 52, 83, 2, 39, 19, 17, 44, 66, 63, 72, 56, 60, 3, 22, 70, 64, 9, 67, 15, 8, 50, 48, 71, 82, 76, 14, 28, 18, 25, 11, 29, 58, 35, 31, 10, 69, 38, 90, 80, 74, 53, 75, 4, 77, 89, 55, 57, 59, 51, 42, 41, 68, 23, 84, 45, 40, 20, 85, and 61 were used. In set 3, genes 33, 52, 22, 67, 7, 36, 40, 6, 56, 29, 48, 41, 28, 68, 83, 90, 51, 70, 60, 24, 87, 88, 18, 58, 73, 1, 17, 8, 26, 89, 38, 4, 10, 47, 75, 72, 50, 13, 23, 66, 20, 30, 12, 43, 46, 15, 16, 5, 55, 31, 63, 32, 53, 69, 39, 71, 42, 62, 57, 34, 44, 14, 25, 64, and 80 were used. In set 4, genes 30, 45, 74, 3, 13, 63, 76, 27, 46, 11, 51, 2, 20, 78, 66, 65, 43, 7, 69, 40, 28, 19, 25, 52, 26, 34, 49, 44, 60, 59, 38, 48, 85, 87, 18, 82, 15, 42, 24, 67, 61, 71, 70, 35, 68, 79, 47, 83, 80, 84, 31, 32, 9, 77, 72, 62, 8, 55, 54, 1, 58, 16, 53, 89, and 90 were used. In set 5, genes 14, 55, 53, 45, 32, 63, 49, 15, 10, 11, 47, 52, 3, 13, 71, 68, 85, 34, 66, 64, 83, 78, 28, 21, 30, 54, 29, 88, 59, 73, 26, 84, 50, 77, 65, 82, 20, 86, 19, 57, 62, 25, 43, 27, 8, 6, 87, 38, 51, 61, 56, 2, 18, 46, 44, 80, 9, 31, 36, 76, 1, 7, 33, 48, and 58 were used. In set 6, genes 66, 44, 18, 85, 54, 28, 80, 65, 25, 1, 88, 72, 74, 46, 76, 71, 24, 51, 47, 31, 21, 60, 83, 32, 3, 63, 64, 69, 52, 27, 2, 38, 34, 10, 12, 35, 77, 33, 29, 56, 67, 40, 30, 22, 49, 5, 7, 43, 17, 13, 81, 20, 79, 14, 48, 73, 53, 90, 70, 59, 19, 16, 8, 36, and 23 were used. In set 7. genes 89, 37, 48, 32, 75, 46, 90, 2, 66, 44, 55, 17, 9, 59, 68, 83, 24, 53, 19, 67, 74, 35, 72, 4, 13, 76, 15, 62, 63, 28, 51, 26, 39, 20, 18, 45, 36, 78, 41, 84, 87, 11, 80, 12, 81, 3, 50, 86, 6, 61, 73, 31, 27, 88, 42, 71, 33, 43, 60, 30, 69, 34, 21, 49, and 70 were used. In set 8, genes 84, 73, 14, 23, 36, 47, 31, 61, 57, 50, 78, 53, 90, 68, 37, 39, 75, 4, 10, 80, 35, 32, 85, 18, 81, 29, 66, 76, 54, 41, 62, 30, 58, 49, 33, 64, 45, 87, 25, 79, 20, 69, 42, 17, 88, 24, 2, 34, 16, 28, 86, 15, 7, 82, 1, 60, 11, 48, 89, 22, 77, 74, 72, 6, and 43 were used. In set 9, genes 1, 74, 39, 48, 44, 47, 3, 8, 80, 54, 16, 41, 76, 9, 85, 86, 49, 70, 52, 89, 19, 66, 43, 17, 15, 63, 29, 53, 42, 32, 30, 4, 36, 7, 77, 2, 84, 87, 28, 67, 20, 56, 65, 31, 12, 25, 40, 10, 73, 6, 83, 64, 50, 13, 22, 58, 45, 21, 57, 60, 72, 82, 26, 33, and 35 were used. In set 10, genes 18, 31, 52, 70, 48, 76, 57, 66, 10, 14, 60, 30, 67, 45, 35, 51, 1, 79, 46, 71, 3, 42, 33, 85, 4, 61, 2, 63, 87, 50, 36, 37, 90, 80, 24, 6, 77, 28, 21, 88, 17, 82, 83, 49, 75, 54, 25, 5, 7, 73, 59, 29, 69, 47, 65, 19, 15, 56, 9, 55, 58, 40, 20, 89, and 74 were used.
[0173] For 70 genes, set 1, genes 79, 50, 38, 63, 74, 71, 66, 4, 33, 1, 69, 88, 85, 18, 27, 77, 70, 65, 14, 40, 64, 29, 59, 6, 3, 9, 84, 22, 62, 60, 30, 7, 11, 13, 45, 57, 35, 72, 15, 75, 39, 36, 10, 53, 67, 80, 83, 31, 5, 25, 90, 89, 58, 23, 56, 2, 16, 41, 76, 47, 26, 43, 17, 55, 82, 87, 24, 12, 48, and 81 were used. In set 2, genes 6, 66, 68, 83, 77, 81, 21, 88, 18, 60, 50, 17, 13, 61, 14, 25, 39, 76, 75, 78, 89, 37, 87, 55, 90, 9, 5, 12, 10, 43, 29, 51, 31, 46, 58, 49, 24, 52, 28, 64, 42, 8, 11, 67, 84, 70, 19, 41, 45, 71, 16, 33, 23, 34, 30, 86, 69, 4, 57, 47, 80, 20, 82, 2, 1, 56, 65, 62, and 48 were used. In set 3, genes 72, 87, 89, 53, 56, 17, 84, 60, 45, 61, 62, 76, 13, 37, 20, 21, 2, 23, 3, 57, 83, 90, 82, 49, 24, 59, 9, 48, 32, 33, 47, 42, 78, 88, 65, 52, 79, 41, 34, 19, 74, 66, 43, 27, 36, 63, 81, 44, 40, 80, 31, 86, 12, 29, 77, 67, 14, 71, 68, 1, 35, 16, 10, 30, 6, 22, 75, 55, 85, and 4 were used. In set 4, genes 70, 81, 71, 17, 8, 59, 6, 15, 52, 74, 23, 9, 19, 14, 82, 86, 27, 73, 66, 38, 22, 41, 88, 76, 47, 58, 56, 11, 55, 64, 44, 84, 62, 21, 35, 80, 36, 28, 12, 13, 4, 1, 75, 60, 5, 87, 89, 2, 50, 46, 25, 85, 37, 90, 78, 34, 24, 18, 45, 79, 77, 30, 32, 51, 57, 67, 83, 68, 54, and 29 were used. In set 5, genes 70, 23, 22, 30, 85, 48, 21, 32, 86, 84, 78, 87, 64, 40, 4, 34, 67, 9, 25, 7, 55, 42, 65, 53, 49, 83, 50, 80, 62, 16, 37, 77, 71, 54, 28, 27, 29, 18, 13, 57, 79, 56, 15, 36, 5, 24, 3, 1, 75, 90, 73, 47, 51, 88, 38, 58, 66, 81, 35, 76, 43, 46, 82, 68, 10, 14, 8, 41, 39, and 59 were used. In set 6, genes 88, 3, 40, 60, 24, 43, 62, 85, 58, 53, 39, 56, 59, 81, 71, 63, 25, 16, 22, 14, 10, 72, 89, 90, 84, 5, 33, 12, 45, 57, 70, 38, 32, 19, 44, 46, 2, 64, 8, 49, 42, 27, 37, 29, 13, 6, 28, 7, 77, 41, 17, 50, 31, 69, 26, 83, 23, 73, 80, 51, 61, 76, 82, 18, 15, 78, 67, 54, 36, and 65 were used. In set 7, genes 35, 52, 48, 42, 65, 38, 61, 79, 23, 20, 12, 8, 53, 57, 22, 54, 69, 9, 56, 43, 5, 66, 86, 49, 81, 19, 40, 45, 85, 60, 10, 50, 55, 11, 15, 73, 13, 2, 29, 59, 78, 67, 18, 80, 84, 39, 87, 90, 58, 46, 17, 32, 7, 62, 14, 34, 27, 6, 83, 70, 51, 26, 68, 21, 82, 77, 44, 47, 24, and 37 were used. In set 8, genes 40, 55, 22, 47, 86, 19, 62, 51, 25, 59, 8, 65, 48, 79, 1, 66, 17, 70, 32, 49, 23, 61, 85, 28, 36, 54, 20, 39, 83, 73, 50, 4, 81, 27, 41, 63, 15, 80, 87, 7, 46, 33, 9, 68, 56, 77, 14, 75, 82, 74, 12, 37, 16, 84, 72, 30, 2, 38, 13, 57, 76, 5, 64, 45, 89, 58, 29, 10, 78, and 90 were used. In set 9, genes 84, 16, 21, 81, 89, 60, 79, 30, 47, 69, 83, 85, 75, 52, 49, 72, 86, 3, 9, 59, 18, 55, 17, 82, 14, 23, 38, 24, 87, 65, 77, 80, 66, 19, 41, 53, 1, 34, 27, 56, 40, 67, 32, 20, 37, 70, 36, 15, 22, 8, 29, 48, 58, 45, 25, 71, 7, 4, 73, 10, 12, 2, 42, 90, 63, 43, 51, 6, 54, and 78 were used. In set 10, genes 19, 51, 29, 22, 66, 13, 32, 89, 62, 45, 65, 35, 24, 73, 55, 47, 67, 76, 69, 26, 37, 64, 53, 10, 15, 34, 79, 2, 56, 30, 3, 20, 78, 31, 75, 46, 27, 52, 6, 86, 16, 9, 54, 87, 58, 33, 61, 11, 43, 40, 74, 60, 50, 25, 80, 72, 83, 38, 1, 70, 5, 7, 77, 85, 59, 88, 63, 14, and 84 were used.
[0174] For 75 genes, set 1, genes 87, 17, 52, 44, 57, 53, 78, 37, 2, 71, 9, 68, 6, 63, 50, 58, 13, 26, 16, 60, 67, 3, 32, 21, 79, 12, 77, 73, 24, 35, 80, 47, 29, 40, 30, 84, 39, 90, 11, 81, 75, 76, 89, 66, 86, 42, 34, 64, 54, 7, 41, 62, 55, 46, 28, 5, 25, 27, 83, 19, 20, 49, 69, 85, 33, 18, 23, 74, 1, 10, 43, 22, 8, and 45 were used. In set 2, genes 75, 33, 52, 86, 24, 50, 70, 10, 17, 90, 28, 46, 48, 77, 47, 61, 12, 4, 83, 27, 45, 88, 35, 36, 22, 68, 73, 31, 57, 69, 65, 64, 15, 9, 54, 39, 14, 20, 67, 79, 44, 38, 78, 23, 84, 37, 66, 5, 11, 18, 41, 13, 21, 49, 16, 76, 1, 29, 53, 40, 42, 63, 25, 56, 6, 82, 71, 85, 89, 80, 34, 51, 60, 30, and 58 were used. In set 3, genes 39, 82, 36, 31, 52, 84, 30, 83, 49, 1, 44, 10, 87, 78, 77, 18, 79, 9, 73, 69, 75, 45, 14, 16, 56, 40, 58, 15, 32, 34, 42, 60, 19, 63, 47, 41, 68, 13, 61, 90, 89, 5, 46, 57, 8, 7, 66, 43, 21, 17, 11, 72, 74, 4, 33, 53, 12, 65, 50, 2, 81, 24, 62, 6, 23, 25, 88, 51, 67, 64, 7, 80, 54, 22, and 3 were used. In set 4, genes 63, 2, 5, 52, 10, 62, 75, 4, 6, 51, 29, 54, 49, 55, 36, 37, 77, 46, 44, 79, 11, 59, 38, 14, 65, 43, 48, 35, 86, 78, 73, 72, 57, 8, 16, 58, 56, 82, 60, 42, 80, 13, 9, 90, 53, 66, 21, 67, 88, 89, 45, 22, 71, 31, 84, 74, 15, 23, 26, 3, 68, 1, 39, 7, 50, 41, 40, 81, 87, 34, 18, 12, 70, 47, and 25 were used. In set 5, genes 62, 82, 46, 89, 81, 43, 57, 69, 9, 19, 18, 16, 80, 63, 72, 2, 54, 86, 44, 53, 31, 5, 1, 61, 20, 37, 58, 32, 28, 47, 34, 6, 41, 68, 15, 90, 85, 13, 23, 10, 4, 70, 76, 33, 11, 51, 35, 88, 67, 84, 8, 24, 66, 65, 26, 59, 40, 79, 64, 42, 45, 22, 17, 87, 30, 12, 27, 14, 39, 56, 38, 71, 52, 36, and 60 were used. In set 6, genes 16, 85, 19, 39, 64, 76, 44, 15, 50, 73, 27, 36, 6, 62, 54, 46, 58, 68, 28, 13, 14, 21, 86, 47, 71, 87, 18, 5, 67, 1, 65, 78, 12, 66, 43, 82, 38, 23, 75, 24, 49, 57, 17, 10, 29, 72, 22, 89, 90, 26, 42, 45, 2, 33, 41, 9, 8, 7, 69, 31, 30, 79, 80, 84, 55, 35, 20, 70, 83, 48, 88, 60, 25, 74, and 63 were used. In set 7, genes 24, 66, 86, 48, 63, 51, 74, 37, 2, 82, 77, 22, 72, 21, 11, 90, 80, 55, 76, 68, 34, 42, 29, 62, 46, 39, 56, 31, 47, 28, 16, 38, 44, 52, 1, 43, 14, 20, 64, 83, 78, 58, 12, 18, 84, 67, 75, 85, 36, 25, 50, 49, 40, 33, 23, 45, 41, 73, 88, 59, 17, 32, 70, 13, 60, 57, 3, 7, 54, 4, 8, 53, 26, 15, and 69 were used. In set 8, genes 80, 38, 59, 41, 85, 44, 12, 22, 39, 17, 52, 24, 32, 62, 18, 8, 78, 74, 9, 66, 76, 14, 3, 16, 40, 28, 48, 58, 54, 29, 43, 5, 81, 77, 86, 23, 75, 82, 34, 7, 51, 64, 4, 6, 72, 61, 37, 84, 45, 33, 71, 19, 67, 88, 1, 35, 47, 83, 25, 49, 11, 42, 50, 70, 2, 46, 15, 26, 27, 68, 57, 65, 13, 53, and 90 were used. In set 9, genes 4, 66, 28, 44, 20, 34, 12, 85, 6, 17, 88, 8, 39, 65, 22, 19, 10, 48, 63, 23, 33, 13, 47, 81, 79, 89, 64, 53, 87, 11, 46, 74, 14, 70, 37, 62, 30, 7, 71, 76, 50, 59, 77, 51, 15, 68, 55, 72, 83, 82, 78, 54, 25, 21, 27, 41, 69, 9, 58, 3, 31, 75, 84, 26, 86, 49, 18, 42, 61, 45, 16, 2, 24, 80, and 73 were used. In set 10, genes 78, 47, 32, 30, 46, 6, 2, 64, 11, 27, 85, 22, 79, 63, 80, 39, 90, 65, 71, 72, 21, 26, 58, 15, 16, 23, 81, 1, 44, 43, 40, 55, 13, 19, 25, 83, 41, 18, 53, 68, 37, 20, 49, 69, 33, 61, 38, 28, 60, 45, 17, 82, 24, 4, 86, 89, 36, 51, 84, 31, 14, 88, 59, 76, 48, 5, 35, 75, 74, 7, 67, 62, 52, 56, and 54 were used.
[0175] For 80 genes, set 1, genes 29, 80, 5, 50, 63, 3, 1, 55, 38, 48, 58, 30, 86, 82, 83, 6, 23, 2, 41, 60, 54, 69, 15, 34, 64, 10, 27, 70, 28, 44, 8, 68, 56, 14, 36, 17, 73, 13, 88, 42, 72, 59, 67, 71, 26, 53, 37, 24, 79, 62, 52, 74, 4, 40, 47, 19, 78, 11, 76, 31, 90, 12, 87, 89, 75, 66, 81, 16, 49, 65, 57, 84, 46, 20, and 21 were used. In set 2, genes 15, 21, 70, 5, 79, 85, 84, 53, 69, 33, 28, 14, 75, 76, 58, 48, 13, 45, 51, 88, 25, 74, 39, 71, 64, 9, 60, 44, 78, 7, 8, 3, 32, 89, 73, 1, 4, 29, 41, 17, 46, 57, 72, 20, 86, 47, 49, 87, 55, 19, 37, 27, 80, 62, 54, 18, 52, 67, 63, 77, 65, 24, 31, 26, 83, 2, 22, 90, 50, 12, 16, 35, 11, 10, and 56 were used. In set 3, genes 41, 4, 59, 73, 29, 22, 60, 45, 70, 10, 64, 21, 81, 36, 52, 67, 54, 38, 65, 90, 27, 87, 28, 7, 74, 43, 56, 75, 9, 35, 42, 20, 72, 47, 14, 63, 18, 68, 23, 69, 8, 50, 89, 3, 11, 82, 39, 80, 46, 16, 53, 58, 25, 79, 49, 76, 37, 30, 78, 83, 2, 84, 57, 88, 6, 32, 12, 71, 15, 55, 48, 34, 62, 61, and 13 were used. In set 4, genes 23, 31, 53, 90, 3, 40, 34, 6, 1, 83, 9, 60, 56, 50, 44, 85, 51, 35, 43, 80, 65, 46, 38, 88, 17, 54, 87, 10, 45, 42, 75, 68, 63, 58, 36, 64, 67, 77, 21, 47, 30, 59, 14, 49, 70, 66, 72, 74, 27, 61, 19, 81, 20, 25, 33, 57, 62, 76, 55, 78, 84, 16, 69, 37, 79, 29, 39, 32, 15, 5, 2, 12, 71, 11, and 73 were used. In set 5, genes 29, 71, 21, 60, 43, 78, 55, 61, 51, 90, 10, 37, 35, 53, 28, 62, 15, 1, 31, 67, 48, 36, 75, 27, 63, 87, 24, 32, 54, 79, 16, 70, 64, 40, 47, 41, 17, 38, 3, 45, 81, 68, 72, 56, 77, 8, 13, 34, 57, 26, 73, 14, 6, 82, 4, 58, 89, 30, 7, 74, 69, 88, 20, 5, 46, 2, 11, 49, 50, 23, 33, 42, 83, 52 and 86 were used. In set 6, genes 39, 54, 30, 24, 80, 10, 21, 7, 14, 69, 38, 83, 52, 65, 46, 42, 66, 36, 61, 16, 50, 33, 2, 73, 13, 81, 48, 8, 6, 41, 12, 25, 43, 79, 35, 26, 89, 75, 60, 67, 82, 45, 20, 90, 68, 77, 58, 34, 18, 47, 22, 84, 4, 57, 32, 5, 19, 59, 86, 74, 1, 31, 62, 85, 29, 53, 88, 28, 40, 37, 63, 15, 64, 49, and 55 were used. In set 7, genes 21, 68, 81, 50, 36, 6, 80, 76, 90, 74, 12, 79, 34, 53, 1, 4, 5, 41, 56, 47, 15, 63, 11, 14, 7, 78, 57, 65, 73, 20, 8, 64, 84, 30, 3, 13, 52, 49, 27, 86, 60, 72, 62, 29, 75, 40, 32, 2, 82, 33, 10, 24, 51, 17, 46, 38, 19, 37, 28, 69, 61, 85, 88, 22, 48, 89, 18, 25, 71, 58, 31, 35, 26, 55, and 44 were used. In set 8, genes 30, 64, 67, 79, 52, 71, 13, 3, 22, 8, 75, 41, 65, 21, 60, 36, 49, 84, 33, 29, 57, 86, 15, 12, 85, 63, 6, 20, 66, 53, 51, 90, 87, 55, 11, 32, 31, 61, 78, 58, 42, 48, 5, 1, 17, 50, 70, 76, 25, 45, 2, 73, 28, 14, 89, 56, 39, 44, 7, 74, 16, 72, 35, 19, 47, 27, 43, 83, 68, 26, 18, 37, 69, 54, and 23 were used. In set 9, genes 79, 85, 48, 29, 23, 31, 62, 37, 5, 33, 3, 19, 53, 9, 36, 18, 58, 17, 81, 46, 8, 35, 66, 87, 14, 30, 74, 77, 21, 40, 75, 43, 42, 15, 39, 70, 60, 13, 10, 2, 72, 44, 45, 38, 4, 25, 84, 68, 50, 24, 7, 27, 82, 55, 80, 32, 89, 57, 6, 69, 83, 28, 56, 22, 16, 1, 41, 63, 26, 78, 12, 59, 64, 61, and 11 were used. In set 10, genes 45, 9, 24, 85, 68, 80, 73, 17, 56, 7, 8, 5, 69, 58, 37, 44, 21, 29, 50, 15, 53, 25, 40, 88, 36, 32, 59, 75, 49, 35, 43, 67, 83, 31, 51, 28, 60, 77, 30, 74, 22, 41, 42, 64, 61, 23, 90, 13, 33, 11, 16, 20, 46, 66, 6, 87, 39, 47, 65, 3, 82, 10, 72, 34, 18, 1, 38, 57, 79, 71, 26, 27, 19, 48, and 76 were used,
[0176] For 85 genes, set 1, genes 62, 19, 38, 77, 64, 49, 14, 16, 47, 73, 28, 3, 54, 78, 70, 12, 75, 35, 15, 40, 21, 60, 58, 86, 83, 33, 66, 59, 44, 45, 56, 9, 5, 81, 72, 68, 27, 37, 71, 52, 48, 36, 79, 6, 41, 74, 22, 46, 2, 20, 34, 13, 55, 53, 10, 88, 57, 61, 4, 39, 24, 85, 76, 87, 65, 25, 23, 90, 32, 26, 80, 63, 89, 82, and 7 were used. In set 2, genes 72, 30, 36, 64, 47, 57, 67, 20, 58, 1, 6, 61, 71, 32, 42, 53, 87, 65, 25, 17, 9, 60, 83, 12, 51, 8, 37, 75, 59, 89, 85, 22, 44, 19, 63, 7, 62, 13, 81, 41, 79, 43, 49, 4, 34, 68, 88, 74, 28, 31, 10, 39, 11, 55, 15, 5, 69, 50, 66, 18, 77, 46, 76, 33, 3, 35, 38, 14, 40, 86, 54, 23, 24, 48, and 78 were used. In set 3, genes 5, 67, 57, 18, 12, 42, 43, 71, 50, 19, 26, 51, 52, 32, 74, 88, 46, 2, 9, 77, 30, 58, 69, 81, 35, 87, 90, 34, 22, 15, 84, 44, 8, 3, 47, 60, 55, 66, 33, 20, 86, 39, 16, 37, 85, 73, 4, 13, 56, 27, 65, 76, 49, 54, 75, 31, 68, 82, 23, 62, 7, 53, 78, 36, 64, 40, 45, 6, 70, 17, 79, 10, 21, 48, and 89 were used. In set 4, genes 67, 47, 68, 38, 50, 82, 54, 56, 64, 49, 63, 14, 22, 7, 25, 12, 57, 85, 88, 5, 28, 23, 77, 44, 80, 89, 83, 20, 81, 73, 11, 17, 76, 75, 32, 34, 55, 62, 21, 6, 30, 10, 71, 39, 36, 74, 42, 60, 43, 8, 59, 58, 65, 3, 61, 72, 70, 79, 16, 13, 18, 19, 45, 4, 84, 1, 87, 26, 46, 40, 37, 78, 15, 69, and 41 were used. In set 5, genes 82, 42, 35, 86, 14, 37, 39, 30, 41, 60, 44, 9, 12, 34, 50, 68, 5, 29, 46, 19, 11, 28, 48, 3, 20, 77, 67, 57, 88, 55, 32, 78, 51, 71, 47, 63, 6, 10, 45, 70, 8, 81, 18, 43, 69, 79, 21, 13, 66, 59, 33, 1, 31, 74, 36, 2, 24, 54, 23, 85, 72, 73, 80, 64, 84, 7, 38, 87, 58, 75, 22, 65, 15, 53, and 52 were used. In set 6, genes 55, 2, 11, 72, 4, 85, 43, 18, 46, 27, 80, 69, 9, 31, 39, 5, 81, 22, 32, 3, 36, 17, 83, 37, 90, 38, 87, 44, 56, 66, 13, 6, 28, 77, 54, 79, 41, 78, 47, 29, 8, 21, 63, 64, 73, 48, 14, 34, 82, 70, 30, 58, 84, 24, 26, 68, 1, 65, 60, 42, 33, 20, 7, 75, 12, 57, 59, 16, 74, 88, 23, 49, 50, 40, and 71 were used. In set 7, genes 18, 40, 66, 35, 20, 85, 12, 19, 86, 26, 36, 89, 84, 88, 74, 15, 33, 75, 50, 16, 49, 32, 38, 31, 2, 27, 87, 68, 69, 53, 60, 79, 7, 21, 63, 17, 90, 30, 29, 11, 56, 25, 58, 62, 48, 8, 45, 9, 72, 64, 28, 76, 3, 78, 46, 1, 10, 34, 43, 83, 5, 52, 14, 65, 51, 41, 22, 44, 61 24, 70, 54, 59, 77, and 13 were used. In set 8, genes 35, 40, 80, 57, 23, 28, 9, 83, 13, 47, 82, 36, 86, 44, 90, 55, 30, 22, 12, 42, 38, 49, 45, 8, 87, 17, 52, 3, 33, 15, 32, 21, 76, 58, 7, 53, 20, 67, 19, 29, 85, 68, 71, 39, 24, 25, 84, 4, 6, 75, 63, 73, 5, 18, 31, 48, 65, 41, 60, 37, 88, 72, 1, 46, 79, 16, 78, 10, 77, 34, 66, 56, 61, 70, and 2 were used. In set 9, genes 51, 59, 73, 9, 79, 21, 39, 67, 71, 68, 28, 65, 85, 30, 41, 61, 29, 8, 16, 78, 34, 1, 77, 90, 45, 33, 60, 89, 49, 56, 43, 62, 83, 6, 11, 18, 50, 66, 47, 19, 4, 22, 13, 27, 86, 26, 20, 17, 52, 10, 70, 54, 42, 53, 24, 76, 81, 75, 38, 64, 74, 36, 48, 32, 82, 44, 37, 57, 72, 35, 7, 14, 15, 3, and 23 were used. In set 10, genes 18, 85, 61, 1, 52, 87, 42, 13, 88, 66, 46, 57, 50, 36, 75, 39, 14, 27, 54, 20, 53, 10, 4, 30, 37, 43, 79, 80, 40, 84, 76, 45, 60, 74, 12, 31, 15, 44, 48, 3, 56, 11, 68, 19, 86, 72, 6, 9, 21, 70, 34, 83, 89, 5, 69, 64, 22, 24, 63, 65, 55, 8, 41, 28, 2, 16, 35, 77, 26, 47, 90, 49, 59, 23, and 17 were, used.
Example 5: PCR Based Detection
[0177] As noted above, the determination or measurement of gene expression may be performed by PCR, such as the use of quantitative PCR. Detecting expression of 50 or more expressed sequences in the human genome may be used in such embodiments of the invention. Additionally, expression levels of 50 or more gene sequences in the set of 74, the set of 90, or a combination set of the two (with a total of 126 gene sequences given the presence of 38 gene sequences in common between the two sets) may also be used. The invention contemplates the use of quantitative PCR to measure expression levels, as described above, of 87 gene sequences (or 50 or more sequences thereof), all of which are present in either the set of 74 or the set of 90. Of the 87 gene sequences, 60 are present in the set of 74, and 63 are present in the set of 90. The identifiers/accession numbers of the 87 gene sequences are AA456140, AA745593, AA765597, AA782845, AA865917, AA946776, AA993639, AB038160, AF104032, AF133587, AF301598, AF332224, A041545, AI147926, AI309080, AI341378, AI457360, AI620495, AI632869, AI683181, AI685931, AI1802118, AI804745, AI952953, AI985118, AJ000388, AK025181, AK027147, AK054605, AL023657, AL039118, AL110274, AL157475, AW118445, AW194680, AW291189, AW298545, AW445220, AW473119, AY033998, BC000045, BC001293, BC001504, BC0001639, BC002551, BC004331, BC004453, BC005364, BC006537, BC006811, BC006819, BC008764, BC008765, BC009084, BC009237, BC010626, BC011949, BC012926, BC013117, BC015754, BC017586, BE552004, BE962007, BF224381, BF437393, BF446419, BF592799, BI493248, H05388, H07885, H09748, M95585, N64339, NM_000065, NM_001337, NM_003914, NM_1004062, NM_004063, NM_004496, NM_006115, NM_019894, NM_033229, R15881, R45389, R61469, X69699, and X96757.
[0178] The use of from 50 to all of these sequences in the practice of the invention may include the use of expression levels measured for reference gene sequences as described herein. In some embodiments, the reference gene sequences are one or more of the 8 disclosed herein. The invention contemplates the use of one or more of the reference sequences identified by AF308803, AL137727, BC003043, BC006091, and BC016680 in PCR or QPCR based embodiments of the invention. Of course all 5 of these reference sequences may also be used.
[0179] All references cited herein, including patents, patent applications, and publications, are hereby incorporated by reference in their entireties, whether previously specifically incorporated or not.
[0180] Having now fully described this invention, it will be appreciated by those skilled in the art that the same can be performed within a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation.
[0181] While this invention has been described in connection with specific embodiments thereof, it will be understood that it is capable of further modifications. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features hereinbefore set forth.
Sequence CWU
1
1
26812930DNAHomo sapiens 1ggccactctg cagacagctc cagacaatca ggcactcgtc
acacagagtc ttcctctcgt 60ggacaggctg cgtcatccca tgaacaggca agatcaagtg
caggagaaag acatggatcc 120caccaccagc agtcagcaga cagctccaga cacgcaggca
ttgggcacgg acaagcttca 180tctgcagtca gagacagtgg acaccgaggg tacagaggta
gtcaggccac tgacagtgag 240ggacattcag aagactcaga cacacagtca gtgtcagcac
agggacaagc tgggccccat 300cagcagagcc accaagagtc cgcacgtggc cagtcagggg
aaagctctgg acgttcaggg 360tctttcctct accaggtgag cactcatgaa cagtctgagt
ccacccatgg acagtctgtg 420cccagcactg gaggaagaca aggatcccac catgatcagg
cacaagacag ctccaggcac 480tcagcatccc aagagggtca ggacaccatt cgtggacacc
cggggccaag cagaggagga 540agacaggggt cccaccacga gcaatcggta gataggtctg
gacactcagg gtcccatcac 600agccacacca catcccaggg aaggtctgat gcctcccgtg
ggcagtcagg atccagaagt 660gcaagcagac aaacacatga ccaggaacaa tcaggagacg
gctctaggca ctcagggtcg 720cgtcatcagg aagcttcctc ttgggccgac agctctagac
actcacaggc agtccaggga 780caatcagagg ggtccaggac aagcaggcgc cagggatcca
gtgttagcca ggacagtgac 840agtcagggac actcagaaga ctctgagagg cggtctgggt
ctgcttccag aaaccatcgt 900ggatctgctc aggagcagtc aagagatggc tccagacacc
ccaggtccca tcacgaagac 960agagccggtc acggggactc tgcagagagc tccagacaat
caggcactca tcatgcagag 1020aattcctctg gtggacaggc tgcatcatcc catgaacagg
caagatcaag tgcaggagag 1080agacatggat cccactacca gcagtcagca gacagctcca
gacactcagg cattgggcac 1140ggacaagctt catctgcagt cagagacagt ggacaccgag
ggtccagtgg tagtcaggcc 1200agtgacaatg agggacattc agaagactca gacacacagt
cagtgtcagc ccaccgacag 1260gctgggcgcc atcacgagag ccaccaagag tccacacgtg
gccggtcacg aggaaggtct 1320ggacgttcag ggtctttcct ctaccaggtg agcactcatg
aacagtctga gtctgcccat 1380ggacgggctg ggcccagtac tggaggaaga caaggatccc
gccacgagca ggcacgagac 1440agctccaggc actcagcgtc ccaagagggt caggacacca
ttcgtggaca cccggggtca 1500aggagaggag gaagacaggg atcctaccac gagcaatcgg
tagataggtc tggacactca 1560gggtcccatc acagccacac cacatcccag ggaaggtctg
atgcctccca tgggcagtca 1620ggatccagaa gtgcaagcag agaaacacgt aatgaggaac
agtcaggaga cggctccagg 1680cactcagggt cgcgtcacca tgaagcttcc actcaggctg
acagctctag acactcacag 1740tccggccagg gtgaatcagc ggggtccagg agaagcaggc
gccagggatc cagtgttagc 1800caggacagtg acagtgaggc atacccagag gactctgaga
ggcgatctga gtctgcttcc 1860agaaaccatc atggatcttc tcgggagcag tcaagagatg
gctccagaca ccccggatcc 1920tctcaccgcg atacagccag tcatgtacag tcttcacctg
tacagtcaga ctctagtacc 1980gctaaggaac atggtcactt tagtagtctt tcacaagatt
ctgcgtatca ctcaggaata 2040cagtcacgtg gcagtcctca cagttctagt tcttatcatt
atcaatctga gggcactgaa 2100aggcaaaaag gtcaatcagg tttagtttgg agacatggca
gctatggtag tgcagattat 2160gattatggtg aatccgggtt tagacactct cagcacggaa
gtgttagtta caattccaat 2220cctgttcttt tcaaggaaag atctgatatc tgtaaagcaa
gtgcgtttgg taaagatcat 2280ccaaggtatt atgcaacgta tattaataag gacccaggtt
tatgtggcca ttctagtgat 2340atatcgaaac aactgggatt tagtcagtca cagagatact
attactatga gtaagaaatt 2400aatggcaaag gaattaatcc aagaatagaa gaatgaagca
agttcacttt caatcaagaa 2460acttcataat actttcaggg aagttatctt ttcctgtcaa
tctgtttaaa atatgctata 2520gtatttcatt agtttggtgg taacttattt ttattgtgta
atgatcttta aacgctatat 2580ttcagaaata ttaaatggaa gaaatcaata tcatggagag
ctaactttag aaaactagct 2640ggagtatttt aggagattct gggtcaagta atgttttatg
tttttgaaag tttaagtttt 2700agacactccc caaatttcta aattaatctt tttcagaaat
atcgaaggag ccaaaaatat 2760aaaacagttc tgatatccaa agtggctata tcaacatcag
ggctagcaca tctttctcta 2820ttatccttct attggaattc tagtattctg tattcaaaaa
atcatcttgg acataattaa 2880tattttagta agctgcatct aaattaaaaa taaactattc
atcatataat 293021591DNAHomo sapiens 2tagaatcggg ggtttcagct
cactgctcct tttctttttt ttctttctct cccccgccca 60cccccccaaa aataattgat
ttgctttaca atcatccaca ctgtgttttg tggatcttta 120attatatata acaatagtag
tcattttaaa tatatattct gaaatctttg caaattttaa 180cagaagagtc gaagctctgc
gagacccaat atttgccaat aagaatggtt atgataatta 240gcaccatgga gcctcaggtg
tcaaatggtc cgacatccaa tacaagcaat ggaccctcca 300gcaacaacag aaactgtcct
tctcccatgc aaacaggggc aaccacagat gacagcaaaa 360ccaacctcat cgtcaactat
ttaccccaga atatgaccca agaagaattc aggagtctct 420tcgggagcat tggtgaaata
gaatcctgca aacttgtgag agacaaaatt acaggacaga 480gtttagggta tggatttgtt
aactatattg atccaaagga tgcagagaaa gccatcaaca 540ctttaaatgg actcagactc
cagaccaaaa ccataaaggt ctcatatgcc cgtccgagct 600ctgcctcaat cagggatgct
aacctctatg ttagcggcct tcccaaaacc atgacccaga 660aggaactgga gcaacttttc
tcgcaatacg gccgtatcat cacctcacga atcctggttg 720atcaagtcac aggagtgtcc
agaggggtgg gattcatccg ctttgataag aggattgagg 780cagaagaagc catcaaaggg
ctgaatggcc agaagcccag cggtgctacg gaaccgatta 840ctgtgaagtt tgccaacaac
cccagccaga agtccagcca ggccctgctc tcccagctct 900accagtcccc taaccggcgc
tacccaggtc cacttcacca ccaggctcag aggttcaggc 960tggacaattt gcttaatatg
gcctatggcg taaagagact gatgtctgga ccagtccccc 1020cttctgcttg ttcccccagg
ttctccccaa ttaccattga tggaatgaca agccttgtgg 1080gaatgaacat ccctggtcac
acaggaactg ggtggtgcat ctttgtctac aacctgtccc 1140ccgattccga tgagagtgtc
ctctggcagc tctttggccc ctttggagca gtgaacaacg 1200taaaggtgat tcgtgacttc
aacaccaaca agtgcaaggg attcggcttt gtcaccatga 1260ccaactatga tgaggcggcc
atggccatcg ccagcctcaa cgggtaccgc ctgggagaca 1320gagtgttgca agtttccttt
aaaaccaaca aagcccacaa gtcctgaatt tcccattctt 1380acttactaaa atatatatag
aaatatatac gaacaaaaca cacgcgcgca cacacacaca 1440tacacgaaag agagagaaac
aaacttttca aggcttatat tcaaccatgg actttataag 1500ccagtgttgc ctaagtatta
aaacattgga ttatcctgag gtgtaccagg aaaggatttt 1560ataatgctta gaaaaaaaaa
aaaaaaaaaa a 159132872DNAHomo sapiens
3tccaggaatc gatagtgcat tcgtgcgcgc ggccgcccgt cgcttcgcac agggctggat
60ggttgtattg ggcagggtgg ctccaggatg ttaggaactg tgaagatgga agggcatgaa
120accagcgact ggaacagcta ctacgcagac acgcaggagg cctactcctc ggtcccggtc
180agcaacatga actcaggcct gggctccatg aactccatga acacctacat gaccatgaac
240accatgacta cgagcggcaa catgaccccg gcgtccttca acatgtccta tgccaacccg
300gccttagggg ccggcctgag tcccggcgca gtagccggca tgccgggggg ctcggcgggc
360gccatgaaca gcatgactgc ggccggcgtg acggccatgg gtacggcgct gagcccgagc
420ggcatgggcg ccatgggtgc gcagcaggcg gcctccatga tgaatggcct gggcccctac
480gcggccgcca tgaacccgtg catgagcccc atggcgtacg cgccgtccaa cctgggccgc
540agccgcgcgg gcggcggcgg cgacgccaag acgttcaagc gcagttaccc gcacgccaag
600ccgccctact cgtacatctc gctcatcacc atggccatcc agcgggcgcc cagcaagatg
660ctcacgctga gcgagatcta ccagtggatc atggacctct tcccctatta ccggcagaac
720cagcagcgct ggcagaactc catccgccac tcgctgtcct tcaatgactg cttcgtcaag
780gtggcacgct ccccggacaa gccgggcaag ggctcctact ggacgctgca cccggactcc
840ggcaacatgt tcgagaacgg ctgctacttg cgccgccaga agcgcttcaa gtgcgagaag
900cagccggggg ccggcggcgg gggcgggagc ggaagcgggg gcagcggcgc caagggcggc
960cctgagagcc gcaaggaccc ctctggcgcc tctaacccca gcgccgactc gcccctccat
1020cggggtgtgc acgggaagac cggccagcta gagggcgcgc cggccccggg cccggccgcc
1080agcccccaga ctctggacca cagtggggcg acggcgacag ggggcgcctc ggagttgaag
1140actccagcct cctcaactgc gccccccata agctccgggc ccggggcgct ggcctctgtg
1200cccgcctctc acccggcaca cggcttggca ccccacgagt cccagctgca cctgaaaggg
1260gacccccact actccttcaa ccacccgttc tccatcaaca acctcatgtc ctcctcggag
1320cagcagcata agctggactt caaggcatac gaacaggcac tgcaatactc gccttacggc
1380tctacgttgc ccgccagcct gcctctaggc agcgcctcgg tgaccaccag gagccccatc
1440gagccctcag ccctggagcc ggcgtactac caaggtgtgt attccagacc cgtcctaaac
1500acttcctagc tcccgggact ggggggtttg tctggcatag ccatgctggt agcaagagag
1560aaaaaatcaa cagcaaacaa aaccacacaa accaaaccgt caacagcata ataaaatcca
1620acaactattt ttatttcatt tttcatgcac aaccttgccc ccagtgcaaa agactgttac
1680tttattattg tattcaaaat tcattgtgta tattactaca aagacggccc caaaccaatt
1740tttttcctgc gaagtttaat gatccacaag tgtatatatg aaattctcct ccttccttgc
1800ccccctctct ttcttccctc ttggccctcc agacattcta gtttgtggag ggttatttaa
1860aaaacaaaaa ggaagatggt caagtttgta aaatatttgt ttgtgctttt cccccctcct
1920tacctgaccc cctacgagtt tacaggcttg tggcaatact cttaaccata agaattgaaa
1980tggtgaagaa acaagtatac actagaggct cttaaaagta ttgaaaagac aatactgctg
2040ttatatagca agacataaac agattataaa catcagagcc atttgcttct cagtttacat
2100ttctgataca tgcagatagc agatgtcttt aaatgaaata catgtatatt gtgtatggac
2160ttaattatgc acatgctcag atgtgtagac atcctccgta tatttacata acatatagag
2220gtaatagata ggtgatatac gtgatacgtt ctcaagagtt gcttgaccga aagttacaag
2280gaccccaacc cctttgctct ctacccacag atggccctgg gaacaatcct caggaattgc
2340cctcaagaac tcgcttcttt gctttgagag tgccatggtc atgtcattct gaggtacata
2400acacataaat tagtttctat gagtgtatac catttaaaga ttttttcagt aaagggaata
2460ttacatgttg ggaggaggag ataagttata gggagctgga tttcaaacgg tggtccaaga
2520ttcaaaaatc ctattgatag tggccatttt aatcattgcc atcgtgtgct tgtttcatcc
2580agtgttatgc actttccaca gttggtgtta gtatagccag agggtttcat tattatttct
2640ctttgctttc tcaatgttaa tttattgcat ggtttattct ttttctttac agctgaaatt
2700gctttaaatg atggttaaaa ttacaaatta aattgggaat ttttatcaat gtgattgtaa
2760ttaaaaatat tttgatttaa ataacaaaaa taataccaga ttttaagccg cggaaaatgt
2820tcttgatcat ttgcagttaa ggactttaaa taaatcaaat gttaacaaaa aa
28724540DNAHomo sapiens 4tgtttttcta gttcattttg tgtttccaac ttttcatgta
aaattttaat tatttttgaa 60tgtgtggatg tgagactgag gtgccttttg gtactgaaat
tctttttcca tgtacctgaa 120gtgttacttt tgtgatatag gaaatccttg tatatatact
ttattggtcc ctaggcttcc 180tattttgtta ccttgctttc tctatggcat ccaccatttt
gattgttcta cttttatgat 240atgttttcat aagtggttaa gcaagtattc tcgttacttt
tgctcttaaa tccctattca 300ttacagcaat gttggtggtc aaagaaaatg ataaacaact
tgaatgttca atggtcctga 360aatacataac aacattttag tacattgtaa agtagaatcc
tctgttcata atgaacaaga 420tgaaccaatg tggattagaa agaagtccga gatattaatt
ccaaaatatc cagacattgt 480taaagggaaa aaattgcaat aaaatatttg taacataaaa
aaaaaaaaaa aaaaaaaaaa 54052508DNAHomo sapiens 5agctctcccc accaataaaa
ggaccaggga ggatcagaga gagcagaagg atcctgagcc 60tcgcactctg ccgcccgcac
caccttccgc tgcctctcag actctgctca gcctcacacg 120atgtcgtgcc gctcctacag
gatcagctca ggatgcgggg tcaccaggaa cttcagctcc 180tgctcagctg tggcccccaa
aactggcaac cgctgctgca tcagcgccgc cccctaccga 240ggggtgtcct gctaccgagg
gctgacgggc ttcggcagcc gcagcctctg caacctgggc 300tcctgcgggc cccggatagc
tgtaggtggc ttccgagccg gctcctgcgg acgcagcttc 360ggctaccgct ccgggggcgt
gtgcggaccc agccccccat gcatcactac cgtgtcggtc 420aacgagagcc tcctcacgcc
cctcaacctg gagatcgacc ccaacgcaca gtgcgtgaag 480caggaggaga aggagcagat
caagtccctc aacagcaggt tcgcggcctt catcgacaag 540gtgcgcttcc tggagcagca
gaacaagctg ctggagacca agtggcagtt ctaccagaac 600cagcgctgct gcgagagcaa
cctggagcca ctgttcagtg gctacatcga gactctgcgg 660cgggaggccg agtgcgtgga
ggccgacagc gggaggctgg cctcagagct caaccatgtg 720caggaggtgc tggagggcta
caagaagaag tatgaagagg aggtggccct gagagccaca 780gcagagaatg agtttgtcgt
tctaaagaag gacgtggact gtgcctacct gcggaaatca 840gacctggagg ccaatgtgga
ggccctggtg gaggagtcta gcttcctgag gcgcctctat 900gaagaggaga tccgcgttct
ccaagcccac atctcagaca cctcggtcat agtcaagatg 960gacaacagcc gagacctgaa
catggactgc atcatcgctg agatcaaggc tcagtatgac 1020gatgttgcca gccgcagccg
ggccgaggct gagtcctggt accgtagcaa gtgtgaggag 1080atgaaggcca cggtgatcag
gcatggggag accctgcgcc gcaccaagga ggagatcaac 1140gagctgaacc gcatgatcca
gaggctgacg gccgagattg agaatgccaa gtgccagcgt 1200gccaagctgg aggctgctgt
ggctgaggca gagcagcagg gtgaggcggc cctcagcgat 1260gcccgctgca agctggctga
gctggagggc gccctgcaga aggccaagca ggacatggcc 1320tgcctgctca aggagtacca
ggaggtgatg aactccaagc tgggcctgga catcgagatc 1380gccacctaca ggcgcctgct
ggagggcgag gaacacaggc tgtgtgaagg tgtgggctct 1440gtgaatgtct gtgtcagcag
ctcccgtggt ggagtctcct gtgggggcct ctcctacagc 1500accaccccag ggcgccagat
cacttctggc ccctcagcca taggcggcag catcacggtg 1560gtggcccctg actcctgtgc
cccctgccag cctcgttcct ccagcttcag ctgcgggagt 1620agccggtcgg tccgctttgc
ctagtagagt catggagcca gggcttcctg ccaagcacct 1680gcctgcctgc atcactgcac
tgaatggcat gtgaatggaa aatgtgtgct tgcttccaga 1740atcttctgga tgttcctaca
gagggaaaga cctacagagg gaaagaccct cgggccgctc 1800ccctgcgcct tttcatgcta
gggagatgca tcctagttgt cctcctggca gctgttttca 1860gaggcattcc cagcccttca
cttaactcct acttagctcc aaaatacctg tatccaattt 1920gtattattcc cccagctctc
agggacaaga ccagtccccc agcgtggtgg tcagcacgga 1980agctccacct tctgggtgga
ggcgccatcc taaccatcca gccaggccac ccacaacccg 2040agaatcaggg agaaagtccc
tccccagcag ccccctcctc ctggctggga agaatggtcc 2100cccagcaagc acttgcctgt
tcattcccgt tcatgttttg cttctctctc agactgcctt 2160cctgcttctg ggctaacctg
ttccagccag gctcctcatg tgacctcgca gttgagaagc 2220ccattatcgt ggggcatcct
tttgcctaca gcccctggtt agggcacttt ggacaggtct 2280tgctattcag tgaacctttg
tacatttcaa agaagactcc atggctgctc cagatgcccc 2340cttgctgggt gcaggtgggg
actgtccaat gcagagctgg cgggacagag agttaagcca 2400cttcctgggt ctccttctta
tgactgtcta tgggtgcatt gccttctggg ttgtctcgat 2460ctgtgtttca ataaatgccg
ctgcaatgca aaaaaaaaaa aaaaaaaa 250861354DNAHomo
sapiensmisc_feature(57)..(57)a or g or c or t/u 6caatcagtga aaattctata
ttcctttggc atttttgtga catattcaat tcagttntat 60gttccagcag agatcattat
ccctgggatc acatccaaat ttcatactaa atggaagcaa 120atctgtgaat ttgggataag
atccttcttg gttagtatta cttgcgccgg agcaatgtct 180tattcctcgt ttagacattg
tgatttcctt cgttggagct gtgagcagca gcacattggc 240cctaatcctg ccacctttgg
ttgaaattct tacattttcg aaggaacatt ataatatatg 300gatggtcctg aaaaatattt
ctatagcatt cactggagtt gttggcttct tattaggtac 360atatataact gttgaagaaa
ttatttatcc tactcccaaa gttgtagctg gcactccaca 420gagtcctttt ctaaatttga
attcaacatg cttaacatct ggtttgaaat agtaaaagca 480gaatcatgag tcttctattt
ttgtcccatt tctgaaaatt atcaagataa ctagtaaaat 540acattgctat atacataaaa
atggtaacaa actctgtttt ctttggcacg atattaatat 600tttggaagta atcataactc
tttaccagta gtggtaaacc tatgaaaaat ccttgctttt 660aagtgttagc aatagttcaa
aaaattaagt tctgaaaatt gaaaaaatta aaatgtaaaa 720aaattaaaga ataaaaatac
ttctattatt cttttatctc agtaagaaat accttaacca 780agatatctct cttttatgct
actcttttgc cactcacttg agaacagaat aggatttcaa 840caataagaga ataaaataag
aacatgtata acaaaaagct ctctccagat catccctgtg 900aatgccaaag taaactttat
gtacagtgta aaaaaaaaaa aatctcagtt atgtttttat 960tagccaaatt ctaatgattg
gctcctggaa gtatagaaaa ctcccattaa cataatataa 1020gcatcagaaa attgcaaaca
ctagaattaa ttttacactc taatggtagt tgatcttcat 1080agtcaagagg cactgttcaa
gatcatgact tagtgtttca atgaaatttg aaaagggact 1140ttaaaactta tccagtgcaa
ctcccttgtt tttcgtcaga ggaaaaggag gcctagaaag 1200gttaagtaac ttggtcgaga
ccactcagcc ttgagatcaa gaaaacctaa tcttctgact 1260cccaggccag gatgttttat
ttctcacatc atgtccaaga aaaagaataa attatgttca 1320gcttaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaa 135472460DNAHomo sapiens
7cggaggcggc gccgacgggg actgctgagg cgcgcagagg gtcggcggcg cccgggagcc
60tgtcgctggc gcggtccggg cgggaggctc ggcggcgggc ggcagcatgt cggtggcggg
120gctgaagaag cagttctaca aggcgagcca gctggtcagt gagaaggtcg gaggggccga
180ggggaccaag ctggatgatg acttcaaaga gatggagaag aaggtggatg tcaccagcaa
240ggcggtgaca gaagtgctgg ccaggaccat cgagtacctg cagcccaacc cagcctcgcg
300ggctaagctg accatgctca acacggtgtc caagatccgg ggccaggtga agaaccccgg
360ctacccgcag tcggaggggc ttctgggcga gtgcatgatc cgccacggga aggagctggg
420cggcgagtcc aactttggtg acgcattgct ggatgccggc gagtccatga agcgcctggc
480agaggtgaag gactccctgg acatcgaggt caagcagaac ttcattgacc ccctccagaa
540cctgtgcgag aaagacctga aggagatcca gcaccacctg aagaaactgg agggccgccg
600cctggacttt gactacaaga agaagcggca gggcaagatc cccgatgagg agctacgcca
660ggcgctggag aagttcgagg agtccaagga ggtggcagaa accagcatgc acaacctcct
720ggagactgac atcgagcagg tgagtcagct ctcggccctg gtggatgcac agctggacta
780ccaccggcag gccgtgcaga tcctggacga gctggcggag aagctcaagc gcaggatgcg
840ggaagcttcc tcacgcccta agcgggagta taagccgaag ccccgggagc cctttgacct
900tggagagcct gagcagtcca acgggggctt cccctgcacc acagccccca agatcgcagc
960ttcatcgtct ttccgatctt ccgacaagcc catccggacc cctagccgga gcatgccgcc
1020cctggaccag ccgagctgca aggcgctgta cgacttcgag cccgagaacg acggggagct
1080gggcttccat gagggcgacg tcatcacgct gaccaaccag atcgatgaga actggtacga
1140gggcatgctg gacggccagt cgggcttctt cccgctcagc tacgtggagg tgcttgtgcc
1200cctgccgcag tgactcaccc gtgtccccgc cccgcccctc cgtccacact ggccggcacc
1260ccctgctggg tctcctgcat tccacggagc ccctgctgcc agggcggtgt ctgagcctgc
1320cggcgccacc tgggccccgg cccttgaggt actccctgag caggacccca cacttgggtg
1380ggggggctta tctgggtggg tggggatgcc tgtttacact agcgctgact cccaacggtg
1440acggctccct tccccactcc atggcgccag cctcctcccc cgctccccaa cttctcgccc
1500agctggccga ggcggggcaa cactaaggtg ctcttagaaa cactaatgtt cctctggggc
1560agcccccacc tccgtcctga cccgacgggg gcccggccca ctgcctaccc tcgagtcccg
1620cagccttaac aggatgggat cgagggtccc catggggtgg ctcagagata ggaccctggt
1680tttaaatccc tcccagcctg gtgctggtga tgggccctgg ccctactcca gggccaatgc
1740acccccgcct cacacacgca ctccttctcc tcaaggccag ggcagagggc ctcaccgcct
1800cccgggcctg ctgtcagctt gcagcccggg gacagaggcc agctgggatc tgcctgagga
1860cagagaacat ggtctcctgc agggccctgc ctcccaagcc ccgccctcag aaagccaagt
1920accttttcag ctttttaact gcccccatcc caacccaggg aggcctgtgt cactctggca
1980caagctgcca ccaccagcca cccacaccca ccccagcaca cctcacacgg gaccacagcc
2040gcgctgccga gggccaagca caaaggttcc agtgagcgca tgtcccagcc cctggtggcc
2100aggctcccct tgctgagccg ctgccacttc accctgtggg aagtggcccc agccatctcc
2160tctagaccaa ggcaggcagc cccgacatct gcttcctcta tcgcccaatg caaaatcgat
2220gaaatgggga gttctctggg ccaggccaca ttcacattcc cctccccctg tggtccagtg
2280aagcctccgg accccaggct ctgctctgcc ctgccctgca cccccctcgt cagaagtaca
2340tgaggggcgc agagatgagc acacagcttt gggcacggtc cagggcaaac tgaaatgtac
2400gcctgaattt tgtaaacaga agtattaaat gtctctttct acaaaaaaaa aaaaaaaaaa
246081112DNAHomo sapiens 8ggcacgaggg aggtgcagag ctgagaatga ggcgatttcg
gaggatggag aaatagcccc 60gagtcccgtg gaaaatgagg ccggcggact tgctgcagct
ggtgctgctg ctcgacctgc 120ccagggacct gggcggaatg gggtgttcgt ctccaccctg
cgagtgccat caggaggagg 180acttcagagt cacctgcaag gatattcaac gcatccccag
cttaccgccc agtacgcaga 240ctctgaagct tattgagact cacctgagaa ctattccaag
tcatgcattt tctaatctgc 300ccaatatttc cagaatctac gtatctatag atgtgactct
gcagcagctg gaatcacact 360ccttctacaa tttgagtaaa gtgactcaca tagaaattcg
gaataccagg aacttaactt 420acatagaccc tgatgccctc aaagagctcc ccctcctaaa
gttccttggc attttcaaca 480ctggacttaa aatgttccct gacctgacca aagtttattc
cactgatata ttctttatac 540ttgaaattac agacaaccct tacatgacgt caatccctgt
gaatgctttt cagggactat 600gcaatgaaac cttgacactg aagctgtaca acaatggctt
tacttcagtc caaggatatg 660ctttcaatgg gacaaagctg gatgctgttt acctaaacaa
gaataaatac ctgacagtta 720ttgacaaaga tgcatttgga ggagtataca gtggaccaag
cttgctgctg cctcttggaa 780gaaagtcctt gtcctttgag actcagaagg ccccaagctc
cagtatgcca tcatgatgcc 840tgctaaggca gccaccttgg tgtacatgct cacagaggct
ctgttcatgg agcagctgct 900gtttgaaaaa ttttgaaatg caagatccac aactagatgg
aaggcactct agtctttgca 960gaaaaaaatg tacctgaatg tacattgcac aatgcctggc
acaaagaagg aagaatataa 1020atgatagttc gactcgtctg tggaagaact tacaatcatg
gggaaagatg gaataaaaac 1080attttttaaa cagcaaaaaa aaaaaaaaaa aa
11129478DNAHomo sapiens 9ccccagcccc actcacccac
cctccttccc accagcctgc tctccgcagg cccactgtct 60ttgggtttaa tgacgtctct
tctctgtgga acttcacgat tccttcccac ggtcaactcg 120ggacctccca gcgaccactg
cagcctgcgg acgaggccgg gacttggccg agcggatcct 180aataagggga aaatggtaaa
tgcaaacgtc ccgttacaat tttaccgcca gtgtgctgtc 240gttccccctc cccctctccg
agtcctcgtg gggacacggc ggggtctgta ggaagttggg 300ccgggttggg ggttgctaga
aggcgctggt gttttgctct gagttttaag agatcccttc 360cttcctcttc ggtgaatgca
ggttatttaa actttgggaa atgtactttt agtctgtcat 420atcaaggcat gagtcactgt
ctttttttgt gtgaataaat ggtttctagt acaatgga 47810845DNAHomo sapiens
10gcggccgccc gcacgtccgc gggtcccggc cgcgccgccg ccgcgcgccc ctgcccgaga
60gagctctggc cccgctagcg gggccaggag ccgggcctcc caccgcagcg tcccccgccg
120cgccagtccc cgctagtggt agtatctcgt aatagcttct gtgtgtgagc taccgtggat
180ctccttccct tctcttgggg gccgggggga aagaaaagga tttaagcaaa ggctccctcg
240ccctgtgagg gcgagcggca aaggcccggc tgagcccccc atgcccctcc cctccccgtg
300taaaaagcct ccttgtgcaa ttgtcttttt tttcctttga acgtgcttct ttgtaatgac
360caaggtaccg atttctgcta agttctccca acaacatgaa actgcctatt cacgccgtaa
420ttctttctgt ctcccttctc tctctctctc tcgctcgctc gctctcgctc tcgctctctc
480tcgctgcgtc ctcatttccc ctcccaatcc tctctcccct ctgcaacccc ccagctcgct
540ggctttctct ctggcttctc tcttttcctc ctccacccac cccctttggt ttgacaattt
600tgtcttaagt gtttctcaaa agaggttact ttagttagca tgcgcgctgt gggcaattgt
660tacaagtgtt cttaggttta ctgtgaagag aatgtattct gtatccgtga attgctttat
720gggggggagg gagggctaat tatatatttt gttgttcctc tatactttgt tctgttgtct
780gcgcctgaaa agggcggaag agttacaata aagtttacaa gcgagaaccc gaaaaaaaaa
840aaaaa
845111721DNAHomo sapiens 11caccagcaca gcaaacccgc cgggatcaaa gtgtaccagt
cggcagcatg gctacgaaat 60gtgggaattg tggacccggc tactccaccc ctctggaggc
catgaaagga cccagggaag 120agatcgtcta cctgccctgc atttaccgaa acacaggcac
tgaggcccca gattatctgg 180ccactgtgga tgttgacccc aagtctcccc agtattgcca
ggtcatccac cggctgccca 240tgcccaacct gaaggacgag ctgcatcact caggatggaa
cacctgcagc agctgcttcg 300gtgatagcac caagtcgcgc accaagctgg tgctgcccag
tctcatctcc tctcgcatct 360atgtggtgga cgtgggctct gagccccggg ccccaaagct
gcacaaggtc attgagccca 420aggacatcca tgccaagtgc gaactggcct ttctccacac
cagccactgc ctggccagcg 480gggaagtgat gatcagctcc ctgggagacg tcaagggcaa
tggcaaaggg ggttttgtgc 540tgctggatgg ggagacgttc gaggtgaagg ggacatggga
gagacctggg ggtgctgcac 600cgttgggcta tgacttctgg taccagcctc gacacaatgt
catgatcagc actgagtggg 660cagctcccaa tgtcttacga gatggcttca accccgctga
tgtggaggct ggactgtacg 720ggagccactt atatgtatgg gactggcagc gccatgagat
tgtgcagacc ctgtctctaa 780aagatgggct tattcccttg gagatccgct tcctgcacaa
cccagacgct gcccaaggct 840ttgtgggctg cgcactcagc tccaccatcc agcgcttcta
caagaacgag ggaggtacat 900ggtcagtgga gaaggtgatc caggtgcccc ccaagaaagt
gaagggctgg ctgctgcccg 960aaatgccagg cctgatcacc gacatcctgc tctccctgga
cgaccgcttc ctctacttca 1020gcaactggct gcatggggac ctgaggcagt atgacatctc
tgacccacag agaccccgcc 1080tcacaggaca gctcttcctc ggaggcagca ttgttaaggg
aggccctgtg caagtgctgg 1140aggacgagga actaaagtcc cagccagagc ccctagtggt
caagggaaaa cgggtggctg 1200gaggccctca gatgatccag ctcagcctgg atgggaagcg
cctctacatc accacgtcgc 1260tgtacagtgc ctgggacaag cagttttacc ctgatctcat
cagggaaggc tctgtgatgc 1320tgcaggttga tgtagacaca gtaaaaggag ggctgaagtt
gaaccccaac ttcctggtgg 1380acttcgggaa ggagcccctt ggcccagccc ttgcccatga
gctccgctac cctgggggcg 1440attgtagctc tgacatctgg atttgaactc caccctcatc
acccacactc cctattttgg 1500gccctcactt ccttggggac ctggcttcat tctgctctct
cttggcaccc gacccttggc 1560agcatgtacc acacagccaa gctgagactg tggcaatgtg
ttgagtcata tacatttact 1620gaccactgtt gcttgttgct cactgtgctg cttttccatg
agctcttgga ggcaccaaga 1680aataaactcg taaccctgtc cttcaaaaaa aaaaaaaaaa a
1721121061DNAHomo sapiens 12ccggagataa cttgagggct
atagaggacc ggctaatact ggtcctgaat ttggcttcag 60gcctcaccaa ccaagtggcc
gtggccttgc cgtcttgccc gtcggccccc ggtgaggcct 120ggacccctgg ggtcccggca
ccaggccccg gcttccgacc ctggcagaag cccaagatct 180ggtccctcgc ggagactgcc
acaagccccg gacacccgcg ccggctcgcc tcccggcgcg 240ggggggtctc caccgggggg
caacggtcgc gcctttccgc cctgcagctc tctccgggcc 300gccgccgccg ccgccgctca
cagactggtc tcagcgccgc tgggcaagtt cccggcttgg 360accaaccggc cgtttccagg
cccaccgccc ggcccccgcc cgcacccgct ctccctgctg 420ggctctgccc ctccgcacct
gctgggactt cccggagccg cgggccaccc ggctgccgcc 480gccgccttcg ctcggccagc
ggagcccgaa ggcggaacag atcgctgtag tgccttggaa 540gtggagaaaa agttactcaa
gacagctttc catcccgtgc ccaggcggcc ccagaaccat 600ctggacgccg ccctggtctt
atcggctctc tcctcatcct agttctttaa aaaaaaacaa 660aaaaacaaaa aaaacttttt
ttaatcgttg taataattgt ataaaaaaaa tcgctctgta 720tagttacaac ttgtaagcat
gtccgtgtat aaatacctaa aagcaaaact aaacaaagaa 780agtaagaaaa agaaataaaa
ccagtcctcc tcagccctcc ccaagtcgct tctgtggcac 840cccgcattcg ctgtgaggtt
tgtttgtccg gttgattttg gggggtggag tttcagtgag 900aataaacgtg tctgcctttg
tgtgtgtgta tatatacaga gaaatgtaca tatgtgtgaa 960ccaaattgta cgagaaagta
tctatttttg gctaaataaa tgagctgctg ccactttgac 1020tataaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa a 106113577DNAHomo sapiens
13tatgagcacc ttcacatgga tccacttgag gaaagaaggt ggaccgaatt tgtaaacggt
60gtgcagcaat atatatcaat tcgttctgag ataatcgcca cttacgctct ctgtggtttt
120gccaatatcg ggtccctagg aatcgtgatc ggcggactca catccatggc tccttccaga
180aagcgtgata tcgcctcggg ggcagtgaga gctctgattg cggggaccgt ggcctgcttc
240atgacagcct gcatcgcagg catactctcc agcactcctg tggacatcaa ctgccatcac
300gttttagaga atgccttcaa ctccactttc cctggaaccc caaccaaggg tgatagcttg
360ttgccaaagt ctgttgagca gccctgttgc ccagggtcct ggtgaagtca tcccaggagg
420aaaccccagt ctgtattctt tgaagggctg ctgcacattg ttgaatccat cgacctttag
480ctgcaatggg atctctaata cattttgagg tcagccactt ctccagtgga actctgaagt
540acagatgctg aattttctgc tttggaaaga aaaaaaa
57714456DNAHomo sapiens 14actcggcatg tgatgaacac ccatagttaa gaaaccatgg
agcaagaaag cttgtggaaa 60gtctctctcc ttcctcataa gacatgcaca ctaatacaca
tacacaccaa aaaattacac 120attttaaaac tgctaagctt ggatttaact gaatcatata
tcttttatca tgttatccta 180aaagtgagaa gacataacca agacatggaa ataaatgtga
aagctggagc cgaagagtca 240aagagctaaa aaattaagtc tagaacattc tatgaggata
gtataaataa aaagaaatac 300agtctagaca tgctgcaagg aaagaagatt ctaaagtccg
tttatggagg caattccata 360tcctttcttg aacgcacatt cagcttaccc cagagagcaa
gtgaggcaat ctggcaaaag 420attaataaag atgtaaaccc ctggaaaaaa aaaaaa
456153628DNAHomo sapiens 15gaattcggca cgagatagtt
ttcaggttaa gaaagccaga atctttgttc agccacactg 60actgaacaga cttttagtgg
ggttacctgg ctaacagcag cagcggcaac ggcagcagca 120gcagcagcag cagcagcagc
agcagcaggg ctcctgggat aactcaggca tagttcaaca 180ctatgggtcc tcctctgaag
ctcttcaaaa accagaaata ccaggaactg aagcaggaat 240gcatcaaaga cagcagactt
ttctgtgatc caacatttct gcctgagaat gattctcttt 300tctacttccg actgcttcct
ggaaaggtgg tgtggaaacg tccccaggac atctgtgatg 360acccccatct gattgtgggc
aacattagca accaccagct gacccaaggg agactggggc 420acaagccaat ggtttctgca
ttttcctgtt tggctgttca ggagtctcat tggacaaaga 480caattcccaa ccataaggaa
caggaatggg accctcaaaa aacagaaaaa tacgctggga 540tatttcactt tcgtttctgg
cattttggag aatggactga agtggtgatt gatgacttgt 600tgcccaccat taacggagat
ctggtcttct ctttctccac ttccatgaat gagttttgga 660atgctctgct ggaaaaagct
tatgcaaagc tgctaggctg ttatgaggcc ctggatggtt 720tgaccatcac tgatattatt
gtggacttca cgggcacatt ggctgaaact gttgacatgc 780agaaaggaag atacactgag
cttgttgagg agaagtacaa gctattcgga gaactgtaca 840aaacatttac caaaggtggt
ctgatctgct gttccattga gtctcccaat caggaggagc 900aagaagttga aactgattgg
ggtctgctga agggccatac ctataccatg actgatattc 960gcaaaattcg tcttggagag
agacttgtgg aagtcttcag tgctgagaag gtgtatatgg 1020ttcgcctgag aaaccccttg
ggaagacagg aatggagtgg cccctggagt gaaatttctg 1080aagagtggca gcaactgact
gcatcagatc gcaagaacct ggggcttgtt atgtctgatg 1140atggagagtt ttggatgagc
ttggaggact tttgccgcaa ctttcacaaa ctgaatgtct 1200gccgcaatgt gaacaaccct
atttttggcc gaaaggagct ggaatcggtg ttgggatgct 1260ggactgtgga tgatgatccc
ctgatgaacc gctcaggagg ctgctataac aaccgtgata 1320ccttcctgca gaatccccag
tacatcttca ctgtgcctga ggatgggcac aaggtcatta 1380tgtcactgca gcagaaggac
ctgcgcactt accgccgaat gggaagacct gacaattaca 1440tcattggctt tgagctcttc
aaggtggaga tgaaccgcaa attccgcctc caccacctct 1500acatccagga gcgtgctggg
acttccacct atattgacac ccgcacagtg tttctgagca 1560agtacctgaa gaagggcaac
tatgtgcttg tcccaaccat gttccagcat ggtcgcacca 1620gcgagtttct cctgagaatc
ttctctgaag tgcctgtcca gctcagggaa ctgactctgg 1680acatgcccaa aatgtcctgc
tggaacctgg ctcgtggcta cccgaaagta gttactcaga 1740tcactgttca cagtgctgag
gacctggaga agaagtatgc caatgaaact gtaaacccat 1800atttggtcat caaatgtgga
aaggaggaag tccgttctcc tgtccagaag aatacagttc 1860atgccatttt tgacacccag
gccattttct acagaaggac cactgacatt cctattatag 1920tacaggtctg gaacagccga
aaattctgtg atcagttctt ggggcaggtt actctggatg 1980ctgaccccag cgactgccgt
gatctgaagt ctctgtacct gcgtaagaag ggtggtccaa 2040ctgccaaagt caagcaaggc
cacatcagct tcaaggttat ttccagcgat gatctcactg 2100agctctaaat ctgcaatccc
agagaatcct gacaaagcgt gccacccttt tattttccgt 2160caggtgccag gtcttagtta
agattcacaa tctttagaaa gaatgagatt cacaataatt 2220aactcttcct ctcttctgat
aaattcccca tacctcccaa tccaagtagc atctgtagct 2280acataaccta tatacctcca
gcagctggac atggggagcg acagtcctat ctagacatca 2340tacacatttg ccaagaaagg
atctctgggg cttccggggg tgagattcaa gcaggacaat 2400aacaagaggc tggacaccct
acagatgtct ttgatgtttt cagttgtttg atatatctcc 2460cctgtagggc atgttgagga
aggaggaggg ctgatcaagg ccaagctggt ctagcctgac 2520atcctagctc ctgactgaac
actatagact tcccagcagc attttcaccc agcagccaga 2580gccggcttta agtccccaac
ccttacagac accactgcca ccaccaccaa ccacgaccac 2640caccaccacc accactcacc
accatcatca cctccggaaa gtgtagtcct gccctaaccc 2700taaccccaag tcacccccca
cagtaaattt taccttcatg ttgagaaagc ttcctggtgc 2760ttaatcaaga gctggagttc
aatgagtcct agacagtgag aggggcctga gcttcagctc 2820aatggaagcc tgctgtgtgc
tcacaagacg gaaaagtgga agaagctgca gtgggagaca 2880aagcctcggt cccccaccca
tccacacaca cctacactca cacacgcgca catgggcgcg 2940caacggaact accatttcag
gcagtcagtg ggcaagagga aagataagta agtaccatac 3000acaccttaaa agatgaggag
aattcatcca gacatattac agccagtttg gggcccctga 3060cttgcaatgt gaaacctctt
cgcttgctgc taggtttaca aacaagccca ttgttcctgt 3120gcctcctaat attcatttgt
tactgaagga ccccatctgg ggacttgaga ctttggtccc 3180agcccagacg cctcagactg
gtctcaaagt caagcaaggc ttcacatcag ctgcaagtgt 3240tagtttgcca gcgcatgatc
tcactgagct tctacagaat ctgcaatccc agagtcaatc 3300atgacgaaat gtacgtccca
ccatcttaac ctatcaactt tctgcccctc cttcaaggcc 3360cagtataaat gccacctcct
ccatgaagcc ttccctaatt ccaccccaaa cccccacctt 3420caacaatatt tcaacgcttc
tgcaatgatg aaaaagaaac atagttgtag tacttagcct 3480acctagacca gcaagcattc
atttttagct cgctcatttt ttaccatgtt ttccagtctg 3540tttaacttct gcagtgcctt
cactacactg ccttacataa accaaatcac aataaagttc 3600atattcagta caattaaaaa
aaaaaaaa 3628161580DNAHomo sapiens
16tatgcaagtg tttaacagat gcttcactat taaaatattt tccccccaag tctcaaatat
60tgaagaatct ctaaccaggg acaccagtcc ctacgaagac cttgggcgat tttgaagtgc
120gggcacctcg attccccgaa tctgtagtgt ggctggtatc ggtgttcccc tggtttaact
180agcctgtttg aaggcacaga tcattcatgg ggaagtataa ccgaatccag tcctctccac
240cgcctgggga tcttcacttt cgcagtctac gactgcctgt gactccagaa agacaaactg
300cagattggcc aagatgggga aattgaggca gagaagccaa gacatgtgct aaaggtcatg
360caggctatga atggagctgg aatgtgaacg caggccatat gaccccagag cccatgttct
420tgaaccctta gaaagacagc agcaacacac ctggtgcagc agctgcttag ttggagtggc
480tgacaaggag agaatgattt ccaggaagag cggaacacat atggaaggcc ttagcttatc
540tttagcgcct catacacccg ttctggactt cagaaaggcc agtgagtggg attaggcctc
600agagatagga tgtcagtccc agtgagggat ggcctagagc attctttaat tctttccttt
660gggtcacaca taagaaacaa ttttccagca ctgatgagtg ttattaacaa tgagatggga
720tagaatttag ttttccctat ggctgtgctt caaaaataga aaagctgtct tttctctgga
780atgattgaat gaagctctgg ggaggaaaag gtggattggc agatctctta aaggaagctt
840ctccttctag gcactattct aaggcttaat attttaactc cctatattaa cctagttcaa
900ctaaacagtg atctgagtaa ttttattttt attaaagctc agatcaaaat gccattaaca
960ttgattgaga aaatcaaagg aatctttgat gtgagtggtt aaattgctga attatttcag
1020tcccataccc tcacagcatg agtacctgat ctgatagact tctttggaat tccttttttg
1080tttgagacag agtcttgctc tgtcgcccag gctggagtgc agcggtgtga tctcaaccat
1140tgcaacctcc acctcccagg ttcaggtgat tctcatgcct cagcctcctg agtagctggg
1200attacagatg tgcaccacca tgcccggcta attattttgt atctttagta gagatgaagt
1260tttgccatgt gggccaggct gttctcaaac tactggcctc aagtgatctg cccgcctcgg
1320cctcccagac tgctgggatt acaggcgtga ggcaccgtgc ctggctggga ttccataata
1380aatccctctg tgtctatttc ttttttcaaa tataattttc ttcatttcca aacatcatct
1440ttaagactcc aaggattttt ccaggcacag tggctcatac ctgtaatccc attgcttgga
1500gaggccaagg tggaagttca tttgaggcca ggagttcgag accaggtggg caacatagtg
1560aaaccttgtc tctacaacat
158017809DNAHomo sapiens 17tgtttatata actgtgttcg tttttgttgt tccgtcccgt
cgtccttgta gactctcatc 60ctcgtgtgtt ttggaccctc caggggtgac atcgggtctt
gtgttcagct ctcctggact 120gttattcctt gtccgcgtgt tcgtgttaga cattgtccac
gatctgtatc atgcctatgt 180ctcactttgg tctcttattt cagcgtgaac actatagttc
caagtttgtt cggataattc 240tgattcttgt caccagcgtg agatttcaac agaacttgtt
tggaacaaat actcacttaa 300aacttcagca gaagaaaaat tacttagtcc ttaggccaac
caatttaact gcagtgtcat 360gtttcacagg ccttcctaca tttagaaatc gtcacacagc
tgtgataaga gtagattatt 420ttactatgaa ataattctga atagatgaaa gcataaaatg
tgagaaactg aatgtattat 480tcaggaagaa tactgagtgc cttcatttaa ctaaagttga
atgtaaaagt caatttgcac 540ttctttataa tcctctggtt tagaattata aattgttaaa
accttgataa ttgtcattta 600attatatttc aggtgtcctg aacaggtcac tagactctac
attgggcagc ctttaaatat 660gattctttgt aatgctaaat agcctttttt tctcttttta
ctgcaactta atatttctat 720ttagaacaca gaaaatgaaa atatttagaa taagttgtac
atttgatgac aaataaatca 780ctattaaaat aaaaaaaaaa aaaaaaaaa
80918488DNAHomo sapiensmisc_feature(95)..(95)a or
g or c or t/u 18aggaacccct gtgggaaagg tttaaaccta aaacagtgcc ccctttggct
cctcctccct 60tggcggaatg ggttcctgga ccatgtgcat ttcantgggc catgggattt
acatttcctt 120gcatccccag gtggtttgat ccctgccagg gccccttcct tcctgctcat
ggttttcagg 180gggcctgatc atggaaagta agggggttgg gccttccctt ttgggggtga
accctgactc 240catcccccta ttgcccccct aaccaatcat gcaaactttt ccccccctgg
ggtaattcac 300cagttaaaaa aagctttttt taaatgtttt gttttggggg gggggcaggg
cccccttttt 360gtttttttaa ggagttggtt ttggtttttg gctgatgttt tgttttttaa
catgccccca 420gtttgtaagg ccaaaggtaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa 480aaaaaaaa
48819463DNAHomo sapiens 19taagctttaa aggctctgtg ttagggcata
gtctagaaac atggggccca agggcaccgg 60gaaaacttac aaagggaaga gatggaactg
ggagggttca agctaccagt tccatctctc 120catgttttag agaattgggg cactaagtca
gccaggtaag gtcaggtcag aggagggccc 180ggatgaagca tgagatgcag agggacagtg
cgtgaatgga gaccttgggt agcaccaacg 240tgtagcggca gaggtggggt ggatgtggct
gatgtcaggg agagaatggg gagcatgcac 300agggctcagt cttatacata cattgaaaat
cctttagcct ttcaaagatt attaacccaa 360atcacctttc ttgcttactc cagatgcctc
agcctctgat ataattgcta agtatctgcc 420gtgttaaaaa taaacatttg agaatcaaaa
aaaaaaaaaa aaa 463202148DNAHomo sapiens 20gcttcagggt
acagctcccc cgcagccaga agccgggcct gcagcgcctc agcaccgctc 60cgggacaccc
cacccgcttc ccaggcgtga cctgtcaaca gcaacttcgc ggtgtggtga 120actctctgag
gaaaaaccat tttgattatt actctcagac gtgcgtggca acaagtgact 180gagacctaga
aatccaagcg ttggaggtcc tgaggccagc ctaagtcgct tcaaaatgga 240acgaaggcgt
ttgtggggtt ccattcagag ccgatacatc agcatgagtg tgtggacaag 300cccacggaga
cttgtggagc tggcagggca gagcctgctg aaggatgagg ccctggccat 360tgccgccctg
gagttgctgc ccagggagct cttcccgcca ctcttcatgg cagcctttga 420cgggagacac
agccagaccc tgaaggcaat ggtgcaggcc tggcccttca cctgcctccc 480tctgggagtg
ctgatgaagg gacaacatct tcacctggag accttcaaag ctgtgcttga 540tggacttgat
gtgctccttg cccaggaggt tcgccccagg aggtggaaac ttcaagtgct 600ggatttacgg
aagaactctc atcaggactt ctggactgta tggtctggaa acagggccag 660tctgtactca
tttccagagc cagaagcagc tcagcccatg acaaagaagc gaaaagtaga 720tggtttgagc
acagaggcag agcagccctt cattccagta gaggtgctcg tagacctgtt 780cctcaaggaa
ggtgcctgtg atgaattgtt ctcctacctc attgagaaag tgaagcgaaa 840gaaaaatgta
ctacgcctgt gctgtaagaa gctgaagatt tttgcaatgc ccatgcagga 900tatcaagatg
atcctgaaaa tggtgcagct ggactctatt gaagatttgg aagtgacttg 960tacctggaag
ctacccacct tggcgaaatt ttctccttac ctgggccaga tgattaatct 1020gcgtagactc
ctcctctccc acatccatgc atcttcctac atttccccgg agaaggaaga 1080gcagtatatc
gcccagttca cctctcagtt cctcagtctg cagtgcctgc aggctctcta 1140tgtggactct
ttatttttcc ttagaggccg cctggatcag ttgctcaggc acgtgatgaa 1200ccccttggaa
accctctcaa taactaactg ccggctttcg gaaggggatg tgatgcatct 1260gtcccagagt
cccagcgtca gtcagctaag tgtcctgagt ctaagtgggg tcatgctgac 1320cgatgtaagt
cccgagcccc tccaagctct gctggagaga gcctctgcca ccctccagga 1380cctggtcttt
gatgagtgtg ggatcacgga tgatcagctc cttgccctcc tgccttccct 1440gagccactgc
tcccagctta caaccttaag cttctacggg aattccatct ccatatctgc 1500cttgcagagt
ctcctgcagc acctcatcgg gctgagcaat ctgacccacg tgctgtatcc 1560tgtccccctg
gagagttatg aggacatcca tggtaccctc cacctggaga ggcttgccta 1620tctgcatgcc
aggctcaggg agttgctgtg tgagttgggg cggcccagca tggtctggct 1680tagtgccaac
ccctgtcctc actgtgggga cagaaccttc tatgacccgg agcccatcct 1740gtgcccctgt
ttcatgccta actagctggg tgcacatatc aaatgcttca ttctgcatac 1800ttggacacta
aagccaggat gtgcatgcat cttgaagcaa caaagcagcc acagtttcag 1860acaaatgttc
agtgtgagtg aggaaaacat gttcagtgag gaaaaaacat tcagacaaat 1920gttcagtgag
gaaaaaaagg ggaagttggg gataggcaga tgttgacttg aggagttaat 1980gtgatctttg
gggagataca tcttatagag ttagaaatag aatctgaatt tctaaaggga 2040gattctggct
tgggaagtac atgtaggagt taatccctgt gtagactgtt gtaaagaaac 2100tgttgaaaat
aaagagaagc aatgtgaagc aaaaaaaaaa aaaaaaaa 214821707DNAHomo
sapiensmisc_feature(17)..(17)a or g or c or t/umisc_feature(83)..(83)a or
g or c or t/u 21aacacagccc taccaancaa tgatgaccag tggaaaacaa tgaagtcacc
aaaccctgga 60cagggctcat gctccaggac aanttgctgt ggcgtaaatg gtccatcaga
ctggcaaaaa 120tacacatctg ccttccggac tgagaataat gatgctgact atccctggcc
tcgtcaatgc 180tgtgttatga acaatcttaa agaacctctc aacctggagg cttgtaaact
aggcgtgcct 240ggtttttatc acaatcaggg ctgctatgaa ctgatctctg gtccaatgaa
ccgacacgcc 300tggggggttg cctggtttgg atttgccatt ctctgctgga ctttttgggt
tctcctgggt 360accatgttct actggagcag aattgaatat taagcataaa gtgttgccac
catacctcct 420tccccgagtg actctggatt tggtgctgga accagctctc tcctaatatt
ccacgtttgt 480gccccacact aacgtgtgtg tcttacattg ccaagtcaga tggtacggac
ttcctttagg 540atctcaggct tctgcagttc tcatgactcc tacttttcat cctagtctag
cattctgcaa 600catttatata gactgttgaa aggagaattt gaaaaatgca taataactac
ttccatccct 660gcttattttt aatttgggaa aataaataca ttcgaaggaa aaaaaaa
707222832DNAHomo sapiens 22ggcacgaggg cgaaattgag gtttcttggt
attgcgcgtt tctcttcctt gctgactctc 60cgaatggcca tggactcgtc gcttcaggcc
cgcctgtttc ccggtctcgc tatcaagatc 120caacgcagta atggtttaat tcacagtgcc
aatgtaagga ctgtgaactt ggagaaatcc 180tgtgtttcag tggaatgggc agaaggaggt
gccacaaagg gcaaagagat tgattttgat 240gatgtggctg caataaaccc agaactctta
cagcttcttc ccttacatcc gaaggacaat 300ctgcccttgc aggaaaatgt aacaatccag
aaacaaaaac ggagatccgt caactccaaa 360attcctgctc caaaagaaag tcttcgaagc
cgctccactc gcatgtccac tgtctcagag 420cttcgcatca cggctcagga gaatgacatg
gaggtggagc tgcctgcagc tgcaaactcc 480cgcaagcagt tttcagttcc tcctgccccc
actaggcctt cctgccctgc agtggctgaa 540ataccattga ggatggtcag cgaggagatg
gaagagcaag tccattccat ccgaggcagc 600tcttctgcaa accctgtgaa ctcagttcgg
aggaaatcat gtcttgtgaa ggaagtggaa 660aaaatgaaga acaagcgaga agagaagaag
gcccagaact ctgaaatgag aatgaagaga 720gctcaggagt atgacagtag ttttccaaac
tgggaatttg cccgaatgat taaagaattt 780cgggctactt tggaatgtca tccacttact
atgactgatc ctatcgaaga gcacagaata 840tgtgtctgtg ttaggaaacg cccactgaat
aagcaagaat tggccaagaa agaaattgat 900gtgatttcca ttcctagcaa gtgtctcctc
ttggtacatg aacccaagtt gaaagtggac 960ttaacaaagt atctggagaa ccaagcattc
tgctttgact ttgcatttga tgaaacagct 1020tcgaatgaag ttgtctacag gttcacagca
aggccactgg tacagacaat ctttgaaggt 1080ggaaaagcaa cttgttttgc atatggccag
acaggaagtg gcaagacaca tactatgggc 1140ggagacctct ctgggaaagc ccagaatgca
tccaaaggga tctatgccat ggcctcccgg 1200gacgtcttcc tcctgaagaa tcaaccctgc
taccggaagt tgggcctgga agtctatgtg 1260acattcttcg agatctacaa tgggaagctg
tttgacctgc tcaacaagaa ggccaagctg 1320cgcgtgctgg aggacggcaa gcaacaggtg
caagtggtgg ggctgcagga gcatctggtt 1380aactctgctg atgatgtcat caagatgatc
gacatgggca gcgcctgcag aacctctggg 1440cagacatttg ccaactccaa ttcctcccgc
tcccacgcgt gcttccaaat tattcttcga 1500gctaaaggga gaatgcatgg caagttctct
ttggtagatc tggcagggaa tgagcgaggc 1560gcggacactt ccagtgctga ccggcagacc
cgcatggagg gcgcagaaat caacaagagt 1620ctcttagccc tgaaggagtg catcagggcc
ctgggacaga acaaggctca caccccgttc 1680cgtgagagca agctgacaca ggtgctgagg
gactccttca ttggggagaa ctctaggact 1740tgcatgattg ccacgatctc accaggcata
agctcctgtg aatatacttt aaacaccctg 1800agatatgcag acagggtcaa ggagctgagc
ccccacagtg ggcccagtgg agagcagttg 1860attcaaatgg aaacagaaga gatggaagcc
tgctctaacg gggcgctgat tccaggcaat 1920ttatccaagg aagaggagga actgtcttcc
cagatgtcca gctttaacga agccatgact 1980cagatcaggg agctggagga gaaggctatg
gaagagctca aggagatcat acagcaagga 2040ccagactggc ttgagctctc tgagatgacc
gagcagccag actatgacct ggagaccttt 2100gtgaacaaag cggaatctgc tctggcccag
caagccaagc atttctcagc cctgccagat 2160gtcatcaagg ccttgcgcct ggccatgcag
ctggaagagc aggctagcag acaaataagc 2220agcaagaaac ggccccagtg acgactgcaa
ataaaaatct gtttggtttg acacccagcc 2280tcttccctgg ccctccccag agaactttgg
gtacctggtg ggtctaggca gggtctgagc 2340tgggacaggt tctggtaaat gccaagtatg
ggggcatctg ggcccagggc agctggggag 2400ggggtcagag tgacatggga cactcctttt
ctgttcctca gttgtcgccc tcacgagagg 2460aaggagctct tagttaccct tttgtgttgc
ccttctttcc atcaagggga atgttctcag 2520catagagctt tctccgcagc atcctgcctg
cgtggactgg ctgctaatgg agagctccct 2580ggggttgtcc tggctctggg gagagagacg
gagcctttag tacagctatc tgctggctct 2640aaaccttcta cgcctttggg ccgagcactg
aatgtcttgt actttaaaaa aatgtttctg 2700agacctcttt ctactttact gtctccctag
agatcctaga ggatccctac tgttttctgt 2760tttatgtgtt tatacattgt atgtaacaat
aaagagaaaa aataaaaaaa aaaaaaaaaa 2820aaaaaaaaaa aa
283223670DNAHomo
sapiensmisc_feature(14)..(14)a or g or c or t/umisc_feature(19)..(19)a or
g or c or t/umisc_feature(37)..(37)a or g or c or
t/umisc_feature(113)..(113)a or g or c or t/u 23atcggacttc ggtnaactnt
ggcaaggatt ggacagncta ggtaggctaa atgtgtgctc 60tgtccctgtt tgcttcaaca
gaggagcaag cctcagctga gaaggagggc acntggaaca 120cctagctcct cccgtgattc
cccaaaccca taacattctt ccatagggct ggaaccagtg 180ccccgtcctg acagggatga
aaagtgaacc cctcaggtca ggagaggcca gagttgaggt 240tctgccactt cctgtccctg
gggagccact caagttacca gggctaccgg ctgaaataaa 300tcttttccgg gtagggtcaa
gggcagtgtg ttccaaggca actgatgtag gccagttgcg 360tgactccagg tttgtcctgg
tactcagtgg gtccaatcac ctggcattga tcacctggca 420ttgatcagca cccaccccac
ccctgaggct tgcccagccc ccaggccctc agatccctgc 480tcttcctgcc tttcctgccc
atgtgtcacc cagcacccaa ggttcagtga cacagggtgg 540tttggagctg gtcactgtca
tagcagctgt gatttcacaa ggaagggtgc tgcaggggga 600cctggttgat ggggagtggg
aaggggaagg aataaagaga tcttcctcag gtaaaaaaaa 660aaaaaaaaaa
67024964DNAHomo sapiens
24acctcgtttg ctcccagtta cttcttatct ggagcagtaa tgtagtccac ttcactcatg
60cctaccccgc gtgtctcgtc tcctgacatg tctcacagac gctcctgaag ttaggtcatt
120acctaaccca tagttattta ccttgaaaga tgggtctccg cacttggaaa ggtttcaaga
180cttgatactg caataaatta tggctcttca cctgggcgcc aactgctgat caacgaaatg
240cttgttgaat caggggcaaa cggagtacag acgtctcaag actgaaacgg ccccattgcc
300tggtctagta gcggatctca ctcagccgca gacaagtaat cactaacccg ttttattcta
360ttcctatctg tggatgtgta aatggctggg gggccagccc tggataggtt tttatgggaa
420ttctttacaa taaacatagc ttgtaacttg agatctacaa atccattcat cctgattggg
480catgaaatcc atggtcaaga ggacaagtgg aaagtgagag ggaaggtttg ctagacacct
540tcgcttgtta tcttgtcaag atagaaaaga tagtatcatt tcacccttgc cagtaaaaac
600ctttccatcc acccattctc agcagactcc agtattggca cagtcactca ctgccattct
660cacactataa caagaaaaga aatgaagtgc ataagtctcc tgggaaaaga accttaaccc
720cttctcgtgc catgactggt gatttcatga ctcataagcc cctccgtagg catcattcaa
780gatcaatggc ccatgcatgc tgtttgcagc agtcaattga gttgaattag aattccaacc
840atacatttta aaggtatttg tgctgtgtgt atattttgat aaaatgttgt gacttcatgg
900caaacaggtg gatgtgtaaa aatggaataa aaaaaaaaaa agagtcaaaa aaaaaaaaaa
960aatt
964251568DNAHomo sapiens 25ggcgcccaag ccgccgccgc cagatcggtg ccgattcctg
ccctgccccg accgccagcg 60cgaccatgtc ccatcactgg gggtacggca aacacaacgg
acctgagcac tggcataagg 120acttccccat tgccaaggga gagcgccagt cccctgttga
catcgacact catacagcca 180agtatgaccc ttccctgaag cccctgtctg tttcctatga
tcaagcaact tccctgagga 240tcctcaacaa tggtcatgct ttcaacgtgg agtttgatga
ctctcaggac aaagcagtgc 300tcaagggagg acccctggat ggcacttaca gattgattca
gtttcacttt cactggggtt 360cacttgatgg acaaggttca gagcatactg tggataaaaa
gaaatatgct gcagaacttc 420acttggttca ctggaacacc aaatatgggg attttgggaa
agctgtgcag caacctgatg 480gactggccgt tctaggtatt tttttgaagg ttggcagcgc
taaaccgggc cttcagaaag 540ttgttgatgt gctggattcc attaaaacaa agggcaagag
tgctgacttc acaaactttg 600cagctcgtgg cctccttcct gaatccctgg attactggac
ctacccaggc tcactgacca 660cccctcctct tctggaatgt gtgacctgga ttgtgctcaa
ggaacccatc agcgtcagca 720gcgagcaggt gttgaaattc cgtaaactta acttcaatgg
ggagggtgaa cccgaagaac 780tgatggtgga caactggcgc ccagctcagc cactgaagaa
caggcaaatc aaagcttcct 840tcaaataaga tggtcccata gtctgtatcc aaataatgaa
tcttcgggtg tttcccttta 900gctaagcaca gatctacctt ggtgatttgg accctggttg
ctttgtgtct agttttctag 960acccttcatc tcttacttga tagacttact aataaaatgt
gaagactaga ccaattgtca 1020tgcttgacac aactgctgtg gctggttggt gctttgttta
tggtagtagt ttttctgtaa 1080cacagaatat aggataagaa ataagaataa agtaccttga
ctttgttcac agcatgtagg 1140gtgatgagca ctcacaattg ttgactaaaa tgctgccttt
aaaacatagg aaagtagaat 1200ggttgagtgc aaatccatag cacaagataa attgagctag
ttaaggcaaa tcaggtaaaa 1260tagtcatgat tctatgtaat gtaaaccaga aaaaataaat
gttcatgatt tcaagatgtt 1320atattaaaga aaaactttaa aaattattat atatttatag
caaagttatc ttaaatatga 1380attctgttgt aatttaatga cttttgaatt acagagatat
aaatgaagta ttatctgtaa 1440aaattgttat aattagagtt gtgatacaga gtatatttcc
attcagacaa tatatcataa 1500cttaataaat attgtatttt agatatattc tctaataaaa
ttcagaattc taaaaaaaaa 1560aaaaaaaa
1568261964DNAHomo sapiens 26ggcacgaggc atggaggcgc
tgctgctggg cgcggggttg ctgctgggcg cttacgtgct 60tgtctactac aacctggtga
aggccccgcc gtgcggcggc atgggcaacc tgcggggccg 120cacggccgtg gtcacgggtg
agtgcggagg cgggtgagtg cgagctggcg gggcgcgcgg 180agaggaggcc gggccggcgg
tagcagcggc ccgccgggct cagctcagct cggctcccgc 240ccgcggtccg caggcgccaa
cagcggcatc ggaaagatga cggcgctgga gctggcgcgc 300cggggagcgc gcgtggtgct
ggcctgccgc agccaggagc gcggggaggc ggctgccttc 360gacctccgcc aggagagtgg
gaacaatgag gtcatcttca tggccttgga cttggccagt 420ctggcctcgg tgcgggcctt
tgccactgcc tttctgagct ctgagccacg gttggacatc 480ctcatccaca atgccggtat
cagttcctgt ggccggaccc gtgaggcgtt taacctgctg 540cttcgggtga accatatcgg
tccctttctg ctgacacatc tgctgctgcc ttgcctgaag 600gcatgtgccc ctagccgcgt
ggtggtggta gcctcagctg cccactgtcg gggacgtctt 660gacttcaaac gcctggaccg
cccagtggtg ggctggcggc aggagctgcg ggcatatgct 720gacactaagc tggctaatgt
actgtttgcc cgggagctcg ccaaccagct tgaggccact 780ggcgtcacct gctatgcagc
ccacccaggg cctgtgaact cggagctgtt cctgcgccat 840gttcctggat ggctgcgccc
acttttgcgc ccattggctt ggctggtgct ccgggcacca 900agagggggtg cccagacacc
cctgtattgt gctctacaag agggcatcga gcccctcagt 960gggagatatt ttgccaactg
ccatgtggaa gaggtgcctc cagctgcccg agacgaccgg 1020gcagcccatc ggctatggga
ggccagcaag aggctggcag ggcttgggcc tggggaggat 1080gctgaacccg atgaagaccc
ccagtctgag gactcagagg ccccatcttc tctaagcacc 1140ccccaccctg aggagcccac
agtttctcaa ccttacccca gccctcagag ctcaccagat 1200ttgtctaaga tgacgcaccg
aattcaggct aaagttgagc ctgagatcca gctctcctaa 1260ccctcaggcc aggatgcttg
ccatggcact tcatggtcct tgaaaacctc ggatgtgtgc 1320gaggccatgc cctggacact
gacgggtttg tgatcttgac ctccgtggtt actttctggg 1380gccccaagct gtgccctgga
catctctttt cctggttgaa ggaataatgg gtgattattt 1440cttcctgaga gtgacagtaa
ccccagatgg agagataggg gtatgctaga cactgtgctt 1500ctcggaaatt tggatgtagt
attttcaggc cccaccctta ttgattctga tcagctctgg 1560agcagaggca gggagtttgc
aatgtgatgc actgccaaca ttgagaatta gtgaactgat 1620ccctttgcaa ccgtctagct
aggtagttaa attaccccca tgttaatgaa gcggaattag 1680gctcccgagc taagggactc
gcctagggtc tcacagtgag taggaggagg gcctgggatc 1740tgaacccaag ggtctgaggc
cagggccgac tgccgtaaga tgggtgctga gaagtgagtc 1800agggcagggc agctggtatc
gaggtgcccc atgggagtaa ggggacgcct tccgggcgga 1860tgcagggctg gggtcatctg
tatctgaagc ccctcggaat aaagcgcgtt gaccgccaaa 1920aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaa 196427953DNAHomo sapiens
27agcggtggag aaaaggcaga accagagtag agattgacag tgagctgagc caatcaggct
60gtgaatctgc agcagtgatc ccaggtcctc caattaatac taagagagtg gaccagggcc
120cctgaggaag acagatggca gggacagcgc gccatgaccg agagatggcg atccaggcca
180agaaaaagct caccacggcc accaacccca ttgaaagact ccgactgcag tgcctggcca
240ggggctctgc tgggatcaaa ggacttggca gagtgtttag aattatggat gacgataata
300atcgaaccct tgattttaaa gaatttatga aagggttaaa tgattatgct gtggtcatgg
360aaaaagaaga ggtggaagaa cttttccgga ggtttgataa agatggaaat ggaacaatag
420acttcaatga atttcttctc acattaagac ctccaatgtc cagagccaga aaagaggtaa
480tcatgcaagc ttttagaaag ttagacaaga ctggagatgg tgttataaca atcgaagacc
540ttcgtgaagt atataatgca aaacaccacc caaagtacca gaatggggaa tggagtgagg
600aacaagtatt taggaaattt ctggataact ttgattcacc ctatgacaaa gatggattgg
660tgacccctga ggagttcatg aactactatg caggtgtgag cgcatccatt gacactgatg
720tgtacttcat catcatgatg agaaccgcct ggaagcttta agcacatgac ctggggacca
780ggccctggga cagccatgtg gctccaaatg actaaatgtc agctcaaaaa ccagaatcgt
840atttgatttc acactcatcc taatgttttt ttctgtgtca aaatattgca ttttctgggg
900ccaaaaaaca ggcagaaata aaagacattg agtagtcaaa aaaaaaaaaa aaa
95328850DNAHomo sapiens 28tagagcatta aaataactat caggcagaag aatctttctt
ctcgcctagg atttcagcca 60tgcgcgcgct ctctctcttt ctctctcttt tcctctctct
ccctctttct agcctggggc 120ttgaatttgc atgtctaatt catttactca ccatatttga
attggcctga acagatgtaa 180atcgggaagg atgggaaaaa ctgcagtcat caacaatgat
taatcagctg ttgcaggcag 240tgtcttaagg agactggtag gaggaggcat ggaaaccaaa
aggccgtgtg tttagaagcc 300taattgtcac atcaagcatc attgtcccca tgcaacaacc
accaccttat acatcacttc 360ctgttttaag cagctctaaa acatagactg aagatttatt
tttaatatgt tgactttatt 420tctgagcaaa gcatcggtca tgtgtgtatt ttttcatagt
cccaccttgg agcatttatg 480tagacattgt aaataaattt tgtgcaaaaa ggactggaaa
aatgaactgt attattgcaa 540tttttttttg taaaagtagc agtttggtat gagttggcat
gcatacaaga tttactaagt 600gggataagct aattatactt tttgttgtgg ataaacaaat
gcttgttgat agcctttttc 660tatcaagaaa ccaaggagct aattattaat aacaatcatt
gcacactgag tcttagcgtt 720tctgatggaa acagtttgga ttgtataata acgccaagcc
cagttgtagt cgtttgagtg 780cagtaatgaa atctgaatct aaaataaaaa caagattatt
tttgtcaaaa aaaaaaaaaa 840aaaaaaaaaa
850294670DNAHomo sapiens 29gcggcgcgca cactgctcgc
tgggccgcgg ctcccgggtg tcccaggccc ggccggtgcg 60cagagcatgg cgggtgcggg
cccgaagcgg cgcgcgctag cggcgccggc ggccgaggag 120aaggaagagg cgcgggagaa
gatgctggcc gccaagagcg cggacggctc ggcgccggca 180ggcgagggcg agggcgtgac
cctgcagcgg aacatcacgc tgctcaacgg cgtggccatc 240atcgtgggga ccattatcgg
ctcgggcatc ttcgtgacgc ccacgggcgt gctcaaggag 300gcaggctcgc cggggctggc
gctggtggtg tgggccgcgt gcggcgtctt ctccatcgtg 360ggcgcgctct gctacgcgga
gctcggcacc accatctcca aatcgggcgg cgactacgcc 420tacatgctgg aggtctacgg
ctcgctgccc gccttcctca agctctggat cgagctgctc 480atcatccggc cttcatcgca
gtacatcgtg gccctggtct tcgccaccta cctgctcaag 540ccgctcttcc ccacctgccc
ggtgcccgag gaggcagcca agctcgtggc ctgcctctgc 600gtgctgctgc tcacggccgt
gaactgctac agcgtgaagg ccgccacccg ggtccaggat 660gcctttgccg ccgccaagct
cctggccctg gccctgatca tcctgctggg cttcgtccag 720atcgggaagg gtgatgtgtc
caatctagat cccaacttct catttgaagg caccaaactg 780gatgtgggga acattgtgct
ggcattatac agcggcctct ttgcctatgg aggatggaat 840tacttgaatt tcgtcacaga
ggaaatgatc aacccctaca gaaacctgcc cctggccatc 900atcatctccc tgcccatcgt
gacgctggtg tacgtgctga ccaacctggc ctacttcacc 960accctgtcca ccgagcagat
gctgtcgtcc gaggccgtgg ccgtggactt cgggaactat 1020cacctgggcg tcatgtcctg
gatcatcccc gtcttcgtgg gcctgtcctg cttcggctcc 1080gtcaatgggt ccctgttcac
atcctccagg ctcttcttcg tggggtcccg ggaaggccac 1140ctgccctcca tcctctccat
gatccaccca cagctcctca cccccgtgcc gtccctcgtg 1200ttcacgtgtg tgatgacgct
gctctacgcc ttctccaagg acatcttctc cgtcatcaac 1260ttcttcagct tcttcaactg
gctctgcgtg gccctggcca tcatcggcat gatctggctg 1320cgccacagaa agcctgagct
tgagcggccc atcaaggtga acctggccct gcctgtgttc 1380ttcatcctgg cctgcctctt
cctgatcgcc gtctccttct ggaagacacc cgtggagtgt 1440ggcatcggct tcaccatcat
cctcagcggg ctgcccgtct acttcttcgg ggtctggtgg 1500aaaaacaagc ccaagtggct
cctccagggc atcttctcca cgaccgtcct gtgtcagaag 1560ctcatgcagg tggtccccca
ggagacatag ccaggaggcc gagtggctgc cggaggagca 1620tgcgcagagg ccagttaaag
tagatcacct cctcgaaccc actccggttc cccgcaaccc 1680acagctcagc tgcccatccc
agtccctcgc cgtccctccc aggtcgggca gtggaggctg 1740ctgtgaaaac tctggtacga
atctcatccc tcaactgagg gccagggacc caggtgtgcc 1800tgtgctcctg cccaggagca
gcttttggtc tccttgggcc ctttttccct tccctccttt 1860gtttacttat atatatattt
tttttaaact taaattttgg gtcaacttga caccactaag 1920atgatttttt aaggagctgg
gggaaggcag gagccttcct ttctcctgcc ccaagggccc 1980agaccctggg caaacagagc
tactgagact tggaacctca ttgctacgac agacttgcac 2040tgaagccgga cagctgccca
gacacatggg cttgtgacat tcgtgaaaac caaccctgtg 2100ggcttatgtc tctgccttag
ggtttgcaga gtggaaactc agccgtaggg tggcactggg 2160agggggtggg ggatctgggc
aaggtgggtg attcctctca ggaggtgctt gaggccccga 2220tggactcctg accataatcc
tagccctgag acaccatcct gagccaggga acagccccag 2280ggttgggggg tgccggcatc
tcccctagct caccaggcct ggcctctggg cagtgtggcc 2340tcttggctat ttctgtgtcc
agttttggag gctgagttct ggttcatgca gacaaagccc 2400tgtccttcag tcttctagaa
acagagacaa gaaaggcaga cacaccgcgg ccaggcaccc 2460atgtgggcgc ccaccctggg
ctccacacag cagtgtcccc tgccccagag gtcgcagcta 2520ccctcagcct ccaatgcatt
ggcctctgta ccgcccggca gccccttctg gccggtgctg 2580ggttcccact cccggcctag
gcacctcccc gctctccctg tcacgctcat gtcctgtcct 2640ggtcctgatg cccgttgtct
aggagacaga gccaagcact gctcacgtct ctgccgcctg 2700cgtttggagg cccctgggct
ctcacccagt ccccacccgc ctgcagagag ggaactaggg 2760caccccttgt ttctgttgtt
cccgtgaatt tttttcgcta tgggaggcag ccgaggcctg 2820gccaatgcgg cccactttcc
tgagctgtcg ctgcctccat ggcagcagcc aaggaccccc 2880agaacaagaa gacccccccg
caggatccct cctgagctcg gggggctctg ccttctcagg 2940ccccgggctt cccttctccc
cagccagagg tggagccaag tggtccagcg tcactccagt 3000gctcagctgt ggctggagga
gctggcctgt ggcacagccc tgagtgtccc aagccgggag 3060ccaacgaagc cggacacggc
ttcactgacc agcggctgct caagccgcaa gctctcagca 3120agtgcccagc ggagcctgcc
gcccccacct gggcaccggg accccctcac catccagtgg 3180gcccggagaa acctgatgaa
cagtttgggg actcaggacc agatgtccgt ctctcttgct 3240tgaggaatga agacctttat
tcacccctgc cccgttgctt cccgctgcac atggacagac 3300ttcacagcgt ctgctcatag
gacctgcatc cttcctgggg acgaattcca ctcgtccaag 3360ggacagccca cggtctggag
gccgaggacc accagcaggc aggtggactg actgtgttgg 3420gcaagacctc ttccctctgg
gcctgttctc ttggctgcaa ataaggacag cagctggtgc 3480cccacctgcc tggtgcattg
ctgtgtgaat ccaggaggca gtggacatcg taggcagcca 3540cggccccggg tccaggagaa
gtgctccctg gaggcacgca ccactgcttc ccactggggc 3600cggcggggcc cacgcacgac
gtcagcctct taccttcccg cctcggctag gggtcctcgg 3660gatgccgttc tgttccaacc
tcctgctctg ggacgtggac atgcctcaag gatacaggga 3720gccggcggcc tctcgacggc
acgcacttgc ctgttggctg ctgcggctgt gggcgagcat 3780gggggctgcc agcgtctgtt
gtggaaagta gctgctagtg aaatggctgg ggccgctggg 3840gtccgtcttc acactgcgca
ggtctcttct gggcgtctga gctggggtgg gagctcctcc 3900gcagaaggtt ggtggggggt
ccagtctgtg atccttggtg ctgtgtgccc cactccagcc 3960tggggacccc acttcagaag
gtaggggccg tgtcccgcgg tgctgactga ggcctgcttc 4020cccctccccc tcctgctgtg
ctggaattcc acagggacca gggccaccgc aggggactgt 4080ctcagaagac ttgatttttc
cgtccctttt tctccacact ccactgacaa acgtccccag 4140cggtttccac ttgtgggctt
caggtgtttt caagcacaac ccaccacaac aagcaagtgc 4200attttcagtc gttgtgcttt
tttgttttgt gctaacgtct tactaattta aagatgctgt 4260cggcaccatg tttatttatt
tccagtggtc atgctcagcc ttgctgctct gcgtggcgca 4320ggtgccatgc ctgctccctg
tctgtgtccc agccacgcag ggccatccac tgtgacgtcg 4380gccgaccagg ctggacaccc
tctgccgagt aatgacgtgt gtggctggga ccttctttat 4440tctgtgttaa tggctaacct
gttacactgg gctgggttgg gtagggtgtt ctggcttttt 4500tgtggggttt ttatttttaa
agaaacactc aatcatccta aaaaaaaaaa aaaaaaaaaa 4560aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4620aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 467030176DNAHomo sapiens
30gggacttgga aaggggaact gggatttggg gaggggctgg aggacttccg cacgcttcca
60cctccttcga cctccactgc gccccacctc cctgcctgtg tgtgttattt caaaggaaaa
120gaacaaaagg aataaatttt ctaagctctt taaaaaaaaa aaaaaaaaaa aaaaaa
176312255DNAHomo sapiens 31gcaggctctg cctgtggcca ctagcagaga agctgctgtc
cttccaccac cagcaccgga 60ccacctgctc caagaccagc ctcctggggg gaccaggcac
ccggccttca ctggcaccca 120gggagccgtc ctcagcagcg tcaacatgtc aaggcccagc
agcagagcca tttacttgca 180ccggaaggag tactcccaga acctcacctc agagcccacc
ctcctgcagc acagggtgga 240gcacttgatg acatgcaagc aggggagtca gagagtccag
gggcccgagg atgccttgca 300gaagctgttc gagatggatg cacagggccg ggtgtggagc
caagacttga tcctgcaggt 360cagggacggc tggctgcagc tgctggacat tgagaccaag
gaggagctgg actcttaccg 420cctagacagc atccaggcca tgaatgtggc gctcaacaca
tgttcctaca actccatcct 480gtccatcacc gtgcaggagc cgggcctgcc aggcactagc
actctgctct tccagtgcca 540ggaagtgggg gcagagcgac tgaagaccag cctgcagaag
gctctggagg aagagctgga 600gcaaagcaga cctcgacttg gaggccttca gccaggccag
gacagatgga gggggcctgc 660tatggaaagg ccgctcccta tggagcaggc acgctatctg
gagccgggga tccctccaga 720acagccccac cagaggaccc tagagcacag cctcccacca
tccccaaggc ccctgccacg 780ccacaccagt gcccgagaac caagtgcctt tactctgcct
cctccaaggc ggtcctcttc 840ccccgaggac ccagagaggg acgaggaagt gctgaaccat
gtcctaaggg acattgagct 900gttcatggga aagctggaga aggcccaggc aaagaccagc
aggaagaaga aatttgggaa 960aaaaaacaag gaccagggag gtctcaccca ggcacagtac
attgactgct tccagaagat 1020caagtacagc ttcaacctcc tgggaaggct ggccacctgg
ctgaaggaga caagtgcccc 1080tgagctcgta cacatcctct tcaagtccct gaacttcatc
ctggccaggt gccctgaggc 1140tggcctagca gcccaagtga tctcacccct cctcacccct
aaagctatca acctgctaca 1200gtcctgtcta agcccacctg agagtaacct ttggatgggg
ttgggcccag cctggaccac 1260tagccgggcc gactggacag gcgatgagcc cctgccctac
caacccacat tctcggatga 1320ctggcaactt ccagagccct ccagccaagc acccttagga
taccaggacc ctgtttccct 1380tcggcgggga agtcataggt tagggagcac ctcacacttt
cctcaggaga agacacacaa 1440ccatgaccct cagcctgggg accccaactc caggccctcc
agccccaaac ctgcccagcc 1500agccctgaaa atgcaagtct tgtacgagtt tgaagctagg
aacccacggg aactgactgt 1560ggtccaggga gagaagctgg aggttctgga ccacagcaag
cggtggtggc tggtgaagaa 1620tgaggcggga cggagcggct acattccaag caacatcctg
gagcccctac agccggggac 1680ccctgggacc cagggccagt caccctctcg ggttccaatg
cttcgactta gctcgaggcc 1740tgaagaggtc acagactggc tgcaggcaga gaacttctcc
actgccacgg tgaggacact 1800tgggtccctg acggggagcc agctacttcg cataagacct
ggggagctac agatgctatg 1860tccacaggag gccccacgaa tcctgtcccg gctggaggct
gtcagaagga tgctggggat 1920aagcccttag gcaccagctt agacacctcc aagaaccagg
ccccgctgat gcaagatggc 1980agatctgata cccattagag ccccgagaat tcctcttctg
gatcccagtt tgcagcaaac 2040cccacacccc agctcacaca gcaaaaacaa tggacaggcc
cagagggtga agcaaacagt 2100gtcccttctg gctgtgttgg agcctcccca gtaaccacct
atttatttta cctctttccc 2160aaacctggag catttatgcc taggcttgtc aagaatctgt
tcagtccctc tccttctcaa 2220taaaagcatc ttcaagcttg aaaaaaaaaa aaaaa
225532945DNAHomo sapiensmisc_feature(36)..(36)a or
g or c or t/u 32ccttccattg aattccacca gacacattca ggttancttc gtaatgtctt
catatgagta 60tcaatcaaca ccttccccaa ctcaattgta ctaggttgta gagcacaagg
atggtctcgt 120gctgctctgt ggcacctgtg cctacactgc tctgagcttt gaggaggctg
ctctctttgc 180tgaccccatg atcttttctg cccttctgtt aagggcattg gccacagcaa
cggggcaaat 240gccccaagct ggctgtaagt gacccatccc tttggctccc atgattagac
caaggagagg 300catggggtcc agctgagcca ttcagaacca ttccttagca ttttccactc
aaaggttaga 360gatgagattt tctcttccca aggctacctc tggccatggt tccagcttca
tgggggcaat 420gggattagga aaatgaggtc aacctgcaaa ggaaagcaga tgcaagagat
ggagacagaa 480tgggggtgtc ctggggatct tggagcctga attcattggc acaaaaggca
gcagcatcct 540cactgtatct gcagtccatt tggactcaat aaaaactttg aaagtcacat
gtgttatgga 600attccttctc agtgacacat tcatctgtgc tcagttgtcc cagcaagggt
cagcccctca 660tacccctgca gcatccgctg ctatgaagca gagctgtaaa cgccctccct
gtgtatagga 720aaagctacat ggagcaaatc ctcctgcctg aagaagtgca tctcagcatc
acttcagctg 780tcggggcatt tgtggggaga accagaccac ctctgcggaa ggcagcagac
cctcttccag 840ccatggatgg agttgaattc tctataaacg gttcaccagc aaaccaccaa
tacattccat 900tgtttgccta gagagaaatt taaaaataaa taaatgttca cttat
94533736DNAHomo sapiens 33tgcggccgcg gcatgaaagg cggcgaggag
aggcagcact gctgctcttg acttctgagc 60agggcttaga gagcctgccc cggcttaagc
cgagctgctg gtgctgaccc tgagcgccga 120gtccgcgagc tctgagtccg gagcctccca
gccgtggagc cgtgggatga ggggggcgtt 180gggggacagg gcaaagtcga tcttggttgt
acagccgccc gatcctagcg cggagctgcg 240agcctgaccg gccgcgtctg gcatggtcag
agaaagaatt ttcttttccc aactccggct 300tttggttttg tgtgtccacc ttgcgcaact
ccggagccag ccgaccccac atggattctc 360aacaggtggc cggcacatct tctgagcctc
gctctctcat ctgaaagtgg agtgtaagtc 420caagaagatt catttagaca aagaaggtgg
aaaaaaagga ctttctgggc cagcaagtcg 480gatgaccacc ctccaagggg cagaggaggg
cccattttgt gaagaagaaa tcaactaccc 540ggaaaacgcc acaggaggac atgtttctgc
agatgtagtt gccctagaaa cagaagagta 600tgggggtgtg aatgtcttct cttttggggg
caaacactat gtccttttct ttttctagat 660acagttaatt cctggaaatt ttagcgagtt
tgttcttgtg gatattttga acaataaaga 720gtgaaaatca aaaaaa
736342104DNAHomo sapiens 34cccaatcact
cctggaatac acagagagag gcagcagctt gctcagcgga caaggatgct 60gggcgtgagg
gaccaaggcc tgccctgcac tcgggcctcc tccagccagt gctgaccagg 120gacttctgac
ctgctggcca gccaggacct gtgtggggag gccctcctgc tgccttgggg 180tgacaatctc
agctccaggc tacagggaga ccgggaggat cacagagcca gcatgttaca 240ggatcctgac
agtgatcaac ctctgaacag cctcgatgtc aaacccctgc gcaaaccccg 300tatccccatg
gagaccttca gaaaggtggg gatccccatc atcatagcac tactgagcct 360ggcgagtatc
atcattgtgg ttgtcctcat caaggtgatt ctggataaat actacttcct 420ctgcgggcag
cctctccact tcatcccgag gaagcagctg tgtgacggag agctggactg 480tcccttgggg
gaggacgagg agcactgtgt caagagcttc cccgaagggc ctgcagtggc 540agtccgcctc
tccaaggacc gatccacact gcaggtgctg gactcggcca cagggaactg 600gttctctgcc
tgtttcgaca acttcacaga agctctcgct gagacagcct gtaggcagat 660gggctacagc
agcaaaccca ctttcagagc tgtggagatt ggcccagacc aggatctgga 720tgttgttgaa
atcacagaaa acagccagga gcttcgcatg cggaactcaa gtgggccctg 780tctctcaggc
tccctggtct ccctgcactg tcttgcctgt gggaagagcc tgaagacccc 840ccgtgtggtg
ggtggggagg aggcctctgt ggattcttgg ccttggcagg tcagcatcca 900gtacgacaaa
cagcacgtct gtggagggag catcctggac ccccactggg tcctcacggc 960agcccactgc
ttcaggaaac ataccgatgt gttcaactgg aaggtgcggg caggctcaga 1020caaactgggc
agcttcccat ccctggctgt ggccaagatc atcatcattg aattcaaccc 1080catgtacccc
aaagacaatg acatcgccct catgaagctg cagttcccac tcactttctc 1140aggcacagtc
aggcccatct gtctgccctt ctttgatgag gagctcactc cagccacccc 1200actctggatc
attggatggg gctttacgaa gcagaatgga gggaagatgt ctgacatact 1260gctgcaggcg
tcagtccagg tcattgacag cacacggtgc aatgcagacg atgcgtacca 1320gggggaagtc
accgagaaga tgatgtgtgc aggcatcccg gaagggggtg tggacacctg 1380ccagggtgac
agtggtgggc ccctgatgta ccaatctgac cagtggcatg tggtgggcat 1440cgttagctgg
ggctatggct gcgggggccc gagcacccca ggagtataca ccaaggtctc 1500agcctatctc
aactggatct acaatgtctg gaaggctgag ctgtaatgct gctgcccctt 1560tgcagtgctg
ggagccgctt ccttcctgcc ctgcccacct ggggatcccc caaagtcaga 1620cacagagcaa
gagtcccctt gggtacaccc ctctgcccac agcctcagca tttcttggag 1680cagcaaaggg
cctcaattcc tgtaagagac cctcgcagcc cagaggcgcc cagaggaagt 1740cagcagccct
agctcggcca cacttggtgc tcccagcatc ccagggagag acacagccca 1800ctgaacaagg
tctcaggggt attgctaagc caagaaggaa ctttcccaca ctactgaatg 1860gaagcaggct
gtcttgtaaa agcccagatc actgtgggct ggagaggaga aggaaagggt 1920ctgcgccagc
cctgtccgtc ttcacccatc cccaagccta ctagagcaag aaaccagttg 1980taatataaaa
tgcactgccc tactgttggt atgactaccg ttacctactg ttgtcattgt 2040tattacagct
atggccacta ttattaaaga gctgtgtaac atcaaaaaaa aaaaaaaaaa 2100aaaa
2104353865DNAHomo
sapiens 35tttttcaatt ttgaacattt tgcaaaacga ggggttcgag gcaggtgaga
gcatcctgca 60cgtcgccggg gagcccgcgg gcacttggcg cgctctcctg ggaccgtctg
cactggaaac 120ccgaaagttt ttttttaata tatattttta tgcagatgta tttataaaga
tataagtaat 180ttttttcttc ccttttctcc accgccttga gagcgagtac ttttggcaaa
ggacggagga 240aaagctcagc aacattttag ggggcggttg tttctttctt tcttatttct
tttttaaggg 300gaaaaaattt gagtgcatcg cgatggagaa aatgtcccga ccgctccccc
tgaatcccac 360ctttatcccg cctccctacg gcgtgctcag gtccctgctg gagaacccgc
tgaagctccc 420ccttcaccac gaagacgcat ttagtaaaga taaagacaaa gaaaagaagc
tggatgatga 480gagtaacagc ccgacggtcc cccagtcggc attcctgggg cctaccttat
gggacaaaac 540ccttccctat gacggagata ctttccagtt ggaatacatg gacctggagg
agtttttgtc 600agaaaatggc attcccccca gcccatctca gcatgaccac agccctcacc
ctcctgggct 660gcagccagct tcctcggctg ccccctcggt catggacctc agcagccggg
cctctgcacc 720ccttcaccct ggcatcccat ctccgaactg tatgcagagc cccatcagac
caggtcagct 780gttgccagca aaccgcaata caccaagtcc cattgatcct gacaccatcc
aggtcccagt 840gggttatgag ccagacccag cagatcttgc cctttccagc atccctggcc
aggaaatgtt 900tgaccctcgc aaacgcaagt tctctgagga agaactgaag ccacagccca
tgatcaagaa 960agctcgcaaa gtcttcatcc ctgatgacct gaaggatgac aagtactggg
caaggcgcag 1020aaagaacaac atggcagcca agcgctcccg cgacgcccgg aggctgaaag
agaaccagat 1080cgccatccgg gcctcgttcc tggagaagga gaactcggcc ctccgccagg
aggtggctga 1140cttgaggaag gagctgggca aatgcaagaa catacttgcc aagtatgagg
ccaggcacgg 1200gcccctgtag gatggcattt ttgcaggctg gctttggaat agatggacag
tttgtttcct 1260gtctgatagc accacacgca aaccaacctt tctgacatca gcactttacc
agaggcataa 1320acacaactga ctcccatttt ggtgtgcatc tgtgtgtgtg tgcgtgtata
tgtgcttgtg 1380ctcatgtgtg tggtcagcgg tatgtgcgtg tgcgtgttcc tttgctcttg
ccattttaag 1440gtagccctct catcgtcttt tagttccaac aaagaaaggt gccatgtctt
tactagactg 1500aggagccctc tcgcgggtct cccatcccct ccctccttca ctcctgcctc
ctcagctttg 1560cttcatgttc gagcttacct actcttccag gactctctgc ttggattcac
taaaaagggc 1620cctggtaaaa tagtggatct cagtttttaa gagtacaagc tcttgtttct
gtttagtccg 1680taagttacca tgctaatgag gtgcacacaa taacttagca ctactccgca
gctctagtcc 1740tttataagtt gctttcctct tactttcagt tttggtgata atcgtcttca
aattaaagtg 1800ctgtttagat ttattagatc ccatatttac ttactgctat ctactaagtt
tccttttaat 1860tctaccaacc ccagataagt aagagtacta ttaatagaac acagagtgtg
tttttgcact 1920gtctgtacct aaagcaataa tcctattgta cgctagagca tgctgcctga
gtattactag 1980tggacgtagg atattttccc tacctaagaa tttcactgtc ttttaaaaaa
caaaaagtaa 2040agtaatgcat ttgagcatgg ccagactatt ccctaggaca aggaagcaga
gggaaatggg 2100aggtctaagg atgaggggtt aatttatcag tacatgagcc aaaaactgcg
tcttggatta 2160gcctttgaca ttgatgtgtt cggttttgtt gttccccttc cctcacaccc
tgcctcgccc 2220ccacttttct agttaacttt ttccatatcc ctcttgacat tcaaaacagt
tacttaagat 2280tcagttttcc cactttttgg taatatatat atttttgtga attatacttt
gttgttttta 2340aaaagaaaat cagttgatta agttaataag ttgatgtttt ctaaggccct
ttttcctagt 2400ggtgtcattt ttgaatgcct cataaattaa tgattctgaa gcttatgttt
cttattctct 2460gtttgctttt gaacgtatgt gctcttataa agtggacttc tgaaaaatga
atgtaaaaga 2520cactggtgta tctcagaagg ggatggtgtt gtcacaaact gtggttaatc
caatcaattt 2580aaatgtttac tatagaccaa aaggagagat tattaaatcg tttaatgttt
atacagagta 2640attataggaa gttctttttt gtacagtatt tttcagatat aaatactgac
aatgtatttt 2700ggaagacata tattatatat agaaaagagg agaggaaaac tattccatgt
tttaaaatta 2760tatagcaaag atatatattc accaatgttg tacagagaag aagtgcttgg
gggtttttga 2820agtctttaat attttaagcc ctatcactga cacatcagca tgttttctgc
tttaaattaa 2880aattttatga cagtatcgag gcttgtgatg acgaatcctg ctctaaaata
cacaaggagc 2940tttcttgttt cttattaggc ctcagaaaga agtcagttaa cgtcacccaa
aagcacaaaa 3000tggattttag tcaaatattt attggatgat acagtgtttt ttaggaaaag
catctgccac 3060aaaaatgttc acttcgaaat tctgagttcc tggaatggca cgttgctgcc
agtgccccag 3120acagttcttt tctaccctgc gggcccgcac gttttatgag gttgatatcg
gtgctatgtg 3180tttggtttat aatttgatag atgtttgact ttaaagatga ttgttctttt
gtttcattaa 3240gttgtaaaat gtcaagaaat tctgctgtta cgacaaagaa acattttacg
ctagattaaa 3300atatcctttc atcaatggga ttttctagtt tcctgccttc agagtatcta
atcctttaat 3360gatctggtgg tctcctcgtc aatccatcag caatgcttct ctcatagtgt
catagacttg 3420ggaaacccaa ccagtaggat atttctacaa ggtgttcatt ttgtcacaag
ctgtagataa 3480cagcaagaga tgggggtgta ttggaattgc aatacattgt tcaggtgaat
aataaaatca 3540aaaacttttg caatcttaag cagagataaa taaaagatag caatatgaga
cacaggtgga 3600cgtagagttg gcctttttac aggcaaagag gcgaattgta gaattgttag
atggcaatag 3660tcattaaaaa catagaaaaa tgatgtcttt aagtggagaa ttgtggaagg
attgtaacat 3720ggaccatcca aatttatggc cgtatcaaat ggtagctgaa aaaactatat
ttgagcactg 3780gtctctcttg gaattagatg tttatatcaa atgagcatct caaatgtttt
ctgcagaaaa 3840aaataaaaag attctaataa aaaaa
386536359DNAHomo sapiensmisc_feature(17)..(17)a or g or c or
t/u 36ttccttccct ccctccnttc ctcaggagcc gccagtcccc aagttggctg tggttgggca
60cctggtttgg gtcctgcaga gctgggctca ggccctgggc tctgaacctg tgaacccttg
120ctgtgttacg aaactttcct tcctctgagg gccttgaacc ctctcctttt cttcttttgg
180gggtgggggt taactttatt ttctcttccc tgtatctgcc tctcccttcc ctcaatttcc
240tgttttaaaa ctgaatggca cgaaattgtt ttcctcaact cggagattcc tgtatggaga
300gaatcaattt ctatatttgc aataaatttc ttatttaaag ctaaaaaaaa aaaaaaaaa
359371848DNAHomo sapiens 37ggcacgaggg ccatctgtgg gggctttggg ccaggggtct
ccggacagca tgagcgtggg 60cttcatcggc gctggccagc tggcttttgc cctggccaag
ggcttcacag cagcaggcgt 120cttggctgcc cacaagataa tggctagctc cccagacatg
gacctggcca cagtttctgc 180tctcaggaag atgggggtga agttgacacc ccacaacaag
gagacggtgc agcacagtga 240tgtgctcttc ctggctgtga agccacacat catccccttc
atcctggatg aaataggcgc 300cgacattgag gacagacaca ttgtggtgtc ctgcgcggcc
ggcgtcacca tcagctccat 360tgagaagaag ctgtcagcgt ttcggccagc ccccagggtc
atccgctgca tgaccaacac 420tccagtcgtg gtgcgggagg gggccaccgt gtatgccaca
ggcacgcacg cccaggtgga 480ggacgggagg ctcatggagc agctgctgag cagcgtgggc
ttctgcacgg aggtggaaga 540ggacctgatt gatgccgtca cggggctcag tggcagcggc
cccgcctacg cattcacagc 600cctggatgcc ctggctgatg ggggcgtgaa gatgggactt
ccaaggcgcc tggcagtccg 660cctcggggcc caggccctcc tgggggctgc caagatgctg
ctgcactcag aacagcaccc 720aggccagctc aaggacaacg tcagctctcc tggtggggcc
accatccatg ccttgcatgt 780gctggagagt gggggcttcc gctccctgct catcaacgct
gtggaggcct cctgcatccg 840cacacgggag ctgcagtcca tggctgacca ggagcaggtg
tcaccagccg ccatcaagaa 900gaccatcctg gacaaggtga agctggactc ccctgcaggg
accgctctgt cgccttctgg 960ccacaccaag ctgctccccc gcagcctggc cccagcgggc
aaggattgac acgtcctgcc 1020tgaccaccat cctgccacca ccttctcttc tcttgtcact
agggggacta gggggtcccc 1080aaagtggccc actttctgtg gctctgatca gcgcaggggc
cagccaggga catagccagg 1140gaggggccac atcacttccc actggaaatc tctgtggtct
gcaagtgctt cccagcccag 1200aacaggggtg gattccccaa cctcaacctc ctttcttctc
tgctcccaaa ccatgtcagg 1260accaccttcc tctagagctc gggagcccgg agggtcttca
cccactccta ctccagtatc 1320agctggcacg ggctccttcc tgagagcaaa ggtcaaggac
cccctctgtg aaggctcagc 1380agaggtggga tcccacgccc cctcccggcc cctccctgcc
ctccattcag ggagaaacct 1440ctccttcccg tgtgagaagg gccagagggt ccaggcatcc
caagtccagc gtgaagggcc 1500acagcccctc ttggctgcca agcacgcaga tcccatggac
atttggggaa agggctcctt 1560gggctgctgg tgaacttctg tggccaccac ctcctgctcc
tgacctccct gggagggtgc 1620tatcagttct gtcctggccc tttcagtttt ataagttggt
ttccagcccc cagtgtcctg 1680acttctgtct gccacatgag gagggaggcc ctgcctgtgt
gggagggtgg ttactgtggg 1740tggaatagtg gaggccttca actgattaga caaggcccgc
ccacatcttg gagggcatct 1800gccttactga ttaaaatgtc aatgtaatct aaaaaaaaaa
aaaaaaaa 1848383003DNAHomo sapiens 38gataaatgcg gagggacggt
ccagctttag ctctctgctc gccgccgccg ctgtcgccgc 60cacctcctct gatctacgaa
agtcatgtta cccaacaccg ggaggctggc aggatgtaca 120gtttttatca caggtgcaag
ccgtggcatt ggcaaagcta ttgcattgaa agcagcaaag 180gatggagcaa atattgttat
tgctgcaaag accgcccagc cacatccaaa acttctaggc 240acaatctata ctgctgctga
agaaattgaa gcagttggag gaaaggcctt gccatgtatt 300gttgatgtga gagatgaaca
gcagatcagt gctgcagtgg agaaagccat caagaaattt 360ggagcttata ccattgctaa
gtatggtatg tctatgtatg tgcttggaat ggcagaagaa 420tttaaaggtg aaattgcagt
caatgcatta tggcctaaaa cagccataca cactgctgct 480atggatatgc tgggaggacc
tggtatcgaa agccagtgta gaaaagttga tatcattgca 540gatgcagcat attccatttt
ccaaaagcca aaaagtttta ctggcaactt tgtcattgat 600gaaaatatct taaaagaaga
aggaatagaa aattttgacg tttatgcaat taaaccaggt 660catcctttgc aaccagattt
cttcttagat gaatacccag aagcagttag caagaaagtg 720gaatcaactg gtgctgttcc
agaattcaaa gaagagaaac tgcagctgca accaaaacca 780cgttctggag ctgtggaaga
aacatttaga attgttaagg actctctcag tgatgatgtt 840gttaaagcca ctcaagcaat
ctatctgttt gaactctccg gtgaagatgg tggcacgtgg 900tttcttgatc tgaaaagcaa
gggtgggaat gtcggatatg gagagccttc tgatcaggca 960gatgtggtga tgagtatgac
tactgatgac tttgtaaaaa tgttttcagg gaaactaaaa 1020ccaacaatgg cattcatgtc
agggaaattg aagattaaag gtaacatggc cctagcaatc 1080aaattggaga agctaatgaa
tcagatgaat gccagactgt gaaggaaaat ataaaaaaaa 1140agtcgactgc tatgctcaaa
aagtaaaaaa agctcaacag ttaaaatcta atgtttgttt 1200tctttcctgt tatattataa
ggatatgcac gtttgttctg gaaaagatag aatttgtctc 1260taaaagactt gaaattgtaa
ttaaaatggc aagctaatca aacataagct tcattaagtg 1320ggattctaag acagtctgtg
tttttatatt tcaagggttt aaccctttga gccttacatc 1380tcattcactg tctttctcca
agaaaagtat tttgggcgga cagtcagatc aagcagtaaa 1440attagctctt tcaaatcttc
ttgtcatgta aaatgaagct agtctgtttt aaaattttta 1500gttttggatt gtatactaat
gaaaatctta atgatgtttt tgatttttat atacttattt 1560taaagaaaat cttatatagt
acattttaca aaaattataa aaaatgaatt agtactggcg 1620aggactaaat gaaacaataa
tttttcattt tgataactag ctttccaggt ggacttagcc 1680ataggaaaat attactaatg
taatttaaca aattgctgca tgtattccat ttaaaaatat 1740gtttaaattg tcctaaaaca
aaataatttt ctccctagga gtatgcattt ggctacagtg 1800ttttgaaaca gaaaccttag
aataggtcat tggtatgggc tgaactgtgt atcccccaat 1860tcatttgttg aggtcctaac
tcccatttct tttgaatgtg actgttcgga gatgaggcct 1920ttaaagaggt gacttaagtt
caaaggaggc tgttagtcta atccaacatg gtgtcctttg 1980gacataagag ataccagcaa
tgtgtgcaca gaacaaagac caggagagga cacagtgaga 2040aggcagttat ctgcaagcaa
agagagaggc ttcagaagaa acaaaatcac cagcaccttg 2100atctttgact tctaatctcc
agaatagtga gaaataaatt tctgttgtta agccgtccac 2160tgtgggaggc cgacgcagga
ggattgcttg aggccaggag ttcaaggcca gcctggacaa 2220catagtaaga ccctatctct
acccccctaa taaattaatt taaaaagccc cccaatctgt 2280ggtattttat tatggcagcc
ctagcaagct aatacagtgg tttgagaggc tgggagggtt 2340gaggggaaga taaactttta
aaaagctctt atctttcatt tcaatcagtt aaaaatactt 2400gctcagtgta acaattttgc
ttctcagctt ccactctaat attgttgtgc cattaagcaa 2460tttagctaat cctgacattt
cttagattca taatgttagg agcatttaat ctgtatttta 2520caagttagga agcagaggat
cagagatggg aaaggactag cccaaggcca acattaacaa 2580gccctctaac aaaaacttta
caatacattt atgttgaatg gaactccaag atctcacctc 2640tccatccagg aatggagtcc
atgtaatcaa agtgaactta aaaataggac agtttcaaca 2700agtcaggaga ttcacagcaa
ctgatcaaag ggagtccagt caacgtgagc aagcgtgatt 2760atgatgagga agccccctct
gctttaatcc acacaaggaa cgtaacctga agtaacctga 2820tgttaaccaa tctgctgtgt
ctactatgct gtttccttgt tcctgctagt gctgctttac 2880aaatgcagac cattctatca
tacctggcag ggcttctgtt ttattttgta ggctggatgc 2940tacccagttc atgaatcgct
aataaaagcc aattagatct ttaaaaaaaa aaaaaaaaaa 3000aaa
3003391824DNAHomo sapiens
39tattaaaagt accccatgga tggacctcca aatgagttta gggtaattgc gcttaaaata
60ttaggaccaa agtacattta ttttatagat ggaggagggg aggagacgag tggggaccag
120cttgacatcc agtcttcacc tggacatatg gaaagaacaa atgtgcgatc tgctcgttcc
180ctctgaaggt ctctgttacg tatttcctcc tctcctccag agcataataa ccaatgactg
240ctctcagaaa ggtactgtga ccaccacttg cttggctctc caacttcctc ccccatttcc
300ctcttgactc ctgtttgcca taacaccttc tgtcccctag ccttgcctca ggtccccgac
360gaatcctgcc cttaatctgt gggggtggta ggtggcactg gtttgaagag cttactggat
420ctccctcagt gagtcagcct ggagttgtgt ttgaaaacca caggccctga ctgtggctgt
480aagacctccc agacaccacc tgctgctgcc tatcatcatc ttcaggtgct gggctcccct
540gtgggcctcg tctgcccgcc ctctgctgca gctgtcccat gggcgcccgc cctctctgac
600accacaagag agcccatcta gattccagga aaaaactcat ctttatttgc cttcttccca
660ctgaaggtaa aagcaacatt aataaccaca acaaatactt agtgagtgct tactattatt
720catttaattg taggcccttc catccctggc catgatgaga gacatgccat agcttactcc
780taaagagacc tgaggacaca cgtgcacaaa catattgggc atatcatcaa tggcatcaaa
840actgattttc cctgtctacc cagaacaggc ctgagggaga gggaaaagcg gatacccacc
900tgtgtcgctg tttgcgtgcc aagtccagga acagtccata cagccctgct gcatcccacg
960acgctgtcac aaagcaggag ttcatccgag gccaaggtat ggagaaactg aggcccagaa
1020attgatgtcc agaatgcttt gctcttagcc actgtactat tatggcatat tttatcttta
1080tgtattgcat catttcatgg attcaagttt atcaatgtcc tttgacaagt ttaaaaatct
1140gtctgctaaa atctatcaaa tacattaagg aaaagtccca cttggcacat ctcccacacc
1200agatgttaat tattcatact gcatgactga ggattttgga ggcagagaga gattcatctg
1260caatatttgg aacaccaatg gaggtctatg tcaacacaga atttatacag cagctggtgc
1320tagtcagagc taatgacaga atttcagttt aataaaaaga cccccaactg agcacaccat
1380cttgaaaaaa gtatacttat caaacagctt tcaatcagtt caagagagac accttaattg
1440gggagaggaa gaattgcaga gtagtttgta atcatgccaa ttccagatca ataactgcat
1500gtctgttctt tggtagaaat agcttttgct ttatattaag taatcacata tatattctct
1560ctatttggat aaggaaacct tcgctttatt tgacaatgta taatgatata ctcttctaat
1620tcacctctgt gtcttcacaa taaacatgag taaaatttag acaagtgatg gtaaaggtca
1680atataattat ttatttttaa aataaatttt gtatctaaca ggaaagcagt tcttatgaaa
1740tttttatatt ttcaaaaatt gttttgttca aataaaattt tatgagtaaa gttaaaaaaa
1800aaaaaaaaaa aaaaaaaaaa aaaa
182440630DNAHomo sapiens 40gggtacctgg tggggccaat caccgagcca tgaacatcag
taacgtactc taaagaccaa 60ggctacgatg gctatgatgg tcagaattac taccaccacc
agtgaagctc cagcctggga 120tgaattcatc cattctggct ttgcatccgg ctaccatttt
cgaagttcaa ctcaggaagg 180tgcaatataa caaatgtgca tattataatg aggaatggta
ctaccgttcc agattttctg 240taattgcttc tgcaaagtaa taggcttctt gtcccttttt
tttctggcat gttatggaat 300gatcattgta aatcaggacc atttatcaag cagtacacca
actcataaga tcaaatttca 360ttgaatggtt tgaggttgta gctctataaa tagtagtttt
taacatgcct gtagtattgc 420taactgcaaa aacatactct ttgtacaaga agtgcttcta
agaatttcat tgacattaat 480gacactgtat acaataaatg tgtagtttct taatcgcact
acctatgcaa cactgtgtat 540taggtttatc atcctcatgt atttttatgt gacctgtatg
tatattctaa tctacgagtt 600ttatcacaaa taaaaatgca atccttcaaa
63041970DNAHomo sapiens 41aaggtgggct ttcattgtga
tttttgttct gttgcagtaa tataggagca cattttggcc 60attgtaatta cagggaacaa
agggattgcg gacacatatc tggacttctt ttcctccctt 120attgttgtgg aagagacact
agaaatgctc aaacacctgc aatatacaga atatacacaa 180ttttattcca gtatttccct
aacatatggt ttaaaattat tccaggtata cagtgtatgc 240aattctgcat tatcacagag
gaacaacttc ttttttaaaa aataaatagg tcagccattt 300ttattaacgt gcaaaaactt
tatcactcta acatgctcta ggtagttgag gaaaagaggt 360ctgatcactg tttgtatttt
attttctttg tgggaacatt tcacctgctg agtgtacatg 420aatttgcttt ctataaaagg
cttttatgag tttacagtag aatcagtgga aggaagagtt 480aataagggct gtttttaaaa
aaacaaacaa acaaacaaaa caaataatta aaaaaaaatt 540ttacattcct tcctattctc
taactacact tgggaagtgc acttcagata agtttgcagt 600gtgactgaga gatgaaggaa
atccatagaa aaggtcctct tagtgaacaa aatttagtta 660ttaactttat agctatgaaa
tttccccggg catttgtttt tgttcaaaca gactttaacc 720tctgcatcat acttaaccct
gcgacatgcg tacagtatgc atattttgtt ttgaaaaaaa 780atgtttcgtt ccagtctgtt
aagaatattc aaaaataata aaggtattgc ttaataaaat 840tgctagaatt gtttagcagt
acatgcacaa tattttacta gattctttgt tttaatagtg 900ttttgttgag actgaaaatc
ttaaaatggt ctgcgcaaat acaaaaaaaa agaaaacacc 960aaaaaaaaaa
970421743DNAHomo sapiens
42ggtgttgttc cggacacata gaaagataac gacgggaaga gcggggcccg ctttggggtc
60caggcaggtt ttggggcctc ctgtctggtg ggaggaggcc gcagcgcagc accctgctcg
120tcacttggga tggagaccgg ctttcccgca atcatgtacc ctggatcttt tattgggggc
180tggggagaag agtatctcag ctgggaagga ccggggctcc cagatttcgt cttccagcag
240cagcccgtgg agtctgaagc aatgcactgc agcaacccca agagtggagt tgtgctggct
300acagtggccc gaggtcccga tgcttgtcag atactcacca gagccccgct gggccaggat
360cccccgcaga ggacagtgct agggctgcta actgcaaatg ggcagtacag gaggacctgt
420ggccagggga tcacaagaat caggtgttat tctggatcag aaaatgcctt ccctccagct
480ggaaagaaag cactccctga ctgtggggtc caagagcccc ccaagcaagg gtttgacatc
540tacatggatg aactagagca gggggacaga gacagctgct cggtcagaga ggggatggca
600tttgaggatg tgtatgaagt agacaccggc acactcaagt cagacctgca cttcctgctg
660gatttcaaca cagtttcccc tatgctggta gattcatctc tcctctccca gtctgaagat
720atatccagtc ttggcacaga tgtgataaat gtgactgaat atgctgaaga aatttatcag
780taccttaggg aagctgaaat aaggcacaga cccaaagcac actacatgaa gaagcagcca
840gacatcacgg aaggcatgcg cacgattctg gtggactggc tggtggaggt tggggaagaa
900tataaacttc gagcagagac cctgtatctg gctgtcaact tcctggacag gttcctttca
960tgtatgtctg ttctgagagg gaaactgcag ctcgtaggaa cagcagctat gcttttggct
1020tcgaaatatg aagagatata tcctcctgaa gtagacgagt ttgtctatat caccgatgat
1080acatacacaa aacgacaact gttaaaaatg gaacacttgc ttctgaaagt tctagctttt
1140gatctgacag taccaaccac caaccagttt ctccttcagt acttgaggcg acaaggagtg
1200tgcgtcagga ctgagaacct ggctaagtac gtagcagagc tgagtctact tgaagcagat
1260ccattcttga aatatcttcc ttcactgata gctgcagcag ctttttgcct ggcaaactat
1320actgtgaaca agcacttttg gccagaaacc cttgctgcat ttacagggta ttcattaagt
1380gaaattgtgc cttgcctgag tgagcttcat aaagcgtacc ttgatatacc ccatcgacct
1440cagcaagcaa ttagggagaa gtacaaggct tcaaagtacc tgtgtgtgtc cctcatggag
1500ccacctgcag ttcttcttct acaataagtt tctgaatgga agcacttcca gaacttcacc
1560tccatatcag aagtgccaat aatcgtcata ggcttctgca cgttggatca actaatgttg
1620tttacaatat agatgacatt ttaaaaatgt aaatgaattt agtttccctt agactttagt
1680agtttgtaat atagtccaac attttttaaa caataaactg cttgtcttat gacaaaaaaa
1740aaa
174343697DNAHomo sapiens 43tccaagccat taaggactgt ggaacttgct atgatcatgg
acgtgctgta tggtggcgtt 60tgttatgcag gaattgatac agatcctgag ctaaaatacc
caaaaggtgc tgggcgagtt 120gctttctcca atcagcagag ctatattgct gccattagtg
ctcggtttgt tcagcttcag 180catggtgata ttgataaacg tgtggaggta aagccatatg
tgctagatga ccagatgtgt 240gatgaatgcc agggcgcacg ctgtggtgga aaatttgctc
cctttttttg tgccaatgtc 300acttgcctgc agtattactg tgagttttgt tgggcaaata
tccactctcg tgctggacgt 360gagttccata agccattggt aaaggaaggt gctgatcgcc
cacgtcagat ccacttccgc 420tggaactaag aatagcaaac tggcctctgt ttaacaagga
aagaaagggt gcatgtggct 480tactgtgtct gaagatactg acatgcagaa gaaataagtg
cattcttctg cttttcaccc 540cagctatcaa tacatgcatc tttatcagca gccaaaacac
tacaagcctc ttgtttttca 600ccaaaaccct acatctcagg cttactaatt tttgtgatat
tttcatgttc aaataaaatg 660tttttttgta ttttcaaaaa aaaaaaaaaa aaaaaaa
697442227DNAHomo sapiens 44ctcgatgtag aggggttggt
agcagacagg tggttacatt agaatagtca cacaaactgt 60tcagtgttgc aggaaccttt
tcttgggggt gggggagttt cccttttcta aaaatgcaat 120gcactaaaac tattttaaga
atgtagttaa ttctgcttat tcataaagtg ggcatcttct 180gtgttttagg tgtaatatcg
aagtcctggc ttttctcgtt ttctcacttg ctctcttgtt 240ctctgttttt ttaaaccaat
tttactttat gaatatattc atgacatttg taataaatgt 300cttgagaaag aatttgtttc
atggcttcat ggtcatcact caagctcccg taaggatatt 360accgtctcag gaaaggatca
ggactccatg tcacagtcct gccatcttac tttcctcttg 420tcgagttctg agtggaaata
actgcattat ggctgcttta acctcagtca tcaaaagaaa 480cttgctgttt tttaggcttg
atctttttcc tttgtggtta attttcctgt atattgtgaa 540aatgggggat tttccctctg
ctcccaccca cctaaacaca gcagccattt gtacctgttt 600gcttcccatc ccacttggca
cccactctga cctcttgtca gtttcctgtt cctggttcca 660tctttttgaa aaaggccctc
ctttgagcta caaacatctg gtaagacaag tacatccact 720catgaatgca gacacagcag
ctggtggttt tgtgtatacc tgtaaagaca agctgagagg 780cttacttttt ggggaagtaa
aagaagatgg aaatggatgt ttcatttgta tgagtttgga 840gcagtgctga aggccaaagc
cgcctactgg tttgtagtta acctagagaa ggttgaaaaa 900ttaatcctac ctttaaaggg
atttgaggta ggctggattc catcgccaca ggactttagt 960tagaattaaa ttcctgcttg
taatttatat ccatgtttag gcttttcata agatgaaaca 1020tgccacagtg aacacactcg
tgtacatatc aagagaagaa ggaaaggcac aggtggagaa 1080cagtaaaagg tgggcagatg
tctttgaaga aatgctcaat gtctgatgct aagtgggaga 1140aggcagagaa caaaggatgt
ggcataatgg tcttaacatt atccaaagac ttgaagctcc 1200atgtctgtaa gtcaaatgtt
acacaaaaaa aaatgcaaat ggtgtttcat tggaattacc 1260aagtgcttag aacttgctgg
ctttcccata ggtggtaaag gggtctgagc tcacaccgag 1320ttgtgcttgg cttgcttgtg
cagctccagg cacccggtgg gcactctggt ggtgtttgtg 1380gtgaactgaa ttgaatccat
tgttgggctt aagttactga aattggaaca ccctttgtcc 1440ttctcggcgg gggcttcctg
gtctgtgctt tacttggctt ttttccttcc cgtcttagcc 1500tcaccccctt gtcaaccaga
ttgagttgct atagcttgat gcagggaccc agtgaagttt 1560ctccgttaaa gattgggagt
cgtcgaaatg tttagattct tttaggaaag gaattatttt 1620cccccctttt acagggtagt
aacttctcca cagaagtgcc aatatggcaa aattacacaa 1680gaaaacagta ttgcaatgac
accattacat aaggaacatt gaactgttag aggagtgctc 1740ttccaaacaa aacaaaaatg
tctctaggtt tagtcagagc tttcacaagt aataaccttt 1800ctgtattaaa atcagagtaa
ccctttctgt attgagtgca gtgtttttta ctcttttctc 1860atgcacatgt tacgttggag
aaaatgttta caaaaatggt tttgttacac taatgcgcac 1920cacatattta tggtatattt
taagtgactt tttatgggtt atttaggttt tcgtcttagt 1980tgtagcacac ttaccctaat
tttgccaatt attaatttgc taaatagtaa tacaaatgac 2040aactgcatta aatttactaa
ttataaaagc tgcaagcaga ctggtggcaa gtacacagcc 2100cttttttttg cagtgctaac
ttgtctactg tgtattatga aaattactgt tgtcccccca 2160cccttttttc cttaaataaa
gtaaaaatga caccctaaaa aaaaaaaaaa aaaaaaaaaa 2220aaaaaaa
222745267DNAHomo sapiens
45tatacggctg ctagaagacg acagaaggtg gcttgggggt ggatatcttt gggttgctgg
60aaaaggtgtg ggaaggttca ggatggtggg agggactgag gtccctgagg tgaagaggcc
120cttggtcctg acgggtttga cccgtgcctg gacccttgga gcagtgttgt gtgaacttgc
180ctagaactct gccttctccg ttgtcaataa agcctccccc tcatgaccta aaaaaaaaaa
240aaaaaaaaaa aaaaaaagtc gtatcga
267464415DNAHomo sapiens 46gagcaggaaa atatataccc taaacagaaa ctcttacttg
ttttatgagc aagtctgagt 60gagtcctaaa atggctggcg aagagctacc aatactgact
gacaggtcac cttaaagcct 120ctaggtgtgc caagtttgat ttatcttagg gactagaacc
tagtcttcta aatgtgattt 180tgccttgctg tttcgtcctg atgtgaaggt aaccacacag
agagattggg ctgcatcagt 240aatgatatgc atacctttcg tgcatcagtg agcttcttcc
ctgttaactg tatgaccaca 300aaatttagct ggagtaaata aatatgcgac agaaatcctg
gaacaagatg gtgaaattgc 360ttaagaatcg agacttcagg gctcaatgac ctctgagcat
gtttcccaaa gtgtgaccca 420catgaccatc tgtctctcag tctcctggtc cctccgtaga
gcttctgaaa ctgaatcttt 480gtggggtggg ggtagcgttc aagaatcaaa agttgaacca
agctctttgg gtgatactta 540tgtatactga ggttcaggaa ctgctggaga gatgactggg
caccaagagg atgacagtga 600ctcagctggc atcccttagc tggttcatgg cagagctgag
tgggcactcc tgtctctgac 660cccagcttca gtgctcttta tctcctccat gcctcctcag
tcgtgctgct ctaagactgc 720ttactggctt tccttcatgt cctgggcaca gagcagttct
tttggtagca gatttgagtc 780cacttccccc gtgcacagat cactgctcag gacccagaga
ggagcagctc tgctccagca 840gggttttcca ttgcatcaca cacccaaacg gtaggatcca
acagtcacac ttgaaagcaa 900ccataattgt gaggtttctg atgctgtaga cttccttaca
tttctcacaa cctagttaga 960gagtcacatg ggggtgaagt gtggctcgcg acctgcccca
acaagtgcgt gcagaagcca 1020ggaaacaaag gagtaaattc acttcaaatg ggatgcacat
ggtgtccgtg atgaagagac 1080acattcagaa ttgcccaagg acaggaaaat gaccagagag
agccagagct gagctggtaa 1140taaagagact ccgagactga gtggagttaa tgagggaagc
atgcaacgag tggggcaatt 1200tcagttggtt tctctcattg ctttaagcga aatgaactat
acggacagga gaacagcctg 1260cttgccccag tctctccttg gccgccctct gttgtccctg
tcaactcagg tgcccacggt 1320gctcagagga ggtgctggca aagcccctgg agccttatgt
aggccatggg ggctcctaaa 1380aggaacctga atgaatcatt tacagcaggt ctctcttgta
aagcccagcc acagtaactc 1440gtacactgac tgtttcaaaa gacagccttt cttaatcatt
taattgtttc atattcaaat 1500atatctccta attgttttta ttttttcctg atctagaaga
tatgacaaca gggtagaact 1560tgggaagagg gaataggaag ctcgcccttc ctccttccct
cctcccctct ctactttcct 1620tccttccttg gtcatcaggt accttctttg tgcctgctgt
tgtaggctac accctatgtt 1680tggtggaagg caaaaagaaa aatcagtagg atacaactca
gtagggaaga cagagatatt 1740caagcccctt gtcctcccag tgtgataagt gtggtggttg
aggtgtgaac aaggggctct 1800gtgaacagag aggacgaaag aggagctcct cctgaggctg
ttgggaaaag catcactgaa 1860gagtgacttt cagaagaaga gaagaaaaag aggagaacat
gcgtgatttt ataatgaaat 1920agattagata aggggaaaaa aggcatttaa acaaggcaaa
aagaacagga gaatagagaa 1980gagatgtgga ggagaaggag cactgtagta aacacgcaga
aggacaggaa cacttagaca 2040tgcaacccac tcccaccctc cgtcttgggg gaggaaagca
cactactgtc ccaaagaact 2100aatactgaac cagtgctgcc ttgtggagag aggcatggcc
aaggcgttca gagacctggg 2160cctggtccca ccgctgccca cagcactcag cctctgagca
cagcctgggg tcatctgtgt 2220gccctctggc caaggctgat ggtagttctc tgagtaattg
agagtcattg cctgtctgtg 2280cagtattgtg aaaacaagtc accttttaac tttaaaacta
ctttaaaaaa ctttaaagtt 2340ttaaaaaaac ttctttaaaa actactcatg agatgacagt
ttctctgacc ctcagaggaa 2400ggctgggctg cgcatacgtg aggaattttt acatgaacat
cccaggactt gctgttcgca 2460ggtgataaac tgcacctccc caggactccc gctgcactca
catgcagctc cctggacttc 2520tggtatctga cccggcccat ttctgtgttt caggggagaa
tttggcttgc gggagtactc 2580agaagttaag acggtgacag taaagatccc ccagaagaac
tcctaagaag gccaagaagg 2640aggatgaagc ccagcctgca cgtctgtccc tctctgcttt
ctctgtaggg cccagctctc 2700aggaatacaa agttgagcca cggtccttac ttaaagattg
aaaagataac atgtaggcca 2760ggcaggtcac tgcacaacta aagcaaacca gctgggtaca
gtttcttggc actctgtaag 2820gggccacctt aatcatacca aatattgggg aaagtgggat
aaagggagga ggaggagcta 2880gcagacacat ccagtatctc cttctggagc acaggatgaa
ataagggagc tgtattattt 2940catgtctttg tcacaaagaa ctttcctctc aaggaaaggt
gacctttctc ctgtcttcat 3000tttcctcctt ccaggccctc ctcgctcacc cacccctccc
tctcttccaa ggagatgtca 3060gctgagctca ttctggggca gatgtttggg ccgggaacaa
tttttcaagg ttgtaaagcc 3120aaattatcat ttcatgttat ccatttcttc aaagcaaaac
atgaaatggt tttagctaga 3180gtcagaccag aatgaaaatg ccaggagctg gtacactaca
gatgtagtaa gaacctggga 3240tattcctgac ccaatctggt tttcttttac ccataaataa
catgaatgaa aaaagattgg 3300gacaatagag actggaagtc atcatgtgca gttcaccgct
tctgagcttg ctgcagtttt 3360ggggtgtgtg tgtattagat tccttctcag ttattctgga
ataaggcaag gagtgggttg 3420tttttcatag ctagataaga tcttttccaa agtttttctt
agaaccaacc aaaaaacaat 3480ccgagtaggc ccgagaattt gataatgctg gatgccttgc
agacatcatt cagtttctaa 3540tattgggcaa caattattat taaatgaatt atttctgtag
ttggaatctg taccttctga 3600acctctacac caataactgc tgcaggtgtg attttggtct
gtcacactgt acatctatca 3660taatgtgccc tgtatctatt ggcagtgacc ttggaaaatc
tggccaagcc taggggtttc 3720cttttccatt tgccaagttc cattgtgcca ggactgccgt
gctccactga gctcctctgt 3780cacaccccat tcttgcccct cactgggcag gccatggcct
acagcttgca gggagtaaag 3840caggcccgcc tccctttctt cccatccaca tactcctctt
ctgctttcca gtgactccac 3900cagtttgatg tgggaagtgt tagcttcctt tccttcttcc
atcccttctt ccatctttcc 3960agctgtcaaa tccaatccag tctctaacct aaatgcagat
catttattta aaagtaccaa 4020acataaccca gagtatgtgg aatatgggca acatatatat
agccttctgt atttaacgat 4080cttctgcttc ttaaccgtac cagttttcta tttataactc
ttatctatcc atgatgtttt 4140aaagtctcca cttgctgtta tttacaaacg acagtgcatt
cagcagccca gtgccgtgag 4200ccctgacaga tgccgtattt ctgagtgctt ccatgtgaat
gctgccctcc tgtagcatgt 4260gtccaagtgg acatagccac taaccaacta gttacctttg
gactgcaaca aaaaatgtga 4320aaatgaagat ttatttcttt taatttactt aaaaagaaac
ctctgtgcta gcaataaagc 4380atttatattg tgcaaaaaaa aaaaaaaaaa aaaac
441547453DNAHomo sapiens 47tgaaaattta tataactgtt
gttgataagg aacattatcc aggaattgat acgtttatta 60ggaaaagata tttttatagg
cttggatgtt tttagttctg actttgaatt tatataaagt 120atttttataa tgactggtct
tccttacctg gaaaaacatg cgatgttagt tttagaatta 180caccacaagt atctaaattt
ggaacttaca aagggtctat cttgtaaata ttgttttgca 240ttgtctgttg gcaaatttgt
gaactgtcat gatacgctta aggtggaaag tgttcattgc 300acaatatatt tttactgctt
tctgaatgta gacggaacag tgtggaagca gaaggctttt 360ttaactcatc cgtttgccaa
tcattgcaaa caactgaaat gtggatgtga ttgcctcaat 420aaagctcgtc cccattgctt
aagccttcaa aaa 453481587DNAHomo sapiens
48cttttagctg ccagccctgg cccatcatgt agctgcagca cagccttccc taacgttgca
60actgggggaa aaatcacttt ccagtctgtt ttgcaaggtg tgcatttcca tcttgattcc
120ctgaaagtcc atctgctgca tcggtcaaga gaaactccac ttgcatgaag attgcacgcc
180tgcagcttgc atctttgttg caaaactagc tacagaagag aagcaaggca aagtcttttg
240tgctcccctc ccccatcaaa ggaaagggga aaatgtctca gtcgaaaggc aagaagcgaa
300accctggcct taaaattcca aaagaagcat ttgaacaacc tcagaccagt tccacaccac
360ctagagattt agactccaag gcttgcattt ctattggaaa tcagaacttt gaggtgaagg
420cagatgacct ggagcctata atggaactgg gacgaggtgc gtacggggtg gtggagaaga
480tgcggcacgt gcccagcggg cagatcatgg cagtgaagcg gatccgagcc acagtaaata
540gccaggaaca gaaacggcta ctgatggatt tggatatttc catgaggacg gtggactgtc
600cattcactgt caccttttat ggcgcactgt ttcgggaggg tgatgtgtgg atctgcatgg
660agctcatgga tacatcacta gataaattct acaaacaagt tattgataaa ggccagacaa
720ttccagagga catcttaggg aaaatagcag tttctattgt aaaagcatta gaacatttac
780atagtaagct gtctgtcatt cacagagacg tcaagccttc taatgtactc atcaatgctc
840tcggtcaagt gaagatgtgc gattttggaa tcagtggcta cttggtggac tctgttgcta
900aaacaattga tgcaggttgc aaaccataca tggcccctga aagaataaac ccagagctca
960accagaaggg atacagtgtg aagtctgaca tttggagtct gggcatcacg atgattgagt
1020tggccatcct tcgatttccc tatgattcat ggggaactcc atttcagcag ctcaaacagg
1080tggtagagga gccatcgcca caactcccag cagacaagtt ctctgcagag tttgttgact
1140ttacctcaca gtgcttaaag aagaattcca aagaacggcc tacataccca gagctaatgc
1200aacatccatt tttcacccta catgaatcca aaggaacaga tgtggcatct tttgtaaaac
1260tgattcttgg agactaaaaa gcagtggact taatcggttg accctactgt ggattggtgg
1320gtttcggggt gaagcaagtt cactacagca tcaatagaaa gtcatctttg agataattta
1380accctgcctc tcagagggtt ttctctccca attttctttt tactccccct cttaaggggg
1440ccttggaatc tatagtatag aatgaactgt ctagatggat gaattatgat aaaggcttag
1500gacttcaaaa ggtgattaaa tatttaatga tgtgtcatat gaaaaaaaaa aaaaaaaaaa
1560aaaaaaaaaa aaaaaaaaaa aaaaaaa
158749558DNAHomo sapiens 49cagtcccacc atgtattttg ctttgtttct aaaaagcttt
ttaaaaactg ttatttaata 60ccaaagggag gaatcgtatg ggttcttctg cccaccgttg
tgactaagaa tgcacaggga 120cttggttctc gttgcacctt tttttagtaa catgtttcat
ggggacccac tgtacagccc 180ttcattctgc tgtgtcagtt tggcctggcc tgacactggc
tgccccagcg gggaccacgg 240aagcagagtg agagccttcg ctgagtcaat gctaccttca
gccccagacg catcccattt 300ccatgtcttc catgctcact gctcatgcac tttttacacg
gtttcttcca aacagcccgg 360tcttgatgca ggagagtctg gaaaaggaag aaaatggttt
cagtttcaaa attcaaagga 420aaaagttgag gacttatttt gtcctgtcaa gattgcaaga
acatgtaaaa tgtacggagc 480ttcataatac gttatattgt tccgaagcag ctcgttgaga
aacatttgtt ttcaataaca 540ttttagctta aaaaaaaa
55850841DNAHomo sapiensmisc_feature(54)..(54)a or
g or c or t/u 50tcacctcgtg gcgtagggga gaggtaacac cgagaagagg cagcggcggt
ggcncagaga 60cgattggtgc caaacagggc agaacgcaac tcagctctgg gtttgtgaat
agcacaatgg 120aagaagctgg actttgtggg ttaagagaga aagcagatat gttgtgtaac
tctgaatcac 180atgatattct tcaacatcaa gactcaaatt gcagtgccac aagtaataaa
catttattgg 240aagatgaaga aggccgtgac tttataacaa agaacaggag ttgggtgagc
ccagtgcact 300gcacacaaga gtcaagaagg gagcttcctg agcaagaagt agcccctccg
tctggtcagc 360aagctttaca attgcaacag gaacaaagaa aaagtcttag gaaaagaagt
tttattattg 420atgcaagccc taaacactct ttccgactcc agaggagaag ctggcagctc
tctgtaagaa 480atatgctgat cttggaaatt cacctcttct atagaagagt ttgttttgaa
ctatacgatt 540tgaaacaaaa ttcttttttt ggagactatg gaaacattct caacagggaa
accctactag 600actttgtaaa gcaaataatg gaaaagatac agaacttttt gaagaatcat
gggaaatttt 660tataattaaa taaatgctaa aattctgttt tgtgaaacat ttatgggaat
tatcactgac 720agtttttgta cactttcaaa tagtgttaaa gcagcaactc catgttgtaa
atgcacaaaa 780caaatattta gttaataatc aactccaaga ataaagctgt aacaataata
gttaaaaaaa 840a
841512384DNAHomo sapiens 51ggcacgaggg tcagcagccg ccagacttcc
tgccgaagtc cgagccccct cccggggctg 60gaggggggca agcgggttcc gaggtgcaaa
gcctggtgcc ccgagccctg cggagctcgg 120ggccagcatg gcccccacgc tgcaacaggc
gtaccggagg cgctggtgga tggcctgcac 180ggctgtgctg gagaacctct tcttctctgc
tgtactcctg ggctggggct ccctgttgat 240cattctgaag aacgagggct tctattccag
cacgtgccca gctgagagca gcaccaacac 300cacccaggat gagcagcgca ggtggccagg
ctgtgaccag caggacgaga tgctcaacct 360gggcttcacc attggttcct tcgtgctcag
cgccaccacc ctgccactgg ggatcctcat 420ggaccgcttt ggcccccgac ccgtgcggct
ggttggcagt gcctgcttca ctgcgtcctg 480caccctcatg gccctggcct cccgggacgt
ggaagctctg tctccgttga tattcctggc 540gctgtccctg aatggctttg gtggcatctg
cctaacgttc acttcactca cgctgcccaa 600catgtttggg aacctgcgct ccacgttaat
ggccctcatg attggctctt acgcctcttc 660tgccattacg ttcccaggaa tcaagctgat
ctacgatgcc ggtgtggcct tcgtggtcat 720catgttcacc tggtctggcc tggcctgcct
tatctttctg aactgcaccc tcaactggcc 780catcgaagcc tttcctgccc ctgaggaagt
caattacacg aagaagatca agctgagtgg 840gctggccctg gaccacaagg tgacaggtga
cctcttctac acccatgtga ccaccatggg 900ccagaggctc agccagaagg cccccagcct
ggaggacggt tcggatgcct tcatgtcacc 960ccaggatgtt cggggcacct cagaaaacct
tcctgagagg tctgtcccct tacgcaagag 1020cctctgctcc cccactttcc tgtggagcct
cctcaccatg ggcatgaccc agctgcggat 1080catcttctac atggctgctg tgaacaagat
gctggagtac cttgtgactg gtggccagga 1140gcatgagaca aatgaacagc aacaaaaggt
ggcagagaca gttgggttct actcctccgt 1200cttcggggcc atgcagctgt tgtgccttct
cacctgcccc ctcattggct acatcatgga 1260ctggcggatc aaggactgcg tggacgcccc
aactcagggc actgtcctcg gagatgccag 1320ggacggggtt gctaccaaat ccatcagacc
acgctactgc aagatccaaa agctcaccaa 1380tgccatcagt gccttcaccc tgaccaacct
gctgcttgtg ggttttggca tcacctgtct 1440catcaacaac ttacacctcc agtttgtgac
ctttgtcctg cacaccattg ttcgaggttt 1500cttccactca gcctgtggga gtctctatgc
tgcagtgttc ccatccaacc actttgggac 1560gctgacaggc ctgcagtccc tcatcagtgc
tgtgttcgcc ttgcttcagc agccactttt 1620catggcgatg gtgggacccc tgaaaggaga
gcccttctgg gtgaatctgg gcctcctgct 1680attctcactc ctgggattcc tgttgccttc
ctacctcttc tattaccgtg cccggctcca 1740gcaggagtac gccgccaatg ggatgggccc
actgaaggtg cttagcggct ctgaggtgac 1800cgcatagact tctcagacca agggacctgg
atgacaggca atcaaggcct gagcaaccaa 1860aaggagtgcc ccatatggct tttctacctg
taacatgcac atagagccat ggccgtagat 1920ttataaatac caagagaagt tctatttttg
taaagactgc aaaaaggagg aaaaaaaacc 1980ttcaaaaacg ccccctaagt caacgctcca
ttgactgaag acagtcccta tcctagaggg 2040gttgagcttt cttcctcctt gggttggagg
agaccagggt gcctcttatc tccttctagc 2100ggtctgcctc ctggtacctc ttggggggat
cggcaaacag gctacccctg aggtcccatg 2160tgccatgagt gtgcacacat gcatgtgtct
gtgtatgtgt gaatgtgaga gagacacagc 2220cctcctttca gaaggaaagg ggcctgaggt
gccagctgtg tcctgggtta ggggttgggg 2280gtcggcccct tccagggcca ggagggcagg
ttccctctct ggtgctgctg cttgcaagtc 2340ttagaggaaa taaaaaggga agtgagaaaa
aaaaaaaaaa aaaa 2384521923DNAHomo sapiens 52ggcacgaggg
aggcggcggc tccagccggc gcggcgcgag gctcggcggt gggatccggc 60gggcggtgct
agctccgcgc tccctgcctc gctcgctgcc gggggcggtc ggaaggcgcg 120gcgcgaagcc
cgggtggccc gagggcgcga tggctgctcc tgtcccgtgg gcctgctgtg 180ctgtgcttgc
cgccgccgcc gcagttgtct acgcccagag acacagtcca caggaggcac 240cccatgtgca
gtacgagcgc ctgggctctg acgtgacact gccatgtggg acagcaaact 300gggatgctgc
ggtgacgtgg cgggtaaatg ggacagacct ggcccctgac ctgctcaacg 360gctctcagct
ggtgctccat ggcctggaac tgggccacag tggcctctac gcctgcttcc 420accgtgactc
ctggcacctg cgccaccaag tcctgctgca tgtgggcttg ccgccgcggg 480agcctgtgct
cagctgccgc tccaacactt accccaaggg cttctactgc agctggcatc 540tgcccacccc
cacctacatt cccaacacct tcaatgtgac tgtgctgcat ggctccaaaa 600ttatggtctg
tgagaaggac ccagccctca agaaccgctg ccacattcgc tacatgcacc 660tgttctccac
catcaagtac aaggtctcca taagtgtcag caatgccctg ggccacaatg 720ccacagctat
cacctttgac gagttcacca ttgtgaagcc tgatcctcca gaaaatgtgg 780tagcccggcc
agtgcccagc aaccctcgcc ggctggaggt gacgtggcag accccctcga 840cctggcctga
ccctgagtct tttcctctca agttctttct gcgctaccga cccctcatcc 900tggaccagtg
gcagcatgtg gagctgtccg acggcacagc acacaccatc acagatgcct 960acgccgggaa
ggagtacatt atccaggtgg cagccaagga caatgagatt gggacatgga 1020gtgactggag
cgtagccgcc cacgctacgc cctggactga ggaaccgcga cacctcacca 1080cggaggccca
ggctgcggag accacgacca gcaccaccag ctccctggca cccccaccta 1140ccacgaagat
ctgtgaccct ggggagctgg gcagcggcgg gggaccctcg gcacccttct 1200tggtcagcgt
ccccatcact ctggccctgg ctgccgctgc cgccactgcc agcagtctct 1260tgatctgagc
ccggcacccc atgaggacat gcagagcacc tgcagaggag caggaggccg 1320gagctgagcc
tgcagacccc ggtttctatt ttgcacacgg gcaggaggac cttttgcatt 1380ctcttcagac
acaatttgtg gagaccccgg cgggcccggg cctgccgccc cccagccctg 1440ccgcaccaag
ctggccctcc ttcctccctc aggggaggtg ggccatgcag ctaacccacc 1500caccaaagac
cccctcaccc tggccccttg ggctggaccc tccaatgcca gcgactccca 1560ggagcccttg
ggggacgtga ggggagcctc tcacatccga tttctcctcc tgccccagcc 1620tcctgtctat
cccagggtct ctgttgccac catcagatta taagctcctg atgctggggg 1680ggcccagcca
tccccctccc cccagcaccc acaattttca gtcccctccc ctctgccctg 1740ttttgtatac
ccctcccctg accctgctcc tatcccacag tatttaatgc cctgtcagtc 1800ccttctagtc
tgactcaatg gtaacttgct gtatttgaat tttttataga tgtatataca 1860gggtgggggg
agtgggcggt tctcattaaa cgtcaccatt tcatgaaaaa aaaaaaaaaa 1920aaa
1923532065DNAHomo
sapiens 53ggcacgagga gtttcataat ttccgtgggt cgggccgggc gggccaggcg
ctgggcacgg 60tgatggccac cactggggcc ctgggcaact actacgtgga ctcgttcctg
ctgggcgccg 120acgccgcgga tgagctgagc gttggccgct atgcgccggg gaccctgggc
cagcctcccc 180ggcaggcggc gacgctggcc gagcaccccg acttcagccc gtgcagcttc
cagtccaagg 240cgacggtgtt tggcgcctcg tggaacccag tgcacgcggc gggcgccaac
gctgtacccg 300ctgcggtgta ccaccaccat caccaccacc cctacgtgca cccccaggcg
cccgtggcgg 360cggcggcgcc ggacggcagg tacatgcgct cctggctgga gcccacgccc
ggtgcgctct 420ccttcgcggg cttgccctcc agccggcctt atggcattaa acctgaaccg
ctgtcggcca 480gaaggggtga ctgtcccacg cttgacactc acactttgtc cctgactgac
tatgcttgtg 540gttctcctcc agttgataga gaaaaacaac ccagcgaagg cgccttctct
gaaaacaatg 600ctgagaatga gagcggcgga gacaagcccc ccatcgatcc caataaccca
gcagccaact 660ggcttcatgc gcgctccact cggaaaaagc ggtgccccta tacaaaacac
cagaccctgg 720aactggagaa agagtttctg ttcaacatgt acctcaccag ggaccgcagg
tacgaggtgg 780ctcgactgct caacctcacc gagaggcagg tcaagatctg gttccagaac
cgcaggatga 840aaatgaagaa aatcaacaaa gaccgagcaa aagacgagtg atgccatttg
ggcttattta 900gaaaaaaggg taagctagag agaaaaagaa agaactgtcc gtcccccttc
cgccttctcc 960cttttctcac ccccacccta gcctccacca tccccgcaca aagcggctct
aaacctcagg 1020ccacatcttt tccaaggcaa accctgttca ggctggctcg taggcctgcc
gctttgatgg 1080aggaggtatt gtaagctttc cattttctat aagaaaaagg aaaagttgag
gggggggcat 1140tagtgctgat agctgtgtgt gttagcttgt atatatattt ttaaaaatct
acctgttcct 1200gacttaaaac aaaaggaaag aaactacctt tttataatgc acaactgttg
atggtaggct 1260gtatagtttt tagtctgtgt agttaattta atttgcagtt tgtgcggcag
attgctctgc 1320caagatactt gaacactgtg ttttattgtg gtaattatgt tttgtgattc
aaacttctgt 1380gtactgggtg atgcacccat tgtgattgtg gaagatagaa ttcaatttga
actcaggttg 1440tttatgaggg gaaaaaaaca gttgcataga gtatagctct gtagtggaat
atgtcttctg 1500tataactagg ctgttaacct atgattgtaa agtagctgta agaatttccc
agtgaaataa 1560aaaaaaattt taagtgttct cggggatgca tagattcatc attttctcca
ccttaaaaat 1620gcgggcattt aagtctgtcc attatctata tagtcctgtc ttgtctattg
tatatataat 1680ctatatgatt aaagaaaata tgcataatca gacaagcttg aatattgttt
ttgcaccaga 1740cgaacagtga ggaaattcgg agctatacat atgtgcagaa ggttactacc
tagggtttat 1800gcttaatttt aatcggagga aatgaatgct gattgtaacg gagttaattt
tattgataat 1860aaattataca ctatgaaacc gccattgggc tactgtagat ttgtatcctt
gatgaatctg 1920gggtttccat cagactgaac ttacactgta tattttgcaa tagttacctc
aaggcctact 1980gaccaaattg ttgtgttgag atgatattta actttttgcc aaataaaata
tattgattct 2040tttctaaaaa aaaaaaaaaa aaaaa
2065541045DNAHomo sapiens 54aaaccagtgt atccagtcat ggaaaagaag
gaggaagatg gcaccctgga gcgggggcac 60tggaacaaca agatggagtt tgtgctgtca
gtggctgggg agatcattgg cttaggcaac 120gtctggaggt ttccctatct ctgctacaaa
aatgggggag gtgagatgag agcccttgtg 180ccaccccacc cactcctgga aggaggatac
ttccatctcc tgcacttacg gcccctctgg 240ggagtcccat agatgtatag aattctggag
gtaggaggac gcttggaggt cattaaggac 300actctgtaag agactaagac ctagaaaggt
tacgtgacta tcccagggct ctttctatta 360taacgtggca tcgtagaaat atgagcacaa
gctggaacca ggtggatgag agtttggatt 420ctggctctgc tacttaacac tctgtgtgat
cttggacaag ttacttaagc tctcagagca 480tcaattgccg ctcctgcaaa ttgagataat
aatgcctgcc tttcaaggtc attgtaagga 540ttagagacaa tgtgtgtaaa gcacttaata
aatagtagct ctgctgatga tgacgttgat 600aaccaaactg ttctgtggtc ttaagtaata
aatagtagct ctgctgatga tgacgttgat 660aaccaaactg ttctgtggtc ttaagtaata
agtagtagct ctgttgatga tgacgttgat 720aaccaaactg ttctgtggtc ttaagtaata
agtagtagct ctgctgatga tgacgttgat 780aaccaaactg ttctgtggtc ttaagtaata
aatagtagct ctgctgatga tgatgttgat 840aaccaaactg ttctgtggtc ttaagtaata
aatagtagct ctgctgatga tgacgttgat 900aaccaaactg ttctgtggtc ttaagtaata
aatagtagct ctgctgatga tgacgttgat 960aaccaaactg ttctgtggtc ttaagtaata
aatagtagct ctgctgatga tgacgttgat 1020aaaaaaaaaa aaaaaaaaaa aaaaa
1045552024DNAHomo sapiens 55ggaagacatc
aggatgtacc atctgccctt ctgtcggacc ccagggtacg tcccatgagc 60gcggccgagc
tgcgtcgagg gcagcagagc gtgctgcact gctcagggac ccggactctg 120cagtttctcc
tgcactgttt tcacctttgg ccagacgggc tctgggaaga cctacaccct 180gactggaccc
cctccccagg gggagggggt gcctgtaccc cccagcctgg ctggcatcat 240gcagaggacc
ttcgcctggc tgttggaccg cgtgcagcac ctgggtgccc ctgtcaccct 300tcgcgcctct
tatctggaga tctacaatga gcaggttcgg gacttgctga gcctggggtc 360tccccggccc
ctccctgttc gctggaacaa gactcggggc ttctatgtgg agcagctgcg 420ggtggtggaa
tttgggagtc tggaggccct gatggaactt ttgcaaacgg gtctcagccg 480tcgaaggaac
tcagcccaca ccctgaacca ggcctccagc cgaagccatg ccctgctcac 540cctttacatc
agccgtcaaa ctgcccagca gatgccttct gtggaccctg gggagccccc 600tgttggtggg
aagctgtgct ttgtggacct ggcaggcagt gagaaggtag cagccacggg 660atcccgtggg
gagctgatgc ttgaggctaa cagcatcaac cgaagcctgc tggccctggg 720tcactgcatc
tccctgctgc tggacccaca gcggaagcag agccacatcc ctttccggga 780cagcaagctc
accaagttgc tggcagactc actgggaggg cgcggggtca ccctcatggt 840ggcctgcgtg
tccccctcag cccagtgcct tcctgagact ctcagcaccc tgcgatatgc 900aagccgagct
cagcgggtca ccacccgacc acaggccccc aagtctcctg tggcaaagca 960gccccagcgt
ttggagacag agatgctgca gctccaggag gagaaccgtc gcctgcagtt 1020ccagctggac
caaatggact gcaaggcctc agggctcagt ggagcccggg tggcctgggc 1080ccagcggaac
ctgtacggga tgctacagga gttcatgcta gagaatgaga ggctcaggaa 1140agaaaagagc
cagctgcaga atagccgaga cctggcccag aatgagcagc gcatcctggc 1200ccagcaggtc
catgcactag agaggcgtct cctctctgcc tgctaccatc accagcaggg 1260tcctggcctg
accccaccgt gtccctgctt gatggcccca gctccccctt gccatgcact 1320gccacccctc
tactcctgcc cctgctgcca catctgccca ctgtgtcgag tgcccctggc 1380ccactgggcc
tgcctgccag gggagcacca cctgccccag gtgttggacc ctgaggcctc 1440aggtggcagg
cccccatctg cccggccccc accctgggca cccccatgca gccctggctc 1500tgccaagtgc
ccaagagaga ggagtcacag tgactggact cagacccgag tcctggcaga 1560gatgttgacg
gaggaggagg tggtaccttc tgcacctccc ctgcctgtga ggcccccgaa 1620gacatcacca
gggctcagag gtggggccgg ggttccaaac ctggcccaga gactggaggc 1680cctcagagac
cagattggca gctccctgcg acgtggccgc agccagccac cctgcagtga 1740gggcgcacgg
agcccaggcc aagtcctccc tccccattga aggccaagtg ggaacccagg 1800agactgctgt
gtgacctcag actgggctcc acactcttgg gcttcagtct gcccatctgc 1860tgaatggaga
cagcagctgc tactccacct gcagctgggc taggggcggg gactgggggt 1920gctatttagg
ggaacaaggg gattcaggag aaaccaggca gcaggggatg aaatacatga 1980ataaagagag
gcatcagctc caaaaaaaaa aaaaaaaaaa aaaa
2024563334DNAHomo sapiens 56ctccccctga gagaggctgg gcagcacccc ccttctgcca
ggagtgccag ccaaggtgcc 60agacccctgt ccagtggcaa gctggaaggc tttcagagca
tcgatgaagc tatagcctgg 120ctcaggaagg aactgacgga gatgcggctg caggaccagc
aactggccag acagctcatg 180cgcctgcgtg gcgacatcaa caagctgaaa atcgaacaca
cctgccgcct ccacaggagg 240atgctcaacg atgccaccta cgagctggag gagcgggatg
agctggccga cctcttctgt 300gactcccctc ttgcctcctc cttcagcctc tccacaccac
tcaagcttat tggcgtgacc 360aagatgaaca tcaactctcg gaggttctct ctctgctgag
gagccctcag actgggcgga 420ggggctggag cggagggctt gggctggagg ggtgtcagag
gaagctgagg ccaagttact 480ccagtgggtc tcccggaggc aggggtccct gggactggcg
actcaagggc cccaggacct 540attcagtggt gctctcccac ccaggggccc tgggtgtgga
tgccagtgtc tctgtgactg 600gctcttgctt actacccaaa gagctctgca gaagggccgc
tccaaccaag atgttaaagg 660agacctgggt tcccaccata atccatccct ccacggtcac
gttcctgttt cctggaatca 720ctggtgctat gaactgggat tcccaaaggg aggcccccca
acaaagctgt catttttgca 780gaaggctgtc ccgcaagggc cttgggggaa attaggcatg
tcagatgtgc ctgtctcacg 840tgctgttgct gtcctctaag tattgtctca aattcaccct
aagtacatga ctcagcaaca 900ttgacaggga gctactagga agggaaaatc gaaaggcatg
acaaatgggc acttggggac 960gcagccccag tggctggcag ccagtgtctc tggtgagcct
gacactacaa ggctgtgtaa 1020attgtaaatt ctggcgtgtg ctgggacatg tgatgggggc
actagcgtag cttgggtgca 1080acaagcacag atgtccccat tgtctcccct ggccacatgc
atctccaaag agcctcttca 1140ctgccaccca caccccaggg tgacagcctg ggagaccact
ggtgactgaa ccaggcaggt 1200cctgaaagca ttttccataa ctgaattctc ctgcaggggc
gtgaccgggg cctcctggtg 1260gattctggtg gtgtcacctt actgccctct ctggaaagac
aatctaggga gcccagaggc 1320ccatcctgag cctcctctga gattttgtgc ctgacctaaa
caactagttt taataagact 1380gttactgatg tgttgttcac ttgttagtaa ctgatttttg
tccaaatgcg gaagccactt 1440gtgtaggtca actacagtgc gtaggatttg attttaagag
tttctccctc ccaacaggct 1500tgaggatcag caagttaaga ccccagcagg ttagggaggt
cagtctgggg tcatacggca 1560tggcaggggt ccctcggcca gacccgtaga atcctgagat
aaggagtgtt tctgaccttt 1620ggtgtcatct agtcgagtcc tctcattagt aaaggagcaa
agtgaaacct gggggaggag 1680aaggacttcc ctcaggttgc acagctgttt aggctataga
atattgatgt gtgaaaccat 1740tattgataat gcctagtaga tcacatgtca atgaacttga
accccaaaga tggtcgtgat 1800gctttgccaa acccgcacac tgccaacccc tctactctcc
acctcagccc ccacccacat 1860ctcccagagt attgcaattc agaacatttg ggtcaaggtg
gagcaaggca ctgacagtgg 1920ccccacaggg catgtgtcac taatcactgt cccatggtct
acgcacggca tctggctgct 1980ctgtctactg tgacttcttc ctgtgtaatc tcagtggggc
ccgtgtccac ccacacatcg 2040tgacccacat aggggagagg ttgcttttct tttgtgggct
gagagtagga caatgcaaat 2100gaatgatctc tagtagacag aaaagaactt ggtctctttt
ttaaaatttc aaagagccag 2160aagttctatg cctccttcaa agtaggcaga acaacgcagc
caagatctac tgtctgccat 2220gctctgtgca atgaagtctg caggcctgag gaccatgtac
tgctgtcctt cctcagagct 2280ctgcacaaac actgccaagt cctgaagacg cattcctttc
ctgccaacct ctttccagat 2340aagcccttga ggtctcgggc tgacctacac acacacacac
acacacacac acacacacac 2400acacccccac acacacacac acacgacaga gaacatgcca
taaacatcct tgaacccatg 2460caggaaagcc catcccatat tctgaaaaaa tgccaaatta
ggtttttctt tctttttgga 2520aatcagtcat tacagtaacc gaaaccattg ggttcagcga
aaatggaaag atttagctga 2580atgtagtcag tccaattaag ttggatgcaa ctgagtgatt
tagttgcttg ggtaacccag 2640tgcttgcttg ctttcttcat tctctgggtg gaaactaaga
tcaagacaca tgtttgggga 2700taagttaaat gtctgagcta ttttgctcgg tttatcctaa
gagaacttta ttatgggatg 2760aggaggtgac ccaagatgag aagtggaggg ggacagcgat
gttttctaaa catcgtccag 2820tgttgactgg cttccttact ttgcacagtg aacacaacta
accacattaa ttcagctttg 2880tgaagtccct gctctctgtg ggttctatga gtcagcagca
acattggcct aacctccgtc 2940ccagcctcct ggctcaccac atgtgtacag tgctgtttgc
agttgtactc attatccatc 3000catctctctg ccatccccaa gcatcgctgg gtgtaaaacg
caaactctcc accgacactg 3060ccatgcgtgg tcatgtcttg atgccttcag gggctcagta
gctatcaaag aggcctggag 3120ggcctgggca ggcttgacga tgcctgaccg agttcaagac
ccacaccctg tagcaatacc 3180aagtgctatt acataatcaa tggacgattt atacttttat
tttttatgat tatttgtttc 3240tatattgctg ttagaaaaag tgaaataaaa atacttcaaa
agaaaaaaaa aaaaaaaaaa 3300aaaaaaaaaa aaaagaaaaa aaaaaaaaaa aaaa
333457573DNAHomo sapiensmisc_feature(567)..(569)a
or g or c or t/u 57tgaaggaccg cgatcctaaa gagattgaat gggacgacct ggcccagctg
cccttcctga 60ccatgtgcgt gaaggagagc ctgaggttac atcccccagc tcccttcatc
tcccgatgct 120gcacccagga cattgttctc ccagatggcc gagtcatccc caagggcatt
acctgcctca 180tcgatattat aggggtccat cacaacccaa ctgtgtggcc ggatcctgag
tctacgaccc 240cttccgcttt gacccagaga acagcaaggg gaggtcacct ctggctttta
attcccttct 300ccgcagggcc caggaactgc atcgggccag cgtttcccat ggcggagatg
aaagtggttc 360ctggcgttga tgctgctgca cttccggttc ctgccagacc acactgagcc
ccgcaggaag 420ctggaactga tcattgcggc cgagggcggg ctttggctgc gggtggagcc
cctgaatgta 480ggcttgcagt gactttctga cccatccacc tgtttttttg cagattgtca
tgaataaaac 540ggtgctgtca cctcaaaaaa aaaaaannna aaa
573582534DNAHomo sapiens 58gagtcctctc gttggtcccg gaggtggggt
tgcgctcaca aggggcgacc gtcgccacgg 60tggcggccac tgcatcgcgt cccacctccg
cggccctggg cgccgtggtg tcgacgggcc 120ccgagcctat gacgggccag ggccagtcgg
cgtccgggtc gtcggcgtgg agcacggtat 180tccgccacgt ccggtatgag aacctgatag
cgggcgtgag cggcggcgtc ttatccaacc 240ttgcgctgca tccgctcgac ctcgtgaaga
tccgcttcgc cgtgagtgat ggattggaac 300tgagaccgaa atataatgga attttacatt
gcttgactac catttggaaa cttgatggac 360tacggggact ttatcaagga gtaaccccaa
atatatgggg tgcaggttta tcctggggac 420tctacttttt cttttacaat gccatcaagt
catataaaac agaaggaaga gctgaacatt 480tagaggcaac agaatacctt gtctcagctg
ctgaagctgg agccatgacc ctctgcatta 540caaacccatt atgggtaaca aaaactcgcc
ttatgttaca gtatgatgct gttgttaact 600ccccacaccg acaatataaa ggaatgtttg
atacacttgt gaaaatatat aagtatgaag 660gtgtgcgtgg attatataag ggatttgttc
ctgggctgtt tggaacatcg catggtgccc 720ttcagtttat ggcatatgaa ttgctgaagt
tgaagtacaa ccagcatatc aatagattac 780cagaagccca gttgagcaca gtagaatata
tatctgttgc agcactatcc aaaatatttg 840ctgtcgcagc aacataccca tatcaagtcg
taagagctcg tcttcaggat caacacatgt 900tttacagtgg tgtaatagat gtaatcacaa
agacatggag gaaagaaggc gtcggtggat 960tttacaaggg aattgctcct aatttgatta
gagtgactcc agcctgctgt attacctttg 1020tggtatatga aaacgtctca cattttttac
ttgaccttag agaaaagaga aagtaagctc 1080aaagaggaca attccagtat atctgcccaa
ggcagcaaca agctcttttg tgtttaaggc 1140ataaaagaag aattctgcat agaaacatgg
ctcatattcg aaattgctct atagtcatta 1200gaagccagag aactgctaag tctcctgcaa
tgtttttctt gctttttgcc ttccccatat 1260atatggaact tggctacctc tgcctgaaat
ggctgccatc aacacaatgt taaaactgac 1320acgaaggata gagtttcaca gatttctacg
ttttattggt ggaagctgat ttgcaacatt 1380tgctaaatgg attagatgaa tgtacttctt
tttgtgagct tacttgcctg gattgcttta 1440aaattaacct ttgtgcaata ccaagaaaat
agctctttaa aagaatgtct ttgtatgtct 1500caaggtaaat taaggattta ctgaataagg
tgttgaccaa atccagacca ttttatttta 1560tttttttatt tatttatttt ttgagatgga
gtcttgcttt gtcgcccagg ctggagtgca 1620gtggcgtgat ctcagctcac tgcaacctcc
acctcccggg ttcacgccat tctcctgcct 1680cagcctcctg agtagctggg actacaggca
cctgccacca cgcctggcta actttttttt 1740atattttgag tagaaatggg gtttcaccat
gttagccagg atggtctcaa tctcctgacc 1800ttgtgatccg cctgccttgg cctcccaaag
tgctgggatt acaggcgtga gccactgcgc 1860ctggccagac cattttagaa ttgggaaatt
ttagtgagaa aaaatgcact gtaaatatgc 1920tttagtttta attcagttgg gatgcactac
ctagcgaaaa ttgagaaact atatacttct 1980cagagaaata tctgacatct attgtcattc
cattgctatt ttttttcccc agagacttcc 2040ataatttaaa ataaaatcct agatccagtt
cttgtttttt ggcataaata cttaatctat 2100tttaaattta taaaatctga gcttctagga
tccagctgtg tcaaccttta tttagcatat 2160ataactataa atcacttatt acagatgcta
aatagatcac cttttacaga tgctgaaatg 2220tttgggatat gtttgttgac aaggtaaatg
gaaatgagaa actttatact tcagttttca 2280gatatatgga tctagatccc aaataaatga
ttaatcttca ttggtttctc aaattcaggt 2340tgaaatacaa attaatagcc tttattgatt
ttacttttat gagtcattgt agacatctat 2400aaatataaaa gggcctgtac ccaaaggatg
ccagaatact agtattttta tttatcgtaa 2460acatccacga gtgctgttgc actaccatct
atttgttgta aataaaagtg ttgttttcaa 2520aaaaaaaaaa aaaa
2534591232DNAHomo sapiens 59ctagaggggc
ggaaagtaac aaggaggtgg gggtacaaat cctcagctcc tgcttccgca 60agcactaacc
tgctctgaag tgagccaggc agctctggcc atcttttccc agccacagaa 120tcaggtgatg
gtccagaatt aagagctgtc acctgtgtca ttcactcaca atggaagaaa 180tgaagaagac
tgccatccgg ctgcccaaag gcaaacagaa gcctataaag acggaatgga 240attcccggtg
tgtccttttc acctacttcc aaggggacat cagcagcgta gtggatgaac 300acttctccag
agctctgagc aatatcaaga gcccccagga attgaccccc tcgagtcaga 360gtgaaggtgt
gatgctgaaa aacgatgata gcatgtctcc aaatcagtgg cgttactcgt 420ctccatggac
aaagccacaa ccagaagtac ctgtcacaaa ccgtgccgcc aactgcaact 480tgcatgtgcc
tggtcccatg gctgtgaatc agttctcacc gtccctggct aggagggcct 540ctgttcggcc
tggggagctg tggcatttct cctccctggc gggcaccagc tccttagagc 600ctggctactc
tcatcccttc cccgctcggc acctggttcc agagccccag cctgatggga 660aacgtgagcc
tctcctaagt ctcctccagc aagacagatg cctagcccgt cctcaggaat 720ctgccgccag
ggagaatggc aaccctggcc agatagctgg aagcacaggg ttgctcttca 780acctgcctcc
cggctcagtt cactataaga aactatatgt atctcgtgga tctgccagta 840ccagccttcc
aaatgaaact ctttcagagt tagagacacc tgggaaatac tcacttacac 900caccaaacca
ctggggccac ccacatcgat acctgcagca tctttagtca agttggagga 960gaaagacaac
acttggtcta agacacggca gcaagacatc cctgcatatt gttccagata 1020aaaatgaaag
ctgctcacac ccacttgcct ccccaatctg ttaaacagct tcgtgtctag 1080tatgagctca
gtacttgccc tgtgaaaatc ccagaagccc ccgctgtcaa tgttccccat 1140ccacaccctg
cttgctcctg tgtaacagct cagatgatga ataataataa aactgtactt 1200ttttggatgg
tgaaaaaaaa aaaaaaaaaa aa
1232603551DNAHomo sapiens 60ttgccttgtg ttagctagca ataagaaaag aagctttgtt
tggattaaca tatataccct 60cttcattctg catacctatt ttttccccaa taatttgcag
cttaggtccg aggacaccac 120aaactctgct taaagggcct ggaggctctc aaggcatggc
cagacgctct gtcttgtact 180tcatcctgct gaatgctctg atcaacaagg gccaagcctg
cttctgtgat cactatgcat 240ggactcagtg gaccagctgc tcaaaaactt gcaattctgg
aacccagagc agacacagac 300aaatagtagt agataagtac taccaggaaa acttttgtga
acagatttgc agcaagcagg 360agactagaga atgtaactgg caaagatgcc ccatcaactg
cctcctggga gattttggac 420catggtcaga ctgtgaccct tgtattgaaa aacagtctaa
agttagatct gtcttgcgtc 480ccagtcagtt tgggggacag ccatgcactg agcctctggt
agcctttcaa ccatgcattc 540catctaagct ctgcaaaatt gaagaggctg actgcaagaa
taaatttcgc tgtgacagtg 600gccgctgcat tgccagaaag ttagaatgca atggagaaaa
tgactgtgga gacaattcag 660atgaaaggga ctgtgggagg acaaaggcag tatgcacacg
gaagtataat cccatcccta 720gtgtacagtt gatgggcaat gggtttcatt ttctggcagg
agagcccaga ggagaagtcc 780ttgataactc tttcactgga ggaatatgta aaactgtcaa
aagcagtagg acaagtaatc 840cataccgtgt tccggccaat ctggaaaatg tcggctttga
ggtacaaact gcagaagatg 900acttgaaaac agatttctac aaggatttaa cttctcttgg
acacaatgaa aatcaacaag 960gctcattctc aagtcagggg gggagctctt tcagtgtacc
aattttttat tcctcaaaga 1020gaagtgaaaa tatcaaccat aattctgcct tcaaacaagc
cattcaagcc tctcacaaaa 1080aggattctag ttttattagg atccataaag tgatgaaagt
cttaaacttc acaacgaaag 1140ctaaagatct gcacctttct gatgtctttt tgaaagcact
taaccatctg cctctagaat 1200acaactctgc tttgtacagc cgaatattcg atgactttgg
gactcattac ttcacctctg 1260gctccctggg aggcgtgtat gaccttctct atcagtttag
cagtgaggaa ctaaagaact 1320caggtttaac cgaggaagaa gccaaacact gtgtcaggat
tgaaacaaag aaacgcgttt 1380tatttgctaa gaaaacaaaa gtggaacata ggtgcaccac
caacaagctg tcagagaaac 1440atgaaggttc atttatacag ggagcagaga aatccatatc
cctgattcga ggtggaagga 1500gtgaatatgg agcagctttg gcatgggaga aagggagctc
tggtctggag gagaagacat 1560tttctgagtg gttagaatca gtgaaggaaa atcctgctgt
gattgacttt gagcttgccc 1620ccatcgtgga cttggtaaga aacatcccct gtgcagtgac
aaaacggaac aacctcagga 1680aagctttgca agagtatgca gccaagttcg atccttgcca
gtgtgctcca tgccctaata 1740atggccgacc caccctctca gggactgaat gtctgtgtgt
gtgtcagagt ggcacctatg 1800gtgagaactg tgagaaacag tctccagatt ataaatccaa
tgcagtagac ggacagtggg 1860gttgttggtc ttcctggagt acctgtgatg ctacttataa
gagatcgaga acccgagaat 1920gcaataatcc tgccccccaa cgaggaggga aacgctgtga
gggggagaag cgacaagagg 1980aagactgcac attttcaatc atggaaaaca atggacaacc
atgtatcaat gatgatgaag 2040aaatgaaaga ggtcgatctt cctgagatag aagcagattc
cgggtgtcct cagccagttc 2100ctccagaaaa tggatttatc cggaatgaaa agcaactata
cttggttgga gaagatgttg 2160aaatttcatg ccttactggc tttgaaactg ttggatacca
gtacttcaga tgcttaccag 2220acgggacctg gagacaaggg gatgtggaat gccaacggac
ggagtgcatc aagccagttg 2280tgcaggaagt cctgacaatt acaccatttc agagattgta
tagaattggt gaatccattg 2340agctaacttg ccccaaaggc tttgttgttg ctgggccatc
aaggtacaca tgccagggga 2400attcctggac accacccatt tcaaactctc tcacctgtga
aaaagatact ctaacaaaat 2460taaaaggcca ttgtcagctg ggacagaaac aatcaggatc
tgaatgcatt tgtatgtctc 2520cagaagaaga ctgtagccat cattcagaag atctctgtgt
gtttgacaca gactccaacg 2580attactttac ttcacccgct tgtaagtttt tggctgagaa
atgtttaaat aatcagcaac 2640tccattttct acatattggt tcctgccaag acggccgcca
gttagaatgg ggtcttgaaa 2700ggacaagact ttcatccaac agcacaaaga aagaatcctg
tggctatgac acctgctatg 2760actgggaaaa atgttcagcc tccacttcca aatgtgtctg
cctattgccc ccacagtgct 2820tcaagggtgg aaaccaactc tactgtgtca aaatgggatc
atcaacaagt gagaaaacat 2880tgaacatctg tgaagtggga actataagat gtgcaaacag
gaagatggaa atactgcatc 2940ctggaaagtg tttggcctag cacaattact gctaggccca
gcacaatgaa cagatttacc 3000atcccgaaga accaactcct acaaatgaga attcttgcac
aaacagcaga ctggcatgct 3060caaagttact gacaaaaatt attttctgtt agtttgagat
cattattctc ccctgactct 3120cctgtttggg catgtcttat tcagttccag ctcatgacgc
cctgtagcat acccctaggt 3180accaacttcc acagcagtct cgtaaattct cctgttcaca
ttgtacaaaa ataatgtgac 3240ttctgaggcc cttatgtagc ctgtgacatt aagcattctc
acaattagaa ataagaataa 3300aacccataat tttcttcaat gagttaataa acagaaatct
ccagaacctc tgaaacacat 3360tcttgaagcc cagctttcat atcttcattc aacaaataat
ttctgagtgt gtatacagga 3420tgtcaagtac tgaccaaagt cctgagaact cggcagataa
taaaacagac aaaagccttt 3480gccttcatga agcatacatt cattcagggg tagacacaca
aaaaatgaaa taaacaggta 3540aaatatgtag c
3551611673DNAHomo sapiens 61ctctcctcgc ccgctgggtg
ctgaagttgg gcggatggca gcaaaccggc tccgctagag 60gaccgagccg cccagccccg
ctcccccgga cccatcggcg cgctgcccac acctccaggc 120gaccggccaa ctgggtcctg
aagtagctga aatgcgaaaa aggcagcagt cccaaaatga 180aggaacacct gccgtgtctc
aagctcctgg aaaccagagg cccaacaaca cctgttgctt 240ttgttggtgc tgttgttgca
gctgctcctg cctcactgtg aggaatgaag aaagagggga 300aaatgcggga agacccacac
acactacaaa aatggagagt atccaggtcc tagaggaatg 360ccaaaacccc actgcagagg
aagtcttgtc ctggtctcaa aattttgaca agatgatgaa 420ggccccagca ggaagaaacc
ttttcagaga gttcctccga acagaataca gtgaagagaa 480cctacttttc tggcttgctt
gtgaagactt aaagaaggag cagaacaaaa aagtaattga 540agaaaaggct aggatgatat
atgaagatta catttctata ctatcaccaa aagaggtcag 600tcttgattct cgagttagag
aggtgatcaa tagaaatctg ttggatccca atcctcacat 660gtatgaagat gcccaacttc
agatatatac tttaatgcac agagattctt ttccaaggtt 720tttgaactct caaatttata
agtcatttgt tgaaagtact gctggctctt cttctgaatc 780ttaatgttca tttaaaaaca
atcattttgg agggctgaga tgggaaataa aagtagttaa 840ataacatcag aaactgagtt
cctggagaac tacagtttag cattcctcag gctactgtga 900aaacacaacc gttatggtct
ttgtctccat ttttatcaag gttttccatg gttaagtttg 960gagaaaatac cacacaaaac
aatgaattgc caaattgttt gttttattca agactcattc 1020tacttgcaag caaagtgtat
ttgtagtcct atgaacagtc tcctcgtgta tctccagaga 1080ctgcatgtgc aaagtaaaat
gcttcatttg ccacatagtt gttgtaatat ttaatccagt 1140agcataactt atatctgtat
ttaaggactt ttgtgcaata tggtcttaag aaataattgc 1200caaaaaaatc ggccatggtt
ctgcattttt aacataatct aagacagaaa aaaagcaatt 1260tttactatgt aacaatggta
ttcaacattc tatatactgt gtttagtaca ctaattttga 1320agccaatatt tctgtacatg
aaaaagagct atttatctct gtttgttgga aaatcctaat 1380ggggattcct ctggttgttc
actgccaaaa ctgtggcatt ttcattacag gagagtttac 1440tatgctaaaa gcaaaaaaca
aaaaaaaaaa aaaagggaag aaggaaaaaa gcaaaaaaca 1500atttgaagat atcctatctc
aatgacaaat caaaagagtg atattgcttt taactgtaat 1560agaagaaaat gaatttatgt
atatatcaga tgtccaatac tgtaattaat ttattaaaga 1620ctggctctcc agttttaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 1673621867DNAHomo sapiens
62aaaagaacca ggattgcatt tgaagttaag ctgcaaaaaa ccagtcgatc aacagatttt
60cgagtcccac agtcaatatg caccatgttt aatgttatgg ttgatgccaa agctcaatca
120acaaaacttt gcagcatgga aatgggccaa gagtttgcta aaatgtggca tcaataccat
180tcaaaaatag acgaactaat tgaagaaact gttaaagaaa tgataacact cttggttgca
240aagttcgtta ctatcttgga aggagtgctg gcaaaattat ccagatatga cgaagggact
300ttgttttctt cttttctgtc atttaccgtg aaggcagctt ccaaatatgt ggatgtacct
360aaacccggga tggacgtggc cgacgcctac gtgactttcg tccgccattc tcaggatgtc
420ctgcgtgata aggtcaatga ggagatgtac atagaaaggt tatttgatca atggtacaac
480agctccatga acgtgatctg cacctggttg acggaccgga tggacttaca gcttcatatt
540tatcagttga aaacactaat taggatggta aagaaaacct acagagattt ccgattgcaa
600ggggtcctgg actccacctt aaacagcaag acctatgaaa cgatccggaa ccgtctcact
660gtggaggaag ccacagcatc agtgagtgaa ggtgggggac tgcagggcat cagcatgaag
720gacagcgatg aggaagacga agaagacgat tagaccattt ggtcctagag tctgctggga
780cagagtcctg taatcagtgc atgtccttag tctgttagtt aaacccatta ggaattttct
840gtcaactacc atgcccatga gatgtttatc aatacaactg ccattttagc tatgtggtac
900caagattagc aaatgacctt catatccact gatttcctga tgtccatgtc tatatgttta
960caagcaatat ggagcaccat tctttaaata ctgttcatgg agaatacata gtctaaccac
1020taggcgtgtc cctgttatca gcaaagatca atgatgcttc attcatgtac tatgtatgca
1080ttggtggtaa atggatgtga gggcaagtac atcaagtaca ttcactctgt ttcacgtatg
1140tggatgccag ttaattaaat gagtacgtaa ataaattaat taaaacacat agatctgctt
1200tgtgttttta tttttatttt ttgaaaaaca aaaggcaagt ctccaacaat taacttttga
1260tgctttctgt tcccctaaaa ccaaaaaatg aaccccttgt gtcgttgtta acccatcctt
1320tcatttactc atataattag ccaaaaaaaa aaggatggct acataccaat ggattgattc
1380tcttaattgc cacggcaagg gggcgatcct atcatgactt aacatcaagc gcgcagttca
1440aaactactgt cttctgtcaa agttttctcc tcttaaatgt tattttgctt ttacgtctca
1500actgtgtatg taaaaaaaac gaatatttaa attacaaccc tagactaaaa atgtgtttat
1560aataagatgt ggatatttcc ttcagtagat tgtaaccata atttaaatta ttttgttcca
1620cactgttttt tatatctgtc atgtacattg cattttgatc tgtaactgca caaccctggg
1680gtttgctgca gagctatttc tttccatgta aagtagtgga tccatcttgc ttttgcctta
1740tataaagcct acagttatgg aagtgtggaa aactgtggct tctcaataaa tattcagatg
1800tcctaagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
1860aaaaaaa
186763601DNAHomo sapiens 63acctgaactg tctaagatat tctaagcaaa gttgacaaag
acaattctcc acttgagccc 60ttaaaaatgt aaccactata aaggtttcac gcggtggttc
ttattgattc gctgtgtcat 120cacatcagct ccactgttgc caaactttgt cgcatgcata
atgtatgatg gaggcttgga 180tgggaatatg ctgattttgt tctgcactta aaggcttctc
ctcctggagg gctgcctagg 240gccacttgct tgatttatca tgagagaaga ggagagagag
agagactgag cgctaggagt 300gtgtgtatgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtat
gtgtgtagcg ggagatgtgg 360gcggagcgag agcaaaagga ctgcggcctg atgcatgctg
gaaaaagaca cgcttttcat 420ttctgatcag ttgtacttca tcctatatca gcacagctgc
catacttcga cttatcagga 480ttctggctgg tggcctgcgc gagggtgcag tcttacttaa
aagactttca gttaattctc 540actggtatca tcgcagtgaa cttaaagcaa agacctctta
gtaaaaaata aaaaaaataa 600a
60164367DNAHomo sapiens 64gcttctcttt aaaattgacc
caaggcatga gccactgcgc ctggccagca aatgcttttt 60gtgcagaata cacttctttc
aggcattgtc aggtgctgtt ttgtttaagc tctaactcac 120ccctggaata caggggaatg
atgacaacca gcccagccag gcctgactca tcatggtcac 180atccagcccc cacccccggc
caactaacca ctgcaggctc ctcttccaga ctcaccaggg 240ggcctcgagg ccccggcatc
tcccttggcc ctgggtgtgg gttttacaag actgtgtctt 300tcatgacatc atagcccaac
catgtgagaa gaaggagaag gccccccttt cttcattaat 360ctgaaaa
367652484DNAHomo sapiens
65ggcacgagga agggcctgtg ggtttattat aaggcggagc tcggcgggag aggtgcgggc
60cgaatccgag ccgagcggag aggaatccgg cagtagagag cggactccag ccggcggacc
120ctgcagccct cgcctgggac agcggcgcgc tgggcaggcg cccaagagag catcgagcag
180cggaacccgc gaagccggcc cgcagccgcg acccgcgcag cctgccgctc tcccgccgcc
240ggtccgggca gcatgaggcg cgcggcgctc tggctctggc tgtgcgcgct ggcgctgagc
300ctgcagccgg ccctgccgca aattgtggct actaatttgc cccctgaaga tcaagatggc
360tctggggatg actctgacaa cttctccggc tcaggtgcag gtgctttgca agatatcacc
420ttgtcacagc agaccccctc cacttggaag gacacgcagc tcctgacggc tattcccacg
480tctccagaac ccaccggcct ggaggctaca gctgcctcca cctccaccct gccggctgga
540gaggggccca aggagggaga ggctgtagtc ctgccagaag tggagcctgg cctcaccgcc
600cgggagcagg aggccacccc ccgacccagg gagaccacac agctcccgac cactcatcag
660gcctcaacga ccacagccac cacggcccag gagcccgcca cctcccaccc ccacagggac
720atgcagcctg gccaccatga gacctcaacc cctgcaggac ccagccaagc tgaccttcac
780actccccaca cagaggatgg aggtccttct gccaccgaga gggctgctga ggatggagcc
840tccagtcagc tcccagcagc agagggctct ggggagcagg acttcacctt tgaaacctcg
900ggggagaata cggctgtagt ggccgtggag cctgaccgcc ggaaccagtc cccagtggat
960cagggggcca cgggggcctc acagggcctc ctggacagga aagaggtgct gggaggggtc
1020attgccgtag gcctcgtggg gctcatcttt gctgtgtgcc tggtgggttt catgctgtac
1080cgcatgaaga agaaggacga aggcagctac tccttggagg agccgaaaca agccaacggc
1140ggggcctacc agaagcccac caaacaggag gaattctatg cctgacgcgg gagccatgcg
1200ccccctccgc cctgccactc actaggcccc cacttgcctc ttccttgaag aactgcaggc
1260cctggcctcc cctgccacca ggccacctcc ccagcattcc agcccctctg gtcgctcctg
1320cccacggagt cgtggggtgt gctgggagct ccactctgct tctctgactt ctgcctggag
1380acttagggca ccaggggttt ctcgcatagg acctttccac cacagccagc acctggcatc
1440gcaccattct gactcggttt ctccaaactg aagcagcctc tccccaggtc cagctctgga
1500ggggaggggg atccgactgc tttggaccta aatggcctca tgtggctgga agatcctgcg
1560ggtggggctt ggggctcaca cacctgtagc acttactggt aggaccaagc atcttggggg
1620ggtggccgct gagtggcagg ggacaggagt ccactttgtt tcgtggggag gtctaatcta
1680gatatcgact tgtttttgca catgtttcct ctagttcttt gttcatagcc cagtagacct
1740tgttacttct gaggtaagtt aagtaagttg attcggtatc cccccatctt gcttccctaa
1800tctatggtcg ggagacagca tcagggttaa gaagactttt tttttttttt tttttaaact
1860aggagaacca aatctggaag ccaaaatgta ggcttagttt gtgtgttgtc tcttgagttt
1920gtcgctcatg tgtgcaacag ggtatggact atctgtctgg tggccccgtt tctggtggtc
1980tgttggcagg ctggccagtc caggctgccg tggggccgcc gcctctttca agcagtcgtg
2040cctgtgtcca tgcgctcagg gccatgctga ggcctgggcc gctgccacgt tggagaagcc
2100cgtgtgagaa gtgaatgctg ggactcagcc ttcagacaga gaggactgta gggagggcgg
2160caggggcctg gagatcctcc tgcagaccac gcccgtcctg cctgtggcgc cgtctccagg
2220ggctgcttcc tcctggaaat tgacgagggg tgtcttgggc agagctggct ctgagcgcct
2280ccatccaagg ccaggttctc cgttagctcc tgtggcccca ccctgggccc tgggctggaa
2340tcaggaatat tttccaaaga gtgatagtct tttgcttttg gcaaaactct acttaatcca
2400atgggttttt ccctgtacag tagattttcc aaatgtaata aactttaata taaagtaaaa
2460aaaaaaaaaa aaaaaaaaaa aaaa
2484661989DNAHomo sapiens 66cggatgggga aaaaaaaaga tgtcagctcc tccgctgtag
tattgctcct taaaaacccc 60tctctctgaa aatgacatgc cctcgcaatg taactccgaa
ctcgtacgcg gagcccttgg 120ctgcgcccgg cggaggagag cgctatagcc ggagcgcagg
catgtatatg cagtctggga 180gtgacttcaa ttgcggggtg atgaggggct gcgggctcgc
gccctcgctc tccaagaggg 240acgagggcag cagccccagc ctcgccctca acacctatcc
gtcctacctc tcgcagctgg 300actcctgggg cgaccccaaa gccgcctatc gcctggaaca
acctgttggc aggccgctgt 360cctcctgctc ctacccacct agtgtcaagg aggagaatgt
ctgctgcatg tacagcgcag 420agaagcgggc gaaaagtggc cccgaggcag ctctctactc
ccaccccttg ccggagtcct 480gccttgggga gcacgaggta cccgtgccca gctactaccg
cgccagcccg agctactccg 540cgctggacaa gacgccccac tgttctgggg ccaacgactt
cgaagcccct ttcgagcagc 600gggccagtct caacccgcgc gccgaacatc tggaatcgcc
tcagctgggg ggcaaagtga 660gtttccctga gacccccaag tccgacagcc agacccccag
ccccaatgaa atcaagacgg 720agcagagcct ggcgggccct aaagggagcc cctcggagag
cgaaaaggag agggccaaag 780ctgccgactc cagcccagac acctcggata acgaagcgaa
agaggagata aaggcagaaa 840acaccacagg aaattggctg acagcaaaga gcggaaggaa
gaagaggtgc ccctatacta 900aacaccagac gctggaattg gagaaagaat ttctgttcaa
tatgtatttg acgcgagagc 960gccgcctgga gattagcaag accattaacc ttacagacag
acaagtcaaa atctggtttc 1020aaaatcgcag aatgaaactc aagaaaatga accgagagaa
tcggatccgg gaactgacct 1080ccaattttaa tttcacctga gagcgcggcc tctcctcctc
ccttcccgct ccttcctctc 1140cccgcccctc ctccctttgt gcctggtgat atattttttt
ttcctccctg agtataaatg 1200caatgcgact gcaaaaaagg caaagacctc agactctcct
tccaagggac ctgtggttcg 1260tgctgcgaag atgcttccac ttaaagcatg agaaatgggg
tgccgggatg tggggtgtgg 1320tgtgtgccct catagatggg ggtgggagtg tggctggtgt
gtgtgtcaaa ccctcactca 1380cccacgcact cacacacagc attctgttct ccatgcaaag
ttaagatcga atccatccgc 1440ttgtagggga aaaaaaggaa aaaaattaac cagagagggt
ctgtaatctc gcagagcaca 1500ggcagaatcg ttccttcctt gctgcatttc ctccttagac
taatagacgt tttggaaagt 1560tcggctagtg ttcgtgtgtt tgtcgtagca cccagagcct
ccaccaaacc ctctccatgt 1620ctttacctcc cagtcgctct aagaatctgc ttgaagtctc
gtatttgtac tgctttctgc 1680ttttctccca cccctcctag cacccccaca tcccccatct
agtaacatct cagaaatttc 1740atccagagga acaaaaaaat taaaaataga acatagcaaa
gcaaagacag aatgcccccc 1800cccaaatatt gtcctgtccc tgtctgggag ttgtgttatt
taaagatatt ctgtatgttg 1860tatcttttgc atgtagcttc cttaatggag aaaaaaaaat
cctaataaat ttccagaatc 1920ataatcctca aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa 1980aaaaaaaaa
1989672125DNAHomo sapiens 67gcagtggcca cgagaggcag
gctggctggg acatgaggtt ggcagagggc aggcaagctg 60gcccttggtg ggcctcgtcc
tgagcactcg gaggcactcc tatgcttgga aagctcgcta 120tgctgctgtg ggtccagcag
gcgctgctcg ccttgctcct ccccacactc ctggcacagg 180gagaagccag gaggagccga
aacaccacca ggcccgctct gctgaggctg tcggattacc 240ttttgaccaa ctacaggaag
ggtgtgcgcc ccgtgaggga ctggaggaag ccaaccaccg 300tatccattga cgtcattgtc
tatgccatcc tcaacgtgga tgagaagaat caggtgctga 360ccacctacat ctggtaccgg
cagtactgga ctgatgagtt tctccagtgg aaccctgagg 420actttgacaa catcaccaag
ttgtccatcc ccacggacag catctgggtc ccggacattc 480tcatcaatga gttcgtggat
gtggggaagt ctccaaatat cccgtacgtg tatattcggc 540atcaaggcga agttcagaac
tacaagcccc ttcaggtggt gactgcctgt agcctcgaca 600tctacaactt ccccttcgat
gtccagaact gctcgctgac cttcaccagt tggctgcaca 660ccatccagga catcaacatc
tctttgtggc gcttgccaga aaaggtgaaa tccgacagga 720gtgtcttcat gaaccaggga
gagtgggagt tgctgggggt gctgccctac tttcgggagt 780tcagcatgga aagcagtaac
tactatgcag aaatgaagtt ctatgtggtc atccgccggc 840ggcccctctt ctatgtggtc
agcctgctac tgcccagcat cttcctcatg gtcatggaca 900tcgtgggctt ctacctgccc
cccaacagtg gcgagagggt ctctttcaag attacactcc 960tcctgggcta ctcggtcttc
ctgatcatcg tttctgacac gctgccggcc actgccatcg 1020gcactcctct cattggtgtc
tactttgtgg tgtgcatggc tctgctggtg ataagtttgg 1080ccgagaccat cttcattgtg
cggctggtgc acaagcaaga cctgcagcag cccgtgcctg 1140cttggctgcg tcacctggtt
ctggagagaa tcgcctggct actttgcctg agggagcagt 1200caacttccca gaggccccca
gccacctccc aagccaccaa gactgatgac tgctcagcca 1260tgggaaacca ctgcagccac
atgggaggac cccaggactt cgagaagagc ccgagggaca 1320gatgtagccc tcccccacca
cctcgggagg cctcgctggc ggtgtgtggg ctgctgcagg 1380agctgtcctc catccggcaa
ttcctggaaa agcgggatga gatccgagag gtggcccgag 1440actggctgcg cgtgggctcc
gtgctggaca agctgctatt ccacatttac ctgctggcgg 1500tgctggccta cagcatcacc
ctggttatgc tctggtccat ctggcagtac gcttgagtgg 1560gtacagccca gtggaggagg
gggtacagtc ctggttaggt ggggacagag gatttctgct 1620taggcccctc aggacccagg
gaatgccagg gacattttca agacacagac aaagtcccgt 1680gccctgtttc caatgccaat
tcatctcagc aatcacaagc caaggtctga acccttccac 1740caaaaactgg gtgttcaagg
cccttacacc cttgtcccac ccccagcagc tcaccatggc 1800tttaaaacat gctctcttag
atcaggagaa actcgggcac tccctaagtc cactctagtt 1860gtggactttt ccccattgac
cctcacctga ataagggact ttggaattct gcttctcttt 1920cacaactttg cttttaggtt
gaaggcaaaa ccaactctct actacacagg cctgataact 1980ctgtacgagg cttctctaac
ccctagtgtc ttttttttct tcacctcact tgtggcagct 2040tccctgaaca ctcatccccc
atcagatgat gggagtggga agaataaaat gcagtgaaac 2100cctaaaaaaa aaaaaaaaaa
aaaaa 212568574DNAHomo sapiens
68tcttcgctcc tctaccccat aaaattccct acaaatgcaa aaattcgaga tagaagaagc
60cgtccctgaa attgctgtct aacattcacc ggaaacctct ccataaacaa ggagaaacga
120atgcacacgc atttttgcta agaagcccgg gattaagatt taaggataca agctgaaaga
180aaaaatgaaa aatgcttctc cgcgcgtcaa tcgaggggtg gatgcgccac gcagctgagc
240ccagctcaca gccacgcgta agaccaaaag ctgccatggg ttctgcgcgc ggagacctca
300gagccgaaga gagaagtccc cgcgtcagaa acgctgcgga tgccaggtct tgaaaatgct
360gacttctgag gctaagaatt atttcaaaga caaaaagaaa agactggtga ggaggccttc
420cggtgcaagg gcgcctatcc gctaattttg gatggggaag tagggattat tcgtttaaat
480tcaatcgcga gcaccaagtc ggactggccg gggatggaga agggcaaccc ccacctttag
540aaaaataaaa gatctcgaag gccaaaaaaa aaaa
574693697DNAHomo sapiens 69agggagtgtt cccgggggag atactccagt cgtagcaaga
gtctcgacca ctgaatggaa 60gaaaaggact tttaaccacc attttgtgac ttacagaaag
gaatttgaat aaagaaaact 120atgatacttc aggcccatct tcactccctg tgtcttctta
tgctttattt ggcaactgga 180tatggccaag aggggaagtt tagtggaccc ctgaaaccca
tgacattttc tatttatgaa 240ggccaagaac cgagtcaaat tatattccag tttaaggcca
atcctcctgc tgtgactttt 300gaactaactg gggagacaga caacatattt gtgatagaac
gggagggact tctgtattac 360aacagagcct tggacaggga aacaagatct actcacaatc
tccaggttgc agccctggac 420gctaatggaa ttatagtgga gggtccagtc cctatcacca
tagaagtgaa ggacatcaac 480gacaatcgac ccacgtttct ccagtcaaag tacgaaggct
cagtaaggca gaactctcgc 540ccaggaaagc ccttcttgta tgtcaatgcc acagacctgg
atgatccggc cactcccaat 600ggccagcttt attaccagat tgtcatccag cttcccatga
tcaacaatgt catgtacttt 660cagatcaaca acaaaacggg agccatctct cttacccgag
agggatctca ggaattgaat 720cctgctaaga atccttccta taatctggtg atctcagtga
aggacatggg aggccagagt 780gagaattcct tcagtgatac cacatctgtg gatatcatag
tgacagagaa tatttggaaa 840gcaccaaaac ctgtggagat ggtggaaaac tcaactgatc
ctcaccccat caaaatcact 900caggtgcggt ggaatgatcc cggtgcacaa tattccttag
ttgacaaaga gaagctgcca 960agattcccat tttcaattga ccaggaagga gatatttacg
tgactcagcc cttggaccga 1020gaagaaaagg atgcatatgt tttttatgca gttgcaaagg
atgagtacgg aaaaccactt 1080tcatatccgc tggaaattca tgtaaaagtt aaagatatta
atgataatcc acctacatgt 1140ccgtcaccag taaccgtatt tgaggtccag gagaatgaac
gactgggtaa cagtatcggg 1200acccttactg cacatgacag ggatgaagaa aatactgcca
acagttttct aaactacagg 1260attgtggagc aaactcccaa acttcccatg gatggactct
tcctaatcca aacctatgct 1320ggaatgttac agttagctaa acagtccttg aagaagcaag
atactcctca gtacaactta 1380acgatagagg tgtctgacaa agatttcaag accctttgtt
ttgtgcaaat caacgttatt 1440gatatcaatg atcagatccc catctttgaa aaatcagatt
atggaaacct gactcttgct 1500gaagacacaa acattgggtc caccatctta accatccagg
ccactgatgc tgatgagcca 1560tttactggga gttctaaaat tctgtatcat atcataaagg
gagacagtga gggacgcctg 1620ggggttgaca cagatcccca taccaacacc ggatatgtca
taattaaaaa gcctcttgat 1680tttgaaacag cagctgtttc caacattgtg ttcaaagcag
aaaatcctga gcctctagtg 1740tttggtgtga agtacaatgc aagttctttt gccaagttca
cgcttattgt gacagatgtg 1800aatgaagcac ctcaattttc ccaacacgta ttccaagcga
aagtcagtga ggatgtagct 1860ataggcacta aagtgggcaa tgtgactgcc aaggatccag
aaggtctgga cataagctat 1920tcactgaggg gagacacaag aggttggctt aaaattgacc
acgtgactgg tgagatcttt 1980agtgtggctc cattggacag agaagccgga agtccatatc
gggtacaagt ggtggccaca 2040gaagtagggg ggtcttcctt gagctctgtg tcagagttcc
acctgatcct tatggatgtg 2100aatgacaacc ctcccaggct agccaaggac tacacgggct
tgttcttctg ccatcccctc 2160agtgcacctg gaagtctcat tttcgaggct actgatgatg
atcagcactt atttcggggt 2220ccccatttta cattttccct cggcagtgga agcttacaaa
acgactggga agtttccaaa 2280atcaatggta ctcatgcccg actgtctacc aggcacacag
agtttgagga gagggagtat 2340gtcgtcttga tccgcatcaa tgatgggggt cggccaccct
tggaaggcat tgtttcttta 2400ccagttacat tctgcagttg tgtggaagga agttgtttcc
ggccagcagg tcaccagact 2460gggataccca ctgtgggcat ggcagttggt atactgctga
ccacccttct ggtgattggt 2520ataattttag cagttgtgtt tatccgcata aagaaggata
aaggcaaaga taatgttgaa 2580agtgctcaag catctgaagt caaacctctg agaagctgaa
tttgaaaagg aatgtttgaa 2640tttatatagc aagtgctatt tcagcaacaa ccatctcatc
ctattacttt tcatctaacg 2700tgcattataa ttttttaaac agatattccc tcttgtcctt
taatatttgc taaatatttc 2760ttttttgagg tggagtcttg ctctgtcgcc caggctggag
tacagtggtg tgatcccagc 2820tcactgcaac ctccgcctcc tgggttcaca tgattctcct
gcctcagctt cctaagtagc 2880tgggtttaca ggcacccacc accatgccca gctaattttt
gtatttttaa tagagacggg 2940gtttcgccat ttggccaggc tggtcttgaa ctcctgacgt
caagtgatct gcctgccttg 3000gtctcccaat acaggcatga accactgcac ccacctactt
agatatttca tgtgctatag 3060acattagaga gatttttcat ttttccatga catttttcct
ctctgcaaat ggcttagcta 3120cttgtgtttt tcccttttgg ggcaagacag actcattaaa
tattctgtac attttttctt 3180tatcaaggag atatatcagt gttgtctcat agaactgcct
ggattccatt tatgtttttt 3240ctgattccat cctgtgtccc cttcatcctt gactcctttg
gtatttcact gaatttcaaa 3300catttgtcag agaagaaaaa cgtgaggact caggaaaaat
aaataaataa aagaacagcc 3360ttttccctta gtattaacag aaatgtttct gtgtcattaa
ccatctttaa tcaatgtgac 3420atgttgctct ttggctgaaa ttcttcaact tggaaatgac
acagacccac agaaggtgtt 3480caaacacaac ctactctgca aaccttggta aaggaaccag
tcagctggcc agatttcctc 3540actacctgcc atgcatacat gctgcgcatg ttttcttcat
tcgtatgtta gtaaagtttt 3600ggttattata tatttaacat gtggaagaaa acaagacatg
aaaagagtgg tgacaaatca 3660agaataaaca ctggttgtag tcagttttgt ttgttaa
3697702530DNAHomo sapiens 70aaatccttct tccaatgttc
ctcccctctc tgtatgaacc ctgtgttggg gggcagaaga 60tggaagccct tggcaagctc
gatcgaacca agctactaaa ttgctgagct cgttttaact 120gaagtgtgag aaggaggttt
aaggcaagta gacaacatcc tgttgttggg gtgcttctct 180cttttttgca catctggctg
aactgggagt caggtggttg acttgtgcct ggctgcagta 240gcagcggcat ctcccttgca
cagttctcct cctcggcctg cccaagagtc caccaggcca 300tggacgcagt ggctgtgtat
catggcaaaa tcagcaggga aaccggcgag aagctcctgc 360ttgccactgg gctggatggc
agctatttgc tgagggacag cgagagcgtg ccaggcgtgt 420actgcctatg tgtgctgtat
cacggttaca tttatacata ccgagtgtcc cagacagaaa 480caggttcttg gagtgctgag
acagcacctg gggtacataa aagatatttc cggaaaataa 540aaaatctcat ttcagcattt
cagaagccag atcaaggcat tgtaatacct ctgcagtatc 600cagttgagaa gaagtcctca
gctagaagta cacaaggtac tacagggata agagaagatc 660ctgatgtctg cctgaaagcc
ccatgaagaa aaataaaaca ccttgtactt tattttctat 720aatttaaata tatgctaagt
cttatatatt gtagataata cagttcggtg agctacaaat 780gcatttctaa agccattgta
gtcctgtaat ggaagcatct agcatgtcgt caaagctgaa 840atggactttt gtacatagtg
aggagctttg aaacgaggat tgggaaaaag taattccgta 900ggttattttc agttattata
tttacaaatg ggaaacaaaa ggataatgaa tactttataa 960aggattaatg tcaattcttg
ccaaatataa ataaaaataa tcctcagttt ttgtgaaaag 1020ctccattttt agtgaaatat
tattttatag ctactaattt taaaatgtct tgcttgattg 1080tatggtggga agttggctgg
tgtcccttgt ctttgccaag ttctccacta gctatggtgt 1140cataggctct tttgggattt
ttgaagctgt atactgtgtg ctaaaacaag cactaaacaa 1200agagtgaagg atttatgttt
aattctgaaa gcaaccttct tgcctagtgt tctgatattg 1260gacagtaaaa tccacagacc
aacctggagt tgaaaatctt ataatttaaa atatgctcta 1320aacatgttta tcgtatttga
tgctacagga tttgaaattg tattacaaat ccaatgaaat 1380gagtttttct tttcatttac
ctctgcccca gttgtttcta ctacatggaa gacctcattt 1440tgaagggaaa tttcagcagc
tgcagctcat gagtaactga tttgtaacaa gcctcctttt 1500aaagtaaccc tacaaaacca
ctggaaagtt tatggttgta ttatttttta aaaaaattcc 1560aagtgattga aacctacacg
agatacagaa ttttatgcgg cattttcttc tcacatttat 1620atttttgtga ttttgtgatt
gattatatgt cactttgcta cagggctcac agaattcatt 1680cactcaacaa acataatagg
gcgctgaggg catagaagta aaaacacctg gtccctgctc 1740tcagttcact gtcttgttgg
acgagaaaag aaacaataac gataaaagac agtgaaagaa 1800aataacgata aaagacagtg
aaagaaaata acaataaaag acaaggaaaa aataacaatg 1860aaagttgata agtacatgat
aagcgaggtt ccccgtgtgt aggtagatct ggtctttaga 1920ggcagataga taggtcagtg
caaatactct ggtccatggg ccatatgaaa aggctaagct 1980tcactgtaaa ataataactg
ggaattctgg attgtgtatg ggtgttggtg aacttggttt 2040taattagtga actgctgaga
gacagagcta ttctccatgt actggcaaga cctgatttct 2100gagcatttaa tatggatgcc
gtgggagtac aaaagtggag tgtggcctga gtaatgcatt 2160atgggtggtt taccatttct
tgaggtaaaa gcatcacatg aacttgtaaa ggaatttaaa 2220aatcctactt tcataataag
ttgcataggt ttaataattt ttaattatat ggcttgagtt 2280taaattgtaa taggcgtaac
taattttaac tctataatgt gttcattctg gaataatcct 2340aaacatatga attatgtttg
catgttcact tccaagagcc tttttttgaa aaaaagcttt 2400ttttgaatca tcaagtcttt
cacatttaaa taaagtgttt gaaagcttta tttaaaaaaa 2460aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2520aagaaaaaaa
253071601DNAHomo sapiens
71aattgttttc taagtaattg ctgcctctat tatggcactt catttttgca ctgtcttttg
60agattcaaga aaaatttcta ttcttttttt tgcatccaat tgtgcctgaa cttttaaaat
120atgtaaatgc tgccatgttc caaacccatc gtcagtgtgt gtgtttagag ctgtgcaccc
180tagaaacaac atattgtccc atgagcaggt gcctgagaca cagacccctt tgcattcaca
240gagaggtcat tggttataga gacttgaatt aataagtgac attatgccag tttctgttct
300ctcacaggtg ataaacaatg ctttttgtgc actacatact cttcagtgta gagctcttgt
360tttatgggaa aaggctcaaa tgccaaattg tgtttgatgg attaatatgc ccttttgccg
420atgcatacta ttactgatgt gactcggttt tgtcgcagct ttgctttgtt taatgaaaca
480cacttgtaaa cctcttttgc actttgaaaa agaatccagc gggatgctcg agcacctgta
540aacaattttc tcaacctatt tgatgttcaa ataaagaatt aaactaaaaa aaaaaaaaaa
600a
601721286DNAHomo sapiens 72ggcgccgcgg acgctgctgg agtcgcctgg caacgatgtc
gcctggcaac tgaataggtt 60ggccagtggc gcgggctact ggaagcagaa agggctgcgg
aggcagtgag tggtttctgc 120agagcttcat ttggaaaggc ctctgtagtt ggggaaagat
ggcccattcc cagaactcct 180tggagcttcc cattaacatc aatgccaccc agattaccac
tgcctatggc catcgggccc 240tgcccaagct gaaggaggag ctgcagtcag aggacctcca
gacgaggcag aaagccctca 300tggccctgtg tgacctcatg catgaccccg agtgtatcta
caaggccatg aacataggct 360gtatggagaa cctgaaagct ttgctgaagg atagcaacag
tatggtgcgc ataaagacca 420ccgaggtgct ccacatcacg gcaagccata gcgtgggcag
atacgccttt ctagagcacg 480acatcgtcct tgccctgtcc ttcctgctga atgaccccag
cccagtctgc cgggggaacc 540tgtacaaggc atacatgcag ctggtccagg tgcctagagg
ggcccaagag atcatcagca 600aaggtctgat ttcctcactg gtatggaagc tgcaggtgga
ggtggaggag gaggagttcc 660aggagttcat cctggacaca ctggtcctct gcctgcagga
ggatgccacc gaggccctgg 720gcagcaatgt ggtgcttgtc ctgaagcaga agctcctcag
cgccaaccag aacatccgca 780gcaaggccgc ccgtgcgctc cttaatgtca gcatatctcg
agagggcaag aaacaggtgt 840gtcattttga cgtcatcccc atcctggtcc atctgctgaa
agacccagtg gagcatgtga 900agtctaacgc tgccggtgcc ctgatgttcg ccacagtgat
cactgaaggg aagtatgcgg 960ccctggaggc acaagccatc ggcctgctcc tggagctgct
gcactccccc atgaccatag 1020cgcgcctgaa tgccaccaag gcccttacca tgctggcaga
ggcccccgag ggccgcaagg 1080ccctgcagac gcacgtgccc actttccgtg ccatggaggt
ggagacttac gaaaagcctc 1140aagtggccga agccttacag cgggcagccc ggatcgccat
cagtgtcatc gagttcaaac 1200cctgagccct tcattcacct ctgtgagtga ataaatgtgc
taagtctctt taaaaaaaaa 1260aaaaaaaaaa aaaaaaaaaa aaaaaa
1286732651DNAHomo sapiens 73agagcagtaa gcttgtgata
aaggccaatt ccaggtagct cttgaaggtg atagccatct 60actttccagt ggctgccaac
cacagggagt gccagttaac actggaagga ttaaggcaag 120gtcccttctc ttgagactcc
cctctgagat ctgaaaaatg aagtggctta ggaacatcag 180cagtgaagaa ctgccaagag
ttggtgaagg ttgtctcttc cgagggcctt ctgaagacag 240ggctcttgaa cagacaagtg
gaagggctgt accagggata aaggaaagaa gtgcctgtcc 300agcagggagc ttgaatttaa
gttccatgta tgaagtcatt ggctctatct gcatttttct 360gtcattctct tcatttgttt
taaggtggaa aattttctta cagttgatgc aaagtatcaa 420ctactttacc ctaccttctc
cccttttaga tgggttcttc ctgagttttg gagtcttgta 480tgattatcag tattcccctg
tcaaaatcaa atctattcag gtttcttcac tgttgagaac 540acctaaatgt ttttattttt
gagaagtggg gacagagtct cactatgtca cccaggctgg 600agtgcaatgg catgatctca
gctcactgca accttcgcct cctgggttca agcgattctc 660ctgcctccgc ctcctgagta
gctgggatta taggcacgca ccaccacgcc cagctaattt 720tttgtatttt tagtagagac
agagtttcac catgttggcc aggctggtct tgaactcctg 780accttgtgat ccacccacct
cggcctccca gagtgctggg attacaggca tgagccacca 840cgcttggcta agaacaccta
aatttttatg tttcttggct caaaaaccag ttccatttct 900aatgttgtcc tcacaagaag
gctaattggt ggtgagacag caggggagga ggaagagctg 960tggtttgtaa cttgttcaac
tcaggcaata agcgatttta gctttattta aagtcttctg 1020tccagcttta agcactttgt
aagacatggc tgaaagtagc ttttctatca gaattgcaga 1080tagtcatgtt gggctaacag
tcaattggat atattccttt acctcacatg accccagcaa 1140ctgtggtggt atctagaggt
gaaacaggca agtgaaatgg acacctctgc tgtgaatgtt 1200ttagagaagg aaattcaaaa
aatgttgtaa ctgaaagcac tgttgaatat gggtatcggc 1260tttctttttc actttgactc
ttaacattat cagtcaactt ccacattaat gaaagttgac 1320catagttatt tccaaataaa
aagaaaccaa ctcttaccag gtcttggact gtgatgtcat 1380attattcagt tttatgcttg
ttcctgagca gaactcataa gagtgacata gtcagctgct 1440gacggcacct cagccacgcc
actcttactc agttcagtgg gtgtgcttgc gtggtaggat 1500gtggtgcagc cctctctacg
ctcttctatt tttggtatat ttcctatcta accttcaaat 1560agcttccaat tctttttttc
ttggactggc ttcattctga atttgtgcta aaataatctt 1620tcataaagag acctcagttt
atagcgtaac agactacaca atgcactgat gttttcataa 1680tgtttaaggg acccactgca
agaagcttgc tgcctccttt taattgtatt catttagatt 1740ttgattttcc atgttaagaa
ggtgaggtcc atgttggtgc ccttcagagt agagaaccat 1800gtaaacatta ggaatgaaca
gaggccttag gaatgaatag agagtttgcc ttatacaatt 1860tcctgttaca aagctctccc
tctcatgcaa agtagggaac accttttgag catctttgaa 1920tttgacaaat ggtgctgttg
caaacacttt ttttttgaga tgaagtctcg cggttgtcac 1980ccgggctgga gtgcagtggc
gtgatctcgg ctcactgcaa cttccacctc ctgggttcca 2040gcagttctcc tgcctcagcc
tcccaagtag ctgagattac aggcgcctgc caccccacct 2100ggctgatttt tgtaatttta
gtagagacgg ggtttcacca tgttggccag gctgattaac 2160tcctgacctc aggtgatcca
cctttctcgg cctcccaaag tgctgggatt acgggtgtga 2220gccaccgtgc ccggcctgca
aacacatttt aattgacaac actagggctg ttgtacaaaa 2280tagtaatgat agccatggaa
gttttacctt attctgtgag aagtgttctt aaacttatta 2340agtgtctaaa ctaaggttta
gtgctttttt aaaggaaagt tgtcccagga ttcatcctaa 2400agaaagcaaa agttaattca
actgatccac caatggaatt agatgggtag agttgggttc 2460ttgagtttta ccaccactta
gttcccactg aattttgtaa cttcctgtgt ttgcatcctc 2520tgttcctatt ctgcccttgc
tctgtgtcat ctcagtcatt tgacttagaa agtgcccttc 2580aaaaggaccc tgttcactgc
tgcacttttc aatgaattaa aatttatttc tgttctaaaa 2640aaaaaaaaaa a
2651743403DNAHomo sapiens
74tgatcaacaa ctgtcagctc ccagtcagag agaaagggcc tcttcagtct gtctcaggag
60actgggagaa acagcataaa ggaccccaca aggaagggag aggtaccctg ggtcaggcgc
120ttgtggagag agggcttcgc atgtaaagtg acgtcaggga aaatagaaca gaaaaaaagc
180cagggccagc ccagaggcac ctgagaagaa tcagacccac agctcagccc agccctggca
240cagagaagag acaggcctgg cagcacccag ggaccccctt tcctcagcct ccacctgcag
300gacagcagga gcactgatgc gctgaaggta cgttctggag tctggaagca gcagaactga
360aggaagtaaa cacgggtgtc tgggaagacc cctcaagctg cagtaaagcc caggactgaa
420ttggccacct gaggccaagg gtggcactcc aacctcctcc taaaggctgg ctagagccac
480aggaaagggc cagaagccag agaaagggca aaggtggacc cctgcctcca aacctcctct
540ggagactgac ctcctctttc ctgtgcctta ttgtttctcc ctcttctctt tgttcgccac
600tgggcggtga cctcagggat cctggcctaa cctggtgatt gtgcaggcaa ctgtgtccga
660gaagaccctt ctctggaaga ttgaacccca attcagccat ggtgactcct ttgatgtcaa
720actggtaagg gctgagccgt gggcacagga taccactcct tccagctctt ctgctgtgac
780ctgcccatgg aagtccctgt ggacacgaaa tcctgtttgg atcatctaac tggaggctct
840ctgttcttca cctccacgcg ccctcttgac cccaggaggt tcaggggagg aagtacgcca
900ctctccactg gcaccctcct tggcctacac agagtcaccc ctgagcccct caatgtgtgc
960tgaggtgggc cctgctctct gcaggggtat ggagagaaat agcttggggt gctgtgaggc
1020cccgaagaag ctgggcctgt ccttctccat cgaggcgatc ctaaagaggc ctgccaggag
1080gagtgatatg gacagaccag aagggccagg tgaagagggc cccggagaag ctgcggcctc
1140aggctctggg ctagaaaagc ctccaaagga ccagccccag gaaggaagga agagcaagcg
1200gagggttcgt accaccttca ccactgagca gctgcatgag ctggagaaga tcttccactt
1260tacccactac ccagacgttc acatccgcag ccagctggca gccaggatca acctcccaga
1320agctcgggtg cagatctggt tccagaatca gcgagccaag tggcggaagc aggagaagat
1380tggcaacctg ggggctccac agcagctgag tgaagccagt gtggtcctgc ccacaaatct
1440ggatgtggct gggcccacgt ggacatccac tgctctgcgc aggctggctc ctcccacgag
1500ctgttgtcca tcggctcaag atcagctggc ctctgcctgg ttccctgcct ggatcaccct
1560cctcccagcg cacccatggg aaacacagcc tgtcccaggt cttcccatcc atcaaacttg
1620catccctgtg ctatgcatcc ttccacctcc acaccccaaa tggggcagca tctgtgctac
1680ttcaacatag agattggaca tgctctcccc aaatgagcca ctttcctctc caggtgaagg
1740caggtagcag atgtgccctg ggcctctggg gaaatcgatc tcacaatcca aaaatggccc
1800acagcccagg aagctaccct gaacatgcca gttggaaggc tgcaccagac tcaaaagcaa
1860actaaacaat aaaggacagc tctcttctct cctggctaaa gctgctctcc tggttcagaa
1920gacaggctgg atgagatctc aggccgagct ctgaaatagg gaggtaatcc tccagcacct
1980gtgtttcctc taacttgctg tgtgacctcc agccggtcac tcaccctctc tggacctcat
2040ctgtaagagg agccagctgg ataagatgat ttctgaagac gcttccatgg tgggcactga
2100ggcacagagg aggccaagga gaggttgttt gttcatgcat gcattcatcc gtgacacatg
2160agtacctact gaggactcca taaacagaac gggatacaga gataaacaat ttgggttctg
2220tccacgtttg tcaaaaggtg gtgctggccc acctctgaaa gcagaacact tgctcaacaa
2280ccttgctgtt ggcccaagtc taacacattc tttatgactg tgagcatctc agagtgagag
2340aaaaatgtag aaagtttttt aaattctaaa caggatttag tgtctttagt tatcttgctg
2400gatgggaaag ggatgttgtc atttctggca caaatgaaaa gtaggacgga aagctccttt
2460cattcagttt atctttccag gatatatgaa aagggaccag ctggaagact agcctcactc
2520tgtcctcgaa agcctgagct ttcattcaac tccctatttc catgcaaaga cgctgggcaa
2580accacatgtt ctgtctgagc ctcagttttc ctatccataa aatgaaggta gccaggcctg
2640cctcaaagag cattcaggag gctctgagag gacatgagag tattttgcaa agtgagggca
2700aggcccagtg tggagtgata ttgttattcc aagattccac tgcaaaagtg gctgctttgg
2760atgccagccc aggatgagta gttcctgttc tcagggaggt catccgctga gcatcccttc
2820tgcacagatg tctctgattc ttgtccttgc aggtggagga cagggcctgc tcccctaagc
2880tgggaagcct ggaatgacct cttgcacaag cctaaattcc aggaatcttc cccaaatccc
2940agatcctctg caatctacct gcacccctga cccacccagg agttggaccg ggagttggga
3000agcctaggtc ttagtcctac actccttcta atttgctgtg taaccttacc attaatctct
3060ctgggtctca gttttctcat ctgtattgga ggtagcagtg ctagctctgc cttcaggcat
3120gcaatatgcc agaactacag acaacagccc acaggatgca aaagtgcttt gccatcttaa
3180aaatgccaga tcactcagag cctatgaatg tggatatcaa caccaggtct ctagcaccgc
3240tggatgaaag gagaaggcta gaggctgagg gaggaaagag cagttaacaa acaaaggcag
3300tagctcatca cttgggtagc aggtacccat tttaggaccc tacactcaaa tgtgcaaaat
3360aaaatttcta tcattttgct ataaaaaaaa aaaaaaaaaa aaa
34037560DNAArtificial SequenceSynthetic construct - oligonucleotide
75cccggatcgc catcagtgtc atcgagttca aaccctgagc ccttcattca cctctgtgag
607660DNAArtificial SequenceSynthetic construct - oligonucleotide
76tgcccttgct ctgtgtcatc tcagtcattt gacttagaaa gtgcccttca aaaggaccct
607760DNAArtificial SequenceSynthetic construct - oligonucleotide
77ggagggaggg ctaattatat attttgttgt tcctctatac tttgttctgt tgtctgcgcc
607860DNAArtificial SequenceSynthetic construct - oligonucleotide
78cagtttggat tgtataataa cgccaagccc agttgtagtc gtttgagtgc agtaatgaaa
607960DNAArtificial SequenceSynthetic construct - oligonucleotide
79aaatcagagt aaccctttct gtattgagtg cagtgttttt tactcttttc tcatgcacat
608060DNAArtificial SequenceSynthetic construct - oligonucleotide
80tgcctggcac aaagaaggaa gaatataaat gatagttcga ctcgtctgtg gaagaactta
608160DNAArtificial SequenceSynthetic construct - oligonucleotide
81agtcttttgc ttttggcaaa actctactta atccaatggg tttttccctg tacagtagat
608260DNAArtificial SequenceSynthetic construct - oligonucleotide
82ggttactgtg ggtggaatag tggaggcctt caactgatta gacaaggccc gcccacatct
608360DNAArtificial SequenceSynthetic construct - oligonucleotide
83taaaatgcac tgccctactg ttggtatgac taccgttacc tactgttgtc attgttatta
608460DNAArtificial SequenceSynthetic construct - oligonucleotide
84ttctcttttg ggggcaaaca ctatgtcctt ttctttttct agatacagtt aattcctgga
608560DNAArtificial SequenceSynthetic construct - oligonucleotide
85aagacccaca ccctgtagca ataccaagtg ctattacata atcaatggac gatttatact
608660DNAArtificial SequenceSynthetic construct - oligonucleotide
86agtgttgcaa gtttccttta aaaccaacaa agcccacaag tcctgaattt cccattctta
608760DNAArtificial SequenceSynthetic construct - oligonucleotide
87gtcactgtca tagcagctgt gatttcacaa ggaagggtgc tgcaggggga cctggttgat
608860DNAArtificial SequenceSynthetic construct - oligonucleotide
88tttcatccag tgttatgcac tttccacagt tggtgttagt atagccagag ggtttcatta
608960DNAArtificial SequenceSynthetic construct - oligonucleotide
89gggaagtagg gattattcgt ttaaattcaa tcgcgagcac caagtcggac tggccgggga
609060DNAArtificial SequenceSynthetic construct - oligonucleotide
90gggaccaggc cctgggacag ccatgtggct ccaaatgact aaatgtcagc tcaaaaacca
609160DNAArtificial SequenceSynthetic construct - oligonucleotide
91tccgtttatg gaggcaattc catatccttt cttgaacgca cattcagctt accccagaga
609260DNAArtificial SequenceSynthetic construct - oligonucleotide
92agagttaagc cacttcctgg gtctccttct tatgactgtc tatgggtgca ttgccttctg
609360DNAArtificial SequenceSynthetic construct - oligonucleotide
93gtggcctgag taatgcatta tgggtggttt accatttctt gaggtaaaag catcacatga
609460DNAArtificial SequenceSynthetic construct - oligonucleotide
94acacatgcat gtgtctgtgt atgtgtgaat gtgagagaga cacagccctc ctttcagaag
609560DNAArtificial SequenceSynthetic construct - oligonucleotide
95tctgtaactg cacaaccctg gggtttgctg cagagctatt tctttccatg taaagtagtg
609660DNAArtificial SequenceSynthetic construct - oligonucleotide
96aaacactctt tccgactcca gaggagaagc tggcagctct ctgtaagaaa tatgctgatc
609760DNAArtificial SequenceSynthetic construct - oligonucleotide
97gcttcctcta tcgcccaatg caaaatcgat gaaatgggga gttctctggg ccaggccaca
609860DNAArtificial SequenceSynthetic construct - oligonucleotide
98gtagaatcct ctgttcataa tgaacaagat gaaccaatgt ggattagaaa gaagtccgag
609960DNAArtificial SequenceSynthetic construct - oligonucleotide
99ctgttttaaa actgaatggc acgaaattgt tttcctcaac tcggagattc ctgtatggag
6010060DNAArtificial SequenceSynthetic construct - oligonucleotide
100aataaatagt agctctgctg atgatgacgt tgataaccaa actgttctgt ggtcttaagt
6010160DNAArtificial SequenceSynthetic construct - oligonucleotide
101caaacagccc ggtcttgatg caggagagtc tggaaaagga agaaaatggt ttcagtttca
6010260DNAArtificial SequenceSynthetic construct - oligonucleotide
102aacatggacc atccaaattt atggccgtat caaatggtag ctgaaaaaac tatatttgag
6010360DNAArtificial SequenceSynthetic construct - oligonucleotide
103ttgtaatcat gccaattcca gatcaataac tgcatgtctg ttctttggta gaaatagctt
6010460DNAArtificial SequenceSynthetic construct - oligonucleotide
104aaagattatt aacccaaatc acctttcttg cttactccag atgcctcagc ctctgatata
6010560DNAArtificial SequenceSynthetic construct - oligonucleotide
105gacttccttt aggatctcag gcttctgcag ttctcatgac tcctactttt catcctagtc
6010660DNAArtificial SequenceSynthetic construct - oligonucleotide
106ctgtatattt tgcaatagtt acctcaaggc ctactgacca aattgttgtg ttgagatgat
6010760DNAArtificial SequenceSynthetic construct - oligonucleotide
107tgttcaaaca gactttaacc tctgcatcat acttaaccct gcgacatgcg tacagtatgc
6010860DNAArtificial SequenceSynthetic construct - oligonucleotide
108tgagtcatat acatttactg accactgttg cttgttgctc actgtgctgc ttttccatga
6010960DNAArtificial SequenceSynthetic construct - oligonucleotide
109ctgaaatgtg gatgtgattg cctcaataaa gctcgtcccc attgcttaag ccttcaaaaa
6011060DNAArtificial SequenceSynthetic construct - oligonucleotide
110atcaagaaaa cctaatcttc tgactcccag gccaggatgt tttatttctc acatcatgtc
6011160DNAArtificial SequenceSynthetic construct - oligonucleotide
111ttcatttcca aacatcatct ttaagactcc aaggattttt ccaggcacag tggctcatac
6011260DNAArtificial SequenceSynthetic construct - oligonucleotide
112agttagaaat agaatctgaa tttctaaagg gagattctgg cttgggaagt acatgtagga
6011360DNAArtificial SequenceSynthetic construct - oligonucleotide
113caattttctt tttactcccc ctcttaaggg ggccttggaa tctatagtat agaatgaact
6011460DNAArtificial SequenceSynthetic construct - oligonucleotide
114gggtggagtt tcagtgagaa taaacgtgtc tgcctttgtg tgtgtgtata tatacagaga
6011560DNAArtificial SequenceSynthetic construct - oligonucleotide
115ctcgctcatt ttttaccatg ttttccagtc tgtttaactt ctgcagtgcc ttcactacac
6011660DNAArtificial SequenceSynthetic construct - oligonucleotide
116ctttgggccg agcactgaat gtcttgtact ttaaaaaaat gtttctgaga cctctttcta
6011760DNAArtificial SequenceSynthetic construct - oligonucleotide
117ctggaccctt ggagcagtgt tgtgtgaact tgcctagaac tctgccttct ccgttgtcaa
6011860DNAArtificial SequenceSynthetic construct - oligonucleotide
118ccacctcctt cgacctccac tgcgccccac ctccctgcct gtgtgtgtta tttcaaagga
6011960DNAArtificial SequenceSynthetic construct - oligonucleotide
119tctggctggt ggcctgcgcg agggtgcagt cttacttaaa agactttcag ttaattctca
6012060DNAArtificial SequenceSynthetic construct - oligonucleotide
120agatgctgtc ggcaccatgt ttatttattt ccagtggtca tgctcagcct tgctgctctg
6012160DNAArtificial SequenceSynthetic construct - oligonucleotide
121tccttcctct tcggtgaatg caggttattt aaactttggg aaatgtactt ttagtctgtc
6012260DNAArtificial SequenceSynthetic construct - oligonucleotide
122gtcctgtccc tgtctgggag ttgtgttatt taaagatatt ctgtatgttg tatcttttgc
6012360DNAArtificial SequenceSynthetic construct - oligonucleotide
123attatatttc aggtgtcctg aacaggtcac tagactctac attgggcagc ctttaaatat
6012460DNAArtificial SequenceSynthetic construct - oligonucleotide
124aggaatggta ctaccgttcc agattttctg taattgcttc tgcaaagtaa taggcttctt
6012560DNAArtificial SequenceSynthetic construct - oligonucleotide
125ctgtacccaa aggatgccag aatactagta tttttattta tcgtaaacat ccacgagtgc
6012660DNAArtificial SequenceSynthetic construct - oligonucleotide
126attgcccccc taaccaatca tgcaaacttt tccccccctg gggtaattca ccagttaaaa
6012760DNAArtificial SequenceSynthetic construct - oligonucleotide
127cccacagtat ttaatgccct gtcagtccct tctagtctga ctcaatggta acttgctgta
6012860DNAArtificial SequenceSynthetic construct - oligonucleotide
128aaaaccaact ctctactaca caggcctgat aactctgtac gaggcttctc taacccctag
6012960DNAArtificial SequenceSynthetic construct - oligonucleotide
129ctcagactgg gctccacact cttgggcttc agtctgccca tctgctgaat ggagacagca
6013060DNAArtificial SequenceSynthetic construct - oligonucleotide
130cctaatgggg attcctctgg ttgttcactg ccaaaactgt ggcattttca ttacaggaga
6013160DNAArtificial SequenceSynthetic construct - oligonucleotide
131cactcacaat tgttgactaa aatgctgcct ttaaaacata ggaaagtaga atggttgagt
6013260DNAArtificial SequenceSynthetic construct - oligonucleotide
132ctttgaaggg ctgctgcaca ttgttgaatc catcgacctt tagctgcaat gggatctcta
6013360DNAArtificial SequenceSynthetic construct - oligonucleotide
133tgcctcatcg atattatagg ggtccatcac aacccaactg tgtggccgga tcctgagtct
6013460DNAArtificial SequenceSynthetic construct - oligonucleotide
134aaaacagaca aaagcctttg ccttcatgaa gcatacattc attcaggggt agacacacaa
6013560DNAArtificial SequenceSynthetic construct - oligonucleotide
135taacaaacaa aggcagtagc tcatcacttg ggtagcaggt acccatttta ggaccctaca
6013660DNAArtificial SequenceSynthetic construct - oligonucleotide
136atatcagaag tgccaataat cgtcataggc ttctgcacgt tggatcaact aatgttgttt
6013760DNAArtificial SequenceSynthetic construct - oligonucleotide
137atcatagccc aaccatgtga gaagaaggag aaggcccccc tttcttcatt aatctgaaaa
6013860DNAArtificial SequenceSynthetic construct - oligonucleotide
138gcagaccatt ctatcatacc tggcagggct tctgttttat tttgtaggct ggatgctacc
6013960DNAArtificial SequenceSynthetic construct - oligonucleotide
139actacaagcc tcttgttttt caccaaaacc ctacatctca ggcttactaa tttttgtgat
6014060DNAArtificial SequenceSynthetic construct - oligonucleotide
140gccatgcata catgctgcgc atgttttctt cattcgtatg ttagtaaagt tttggttatt
6014160DNAArtificial SequenceSynthetic construct - oligonucleotide
141cacctattta ttttacctct ttcccaaacc tggagcattt atgcctaggc ttgtcaagaa
6014260DNAArtificial SequenceSynthetic construct - oligonucleotide
142gtggacatag ccactaacca actagttacc tttggactgc aacaaaaaat gtgaaaatga
6014360DNAArtificial SequenceSynthetic construct - oligonucleotide
143acttgtaaac ctcttttgca ctttgaaaaa gaatccagcg ggatgctcga gcacctgtaa
6014460DNAArtificial SequenceSynthetic construct - oligonucleotide
144aattctctat aaacggttca ccagcaaacc accaatacat tccattgttt gcctagagag
6014560DNAArtificial SequenceSynthetic construct - oligonucleotide
145aatggcccat gcatgctgtt tgcagcagtc aattgagttg aattagaatt ccaaccatac
6014660DNAArtificial SequenceSynthetic construct - oligonucleotide
146gagctcagta cttgccctgt gaaaatccca gaagcccccg ctgtcaatgt tccccatcca
6014760DNAArtificial SequenceSynthetic construct - oligonucleotide
147atgaagcgga attaggctcc cgagctaagg gactcgccta gggtctcaca gtgagtagga
6014860DNAArtificial SequenceSynthetic construct - oligonucleotide
148agtggctata tcaacatcag ggctagcaca tctttctcta ttatccttct attggaattc
601491108DNAHomo sapiens 149gagtgagtga gagggcagag gaaatactca atctgtgcca
ctcactgcct tgagcctgct 60tcctcactcc aggactgcca gaggctcact cccttgagcc
tgcttcctca ctccaggact 120gccagaggaa gcaatcacca aaatgaagac tgctttaatt
ttgctcagca ttttgggaat 180ggcctgtgct ttctcaatga aaaatttgca tcgaagagtc
aaaatagagg attctgaaga 240aaatggggtc tttaagtaca ggccacgata ttatctttac
aagcatgcct acttttatcc 300tcatttaaaa cgatttccag ttcagggcag tagtgactca
tccgaagaaa atggagatga 360cagttcagaa gaggaggagg aagaagagga gacttcaaat
gaaggagaaa acaatgaaga 420atcgaatgaa gatgaagact ctgaggctga gaataccaca
ctttctgcta caacactggg 480ctatggagag gacgccacgc ctggcacagg gtatacaggg
ttagctgcaa tccagcttcc 540caagaaggct ggggatataa caaacaaagc tacaaaagag
aaggaaagtg atgaagaaga 600agaggaggaa gaggaaggaa atgaaaacga agaaagcgaa
gcagaagtgg atgaaaacga 660acaaggcata aacggcacca gtaccaacag cacagaggca
gaaaacggca acggcagcag 720cggaggagac aatggagaag aaggggaaga agaaagtgtc
actggagcca atgcagaagg 780caccacagag accggagggc agggcaaggg cacctcgaag
acaacaacct ctccaaatgg 840tgggtttgaa cctacaaccc caccacaagt ctatagaacc
acttccccac cttttgggaa 900aaccaccacc gttgaatacg agggggagta cgaatacacg
ggcgtcaatg aatacgacaa 960tggatatgaa atctatgaaa gtgagaacgg ggaacctcgt
ggggacaatt accgagccta 1020tgaagatgag tacagctact ttaaaggaca aggctacgat
ggctatgatg gtcagaatta 1080ctaccaccac cagtgaagct ccagcctg
11081504767DNAHomo sapiens 150gcctcccgcc gcctcccgcg
cggccatgga ctgagcgccg ccggccaggc cgcggggatg 60gggccgccgc tcccgctgct
gctgctgcta ctgctgctgc tgccgccacg cgtcctgcct 120gccgcccctt cgtccgtccc
ccgcggccgg cagctcccgg ggcgtctggg ctgcctgctc 180gaggagggcc tctgcggagc
gtccgaggcc tgtgtgaacg atggagtgtt tggaaggtgc 240cagaaggttc cggcaatgga
cttttaccgc tacgaggtgt cgcccgtggc cctgcagcgc 300ctgcgcgtgg cgttgcagaa
gctttccggc acaggtttca cgtggcagga tgactatact 360cagtatgtga tggaccagga
acttgcagac ctcccgaaaa cctacctgag gcgtcctgaa 420gcatccagcc cagccaggcc
ctcaaaacac agcgttggca gcgagaggag gtacagtcgg 480gagggcggtg ctgccctggc
caacgccctc cgacgccacc tgcccttcct ggaggccctg 540tcccaggccc cagcctcaga
cgtgctcgcc aggacccata cggcgcagga cagacccccc 600gctgagggtg atgaccgctt
ctccgagagc atcctgacct atgtggccca cacgtctgcg 660ctgacctacc ctcccgggcc
ccggacccag ctccgcgagg acctcctgcc gcggaccctc 720ggccagctcc agccagatga
gctcagccct aaggtggaca gtggtgtgga cagacaccat 780ctgatggcgg ccctcagtgc
ctatgctgcc cagaggcccc cagctccccc cggggagggc 840agcctggagc cacagtacct
tctgcgtgca ccctcaagaa tgcccaggcc tttgctggca 900ccagccgccc cccagaagtg
gccttcacct ctgggagatt ccgaagaccc ctccagcaca 960ggcgatggag cacggattca
taccctcctg aaggacctgc agaggcagcc ggctgaggtg 1020aggggcctga gtggcctgga
gctggacggc atggctgagc tgatggctgg cctgatgcaa 1080ggcgtggacc atggagtagc
tcgaggcagc cctgggagag cggccctggg agagtctgga 1140gaacaggcgg atggccccaa
ggccaccctc cgtggagaca gctttccaga tgacggagtg 1200caggacgacg atgatagact
ttaccaagag gtccatcgtc tgagtgccac actcgggggc 1260ctcctgcagg accacgggtc
tcgactctta cctggagccc tcccctttgc aaggcccctc 1320gacatggaga ggaagaagtc
cgagcaccct gagtcttccc tgtcttcaga agaggagact 1380gccggagtgg agaacgtcaa
gagccagacg tattccaaag atctgctggg gcagcagccg 1440cattcggagc ccggggccgc
tgcgtttggg gagctccaaa accagatgcc tgggccctcg 1500aaggaggagc agagccttcc
agcgggtgct caggaggccc tcagcgacgg cctgcaattg 1560gaggtccagc cttccgagga
agaggcgcgg ggctacatcg tgacagacag agaccccctg 1620cgccccgagg aaggaaggcg
gctggtggag gacgtcgccc gcctcctgca ggtgcccagc 1680agtgcgttcg ctgacgtgga
ggttctcgga ccagcagtga ccttcaaagt gagcgccaat 1740gtccaaaacg tgaccactga
ggatgtggag aaggccacag ttgacaacaa agacaaactg 1800gaggaaacct ctggactgaa
aattcttcaa accggagtcg ggtcgaaaag caaactcaag 1860ttcctgcctc ctcaggcgga
gcaagaagac tccaccaagt tcatcgcgct caccctggtc 1920tccctcgcct gcatcctggg
cgtcctcctg gcctctggcc tcatctactg cctccgccat 1980agctctcagc acaggctgaa
ggagaagctc tcgggactag ggggcgaccc aggtgcagat 2040gccactgccg cctaccagga
gctgtgccgc cagcgtatgg ccacgcggcc accagaccga 2100cctgagggcc cgcacacgtc
acgcatcagc agcgtctcat cccagttcag cgacgggccg 2160atccccagcc cctccgcacg
cagcagcgcc tcatcctggt ccgaggagcc tgtgcagtcc 2220aacatggaca tctccaccgg
ccacatgatc ctgtcctaca tggaggacca cctgaagaac 2280aagaaccggc tggagaagga
gtgggaagcg ctgtgcgcct accaggcgga gcccaacagc 2340tcgttcgtgg cccagaggga
ggagaacgtg cccaagaacc gctccctggc tgtgctgacc 2400tatgaccact cccgggtcct
gctgaaggcg gagaacagcc acagccactc agactacatc 2460aacgctagcc ccatcatgga
tcacgacccg aggaaccccg cgtacatcgc cacccaggga 2520ccgctgcccg ccaccgtggc
tgacttttgg cagatggtgt gggagagcgg ctgcgtggtg 2580atcgtcatgc tgacacccct
cgcggagaac ggcgtccggc agtgctacca ctactggccg 2640gatgaaggct ccaatctcta
ccacatctat gaggtgaacc tggtctccga gcacatctgg 2700tgtgaggact tcctggtgag
gagcttctat ctgaagaacc tgcagaccaa cgagacgcgc 2760accgtgacgc agttccactt
cctgagttgg tatgaccgag gagtcccttc ctcctcaagg 2820tccctcctgg acttccgcag
aaaagtaaac aagtgctaca ggggccgttc ttgtccaata 2880attgttcatt gcagtgacgg
tgcaggccgg agcggcacct acgtcctgat cgacatggtt 2940ctcaacaaga tggccaaagg
tgctaaagag attgatatcg cagcgaccct ggagcacttg 3000agggaccaga gacccggcat
ggtccagacg aaggagcagt ttgagttcgc gctgacagcc 3060gtggctgagg aggtgaacgc
catcctcaag gcccttcccc agtgagcggc agcctcaggg 3120gcctcagggg agcccccacc
ccacggatgt tgtcaggaat catgatctga ctttaattgt 3180gtgtcttcta ttataactgc
atagtaatag ggcccttagc tctcccgtag tcagcgcagt 3240ttagcagtta aaagtgtatt
tttgtttaat caaacaataa taaagagaga tttgtggaaa 3300aatccagtta cgggtggagg
ggaatcggtt catcaatttt cacttgctta aaaaaaatac 3360tttttcttaa agcacccgtt
caccttcttg gttgaagttg tgttaacaat gcagtagcca 3420gcacgttcga ggcggtttcc
aggaagagtg tgcttgtcat ctgccacttt cgggagggtg 3480gatccactgt gcaggagtgg
ccggggaagc tggcagcact cagtgaggcc gcccggcaca 3540caaggcacgt ttggcatttc
tctttgagag agtttatcat tgggagaagc cgcggggaca 3600gaactgaacg tcctgcagct
tcggggcaag tgagacaatc acagctcctc gctgcgtctc 3660catcaacact gcgccgggta
ccatggacgg ccccgtcagc cacacctgtc agcccaagca 3720gagtgattca ggggctcccc
gggggcagac acctgtgcac cccatgagta gtgcccactt 3780gaggctggca ctcccctgac
ctcacctttg caaagttaca gatgcacccc aacattgaga 3840tgtgttttta atgttaaaat
attgatttct acgttatgaa aacagatgcc cccgtgaatg 3900cttacctgtg agataaccac
aaccaggaag aacaaatctg ggcattgagc aagctatgag 3960ggtccccggg agcacacgaa
ccctgccagg cccccgctgg ctcctccagg cacgtcccgg 4020acctgtgggg ccccagagag
gggacatttc cctcctggga gagaaggaga tcagggcaac 4080tcggagaggg ctgcgagcat
ttccctcccg ggagaggaga tcagggcgac ctgcacgcac 4140tgcgtagagc ctggaaggga
agtgagaaac cagccgaccg gccctgcccc tcttcccggg 4200atcacttaat gaaccacgtg
ttttgacatc atgtaaacct aagcacgtag agatgattcg 4260gatttgacaa aataacattt
gagtatccga ttcgccatca ccccctaccc cagaaatagg 4320acaattcact tcattgacca
ggatgatcac atggaaggcg gcgcagaggc agctgtgtgg 4380gctgcagatt tcctgtgtgg
ggttcagcgt agaaaacgca cctccatccc gcccttccca 4440cagcattcct ccatcttaga
tagatggtac tctccaaagg ccctaccaga gggaacacgg 4500cctactgagc ggacagaatg
atgccaaaat attgcttatg tctctacatg gtattgtaat 4560gaatatctgc tttaatatag
ctatcatttc ttttccaaaa ttacttctct ctatctggaa 4620tttaattaat cgaaatgaat
ttatctgaat ataggaagca tatgcctact tgtaatttct 4680aactccttat gtttgaagag
aaacctccgg tgtgagatat acaaatatat ttaattgtgt 4740catattaaac ttctgattca
aaaaaaa 47671511148DNAHomo sapiens
151ggcacgaggc cacgagctgt tgtgcatcca gaggtggaat tggggcccgg cattccctcc
60tcgtcccggg ctggcccttg cccccaccct gcaactcctg gttgagatgg gctcagccaa
120gagcgtccca gtcacaccag cgcggcctcc gccgcacaac aagcatctgg ctcgagtggc
180ggacccccgt tcacctagtg ctggcatcct gcgcactccc atccaggtgg agagctctcc
240acagccaggc ctaccagcag gggagcaact ggagggtctt aaacatgccc aggactcaga
300tccccgctct cctactcttg gtattgcacg gacacctatg aagaccagca gtggagaccc
360cccaagccca ctggtgaaac agctgagtga agtatttgaa actgaagact ctaaatcaaa
420tcttccccca gagcctgttc tgcccccaga ggcaccttta tcttctgaat tggacttgcc
480tctgggtacc cagttatctg ttgaggaaca gatgccacct tggaaccaga ctgagttccc
540ctccaaacag gtgttttcca aggaggaagc aagacagccc acagaaaccc ctgtggccag
600ccagagctcc gacaagccct caagggaccc tgagactccc agatcttcag gttctatgcg
660caatagatgg aaaccaaaca gcagcaaggt actagggaga tcccccctca ccatcctgca
720ggatgacaac tcccctggca ccctgacact acgacagggt aagcggcctt cacccctaag
780tgaaaatgtt agtgaactaa aggaaggagc cattcttgga actggacgac ttctgaaaac
840tggaggacga gcatgggagc aaggccagga ccatgacaag gaaaatcagc actttccctt
900ggtggagagc taggccctgc atggccccag caatgcagtc acccagggcc tggtgatatc
960tgtgtcctct caccccttct ttcccaggga tactgaggaa tggcttgttt tcttagactc
1020ctcctcagct accaaactgg gactcacagc tttattgggc tttctttgtg tcttgtgtgt
1080ttcttttata ttaaaggaag taattttaaa tgttacttta aaaaggtaaa aaaaaaaaaa
1140aaaaaaaa
1148152539DNAHomo sapiens 152gcattcgtag taaaggtgcc caagaaatta ttttggccat
ttattgtttt gtccttttct 60ttaaagaact gttttttttt cttttgttta cttttagacc
aaagattggg ttctagaaaa 120tgcacttggt atactaagta ttaaaacaaa caaaaaggaa
agttgtttca gttggcaaca 180ctgcccattc aattgaatca gaaggggaca aaattaacga
ttgccttcag tttgtgttgt 240gtatattttg atgtatgtgg tcactaacag gtcactttta
ttttttctaa atgtagtgaa 300atgttaatac ctattgtact tataggtaaa ccttgcaaat
atgtaacctg tgttgcgcaa 360atgccgcata aatttgagtg attgttaatg ttgtcttaaa
atttcttgat tgtgatactg 420tggtcatatg cccgtgtttg tcacttacaa aaatgtttac
tatgaacaca cagaaataaa 480aaataggcta aattcatata aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaa 5391531673DNAHomo sapiens 153gaggcagtaa
ggacttggac tcctctgtcc agcttttaac aatctaagtt acggttaccc 60tcttctgggt
cacgctagaa tcagatctgc tctccagcat cttctgtttc ctggcaagtg 120tttcctgcta
ctttggattg gccacgatgg gctggagctg ccttgtgaca ggagcaggag 180ggcttctggg
tcagaggatc gtccgcctgt tggtggaaga gaaggaactg aaggagatca 240gggccttgga
caaggccttc agaccagaat tgagagagga attttctaag ctccagaaca 300ggaccaagct
gactgtactt gaaggagaca ttctggatga gccattcctg aaaagagcct 360gccaggacgt
ctcggtcgtc atccacaccg cctgtatcat tgatgtcttt ggtgtcactc 420acagagagtc
catcatgaat gtcaatgtga aaggtaccca gctactgttg gaggcctgtg 480tccaagccag
tgtgccagtc ttcatctaca ccagtagcat agaggtagcc gggcccaact 540cctacaagga
aatcatccag aacggccacg aagaagagcc tctggaaaac acatggccca 600ctccataccc
gtacagcaaa aagcttgctg agaaggctgt gctggcggct aatgggtgga 660atctaaaaaa
tggtgatacc ttgtacactt gtgcgttaag acccacatat atctatgggg 720aaggaggccc
attcctttct gccagtataa atgaggccct gaacaacaat gggatcctgt 780caagtgttgg
aaagttctct acagtcaacc cagtctatgt tggcaacgtg gcctgggccc 840acattctggc
cttgagggct ctgcgggacc ccaagaaggc cccaagtgtc cgaggtcaat 900tctattacat
ctcagatgac acgcctcacc aaagctatga taaccttaat tacatcctga 960gcaaagagtt
tggcctccgc cttgattcca gatggagcct tcctttaacc ctgatgtact 1020ggattggctt
cctgctggaa gtagtgagct tcctactcag cccaatttac tcctatcaac 1080cccccttcaa
ccgccacaca gtcacattat caaatagtgt gttcaccttc tcttacaaga 1140aggctcagcg
agatctggcg tataagccac tctacagctg ggaggaagcc aagcagaaaa 1200ccgtggagtg
ggttggttcc cttgtggacc ggcacaagga gaccctgaag tccaagactc 1260agtgatttaa
ggatgacaga gatgtgcatg tgggtattgt taggaaatgt catcaaactc 1320cacccacctg
gcttcataca gaaggcaaca ggggcacaag cccaggtcct gctgcctctc 1380tttcacacaa
tgcccaactt actgtcttct tcatgtcatc aaaatctgca cagtcactgg 1440cccaaccaga
actttctgtc ctaatcatac accagaagac aaacaatatg atttgctgtt 1500accaaatctc
agtggctgat tctgaacaat tgtggtctct cttaacttga ggttctcttt 1560tgactaatag
agctccattt cccctcttaa atgagaaagc atttcttttc tctttaatct 1620cctattcctt
cacacagttc aacataaaga gcaataaatg ttttaatgct taa
1673154518DNAHomo sapiens 154aaattttgac cccatataaa gaaatgtgtt atgtatgttg
tgcctcctta gagacataaa 60tttagtgtca aaacatggga gatggcttac tcagaagcat
actccactta acataccatg 120gcctgagcta agtaccatgt cctgtttgtg tcttattttt
aaatattttc tttgtccaca 180tgggccgttg accttagagt taaggcggtt gcttttttga
agaaatcacc aaagtttctg 240ggaaactatg ttcaaggttg aaatggagag tagatttaat
tttatttgtc ttgtagggaa 300gaaatcttcc tttgaaccgc ttttcttgct ttttcccttt
ttcccaaact aggttacagg 360ttcttatctg caaggttcaa gttgcttaga cattgttttc
cagtattctg cagggccagt 420cagttgtaca gaagttggaa tattctgttc cagaattaaa
gaagttttta gattatgaaa 480tattatgata ataaagctat atttctgaaa aaaaaaaa
5181552833DNAHomo sapiens 155gaaggagctc tcttcttgct
tggcagctgg accaagggag ccagtcttgg gcgctggagg 60gcctgtcctg accatggtcc
ctgcctggct gtggctgctt tgtgtctccg tcccccaggc 120tctccccaag gcccagcctg
cagagctgtc tgtggaagtt ccagaaaact atggtggaaa 180tttcccttta tacctgacca
agttgccgct gccccgtgag ggggctgaag gccagatcgt 240gctgtcaggg gactcaggca
aggcaactga gggcccattt gctatggatc cagattctgg 300cttcctgctg gtgaccaggg
ccctggaccg agaggagcag gcagagtacc agctacaggt 360caccctggag atgcaggatg
gacatgtctt gtggggtcca cagcctgtgc ttgtgcacgt 420gaaggatgag aatgaccagg
tgccccattt ctctcaagcc atctacagag ctcggctgag 480ccggggtacc aggcctggca
tccccttcct cttccttgag gcttcagacc gggatgagcc 540aggcacagcc aactcggatc
ttcgattcca catcctgagc caggctccag cccagccttc 600cccagacatg ttccagctgg
agcctcggct gggggctctg gccctcagcc ccaaggggag 660caccagcctt gaccacgccc
tggagaggac ctaccagctg ttggtacagg tcaaggacat 720gggtgaccag gcctcaggcc
accaggccac tgccaccgtg gaagtctcca tcatagagag 780cacctgggtg tccctagagc
ctatccacct ggcagagaat ctcaaagtcc tatacccgca 840ccacatggcc caggtacact
ggagtggggg tgatgtgcac tatcacctgg agagccatcc 900cccgggaccc tttgaagtga
atgcagaggg aaacctctac gtgaccagag agctggacag 960agaagcccag gctgagtacc
tgctccaggt gcgggctcag aattcccatg gcgaggacta 1020tgcggcccct ctggagctgc
acgtgctggt gatggatgag aatgacaacg tgcctatctg 1080ccctccccgt gaccccacag
tcagcatccc tgagctcagt ccaccaggta ctgaagtgac 1140tagactgtca gcagaggatg
cagatgcccc cggctccccc aattcccacg ttgtgtatca 1200gctcctgagc cctgagcctg
aggatggggt agaggggaga gccttccagg tggaccccac 1260ttcaggcagt gtgacgctgg
gggtgctccc actccgagca ggccagaaca tcctgcttct 1320ggtgctggcc atggacctgg
caggcgcaga gggtggcttc agcagcacgt gtgaagtcga 1380agtcgcagtc acagatatca
atgatcacgc ccctgagttc atcacttccc agattgggcc 1440tataagcctc cctgaggatg
tggagcccgg gactctggtg gccatgctaa cagccattga 1500tgctgacctc gagcccgcct
tccgcctcat ggattttgcc attgagaggg gagacacaga 1560agggactttt ggcctggatt
gggagccaga ctctgggcat gttagactca gactctgcaa 1620gaacctcagt tatgaggcag
ctccaagtca tgaggtggtg gtggtggtgc agagtgtggc 1680gaagctggtg gggccaggcc
caggccctgg agccaccgcc acggtgactg tgctagtgga 1740gagagtgatg ccacccccca
agttggacca ggagagctac gaggccagtg tccccatcag 1800tgccccagcc ggctctttcc
tgctgaccat ccagccctcc gaccccatca gccgaaccct 1860caggttctcc ctagtcaatg
actcagaggg ctggctctgc attgagaaat tctccgggga 1920ggtgcacacc gcccagtccc
tgcagggcgc ccagcctggg gacacctaca cggtgcttgt 1980ggaggcccag gatacagatg
agccgagact gagcgcttct gcacccctgg tgatccactt 2040cctaaaggcc cctcctgccc
cagccctgac tcttgcccct gtgccctccc aatacctctg 2100cacaccccgc caagaccatg
gcttgatcgt gagtggaccc agcaaggacc ccgatctggc 2160cagtgggcac ggtccctaca
gcttcaccct tggtcccaac cccacggtgc aacgggattg 2220gcgcctccag actctcaatg
gttcccatgc ctacctcacc ttggccctgc attgggtgga 2280gccacgtgaa cacataatcc
ccgtggtggt cagccacaat gcccagatgt ggcagctcct 2340ggttcgagtg atcgtgtgtc
gctgcaacgt ggaggggcag tgcatgcgca aggtgggccg 2400catgaagggc atgcccacga
agctgtcggc agtgggcatc cttgtaggca ccctggtagc 2460aataggaatc ttcctcatcc
tcattttcac ccactggacc atgtcaagga agaaggaccc 2520ggatcaacca gcagacagcg
tgcccctgaa ggcgactgtc tgaatggccc aggcagctct 2580agctgggagc ttggcctctg
gctccatctg agtcccctgg gagagagccc agcacccaag 2640atccagcagg ggacaggaca
gagtagaagc ccctccatct gccctggggt ggaggcacca 2700tcaccatcac caggcatgtc
tgcagagcct ggacaccaac tttatggact gcccatggga 2760gtgctccaaa tgtcagggtg
tttgcccaat aataaagccc cagagaactg ggctgggccc 2820tatgggattg gta
2833156592DNAHomo sapiens
156tctttaccta tgtgaagcga ggtgacgtga tacgtcactg gcgccgtctt ataatttaga
60tgtaaaaatc tttagaaaca aataaaactc tctatatatg tgtatgtctg tgtacaaaaa
120aatgacagag ctgatggcca gtgtatacag agcgtggccc gcggtgtaca atacccatat
180aaggtacatt gtgcaggagg ggaattgctg gctgctttta cttcctgacc aagactgaaa
240aattatttac tgaaatctgt aaaccttttt atgaaacttt taagcaccag gctgtttact
300tacacaattt aggtctgcca gaaaattcta tctgtgatag atctgtaaag agggtcaggg
360gttagagttt actatttttg aagtttacat tgttacatat gaaatggaaa cattattttg
420aaacgttgtc ataacccaat ggtgcattct gtaaccatgg agtcttctgt ttcctggggg
480aaaggggcat tcatgacctg aactttttag caaattatta ttctcagttt ccattacctg
540tttggccaaa cagattaata aaatatttga aaaagaagca ataaaaaaaa aa
592157818DNAHomo sapiensmisc_feature(60)..(60)a or g or c or
t/umisc_feature(80)..(80)a or g or c or t/umisc_feature(114)..(114)a or g
or c or t/umisc_feature(121)..(121)a or g or c or t/u 157ctgagaaagt
ccggtcccta taaggggaca tcagtgcgag acctgctccg tgctgtgagn 60acaagaggca
ccatacaagn aagctcccag ttgaggtgcg acaggcactc gccnaagtcc 120ntgatggctt
cgtccagtac tcacaaaacg gctccccccg gctggtcctt cacacgcacc 180gagccatgag
gagctggcgc ctctgagagc ctcttcctgc cctactaccc gccagactca 240gaggccagga
ggccatgccc tggggccaca gggaggtgag gtgggctgga tgccacacag 300atggtctccg
tgctggctca ctgaagagct gagcctgtgg ctggcctcag aatcaggctg 360ggtgcagtgg
ctcacacctg taatcccagc attttgggag gctgagtgag aggatcactt 420gagctcagga
gttcgagacc agcctggcca acatggcaac accccatttc tacaaaaaat 480ttgtaaaatt
agccaggcat ggtggcgcac gcctgtagtc ccagctgctt gggaggctga 540ggtgggagaa
tcacttgagc ccaggagttc gaggctgcag tgagccagga tcatgccact 600gcactccagc
ctggtccaca gagagacact gtcaccccct ttcccccaca agactggcag 660aggctgggca
gcctggggct gatgaagcag agatgttcgc tggatcccag gccctggcac 720ccctcaggaa
atacaagaaa aagaatattc acatctgttt aatgtgcata aagccaagga 780aaggacagtt
ccgaattcaa aaaaaaaaaa aaaaaaaa
818158753DNAHomo sapiens 158tttttttttt tttttttaaa tatttaactt atttatttaa
caaagtagaa gggaatccat 60tgctagcttt tctgtgttgg tgtctaatat ttgggtaggg
tgggggatcc ccaacaatca 120ggtcccctga gatagctggt cattgggctg atcattgcca
gaatcttctt ctcctggggt 180ctggcccccc aaaatgccta acccaggacc ttgggaattc
tactcatccc aaatgataat 240tccaaatgct gttacccaag gttagggtgt tgaaggaagg
tagagggtgg ggcttcaggt 300ctcaacggct tccctaacca cccctcttct cttggcccag
cctggttccc cccacttcca 360ctcccctcta ctctctctag gactgggctg atgaaggcac
tgcccaaaat ttcccctacc 420cccaactttc ccctaccccc aactttcccc accagctcca
caaccctgtt tggagctact 480gcaggaccag aagcacaaag tgcggtttcc caagcctttg
tccatctcag cccccagagt 540atatctgtgc ttggggaatc tcacacagaa actcaggagc
accccctgcc tgagctaagg 600gaggtcttat ctctcagggg gggtttaagt gccgtttgca
ataatgtcgt cttatttatt 660tagcggggtg aatattttat actgtaagtg agcaatcaga
gtataatgtt tatggtgaca 720aaattaaagg ctttcttata tgtttaaaaa aaa
753159516DNAHomo sapiens 159gccttataaa gcaccaagag
gctgccagtg ggacattttc tcggccctgc cagcccccag 60gaggaaggtg ggtctgaatc
tagcaccatg acggaactag agacagccat gggcatgatc 120atagacgtct tttcccgata
ttcgggcagc gagggcagca cgcagaccct gaccaagggg 180gagctcaagg tgctgatgga
gaaggagcta ccaggcttcc tgcagagtgg aaaagacaag 240gatgccgtgg ataaattgct
caaggacctg gacgccaatg gagatgccca ggtggacttc 300agtgagttca tcgtgttcgt
ggctgcaatc acgtctgcct gtcacaagta ctttgagaag 360gcaggactca aatgatgccc
tggagatgtc acagattcct ggcagagcca tggtcccagg 420cttcccaaaa gtgtttgtgg
caattattcc cctaggctga gcctgctcat gtacctctga 480ttaataaatg cttatgaaat
gaaaaaaaaa aaaaaa 516160354DNAHomo sapiens
160ccagcaaagt ctcttttgac cacacgcttt atccgagatg cttagaagta tatttggctg
60ttttatttgc atctttgatt aagatgtcta tcattgtaaa aaggtattca aaacaaaagt
120gtactctttt attattatga atcacattgt actgagctgt gaagtcagtg ttttaaaaat
180gtagagttta ttcatggagc atgccattga ggtttggatg gtggcaggta aaacagaaag
240gcaagatgtc atctgacatt aggctactta taaataaatg tttatctagc ttttatttca
300tgccctaatg aataaaacat gcttcgaaaa agaaagtaaa aaaaaaaaac aaaa
3541612904DNAHomo sapiens 161ggcgagagag acgctcccgc tcgccgccag ctctgattgg
cccagcggta ggaaaggtta 60aaccaaaaat ttttttacag ccctagtgtg cgcctgtagc
tcggaaaatt aattgtggct 120atagccgcct cgatcgctgt ctccccagcc tcgccgcgga
cgctccggga cgcgcccgcc 180cgccgcccgg ttctcccccc ctttgggctg gtgctgctgc
tgctgtgact gctgctgcga 240aaggaggagg aggaggagga agcagcgggg gggggagcgg
tgggtgtggg ggaaaccaag 300agtacagtgg acgaggactc accccggcgt ggtgttcttt
tttcttcttc tttttctttc 360cttttttttt tttttttcta attcctgagg ggtggttgct
gcttttgcta catgacttgc 420cagcgcccga gcctgcggtc caactgcgct gctgccggag
cgctcagtgc cgccgctgcc 480gcccgtgccc cccgcgcccc gttcggcacc caccggtcgc
cgccccgccc gcgcgccgct 540gtcccgctcc cgcgccgccg ccgccgtttc cccccgacga
ctgggtgatg ctggacatgg 600gagataggaa agaggtgaaa atgatcccca agtcctcgtt
cagcatcaac agcctggtgc 660ccgagggcct ccagaacgac aaccaccacg cgagccacgg
ccaccacaac agccaccacc 720cccagcacca ccaccaccac caccaccatc accaccaccc
gccgccgccc gccccgcaac 780cgccgccgcc gccgcagcag cagcagccgc cgccgccgcc
gagacgcggg gcccggcgcc 840gacgacgacg aggccccagc agttgttgtt ccgccgcgca
cgcacacggc gcgcctgagg 900gccaacggca gctggcgcaa ggcgaccggc gcggccgggg
gatctgcccc gtcgggccgg 960acgagaagga gaaggcccgc gccggggggg aggagaagaa
gggggcgggc gagggcggca 1020aggacgggga ggggggcaag gagggcgaga agaagaacgg
caagtacgag aagccgccgt 1080tcagctacaa cgcgctcatc atgatggcca tgcggcagag
ccccgagaag cggctcacgc 1140tcaacggcat ctacgagttc atcatgaaga acttccctta
ctaccgcgag aacaagcagg 1200gctggcagaa ctccatccgc cacaatctgt ccctcaacaa
gtgcttcgtg aaggtgccgc 1260gccactacga cgacccgggc aagggcaact actggatgct
ggacccgtcg agcgacgacg 1320tgttcatcgg cggcaccacg ggcaagctgc ggcgctccac
cacctcgccg gccaagccgg 1380ccttcaagcg cggtgccgcg ctcacctcca ccggcctcac
cttcatggac gcgccggctc 1440cctctactgg cccatgtcgc ccttcctgtc cctgcaccac
ccccgccagc agcactttga 1500gttacaacgg gaccacgtcg gcctacccca gccaccccat
gccctacagc tccgtgttga 1560ctcaaaactc gctgggcaac aaccactcct cctccaccgc
caacgggctg agcgtggacc 1620ggctggtcaa cgggggaatc ccgtacgcca cgcaccacct
cacggccgcc gcgctaaccg 1680cctcggtgcc ctgcggcctg ctggtgccct gctctgggac
ctactccctc aacccctgct 1740ccgtcaacct gctcgcgggc cagaccagtt actttttccc
ccacgtcccg cacccgtcaa 1800tgacttcgca gagcagcacg tccatgagcg ccagggccgc
gtcctcctcc acgtcgccgg 1860caggcccccc tcgacccctg ccctgtgagt ctttaagacc
ctctttgcca agttttacga 1920cgggactgtc tgggggactg tctgattatt tcacacatca
aaatcagggg tcttcttcca 1980accctttaat acattaacat ccctgggacc agactgtaag
tgaacgtttt acacacattt 2040gcattgtaaa tgataattaa aaaaataagt ccaggtattt
tttattaagc ccccccctcc 2100catttctgta cgtttgttca gtctctaggg ttgtttatta
ttctaacaag gtgtggagtg 2160tcagcgaggt gcaatgtggg gagaatacat tgtagaatat
aaggtttgga agtcaaatta 2220tagtagaatg tgtatctaaa tagtgactgc tttgccattt
cattcaaacc tgacaagtct 2280atctctaaga gccgccagat ttccatgtgt gcagtattat
aagttatcat ggaactatat 2340ggtggacgca gaccttgaga acaacctaaa ttatggggag
aattttaaaa tgttaaactg 2400taatttgtat ttaaaaagca ttcgtagtaa aggtgcccaa
gaaattattt tggccattta 2460ttgttttctc cttttcttta aagaactgtt tttttttctt
ttgtttactt ttagaccaaa 2520gattgggcgg ttctagaaaa tgcgccttgg tatactaagt
attaaaacaa acaaaaagga 2580aagttgtttc agttaacgct gcccattcaa ttgaatcaga
aggggacaaa attaacgatt 2640gccttcagtt tgtgttgtgt atattttgat gtatgtggtc
actaacaggt cacttttatt 2700ttttctaaat gtagtgaaat gttaatacct attgtactta
taggtaaacc ttgcaaatat 2760gtaacctgtg ttgcgcaaat gccgcataaa tttgagtgat
tgttaatgtt gtcttaaaat 2820ttcttgattg tgactatgtg gtcatatgcc cgtgtttgtc
acttacaaaa atgtttacta 2880tgaacacaca taaataaaaa atag
29041622327DNAHomo sapiens 162aaaatgctta ctcttgtggg
ctacttgttg tgtggaaaaa ggaaaacgga ttcattttcc 60catcggcgac tttatgacga
cagaaatgaa ccagttctgc gattagacaa tgcaccggaa 120ccttatgatg tgagttttgg
gaattctagc tactacaatc caactttgaa tgattcagcc 180atgccagaaa gtgaagaaaa
tgcacgtgat ggcattccta tggatgacat acctccactt 240cgtacttctg tatagaacta
acagcaaaaa ggcgttaaac agcaagtgtc atctacatcc 300tagccttttg acaaattcat
ctttcaaaag gttacacaaa attactgtca cgttggattt 360tgtcaaggag aatcataaaa
gcaggagacc agtagcagaa atgtagacag gatgtatcat 420ccaaaggttt tctttcttac
aatttttggc catcctgagg catttactaa gtagccttaa 480tttgtatttt agtagtattt
tcttagtaga aaatatttgt ggaatcagat aaaactaaaa 540gatttcacca ttacagccct
gcctcataac taaataataa aaattattcc accaaaaaat 600tctaaaacaa tgaagatgac
tctttactgc tctgcctgaa gccctagtac cataattcaa 660gattgcattt tcttaaatga
aaattgaaag ggtgcttttt aaagaaaatt tgacttaaag 720ctaaaaagag gacatagccc
agagtttctg ttattgggaa attgaggcaa tagaaatgac 780agacctgtat tctagtacgt
tataattttc tagatcagca cacacatgat cagcccactg 840agttatgaag ctgacaatga
ctgcattcaa cggggccatg gcaggaaagc tgaccctacc 900caggaaagta atagcttctt
taaaagtctt caaaggtttt gggaatttta acttgtctta 960atatatctta ggcttcaatt
atttgggtgc cttaaaaact caatgagaat catggtaaaa 1020aaaaaaagtt aaccaaagaa
tatacctgta cataatttgt acagttttaa gttgttagat 1080aggaactgga tttcttatgt
attagacatt attgctcaat cataatggaa tagattctgc 1140atccctaaat gtatgaacca
taaggttaaa aaagatgaat ggaaatatca aacaactttt 1200cactgagcat cagtttcata
atcaataata taagaagatt aatttggatt ctagtatgtt 1260tcagtttgtt tttaattacc
accttccttt ggtagaaaaa atatgttcct tgatgtagga 1320aagtctaggt tttagagatt
agaggatgag atcaagagtt aaattcctaa agaagcactg 1380aatatatgaa gagagcaaac
aaatcaagta ccaacctaga ggctttattt ttgaattgat 1440tcatggtgtg tgtgtgtgtg
tgtgtgtgtg tgtgtgtgtg tgtgtaacac agaaacagct 1500ttcagaaaat aagggataga
aagtaatgaa gaaagtactt accccatatt gccataaaaa 1560tagcaaagaa gactgtccct
ccattatcga acaaatatgt cacctgagta gaaaacaaac 1620agaaatatta gtcatgcaaa
ttgattataa taagccagtg aatactgttt gcactcaggt 1680actatgattt tttctcaaat
agaatcatat tattttatag tacagaaata ttatatatga 1740attcctttca tgggtcttgc
aacaatttca catgattttt ctcatgggga gaggtgaaga 1800aacaacatta gccctcttct
ctcctctctt gattcccttt ataccccacc atcatttctg 1860attataaata attctaccat
tctatggaag tatttgtggg tcacagattg tcaaactact 1920taatgaaagt tgtatgaaat
tagtttttca ggtgaggcat tcctagttgc aattcctgtt 1980agcaaaactt ctaggagtgg
ggaagttgga aaatgcagga ttcttccagt gagccagcat 2040ttcccatagc taaccctatt
ctcttagtct ttcaaaatgt agaatgggtc caataatggc 2100tataagatgt aataaatccc
atcttaattt gttttaaaag tttcataaat cactgaacac 2160ttatgaaaca aagtgttttt
taatcagata tcaactgaaa cttcataaag gatgcatagt 2220tttataatgt tattgaatca
aattttaagg cttgtattgt ttgattttaa taaagtataa 2280tctccttttt aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaa 23271631841DNAHomo sapiens
163ggcacgaggc tgcctgcccc ccgggtgggg ctgcggctct ggcctcccag gcccatcctc
60aacagctacc ccagccaaca ccaaggccac aaggggaccc cggcctagga ggcaggaagc
120caaggtacag agagcagcct ggccctcacc agtgcgcaag ctggggcagc aaggctgaca
180gttgctgcat gcccagggca gggtgtggta ctggcaccca agttcagcat ggcagagctg
240gccaacagct tgtccccgat ctgcctccag ccccaagatg cctacagccc ccaggcccct
300tcggcagcac tgcctctgcc cacctgcctt taagagactc cagggctgct cctgtcatgc
360agcgaaggtt ttgtctgttt caaagttcga gactcaactt gagggactgt ttttgacaat
420ccccgctgac ctccgctcct cgtggcgccc tggccctaca cccagcctgg cccagggccg
480gctttgcctg gtgaggctgg agggagcacc aggacctgct gtctgctgtc agcccctcct
540ggtgctggtg ccctgatgct gtgccttgtc acccattgag ctgcaagagg gaccaagagg
600gggccacgca gccagccaga tgcctggccc tgtgctgggg cagacaacgc tgcagagccc
660agggagcctg gcgctaggac gtgcgtcctt gtgacactgg cctgtctgaa ctcacctggc
720ctgggaagca ccgtctgccc gggcccaagc cctgcccctc cagagtccag agccaggaag
780gggctgctga gggcgagcat cctgctgggc tctctgcccg gcccacccct ccaaggggct
840ggcctgtgag ccttgactgg gattcatgat gtggaggccc ccaacttcca gaagcagctg
900gtactctgct cacacaagcg actgggccgg ccggccctgg acccctagac cccgagccgc
960ctgccgactg cctgcacagg gagagcagtt gaggcccggg cagggccccc acaccagacc
1020ccaacatagc ttccccaccc aggcaccccc tcccggggca gcaggcgtgg gagtcagggc
1080tgcatgctcc tcccctccca cctcacaggc ggccttaggc aagtcatttt ctgtcatcac
1140aaggtcgcct ctgcctagtc aggtcctggc gtccagagta aggatgtgcg gcccccaggc
1200ccccgcacac ctccctcagc accaagaccg ggaccccccc acccacgtgt ctcattgtgg
1260ctgcctatgg actcccgggc cttgtgtgca ggccaggccc ttccactgat tttttaaagt
1320gaaccattgc tggatctcag attctgtggc atctaaggcc tagcaggggt gggcacacgg
1380gtcacccgag gcccatacca agactctgtt cctgccctag gcccagtctc aaaggaagcc
1440acaaggcgcg ggggccactg aggaaggaaa tgttcatttt catttgtcca aaaccacctt
1500aagttttaag tatattaatc ttgatgcttt ttaactattg ctttttaact tgctgagatt
1560tagaaatact gttataaaaa cttttttaat ttctgtattt tttttctgta ttgtatcttc
1620atgggacatt aggggttttc tatggtaagc acacctatgg ttttggtaaa aacattatca
1680aatatatatc cagacggttc ttccctagaa gaaaaacaag tctttacacc tgataaaata
1740ttttgcgaag agaggtgttc tttttcctta ctggtgctga aaggaaggat ggataacgag
1800gagaaaataa aactgtgagg ctcaaaaaaa aaaaaaaaaa a
1841164848DNAHomo sapiens 164cctgcccttc tctatatgta ccatctccaa aaaccatgta
catctccaaa aactggagta 60gaaagttaga ttgctcaact acaactcctc tagaactcta
tagctctgac atacagattc 120acactctcct ctatttgcta agtatgtaaa gaatgttttc
ttttaaaatg ttctcttttg 180agaacaactg cttatttgtt ataaaagcat ttggttaaaa
tgatgtcatc ataaagaaca 240gtggctttgt ttcaatacat atttttgaga tgattatcta
gaagccagat taataaaatc 300agcttgtgac cttgctaagc atataaactg gaaattcaga
tacattcaaa attatgggtt 360catttaaaag tgttctacct tttgggtatg agactaatat
cactaattcc tcaatagtta 420tcatggctct atcttaatta attagaaaat atgtgtgttt
aattctttga gaattaaaat 480agagaatatt aacagagggt taaaaactgc ttcaactcca
ataagataaa ggaagctcaa 540aatctatgag ctgagtgttc aattagcttt gcctactgag
ttcaatttta tgtcaataca 600acagtggatc agacagtacg actttgaact ggtgaatgta
aacaattgtt tttcacctaa 660gctgctttgg aagaactgat gcttgctgct aactaaagtt
ttggatgtat cgatttagag 720aaccaattaa tacctgcaaa ataaagcata ctgtggtact
tctgtttgat ctagtatgtg 780tgattttaga ttgatggatt aaaaattaat aaagatcata
cattccatac caaaaaaaaa 840aaaaaaaa
8481651767DNAHomo sapiens 165ccagaagcct gcatttctgc
attctgctta attccctttc cttagatttg aaagaagcca 60acactaaacc acaaatatac
aacaaggcca ttttctcaaa cgagagtcag cctttaacga 120aatgaccatg gttgacacag
agatgccatt ctggcccacc aactttggga tcagctccgt 180ggatctctcc gtaatggaag
accactccca ctcctttgat atcaagccct tcactactgt 240tgacttctcc agcatttcta
ctccacatta cgaagacatt ccattcacaa gaacagatcc 300agtggttgca gattacaagt
atgacctgaa acttcaagag taccaaagtg caatcaaagt 360ggagcctgca tctccacctt
attattctga gaagactcag ctctacaata agcctcatga 420agagccttcc aactccctca
tggcaattga atgtcgtgtc tgtggagata aagcttctgg 480atttcactat ggagttcatg
cttgtgaagg atgcaagggt ttcttccgga gaacaatcag 540attgaagctt atctatgaca
gatgtgatct taactgtcgg atccacaaaa aaagtagaaa 600taaatgtcag tactgtcggt
ttcagaaatg ccttgcagtg gggatgtctc ataatgccat 660caggtttggg cggatgccac
aggccgagaa ggagaagctg ttggcggaga tctccagtga 720tatcgaccag ctgaatccag
agtccgctga cctccgggcc ctggcaaaac atttgtatga 780ctcatacata aagtccttcc
cgctgaccaa agcaaaggcg agggcgatct tgacaggaaa 840gacaacagac aaatcaccat
tcgttatcta tgacatgaat tccttaatga tgggagaaga 900taaaatcaag ttcaaacaca
tcacccccct gcaggagcag agcaaagagg tggccatccg 960catctttcag ggctgccagt
ttcgctccgt ggaggctgtg caggagatca cagagtatgc 1020caaaagcatt cctggttttg
taaatcttga cttgaacgac caagtaactc tcctcaaata 1080tggagtccac gagatcattt
acacaatgct ggcctccttg atgaataaag atggggttct 1140catatccgag ggccaaggct
tcatgacaag ggagtttcta aagagcctgc gaaagccttt 1200tggtgacttt atggagccca
agtttgagtt tgctgtgaag ttcaatgcac tggaattaga 1260tgacagcgac ttggcaatat
ttattgctgt cattattctc agtggagacc gcccaggttt 1320gctgaatgtg aagcccattg
aagacattca agacaacctg ctacaagccc tggagctcca 1380gctgaagctg aaccaccctg
agtcctcaca gctgtttgcc aagctgctcc agaaaatgac 1440agacctcaga cagattgtca
cggaacacgt gcagctactg caggtgatca agaagacgga 1500gacagacatg agtcttcacc
cgctcctgca ggagatctac aaggacttgt actagcagag 1560agtcctgagc cactgccaac
atttcccttc ttccagttgc actattctga gggaaaatct 1620gacacctaag aaatttactg
tgaaaaagca ttttaaaaag aaaaggtttt agaatatgat 1680ctattttatg catattgttt
ataaagacac atttacaatt tacttttaat attaaaaatt 1740accatattat gaaaaaaaaa
aaaaaaa 17671668448DNAHomo sapiens
166gcagtggttt ctcctccttc ctcccaggaa gggccaggaa aatggccctg gtcctggaga
60tcttcaccct gctggcctcc atctgctggg tgtcggccaa tatcttcgag taccaggttg
120atgcccagcc ccttcgtccc tgtgagctgc agagggaaac ggcctttctg aagcaagcag
180actacgtgcc ccagtgtgca gaggatggca gcttccagac tgtccagtgc cagaacgacg
240gccgctcctg ctggtgtgtg ggtgccaacg gcagtgaagt gctgggcagc aggcagccag
300gacggcctgt ggcttgtctg tcattttgtc agctacagaa acagcagatc ttactgagtg
360gctacattaa cagcacagac acctcctacc tccctcagtg tcaggattca ggggactacg
420cgcctgttca gtgtgatgtg cagcatgtcc agtgctggtg tgtggacgca gaggggatgg
480aggtgtatgg gacccgccag ctggggaggc caaagcgatg tccaaggagc tgtgaaataa
540gaaatcgtcg tcttctccac ggggtgggag ataagtcacc accccagtgt tctgcggagg
600gagagtttat gcctgtccag tgcaaatttg tcaacaccac agacatgatg atttttgatc
660tggtccacag ctacaacagg tttccagatg catttgtgac cttcagttcc ttccagagga
720ggttccctga ggtatctggg tattgccact gtgctgacag ccaagggcgg gaactggctg
780agacaggttt ggagttgtta ctggatgaaa tttatgacac catttttgct ggcctggacc
840ttccttccac cttcactgaa accaccctgt accggatact gcagagacgg ttcctcgcag
900ttcaatcagt catctctggc agattccgat gccccacaaa atgtgaagtg gagcggttta
960cagcaaccag ctttggtcac ccctatgttc caagctgccg ccgaaatggc gactatcagg
1020cggtgcagtg ccagacggaa gggccctgct ggtgtgtgga cgcccagggg aaggaaatgc
1080atggaacccg gcagcaaggg gagccgccat cttgtgctga aggccaatct tgtgcctccg
1140aaaggcagca ggccttgtcc agactctact ttgggacctc aggctacttc agccagcacg
1200acctgttctc ttccccagag aaaagatggg cctctccaag agtagccaga tttgccacat
1260cctgcccacc cacgatcaag gagctctttg tggactctgg gcttctccgc ccaatggtgg
1320agggacagag ccaacagttt tctgtctcag aaaatcttct caaagaagcc atccgagcaa
1380tttttccctc ccgagggctg gctcgtcttg cccttcagtt taccaccaac ccaaagagac
1440tccagcaaaa cctttttgga gggaaatttt tggtgaatgt tggccagttt aacttgtctg
1500gagcccttgg cacaagaggc acatttaact tcagtcaatt tttccagcaa cttggtcttg
1560caagcttctt gaatggaggg agacaagaag atttggccaa gccactctct gtgggattag
1620attcaaattc ttccacagga acccctgaag ctgctaagaa ggatggtact atgaataagc
1680caactgtggg cagctttggc tttgaaatta acctacaaga gaaccaaaat gccctcaaat
1740tccttgcttc tctcctggag cttccagaat tccttctctt cttgcaacat gctatctctg
1800tgccagaaga tgtggcaaga gatttaggtg atgtgatgga aacggtactc gactcccaga
1860cctgtgagca gacacctgaa aggctatttg tcccatcatg cacgacagaa ggaagctatg
1920aggatgtcca atgcttttcc ggagagtgct ggtgtgtgaa ttcctggggc aaagagcttc
1980caggctcaag agtcagagat ggacagccaa ggtgccccac agactgtgaa aagcaaaggg
2040ctcgcatgca aagcctcatg ggcagccagc ctgctggctc caccttgttt gtccctgctt
2100gtactagtga gggacatttc ctgcctgtcc agtgcttcaa ctcagagtgc tactgtgttg
2160atgctgaggg tcaggccatt cctggaactc gaagtgcaat agggaagccc aagaaatgcc
2220ccacgccctg tcaattacag tctgagcaag ctttcctcag gacggtgcag gccctgctct
2280ctaactccag catgctaccc accctttccg acacctacat cccacagtgc agcaccgatg
2340ggcagtggag acaagtgcaa tgcaatgggc ctcctgagca ggtcttcgag ttgtaccaac
2400gatgggaggc tcagaacaag ggccaggatc tgacgcctgc caagctgcta gtgaagatca
2460tgagctacag agaagcagct tccggaaact tcagtctctt tattcaaagt ctgtatgagg
2520ctggccagca agatgtcttc ccggtgctgt cacaataccc ttctctgcaa gatgtcccac
2580tagcagcact ggaagggaaa cggccccagc ccagggagaa tatcctcctg gagccctacc
2640tcttctggca gatcttaaat ggccaactca gccaataccc ggggtcctac tcagacttca
2700gcactccttt ggcacatttt gatcttcgga actgctggtg tgtggatgag gctggccaag
2760aactggaagg aatgcggtct gagccaagca agctcccaac gtgtcctggc tcctgtgagg
2820aagcaaagct ccgtgtactg cagttcatta gggaaacgga agagattgtt tcagcttcca
2880acagttctcg gttccctctg ggggagagtt tcctggtggc caagggaatc cggctgagga
2940atgaggacct cggccttcct ccgctcttcc cgccccggga ggctttcgcg gagtttctgc
3000gtgggagtga ttacgccatt cgcctggcgg ctcagtctac cttaagcttc tatcagagac
3060gccgcttttc cccggacgac tcggctggag catccgccct tctgcggtcg ggcccctaca
3120tgccacagtg tgatgcgttt ggaagttggg agcctgtgca gtgccacgct gggactgggc
3180actgctggtg tgtagatgag aaaggagggt tcatccctgg ctcactgact gcccgctctc
3240tgcagattcc acagtgcccg acaacctgcg agaaatctcg aaccagtggg ctgctttcca
3300gttggaaaca ggctagatcc caagaaaacc catctccaaa agacctgttc gtcccagcct
3360gcctagaaac aggagaatat gccaggctgc aggcatcggg ggctggcacc tggtgtgtgg
3420accctgcatc aggagaagag ttgcggcctg gctcgagcag cagtgcccag tgcccaagcc
3480tctgcaatgt gctcaagagt ggagtcctct ctaggagagt cagcccaggc tatgtcccag
3540cctgcagggc agaggatggg ggcttttccc cagtgcaatg tgaccaggcc cagggcagct
3600gctggtgtgt catggacagc ggagaagagg tgcctgggac gcgcgtgacc gggggccagc
3660ccgcctgtga gagcccgcgg tgtccgctgc cattcaacgc gtcggaggtg gttggtggaa
3720caatcctgtg tgagacaatc tcgggcccca caggctctgc catgcagcag tgccaattgc
3780tgtgccgcca aggctcctgg agcgtgtttc caccagggcc attgatatgt agcctggaga
3840gcggacgctg ggagtcacag ctgcctcagc cccgggcctg ccaacggccc cagctgtggc
3900agaccatcca gacccaaggg cactttcagc tccagctccc gccgggcaag atgtgcagtg
3960ctgactacgc gggtttgctg cagactttcc aggttttcat attggatgag ctgacagccc
4020gcggcttctg ccagatccag gtgaagactt ttggcaccct ggtttccatt cctgtctgca
4080acaactcctc tgtgcaggtg ggttgtctga ccagggagcg tttaggagtg aatgttacat
4140ggaaatcacg gcttgaggac atcccagtgg cttctcttcc tgacttacat gacattgaga
4200gagccttggt gggcaaggat ctccttgggc gcttcacaga tctgatccag agtggctcat
4260tccagcttca tctggactcc aagacgttcc cagcggaaac catccgcttc ctccaagggg
4320accactttgg cacctctcct aggacacggt ttgggtgctc ggaaggattc taccaagtct
4380tgacaagtga ggccagtcag gacggactgg gatgcgttaa gtgccatgaa ggaagctatt
4440cccaagatga ggaatgcatt ccttgtcctg ttggattcta ccaagaacag gcagggagct
4500tggcctgtgt cccatgtcct gtgggcagaa cgaccatttc tgccggagct ttcagccaga
4560ctcactgtgt cactgactgt cagaggaacg aagcaggcct gcaatgtgac cagaatggcc
4620agtatcgagc cagccagaag gacaggggca gtgggaaggc cttctgtgtg gacggcgagg
4680ggcggaggct gccatggtgg gaaacagagg cccctcttga ggactcacag tgtttgatga
4740tgcagaagtt tgagaaggtt ccagaatcaa aggtgatctt cgacgccaat gctcctgtgg
4800ctgtcagatc caaagttcct gattctgagt tccccgtgat gcagtgcttg acagattgca
4860cagaggacga ggcctgcagc ttcttcaccg tgtccacgac ggagccagag atttcctgtg
4920atttctatgc ttggacaagt gacaatgttg cctgcatgac ttctgaccag aaacgagatg
4980cactggggaa ctcaaaggcc accagctttg gaagtcttcg ctgccaggtg aaagtgagga
5040gccatggtca agattctcca gctgtgtatt tgaaaaaggg ccaaggatcc accacaacac
5100ttcagaaacg ctttgaaccc actggtttcc aaaacatgct ttctggattg tacaacccca
5160ttgtgttctc agcctcagga gccaatctaa ccgatgctca cctcttctgt cttcttgcat
5220gcgaccgtga tctgtgttgc gatggcttcg tcctcacaca ggttcaagga ggtgccatca
5280tctgtgggtt gctgagctca cccagtgtcc tgctttgtaa tgtcaaagac tggatggatc
5340cctctgaagc ctgggctaat gctacatgtc ctggtgtgac atatgaccag gagagccacc
5400aggtgatatt gcgtcttgga gaccaggagt tcatcaagag tctgacaccc ttagaaggaa
5460ctcaagacac ctttaccaat tttcagcagg tttatctctg gaaagattct gacatggggt
5520ctcggcctga gtctatggga tgtagaaaaa acacagtgcc aaggccagca tctccaacag
5580aagcaggttt gacaacagaa cttttctccc ctgtggacct caaccaggtc attgtcaatg
5640gaaatcaatc actatccagc cagaagcact ggcttttcaa gcacctgttt tcagcccagc
5700aggcaaacct atggtgcctt tctcgttgtg tgcaggagca ctctttctgt cagctcgcag
5760agataacaga gagtgcatcc ttgtacttca cctgcaccct ctacccagag gcacaggtgt
5820gtgatgacat catggagtcc aatacccagg gctgcagact gatcctgcct cagatgccaa
5880aggccctgtt ccggaagaaa gttatactgg aagataaagt gaagaacttt tacactcgcc
5940tgccgttcca aaaactgatg gggatatcca ttagaaataa agtgcccatg tctgaaaaat
6000ctatttctaa tgggttcttt gaatgtgaac gacggtgcga tgcggaccca tgctgcactg
6060gctttggatt tctaaatgtt tcccagttaa aaggaggaga ggtgacatgt ctcactctga
6120acagcttggg aattcagatg tgcagtgagg agaatggagg agcctggcgc attttggact
6180gtggctctcc tgacattgaa gtccacacct atcccttcgg atggtaccag aagcccattg
6240ctcaaaataa tgctcccagt ttttgccctt tggttgttct gccttccctc acagagaaag
6300tgtctctgga atcgtggcag tccctggccc tctcttcagt ggttgttgat ccatccatta
6360ggcactttga tgttgcccat gtcagcactg ctgccaccag caatttctct gctgtccgag
6420acctctgttt gtcggaatgt tcccaacatg aggcctgtct catcaccact ctgcaaaccc
6480aactcggggc tgtgagatgt atgttctatg ctgatactca aagctgcaca catagtctgc
6540agggtcggaa ctgccgactt ctgcttcgtg aagaggccac ccacatctac cggaagccag
6600gaatctctct gctcagctat gaggcatctg taccttctgt gcccatttcc acccatggcc
6660ggctgctggg caggtcccag gccatccagg tgggtacctc atggaagcaa gtggaccagt
6720tccttggagt tccatatgct gccccgcccc tggcagagag gcacttccag gcaccagagc
6780ccttgaactg gacaggctcc tgggatgcca gcaagccaag ggccagctgc tggcagccag
6840gcaccagaac atccacgtct cctggagtca gtgaagattg tttgtatctc aatgtgttca
6900tccctcagaa tgtggcccct aacgcgtctg tgctggtgtt cttccacaac accatggaca
6960gggaggagag tgaaggatgg ccggctatcg acggctcctt cttggctgct gttggcaacc
7020tcatcgtggt cactgccagc taccgagtgg gtgtcttcgg cttcctgagt tctggatccg
7080gagaggtgag tggcaactgg gggctgctgg accaggtggc ggctctgacc tgggtgcaga
7140cccacatccg aggatttggc ggggaccctc ggcgcgtgtc cctggcagca gaccgtggcg
7200gggctgatgt ggccagcatc caccttctca cggccagggc caccaactcc caacttttcc
7260ggagagctgt gctgatggga ggctccgcac tctccccggc cgccgtcatc agccatgaga
7320gggctcagca gcaggcaatt gctttggcaa aggaggtcag ttgccccatg tcatccagcc
7380aagaagtggt gtcctgcctc cgccagaagc ctgccaatgt cctcaatgat gcccagacca
7440agctcctggc cgtgagtggc cctttccact actggggtcc tgtgatcgat ggccacttcc
7500tccgtgagcc tccagccaga gcactgaaga ggtctttatg ggtagaggtc gatctgctca
7560ttgggagttc tcaggacgac gggctcatca acagagcaaa ggctgtgaag caatttgagg
7620aaagtcgagg ccggaccagt agcaaaacag ccttttacca ggcactgcag aattctctgg
7680gtggcgagga ctcagatgcc cgcgtcgagg ctgctgctac atggtattac tctctggagc
7740actccacgga tgactatgcc tccttctccc gggctctgga gaatgccacc cgggactact
7800ttatcatctg ccctataatc gacatggcca gtgcctgggc aaagagggcc cgaggaaacg
7860tcttcatgta ccatgctcct gaaaactacg gccatggcag cctggagctg ctggcggatg
7920ttcagtttgc cttggggctt cccttctacc cagcctacga ggggcagttt tctctggagg
7980agaagagcct gtcgctgaaa atcatgcagt acttttccca cttcatcaga tcaggaaatc
8040ccaactaccc ttatgagttc tcacggaaag tacccacatt tgcaaccccc tggcctgact
8100ttgtaccccg tgctggtgga gagaactaca aggagttcag tgagctgctc cccaatcgac
8160agggcctgaa gaaagccgac tgctccttct ggtccaagta catctcgtct ctgaagacat
8220ctgcagatgg agccaagggc gggcagtcag cagagagtga agaggaggag ttgacggctg
8280gatctgggct aagagaagat ctcctaagcc tccaggaacc aggctctaag acctacagca
8340agtgaccagc ccttgagctc cccaaaaacc tcacccgagg ctgcccacta tggtcatctt
8400tttctctaaa atagttactt accttcaata aagtatctac atgcggtg
84481674424DNAHomo sapiens 167agatctctcc agatcacact gtcacgtgta cctagcacat
ctcgagaact cctttgggcc 60gtctggggcc cgggaaggaa gcctgagttc tcaagattcc
aggactgaga gtgccagctt 120gtctcaaagc caggtcaatg gtttctttgc cagccattta
ggtgaccaaa cctggcagga 180atcacagcat ggcagccctt ccccatctgt aatatccaaa
gccaccgaga aagagacttt 240cactgatagt aaccaaagca aaactaaaaa gccaggcatt
tctgatgtaa ctgattactc 300agaccgtgga gattcagaca tggatgaagc cacttactcc
agcagtcagg atcatcaaac 360accaaaacag gaatcttcct cttcagtgaa tacatccaac
aagatgaatt ttaaaacttt 420tccttcatca cctcctaggt ctggagatat ctttgaggtt
gaactggcta aaaatgataa 480cagcttgggg ataagtgtca cgggaggtgt gaatacgagt
gtcagacatg gtggcattta 540tgtgaaagct gttattcccc agggagcagc agagtctgat
ggtagaattc acaaaggtga 600tcgcgtccta gctgtcaatg gagttagtct agaaggagcc
acccataagc aagctgtgga 660aacactgaga aatacaggac aggtggttca tctgttatta
gaaaagggac aatctccaac 720atctaaagaa catgtcccgg taaccccaca gtgtaccctt
tcagatcaga atgcccaagg 780tcaaggccca gaaaaagtga agaaaacaac tcaggtcaaa
gactacagct ttgtcactga 840agaaaataca tttgaggtaa aattatttaa aaatagctca
ggtctaggat tcagtttttc 900tcgagaagat aatcttatac cggagcaaat taatgccagc
atagtaaggg ttaaaaagct 960ctttcctgga cagccagcag cagaaagtgg aaaaattgat
gtaggagatg ttatcttgaa 1020agtgaatgga gcctctttga aaggactatc tcagcaggaa
gtcatatctg ctctcagggg 1080aactgctcca gaagtattct tgcttctctg cagacctcca
cctggtgtgc taccggaaat 1140tgatactgcg cttttgaccc cacttcagtc tccagcacaa
gtacttccaa acagcagtaa 1200agactcttct cagccatcat gtgtggagca aagcaccagc
tcagatgaaa atgaaatgtc 1260agacaaaagc aaaaaacagt gcaagtcccc atccagaaaa
gacagttaca gtgacagcag 1320tgggagtgga gaagatgact tagtgacagc tccagcaaac
atatcaaatt cgacctggag 1380ttcagctttg catcagactc taagcaacat ggtatcacag
gcacagagtc atcatgaagc 1440accaagagtc aagaagatac catttgtacc atgttttact
atcctcagga aaaggcccaa 1500taaaccagag tttgaggaca gtaatccttc ccctctacca
ccggatatgg ctcctgggca 1560gagttatcaa ccccaatcag aatctgcttc ctctagttcg
atggataagt atcatataca 1620tcacatttct gaaccaacta gacaagaaaa ctggacacct
ttgaaaaatg acttggaaaa 1680tcaccttgaa gactttgaac tggaagtaga actcctcatt
accctaatta aatcagaaaa 1740aggaagcctg ggttttacag taaccaaagg caatcagaga
attggttgtt atgttcatga 1800tgtcatacag gatccagcca aaagtgatgg aaggctaaaa
cctggggacc ggctcataaa 1860ggttaatgat acagatgtta ctaatatgac tcatacagat
gcagttaatc tgctccgggg 1920atccaaaaca gtcagattag ttattggacg agttctagaa
ttacccagaa taccaatgtt 1980gcctcatttg ctaccggaca taacactaac gtgcaacaaa
gaggagttgg gtttttcctt 2040atgtggaggt catgacagcc tttatcaagt ggtatatatt
agtgatatta atccaaggtc 2100cgtcgcagcc attgagggta atctccagct attagatgtc
atccattatg tgaacggagt 2160cagcacacaa ggaatgacct tggaggaagt taacagagca
ttagacatgt cacttccttc 2220attggtattg aaagcaacaa gaaatgatct tccagtggtc
cccagctcaa agaggtctgc 2280tgtttcagct ccaaagtcaa ccaaaggcaa tggttcctac
agtgtggggt cttgcagcca 2340gcctgccctc actcctaatg attcattctc cacggttgct
ggggaagaaa taaatgaaat 2400atcgtacccc aaaggaaaat gttctactta tcagataaag
ggatcaccaa acttgactct 2460gcccaaagaa tcttatatac aagaagatga catttatgat
gattcccaag aagctgaagt 2520tatccagtct ctgctggatg ttgtggatga ggagtcccag
aatcttttaa acgaaaataa 2580tgcagcagga tactcctgtg gtccaggtac attaaagatg
aatgggaagt tatcagaaga 2640gagaacagaa gatacagact gcgatggttc acctttacct
gagtatttta ctgaggccac 2700caaaatgaat ggctgtgaag aatattgtga agaaaaagta
aaaagtgaaa gcttaattca 2760gaagccacaa gaaaagaaga ctgatgatga tgaaataaca
tggggaaatg atgagttgcc 2820aatagagaga acaaaccatg aagattctga taaagatcat
tcctttctga caaacgatga 2880gctcgctgta ctccctgtcg tcaaagtgct tccctctggt
aaatacacgg gcgccaactt 2940aaaatcagtc attcgagtcc tgcgggttgc tagatcagga
attccttcta aggagctgga 3000gaatcttcaa gaattaaaac ctttggatca gtgtctaatt
gggcaaacta aggaaaacag 3060aaggaagaac agatataaaa atatacttcc ctatgatgct
acaagagtgc ctcttggaga 3120tgaaggtggc tatatcaatg ccagcttcat taagatacca
gttgggaaag aagagttcgt 3180ttacattgcc tgccaaggac cactgcctac aactgttgga
gacttctggc agatgatttg 3240ggagcaaaaa tccacagtga tagccatgat gactcaagaa
gtagaaggag aaaaaatcaa 3300atgccagcgc tattggccca acatcctagg caaaacaaca
atggtcagca acagacttcg 3360actggctctt gtgagaatgc agcagctgaa gggctttgtg
gtgagggcaa tgacccttga 3420agatattcag accagagagg tgcgccatat ttctcatctg
aatttcactg cctggccaga 3480ccatgataca ccttctcaac cagatgatct gcttactttt
atctcctaca tgagacacat 3540ccacagatca ggcccaatca ttacgcactg cagtgctggc
attggacgtt cagggaccct 3600gatttgcata gatgtggttc tgggattaat cagtcaggat
cttgattttg acatctctga 3660tttggtgcgc tgcatgagac tacaaagaca cggaatggtt
cagacagagg atcaatatat 3720tttctgctat caagtcatcc tttatgtcct gacacgtctt
caagcagaag aagagcaaaa 3780acagcagcct cagcttctga agtgacatga aaagagcctc
tggatgcatt tccatttctc 3840tccttaacct ccagcagact cctgctctct atccaaaata
aagatcacag agcagcaagt 3900tcatacaaca tgcatgttct cctctatctt agaggggtat
tcttcttgaa aataaaaaat 3960attgaaatgc tgtattttta cagctacttt aacctatgat
aattatttac aaaattttaa 4020cactaaccaa acaatgcaga tcttagggat gattaaaggc
agcatttgat gatagcagac 4080attgttacaa ggacatggtg agtctatttt taatgcacca
atcttgttta tagcaaaaat 4140gttttccaat attttaataa agtagttatt tataggcata
cttgaaacca gtatttaagc 4200tttaaatgac agtaatattg gcatagaaaa aagtagcaaa
tgtttactgt atcaatttct 4260aatgtttact atatagaatt tcctgtaata tatttatata
ctttttcatg aaaatggagt 4320tatcagttat ctgtttgtta ctgcatcatc tgtttgtaat
cattatctca ctttgtaaat 4380aaaaacacac cttaaaacat gaacaagcca aaaaaaaaaa
aaaa 44241681450DNAHomo sapiens 168ccaggcagca
gttagcccgc cgcccgcctg tgtgtcccca gagccatgga gagagccagt 60ctgatccaga
aggccaagct ggcagagcag gccgaacgct atgaggacat ggcagccttc 120ccaggcagca
gttagcccgc cgcccgcctg tgtgtcccca gagccatgga gagagccagt 180ctgatccaga
aggccaagct ggcagagcag gccgaacgct atgaggacat ggcagccttc 240atgaaaggcg
ccgtggagaa gggcgaggag ctctcctgcg aagagcgaaa cctgctctca 300gtagcctata
agaacgtggt gggcggccag agggctgcct ggagggtgct gtccagtatt 360gagcagaaaa
gcaacgagga gggctcggag gagaaggggc ccgaggtgcg tgagtaccgg 420gagaaggtgg
agactgagct ccagggcgtg tgcgacaccg tgctgggcct gctggacagc 480cacctcatca
aggaggccgg ggacgccgag agccgggtct tctacctgaa gatgaagggt 540gactactacc
gctacctggc cgaggtggcc accggtgacg acaagaagcg catcattgac 600tcagcccggt
cagcctacca ggaggccatg gacatcagca agaaggagat gccgcccacc 660aaccccatcc
gcctgggcct ggccctgaac ttttccgtct tccactacga gatcgccaac 720agccccgagg
aggccatctc tctggccaag accactttcg acgaggccat ggctgatctg 780cacaccctca
gcgaggactc ctacaaagac agcaccctca tcatgcagct gctgcgagac 840aacctgacac
tgtggacggc cgacaacgcc ggggaagagg ggggcgaggc tccccaggag 900ccccagagct
gagtgttgcc cgccaccgcc ccgccctgcc ccctccagtc cccgccctgc 960cgagaggact
agtatggggt gggaggcccc acccttctcc cctaggcgct gttcttgctc 1020caaagggctc
cgtggagagg gactggcaga gctgaggcca cctggggctg gggatcccac 1080tcttcttgca
gctgttgagc gcacctaacc actggtcatg cccccacccc tgctctccgc 1140acccgcttcc
tcccgacccc aggaccaggc tacttctccc ctcctcttgc ctccctcctg 1200cccctgctgc
ctcttgattc gtaggaattg aggagtgtct ccgccttgtg gctgagaact 1260ggacagtggc
aggggctgga gatgggtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgcgcg 1320cgcgccagtg
caagaccgag actgagggaa agcatgtctg ctgggtgtga ccatgtttcc 1380tctcaataaa
gttcccctgt gacactcaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1440aaaaaaaaaa
1450169798DNAHomo
sapiens 169cggccgcgag gccctgagat gaggctccaa agaccccgac aggccccggc
gggtgggagg 60cgcgcgcccc ggggcgggcg gggctccccc taccggccag acccggggag
aggcgcgcgg 120aggctgcgaa ggttccagaa gggcggggag ggggcgccgc gcgctgaccc
tccctgggca 180ccgctgggga cgatggcgct gctcgccttg ctgctggtcg tggccctacc
gcgggtgtgg 240acagacgcca acctgactgc gagacaacga gatccagagg actcccagcg
aacggacgag 300ggtgacaata gagtgtggtg tcatgtttgt gagagagaaa acactttcga
gtgccagaac 360ccaaggaggt gcaaatggac agagccatac tgcgttatag cggccgtgaa
aatatttcca 420cgttttttca tggttgcgaa gcagtgctcc gctggttgtg cagcgatgga
gagacccaag 480ccagaggaga agcggtttct cctggaagag cccatgccct tcttttacct
caagtgttgt 540aaaattcgct actgcaattt agaggggcca cctatcaact catcagtgtt
caaagaatat 600gctgggagca tgggtgagag ctgtggtggg ctgtggctgg ccatcctcct
gctgctggcc 660tccattgcag ccggcctcag cctgtcttga gccacgggac tgccacagac
tgagccttcc 720ggagcatgga ctcgctccag accgttgtca cctgttgcat taaacttgtt
ttctgttgat 780taaaaaaaaa aaaaaaaa
7981703726DNAHomo sapiens 170ttcagccgga acgttactcc gtgtccaccc
ggatcgtgtg tgtgatcgag gctgcggaga 60cgcctttcac ggggggtgtc gaggtggacg
tcttcgggaa actgggccgt tcgcctccca 120atgtccagtt caccttccaa cagcccaagc
ctctcagtgt ggagccgcag cagggaccgc 180aggcgggcgg caccacactg accatccacg
gcacccacct ggacacgggc tcccaggagg 240acgtgcgggt gaccctcaac ggcgtcccgt
gtaaagtgac gaagtttggg gcgcagctcc 300agtgtgtcac tggcccccag gcgacacggg
gccagatgct tctggaggtc tcctacgggg 360ggtcccccgt gcccaacccc ggcatcttct
tcacctaccg cgaaaacccc gtactgcgag 420ccttcgagcc gctacgaagc tttgccagtg
gtggccgcag catcaacgtc acgggtcagg 480gcttcagcct gatccagagg tttgccatgg
tggtcatcgc ggagcccctg cagtcctggc 540agccgccgcg ggaggctgaa tccctgcagc
ccatgacggt ggtgggtaca gactacgtgt 600tccacaatga caccaaggtc gtcttcctgt
ccccggctgt gcctgaggag ccagaggtct 660acaacctcac ggtgctgatc gagatggacg
ggcaccgtgc cctgctcaga acagaggccg 720gggccttcga gtacgtgcct gaccccaccc
ttgagaactt cacaggtggc gtcaagaagc 780aggtcaacaa gctcatccac gcccggggca
ccaatctgaa caaggcgatg acgctgcagg 840aggccgaggc cttcgtgggt gccgagcgct
gcaccatgaa gacgctgacg gagaccgacc 900tgtactgtga gcccccggag gtgcagcccc
cgcccaagcg gcggcagaaa cgagacacca 960cacacaacct gcccgagttc attgtgaagt
tcggctctcg cgagtgggtg ctgggccgcg 1020tggagtacga cacacgggtg agcgacgtgc
cgctcagcct catcttgccg ctggtcatcg 1080tgcccatggt ggtcgtcatc gcggtgtctg
tctactgcta ctggaggaag agccagcagg 1140ccgaacgaga gtatgagaag atcaagtccc
agctggaggg cctggaggag agcgtgcggg 1200accgctgcaa gaaggaattc acagacctga
tgatcgagat ggaggaccag accaacgacg 1260tgcacgaggc cggcatcccc gtgctggact
acaagaccta caccgaccgc gtcttcttcc 1320tgccctccaa ggacggcgac aaggacgtga
tgatcaccgg caagctggac atccccgagc 1380cgcggcggcc ggtggtggag caggccctct
accagttctc caacctgctg aacagcaagt 1440ctttcctcat caatttcatc cacaccctgg
agaaccagcg ggagttctcg gcccgcgcca 1500aggtctactt cgcgtccctg ctgacggtgg
cgctgcacgg gaaactggag tactacacgg 1560acatcatgca cacgctcttc ctggagctcc
tggagcagta cgtggtggcc aagaacccca 1620agctgatgct gcgcaggtct gagactgtgg
tggagaggat gctgtccaac tggatgtcca 1680tctgcctgta ccagtacctc aaggacagtg
ccggggagcc cctgtacaag ctcttcaagg 1740ccatcaaaca tcaggtggaa aagggcccgg
tggatgcggt acagaagaag gccaagtaca 1800ctctcaacga cacggggctg ctgggggatg
atgtggagta cgcacccctg acggtgagcg 1860tgatcgtgca ggacgaggga gtggacgcca
tcccggtgaa ggtcctcaac tgtgacacca 1920tctcccaggt caaggagaag atcattgacc
aggtgtaccg tgggcagccc tgctcctgct 1980ggcccaggcc agacagcgtg gtcctggagt
ggcgtccggg ctccacagcg cagatcctgt 2040cggacctgga cctgacgtca cagcgggagg
gccggtggaa gcgcgtcaac acccttatgc 2100actacaatgt ccgggatgga gccaccctca
tcctgtccaa ggtgggggtc tcccagcagc 2160cggaggacag ccagcaggac ctgcctgggg
agcgccatgc cctcctggag gaggagaacc 2220gggtgtggca cctggtgcgg ccgaccgacg
aggtggacga gggcaagtcc aagagaggca 2280gcgtgaaaga gaaggagcgg acgaaggcca
tcaccgagat ctacctgacg cggctgctct 2340cagtcaaggg cacactgcag cagtttgtgg
acaacttctt ccagagcgtg ctggcgcctg 2400ggcacgcggt gccacctgca gtcaagtact
tcttcgactt cctggacgag caggcagaga 2460agcacaacat ccaggatgaa gacaccatcc
acatctggaa gacgaacagt ttaccgctcc 2520ggttctgggt gaacatcctc aagaaccccc
acttcatctt tgacgtgcat gtccacgagg 2580tggtggacgc ctcgctgtca gtcatcgcgc
agaccttcat ggatgcctgc acgcgcacgg 2640agcataagct gagccgcgat tctcccagca
acaagctgct gtacgccaag gagatctcca 2700cctacaagaa gatggtggag gattactaca
aggggatccg gcagatggtg caggtcagcg 2760accaggacat gaacacacac ctggcagaga
tttcccgggc gcacacggac tccttgaaca 2820ccctcgtggc actccaccag ctctaccaat
acacgcagaa gtactatgac gagatcatca 2880atgccttgga ggaggatcct gccgcccaga
agacgcagct ggccttccgc ctgcagcaga 2940ttgccgctgc actggagaac aaggtcactg
acctctgacc tacaatctcc agtgctgcct 3000tgggacatag gtacctgagg tacctgagag
cccctcaggg gaggaggccg agtggctgtg 3060gctgaggccc ccaccctccc ctggaacgcg
ccccaagccg gagtgggtgc agccggaacc 3120cgcccagcgt ctagactgta gcatcttcct
ctgagcaata ccgccgggca ccgcaccagc 3180accagcccca gccccagctc cctccggccg
cagaaccagc atcgggtgtt cactgtcgag 3240tctcgagtga tttgaaaatg tgccttacgc
tgccacgctg ggggcagctg gcctccgcct 3300ccgcccacgc accagcagcc gcctccatgc
cctaggttgg gcccctgggg gatctgaggg 3360cctgtggccc ccagggcaag ttcccagatc
ctatgtctgt ctgtccacca cgagatggga 3420ggaggagaaa aagcggtacg atgccttcct
gacctcaccg gcctccccaa gggtgccggc 3480actctgggtg gactcacggc tgctgggccc
cacgtcaaag gtcaagtgag acgtaggtca 3540agtcctacgt cggggcccag acatcctggg
gtcctggtct gtcagacagg ctgccctaga 3600gccccaccca gtccgggggg actgggagca
gttccaagac caccccaccc ctttttgtaa 3660atcttgttca ttgtaaatca aatacagcgt
ctttttcact ccgaaaaaaa aaaaaaaaaa 3720aaaaaa
37261712255DNAHomo sapiens 171gatgtgggca
cgcctcagag ccagaagttt atggctccca cctgctcaat ctgacaggaa 60gcttctgctc
cccagttctc cccagccact gtggtctaca gattccagga aacccatccc 120cctgtgacct
cagggtgtgc tctgttctcc accctaggga ccagaaggag ccaggagtaa 180agaactggct
tacttggccg ccactgggaa attctgggta attcgagacg ccctggaatt 240tggacccact
ccgctgatag gtggtgggca gggttctagg gaacacaaga ggcggagcca 300ggtggcttcc
ctgtgctggc attcttggct ctctctctct ctctttctct ctctctgtct 360ctctctctct
ctctgtctct cagccttgaa gccgtttccc tctgcgattc atgtaagtgt 420gactcgattt
cagggaaagg gaactcgcgt gggctgagga gaccggagtg gacgggctgg 480ggaaggcacc
gtgatgcccg caaccccgtc cctgaaggtg gtccatgagc tgcctgcctg 540taccctctgt
gcggggccgc tggaggatgc ggtgaccgtt ccctgtggac acaccttctg 600ccggctctgc
ctccccgcgc tctcccagat gggggcccaa tcctcgggca agatcctgct 660ctgcccgctc
tgccaagagg aggagcaggc agagactccc atggcccctg tgcccctggg 720cccgctggga
gaaacttact gcgaggagca cggcgagaag atctacttct tctgcgagaa 780cgatgccgag
ttcctctgtg tgttctgcag ggagggtccc acgcaccagg cgcacaccgt 840ggggttcctg
gacgaggcca ttcagcccta ccgggatcgt ctcaggagtc gactggaagc 900tctgagcacg
gagagagatg agattgagga tgtaaagtgt caagaagacc agaagcttca 960agtgctgctg
actcagatcg aaagcaagaa gcatcaggtg gaaacagctt ttgagaggct 1020gcagcaggag
ctggagcagc agcgatgtct cctgctggcc aggctgaggg agctggagca 1080gcagatttgg
aaggagaggg atgaatatat cacaaaggtc tctgaggaag tcacccggct 1140tggagcccag
gtcaaggagc tggaggagaa gtgtcagcag ccagcaagtg agcttctaca 1200agatgtcaga
gtcaaccaga gcaggtgtga gatgaagact tttgtgagtc ctgaggccat 1260ttctcctgac
cttgtcaaga agatccgtga tttccacagg aaaatactca ccctcccaga 1320gatgatgagg
atgttctcag aaaacttggc gcatcatctg gaaatagatt caggggtcat 1380cactctggac
cctcagaccg ccagccggag cctggttctc tcggaagaca ggaagtcagt 1440gaggtacacc
cggcagaaga agagcctgcc agacagcccc ctgcgcttcg acggcctccc 1500ggcggttctg
ggcttcccgg gcttctcctc cgggcgccac cgctggcagg ttgacctgca 1560gctgggcgac
ggcggcggct gcacggtggg ggtggccggg gagggggtga ggaggaaggg 1620agagatggga
ctcagcgccg aggacggcgt ctgggccgtg atcatctcgc accagcagtg 1680ctgggccagc
acctccccgg gcaccgacct gccgctgagc gagatcccgc gcggcgtgag 1740agtcgccctg
gactacgagg cggggcaggt gaccctccac aacgcccaga cccaggagcc 1800catcttcacc
ttcactgcct ctttctccgg caaagtcttc cctttctttg ccgtctggaa 1860aaaaggttcc
tgccttacgc tgaaaggctg aagtggggcg cgcgaagggc ggcgaagcgg 1920agacggcggc
tctccgggat ccagctccgc ccctggccag tgtgcggccc gggggctccc 1980tgtgcccgcg
tgaggcgaga gaacagggga cttgagtctc gaacagcggt tgtttttact 2040ttatttatct
taggccctca gctccctgac gtcctgagcc tccctgtgac gctctggcct 2100tctctgcacc
tcagagtgca gaaccacaga cggcttcggc tgtgcctagg gcaacagcca 2160acctaggagc
cagcgggctt tcggggaaaa aaaagaaaaa gacatctaaa ataaaatgtt 2220taaactgttt
caaaataaaa aaaaaaaaaa aaaaa
2255172942DNAHomo sapiens 172tttatacatt ctaaatctcc ccagtttctt tggggctgga
agatgcaact tccatttaat 60agaaactttg aaatcttggg gtaagggagc agtgggggga
ctagggagaa ggataagaaa 120tagaattatt gaaaagcccc caccagggac cttcctggcc
agaatatgca gagtaattcc 180tgctggcttc acctttgaaa gtccctcgaa actatgcaga
tgaaactgag tctgtttttg 240atattgtcag atgtattcta ccttggaagt cccaacacct
aaactggaat tcttgtattt 300acatctcctc cactgtcccc cacaccaccc ctcaattcct
gctgcccctg ctaatgttaa 360gcatttttct cttgttatca tcaggttcac attaaaaaca
gatacttaca aactgacttg 420aagcacagat acttttacga atgtgataaa atattttctt
aagaaaagga aagaggatgt 480gggtcaaata aaacaccgca tggatgttga ttggtgaata
ctggtgtaag aaaagggagc 540tcaggaattt ttattactgt atttgtaaat gagtttgaag
gaatttgtaa atgccactgg 600tacattttta aggtgacaca tttgctcctt ataaagttat
taaaaattac agggtaagct 660taaatgacgt ttgccagtag ttttacttta tataatcaat
attgatattg ttgctgaact 720atgtaacttt atgatgcatt tttcagtccc ttttcagagc
aaatgctttt gcaatggtag 780taatgtttag tttaaattga cttaataaat tattacctga
gcaaaaaaaa aaaaaaaaaa 840aaaaaaaaaa taaaaaaaaa aaaaaaaaaa aaaaaaaaaa
taataaaaaa aaaaaaaaca 900aacaaatcaa taaaacttaa acaaaaaaaa aataaaaaaa
aa 9421731070DNAHomo sapiens 173gcagagatcg
ccacatcgtc ggacaaggtc aaggacgggg gcggcgggaa cgagggctct 60ccatgcccac
cgtgtcccgg gcccatagcc gggcaagccc taggaggcag ccgggcgtcg 120ccggccccgg
cgccgtcacg ctcgccctcg gcgcagtgtc cttttccagg cgggacggtg 180ctgtcccggc
ctctctacta caccgcgccc ttctatcccg gctacacgaa ctatggctcc 240ttcggacacc
ttcatggcca cccggggccg gggccgggcc ccacacccgg tccggggtct 300catttcaatg
gattaaacca gaccgtgttg aaccgagcgg acgctttggc taaagacccg 360aaaatgttgc
ggagccagtc tcagctagac ctgtgcaaag actctcccta tgaattgaag 420aaaggtatgt
ccgacattta acgcgggctg cgtcggtccc ggacttttct aatttattaa 480aaacatggcc
ttggcagtta tttttccatc accgagagag agagacagag agagaaaata 540aactacccct
cctattcaga agtttatagt ttatggagat ggatgacata aaaatgtaaa 600catctccaca
cacacaaaaa aatgtcttaa ccaaccgaaa agaaaaatta aaaaaggatt 660tgtattaaat
cttattctgt atatttaatg tagcattttt gtatttaaat tgataattca 720atatctttga
agtaaattat gaaatcaaga cacctgtaca ggcatttaat gtttttttgt 780aatataaata
tatacatttg tgtttccccc aaaactgttt catagttaaa aaatacaagt 840ttaatttaat
tttttacacc tattgattct gctgggtatg agctaaagta ttacggaaag 900gaaacaggtt
atactcttag atttaaaaag tgaaagaaac tgcaggcgcc tttgtaaaat 960gcaaaatatt
taattaaaag agattttaac ataatgagag ccactcatta ctttttagaa 1020gcctcaataa
actgtccatt gccttggtca aaaaaaaaaa aaaaaaaaaa
1070174668DNAHomo sapiensmisc_feature(60)..(60)a or g or c or t/u
174atatccaaga aatttggaca cctataccta cagaataatg aaatagaaaa gatgaatctn
60acagtgatgt gtccttctat tgacccacta cattaccacc atttaacata cattcgtgtg
120gaccaaaata aactaaaaga accaataagc tcatacatct tcttctgctt ccctcatata
180cacactattt attatggtga acaacgaagc actaatggtc aaacaataca actaaagacc
240caagttttca ggagatttcc agatgatgat gatgaaagtg aagatcacga tgatcctgac
300aatgctcatg agagcccaga acaagaagga gcagaagggc actttgacct tcattattat
360gaaaatcaag aatagcaaga aactatatag gtatacactt acgacttcac aaaacctata
420cttaatatag taaatctaag taaacatgta ttactcaaag taatatattt agaattatgt
480attagtataa gatcagaatt gaatttaagt tgttggtgac atctgcatca tttcatagga
540ttagaactta ctcaaaataa tgtaaatctt taaaaatata aattagaatg acaagtggga
600atcataaatt aaacgttaat ggtttcttat gctcttttta aatatagaaa tatcatgtta
660aaaaaaaa
6681752953DNAHomo sapiens 175atgattgcaa cagtggattt aaaagtcaat gaatatgaga
aaaaccaaaa atggcttgag 60atcctaaata agattgaaaa caaaacatac acgaagctca
aaaatggaca tgtgtttagg 120aagcaggcac tgatgagtga agaaaggact ctgttatatg
atggccttgt ttactggaaa 180actgctacag gtcgtttcaa agatatccta gctctacttc
taactgatgt gctgctcttt 240ttacaagaaa aagaccagaa atacatcttt gcagccgttg
atcagaagcc atcagttatt 300tcccttcaaa agcttattgc tagagaagtt gctaatgagg
agagaggaat gtttctgatc 360agtgcttcat ctgctggtcc tgagatgtat gaaattcaca
ccaattccaa ggaggaacgc 420aataactgga tgagacggat ccagcaggct gtagaaagtt
gtcctgaaga aaaaggggga 480aggacaagtg aatctgatga agacaagagg aaagctgaag
ccagagtggc caaaattcag 540caatgtcaag aaatactcac taaccaagac caacaaattt
gtgcgtattt ggaggagaag 600ctgcatatct atgctgaact tggagaactg agcggatttg
aggacgtcca tctagagccc 660cacctcctta ttaaacctga cccaggcgag cctccccagg
cagcctcatt actggcagca 720gcactgaaag aagcattagt cacaggaggg agagaaggaa
gaggctgttc ggatgtggat 780cccgggatcc agggtgtggt aaccgacttg gccgtctctg
atgcagggga gaaggtggaa 840tgtagaaatt ttccaggttc ttcacaatca gagattatac
aagccataca gaatttaacc 900cgtctcttat acagccttca ggccgccttg accattcagg
acagccacat tgagatccac 960aggctggttc tccagcagca ggagggcctg tctctcggcc
actctatcct ccgaggcggc 1020cccttgcagg accagaagtc tcgcgacgcg gacaggcagc
atgaggagct ggccaatgtg 1080caccagcttc agcaccagct ccagcagggg cagcggcgct
ggctgcgcag gtgtgagcag 1140cagcagcggg cgcaggcgac cagggagagc tggctgcagg
agcgggagcg ggagtgccag 1200tcgcaggagg agctgctgct gcggagccgg ggcgagctgg
acctccagct ccaggagtac 1260cagcacagcc tggagcggct gagggagggc cagcgcctgg
tggagaggga gcaggcgagg 1320atgcgggccc agcagagcct gctgggccac tggaagcacg
gccggcagag gagcctgtcc 1380gcggtgctcc ttccgggtgg ccccgaggta atggaactta
atcgatctga gagtttatgt 1440catgaaaact cattcttcat caatgaagct ttagtacaaa
tgtcatttaa cactttcaac 1500aaactgaatc catcagttat ccatcaggat gccacttacc
ctacaactca atctcattct 1560gacttggtga ggactagtga acatcaagta gacctcaagg
tggacccttc tcagccttcg 1620aatgtcagtc acaaactgtg gacagccgct ggttccggcc
atcagatact tcctttccat 1680gaaagcagca aggattcttg taaaaatggc tccagtatga
caaagtgcag ttgtacgttg 1740acatctcccc cgggactgtg gactggaacc acatctactt
tgaaggattt ggacacctcc 1800cacactgagt ccccaacccc ccatgactca aattcacacc
gccctcaact gcaggcgttt 1860ataacagaag caaagctaaa tctaccgaca aggacaatga
ccagacaaga tggggaaact 1920ggagatggag ccaaagaaaa tattgtttac ctctaattgt
gttgtcattt ttccaaacaa 1980aacaaaacac tggcactttt gggagaaact ttttgtctcc
attccttatg tatgtgtgat 2040tgtctgtgtc caaattgctt taagaataat atttaatatt
tcctggaagc tcattttttt 2100ggcatgagtc taattaaatt attgaaagcc accctgtttg
tataatcttt aacttatcaa 2160atctaatttc agatttctgg aggagaaact aacttgaata
agcaggacta ttttaaaagt 2220tgttttgacg ctagagtaaa attccatgtc acattttcta
cccaatcatc tggatttcaa 2280gattcctttt aagatctcaa tgaagcaatt tggatttaaa
gagtggtatt cacaaggggt 2340gaactttcac agtcagggca gttgcctcag tgcccacata
ggcagaggag gatgtgggaa 2400agggcttttc tcagctagtt tttgtgtgct catttcttct
gggagcatta aaagtggtga 2460tctgttacag tcactattca actgggcacg tgttgtgatt
ggtcagtcac tgagccaggg 2520atacagtccg gacttgctta gtacctaagc ctaatgctgg
tggggtttca agacatggtt 2580cagcatcatc ttttaacaag gcccagaggc ccagagcccg
catcaagtca ttttgatgta 2640aatagtgaac tttgttagag ccctcacttc tatcaatcag
ctgtcctgtc cctgccagca 2700cctggagcac caactaccac tccctggaaa gaacccttcc
ctgcagtttt ttaaggacaa 2760aactgcccac tcctcattaa gtttgctgcc tggatacact
tttccacaaa ggaaaactgg 2820catatcctgc cttccgagta gtatgggtct ctgtgtgaga
aaccaggaga tattttcatc 2880ttgttcggaa atacttgtat gtattttggt gtcaataaat
atcttgtacc tcattaaaaa 2940aaaaaaaaaa aaa
29531764157DNAHomo sapiens 176ctgagccgca tctgcaatag
cacacttgcc cggccacctg ctgccgtgag cctttgctgc 60tgaagcccct ggggtcgcct
ctacctgatg aggatgtgca cccccattag ggggctgctc 120atggcccttg cagtgatgtt
tgggacagcg atggcatttg cacccatacc ccggatcacc 180tgggagcaca gagaggtgca
cctggtgcag tttcatgagc cagacatcta caactactca 240gccttgctgc tgagcgagga
caaggacacc ttgtacatag gtgcccggga ggcggtcttc 300gctgtgaacg cactcaacat
ctccgagaag cagcatgagg tgtattggaa ggtctcagaa 360gacaaaaaag caaaatgtgc
agaaaagggg aaatcaaaac agacagagtg cctcaactac 420atccgggtgc tgcagccact
cagcgccact tccctttacg tgtgtgggac caacgcattc 480cagccggcct gtgaccacct
gaacttaaca tcctttaagt ttctggggaa aaatgaagat 540ggcaaaggaa gatgtccctt
tgacccagca cacagctaca catccgtcat ggttgatgga 600gaactttatt cggggacgtc
gtataatttt ttgggaagtg aacccatcat ctcccgaaat 660tcttcccaca gtcctctgag
gacagaatat gcaatccctt ggctgaacga gcctagtttc 720gtgtttgctg acgtgatccg
aaaaagccca gacagccccg acggcgagga tgacagggtc 780tacttcttct tcacggaggt
gtctgtggag tatgagtttg tgttcagggt gctgatccca 840cggatagcaa gagtgtgcaa
gggggaccag ggcggcctga ggaccttgca gaagaaatgg 900acctccttcc tgaaagcccg
actcatctgc tcccggccag acagcggctt ggtcttcaat 960gtgctgcggg atgtcttcgt
gctcaggtcc ccgggcctga aggtgcctgt gttctatgca 1020ctcttcaccc cacagctgaa
caacgtgggg ctgtcggcag tgtgcgccta caacctgtcc 1080acagccgagg aggtcttctc
ccacgggaag tacatgcaga gcaccacagt ggagcagtcc 1140cacaccaagt gggtgcgcta
taatggcccg gtacccaagc cgcggcctgg agcgtgcatc 1200gacagcgagg cacgggccgc
caactacacc agctccttga atttgccaga caagacgctg 1260cagttcgtta aagaccaccc
tttgatggat gactcggtaa ccccaataga caacaggccc 1320aggttaatca agaaagatgt
gaactacacc cagatcgtgg tggaccggac ccaggccctg 1380gatgggactg tctatgatgt
catgtttgtc agcacagacc ggggagctct gcacaaagcc 1440atcagcctcg agcacgctgt
tcacatcatc gaggagaccc agctcttcca ggactttgag 1500ccagtccaga ccctgctgct
gtcttcaaag aagggcaaca ggtttgtcta tgctggctct 1560aactcgggcg tggtccaggc
cccgctggcc ttctgtggga agcacggcac ctgcgaggac 1620tgtgtgctgg cgcgggaccc
ctactgcgcc tggagcccgc ccacagcgac ctgcgtggct 1680ctgcaccaga ccgagagccc
cagcaggggt ttgattcagg agatgagcgg cgatgcttct 1740gtgtgcccgg ataaaagtaa
aggaagttac cggcagcatt ttttcaagca cggtggcaca 1800gcggaactga aatgctccca
aaaatccaac ctggcccggg tcttttggaa gttccagaat 1860ggcgtgttga aggccgagag
ccccaagtac ggtcttatgg gcagaaaaaa cttgctcatc 1920ttcaacttgt cagaaggaga
cagtggggtg taccagtgcc tgtcagagga gagggttaag 1980aacaaaacgg tcttccaagt
ggtcgccaag cacgtcctgg aagtgaaggt ggttccaaag 2040cccgtagtgg cccccacctt
gtcagttgtt cagacagaag gtagtaggat tgccaccaaa 2100gtgttggtgg catccaccca
agggtcttct cccccaaccc cagccgtgca ggccacctcc 2160tccggggcca tcacccttcc
tcccaagcct gcgcccaccg gcacatcctg cgaaccaaag 2220atcgtcatca acacggtccc
ccagctccac tcggagaaaa ccatgtatct taagtccagc 2280gacaaccgcc tcctcatgtc
cctcttcctc ttcttctttg ttctcttcct ctgcctcttt 2340ttctacaact gctataaggg
atacctgccc agacagtgct tgaaattccg ctcggcccta 2400ctaattggga agaagaagcc
caagtcagat ttctgtgacc gtgagcagag cctgaaggag 2460acgttagtag agccagggag
cttctcccag cagaatgggg agcaccccaa gccagccctg 2520gacaccggct atgagaccga
gcaagacacc atcaccagca aagtccccac ggatagggag 2580gactcacaga ggatcgacga
cctttctgcc agggacaagc cctttgacgt caagtgtgag 2640ctgaagttcg ctgactcaga
cgcagatgga gactgaggcc ggctgtgcat ccccgctggt 2700gcctcggctg cgacgtgtcc
aggcgtggag agttttgtgt ttctcctgtt cagtatccga 2760gtctcgtgca gtgctgcgta
ggttagcccg catcgtgcag acaacctcag tcctcttgtc 2820tattttctct tgggttgagc
ctgtgacttg gtttctcttt gtccttttgg aaaaatgaca 2880agcattgcat cccagtcttg
tgttccgaag tcagtcggag tacttgaaga aggcccacgg 2940gcggcacgga gttcctgagc
cctttctgta gtgggggaaa ggtggctgga cctctgttgg 3000ctgagaagag catcccttca
gcttcccctc cccgtagcag ccactaaaag attatttaat 3060tccagattgg aaatgacatt
ttagtttatc agattggtaa cttatcgcct gttgtccaga 3120ttggcacgaa ccttttcttc
cacttaatta tttttttagg attttgcttt gattgtgttt 3180atgtcatggg tcattttttt
ttagttacag aagcagttgt gttaatattt agaagaagat 3240gtatatcttc cagattttgt
tatatatttg gcataaaata cggcttacgt tgcttaagat 3300tctcagggat aaacttcctt
ttgctaaatg cattctttct gcttttagaa atgtagacat 3360aaacactccc cggagcccac
tcaccttttt tctttttctt tttttttttt taactttatt 3420ccttgaggga agcattgttt
ttggagagat tttctttctg tacttcgttt tacttttctt 3480tttttttaac ttttactctc
tcgaagaaga ggaccttccc acatccacga ggtgggtttt 3540gagcaaggga aggtagcctg
gatgagctga gtggagccag gctggcccag agctgagatg 3600ggagtgcggt acaatctgga
gcccacagct gtcggtcaga acctcctgtg agacagatgg 3660aaccttcaca agggcgcctt
tggttctctg aacatctcct ttctcttctt gcttcaattg 3720cttacccact gcctgcccag
actttctatc cagcctcact gagctgccca ctactggaag 3780ggaactgggc ctcggtggcc
ggggccgcga gctgtgacca cagcaccctc aagcatacgg 3840cgctgttcct gccactgtcc
tgaagatgtg aatgggtggt acgatttcaa cactggttaa 3900tttcacactc catctccccg
ctttgtaaat acccatcggg aagagacttt ttttccatgg 3960tgaagagcaa taaactctgg
atgtttgtgc gcgtgtgtgg acagtcttat cttccagcat 4020gataggattt gaccattttg
gtgtaaacat ttgtgtttta taagatttac cttgttttta 4080tttttctact ttgaattgta
tacatttgga aagtacccaa ataaatgaga agcttctatc 4140cttaaaaaaa aaaaaaa
41571771023DNAHomo
sapiensmisc_feature(4)..(4)a or g or c or t/umisc_feature(23)..(23)a or g
or c or t/umisc_feature(28)..(28)a or g or c or
t/umisc_feature(47)..(47)a or g or c or t/umisc_feature(53)..(53)a or g
or c or t/umisc_feature(69)..(69)a or g or c or t/u 177cccntcccca
gaggcaggaa aancagtntg ccgaaaggat agactgnggt gcngtctttc 60cccaagttnt
gaactagttt taaggtagct taggatgaaa aatggagaat gattgggggt 120tccaaaccac
tttcttctcc cttggcttat atctcttcac catttggtgg tcaactgtgg 180gcctaccctg
gacctcatct actcagcgag aattggacat gaagctagag gcagctgcct 240tggaagggaa
gtcaggctca cttggacagc ccaggccatg gcaggaagaa tcccttcctc 300ttggggtcct
tgatgggcat gtgtgatggg gaaggagcag tctcccagcc ctgggtctgc 360tccccacatc
tctcctaatt ccacttcacc ttttgccacc ccctccccac cagaggccta 420gcccttttgt
caccgaaggc ccccagagtg tttctgtgtg aaaccctctc atttacactg 480tggcatcaaa
atccacaaaa gatggattaa ttgcactctg gttaatagca gcagcacaat 540gattaaaatc
tatattccta tcttctctag caccctggtg tggggatggg gcggaagggt 600gtcttgaggg
gcagggagga ccccataaaa caatccctcc tgcattctca ggctaaatag 660ggcccccagt
gactacctgt tcttggctgt cccctctgaa gagctctgcc ttctcacagc 720caccaccagt
tgccccactc ccaggaaaac agcacatgtt cttcttctcc tgccttgaga 780ctgcgtgtta
gtcttccatt cataactcat cagcagctca gtccttctta tgtctagtct 840cagttcattc
agccaaagct catttttgtc ctatccaaag tagaaagggt tcttttagaa 900aacttgaaga
atgtgcctcc tcttagcatc tgtttctgac tcccagttat ttttaaaata 960aatgatgaat
aaaatgcctg ccctgaaggg ttctggagga gtcaggtatc aaaaaaaaaa 1020aaa
1023178550DNAHomo
sapiens 178tttttaagat gatcttgctc cgtcacccag gctggagtgc agtggcgtaa
tcatggcttc 60ctgcagcctc aaactcctgg gctcaatgag ttccttgaga tcttccatcc
tcagcttccc 120aagtagctag tagtagtagt ggcttgcacc aacgctcctg ccctaatttt
caatattttt 180tttgtagaga taggatctca ctgtgttacc caagctagac ttgaactcct
ggcctcaagc 240gatccttccg ccttggcctc ccaaagtgtt gggattacag gcattagcta
ccacacctgg 300ccaaggccca ggtttcgaca gaaagggaga gaaaacctgc cagagatgcc
atttcggagc 360cactctgctt ggcagggacc tgtgttcccc tcatgcaggt tcatccttag
agggctgcgg 420tcttatctgg ttgtgcaaaa gtcccacaac ctttctggat tgatagtttg
tggtgaaata 480aacaatttta gtttgtttgg agaatctttt gtatacaaaa tacaaataaa
acctaaatca 540aagaaacaga
5501792798DNAHomo sapiens 179gaggggccgg aggcgtcccc gctcccgctc
gctactagcc cgcgggccag cgccgcgtcc 60cgagccccgg cgggagccat ggctctaaaa
ggacaagaag attatattta tcttttcaag 120gattcaacac atccagtgga ttttctggat
gcattcagaa cattttactt ggatggatta 180tttactgata ttactcttca gtgtccttca
ggcataattt tccattgtca ccgagccgtt 240ttagctgctt gcagcaatta ttttaaggca
atgttcacag ctgacatgaa agaaaaattt 300aaaaataaaa taaaactctc tggcatccac
catgatattc tggaaggcct tgtaaattat 360gcatacactt cccaaattga aataactaaa
agaaatgttc aaagcctgct tgaggcagcg 420gatctgctac agttcctttc agtaaagaag
gcttgtgagc ggtttttggt aaggcacttg 480gatattgata attgtattgg aatgcactcc
tttgcagaat ttcatgtgtg tccagaacta 540gagaaggaat ctcgaagaat tctatgttca
aagtttaagg aagtgtggca acaagaagaa 600tttctggaaa tcagccttga aaagtttctc
tttatcttgt ccagaaagaa tctcagtgtt 660tggaaagaag aagctatcat agagccagtt
attaagtgga ctgctcatga tgtagaaaat 720cgaattgaat gcctctataa tctactgagc
tatatcaaca ttgatataga tccagtgtac 780ttaaaaacag ccttaggcct tcaaagaagc
tgcctgctca ccgaaaataa gatccgctcc 840ctaatataca atgccttgaa tcccatgcat
aaagagattt cccagaggtc cacagccaca 900atgtatataa ttggaggcta ttactggcat
cctttatcag aggttcacat atgggatcct 960ttgacaaatg tttggattca gggagcagaa
ataccagatt ataccaggga gagctatggt 1020gttacatgtt taggacccaa catttatgta
actgggggct acaggacgga taacatagaa 1080gctcttgaca cagtgtggat ctataacagt
gaaagtgatg aatggacaga aggtttgcca 1140atgctcaatg ccaggtatta ccactgtgca
gtcaccttgg gtggctgtgt ctatgcttta 1200ggtggttaca gaaaaggggc tccagcagaa
gaggctgagt tctatgatcc tttaaaagag 1260aaatggattc ctattgcaaa catgattaaa
ggtgtgggaa atgctactgc ctgtgtctta 1320catgatgtta tctacgtcat tggtggccac
tgtggctaca gaggaagctg cacctatgac 1380aaagttcaga gctacaattc cgatatcaac
gaatggagcc tcatcacctc cagtccacat 1440ccagaatatg gattgtgctc agttccgttt
gaaaataagc tctatctagt cggtggacaa 1500actacaatca cagaatgcta tgaccctgaa
caaaatgaat ggagagagat agctcccatg 1560atggaaagga ggatggagtg cggtgccgtc
atcatgaatg gatgtattta tgtcactgga 1620ggatactcct actcaaaggg aacgtatctt
cagagcattg agaaatatga tccagatctt 1680aataagtggg aaatagtggg taatcttccc
agtgccatgc ggtctcatgg gtgtgtttgt 1740gtgtataatg tctaattgaa tctgcagaaa
tgaccaagca atcacttttt tggagtatag 1800ttttataaaa aaagaatgca gggtttgaag
ttccttacct gataattgtg tctggcacat 1860gataggggat cagtaaattg taattcctaa
ccctactgta ctcccaaaca tggtgattca 1920tggtcaagaa aaatcttata tatatatata
cacacacata tatatgtgtt catatatatg 1980tatacatata tgtgtatata tacgcatgta
tgtatacata tatgtgtata tatacgcatg 2040tatgtatgca tatgtgtgta tatatacgta
tgtatgtata catatgtgta tatatacgta 2100tgtatgtata catatatgtg tatatatgcg
tatgtatgta tacatatatg tgtatatata 2160cgtatgtatg tatacatata tgtgtatata
tacgtatgta tgtatacata tatgtgtata 2220tatacgtatg tatgtataca tatatgtgtg
tatatacgtg tgtatgtata catatatgtg 2280tatatatacg tgtgtatgta tacatatatg
tgtatatatg cgtgtgtatg tatacatata 2340tgtgtatata tacgtgtgta tgtatacata
tatgtgtata tatacgtgtg tatgtataca 2400tatatgtgta tatatgcgtg tgtatatata
tacacatata tacgtatata tgtatatata 2460tatacacagt tgaatcagtg ggattaatac
ctataatctc tggttttcaa aggtaatatg 2520gaatatttga cacttggtaa aaggtgaact
acctttgtag tgaatctttt cctcttggta 2580gcatcaacac tggggataaa tcagaaccat
tctgtggaat gaaatgtttc tcaagagcct 2640ataatatagt agatagtgca tattaagatg
tctggctggg catggtggct catgcctgta 2700atcccagcac tttgggaggc tgaggcggga
ggatcacttg agcctagaag ttggagacta 2760acctggcgag accctgtctc aaaaaaaaaa
aaaaaaaa 2798180439DNAHomo sapiens
180acccttttgt gaccagctgc ataccccaaa accttttgga atctgggcta actggctgtg
60cctacatcaa cagcacccgt gaacccccgt gtgctatgct ctgtgcaaca aaacattcag
120aacccacttt caagatgctg ctgctgtgcc agtgtgacaa aaaaaagagg cgcaagcagc
180agtaccagca gagacagtcg gtcatttttc acaagcgcgc acccgagcag gccttgtaga
240atgaggttgt atcaatagca gtgacaaaac gcacacatca acccacagac cttaggagga
300ggaaggcgag ggcggggtga cttctggtga tgataaaaat ggttttatca cccagatgtg
360aaagaagctg cctgtttact gatccattga ataaacccat tttaatagaa aaagtcaata
420ccaattcagc aaaaaaaaa
4391811309DNAHomo sapiens 181tatctatgta acaaatcgca gcacaggagt cccctgggct
ccctcaggct ctggtatgac 60atatttgagc catataaatt cagcttctcc tctggcatct
gttagccgac tcacttgcaa 120ctccacctca gcagtggtct ctcagtcctc tcaaagcaag
gaaagagtac tgtgtgctga 180gagaccatgg caaagaatcc tccagagaat tgtgaagact
gtcacattct aaatgcagaa 240gcttttaaat ccaagaaaat atgtaaatca cttaagattt
gtggactggt gtttggtatc 300ctgaccctaa ctctaattgt cctgttttgg gggagcaagc
acttctggcc ggaggtaccc 360aaaaaagcct atgacatgga gcacactttc tacagcagtg
gagagaagaa gaagatttac 420atggaaattg atcctgtgac cagaactgaa atattcagaa
gcggaaatgg cactgatgaa 480acattggaag tacacgactt taaaaacgga tacactggca
tctacttcgt gggtcttcaa 540aaatgtttta tcaaaactca gattaaagtg attcctgaat
tttctgaacc agaagaggaa 600atagatgaga atgaagaaat taccacaact ttctttgaac
agtcagtgat ttgggtccca 660gcagaaaagc ctattgaaaa ccgagatttt cttaaaaatt
ccaaaattct ggagatttgt 720gataacgtga ccatgtattg gatcaatccc actctaatat
cagtttctga gttacaagac 780tttgaggagg agggagaaga tcttcacttt cctgccaacg
aaaaaaaagg gattgaacaa 840aatgaacagt gggtggtccc tcaagtgaaa gtagagaaga
cccgtcacgc cagacaagca 900agtgaggaag aacttccaat aaatgactat actgaaaatg
gaatagaatt tgatcccatg 960ctggatgaga gaggttattg ttgtatttac tgccgtcgag
gcaaccgcta ttgccgccgc 1020gtctgtgaac ctttactagg ctactaccca tatccatact
gctaccaagg aggacgagtc 1080atctgtcgtg tcatcatgcc ttgtaactgg tgggtggccc
gcatgctggg gagggtctaa 1140taggaggttt gagctcaaat gcttaaactg ctggcaacat
ataataaatg catgctattc 1200aatgaatttc tgcctatgag gcatctggcc cctggtagcc
agctctccag aattacttgt 1260aggtaattcc tctcttcatg ttctaataaa cttctacatt
atcaaaaaa 13091821477DNAHomo sapiens 182gcggatcgct
gctccctctc gccatggcgc aggtgctgat cgtgggcgcc gggatgacag 60gaagcttgtg
cgctgcgctg ctgaggaggc agacgtccgg tcccttgtac cttgctgtgt 120gggacaaggc
tgacgactca gggggaagaa tgactacagc ctgcagtcct cataatcctc 180agtgcacagc
tgacttgggt gctcagtaca tcacctgcac tcctcattat gccaaaaaac 240accaacgttt
ttatgatgaa ctgttagcct atggcgtttt gaggcctcta agctcgccta 300ttgaaggaat
ggtgatgaaa gaaggagact gtaactttgt ggcacctcaa ggaatttctt 360caattattaa
gcattacttg aaagaatcag gtgcagaagt ctacttcaga catcgtgtga 420cacagatcaa
cctaagagat gacaaatggg aagtatccaa acaaacaggc tcccctgagc 480agtttgatct
tattgttctc acaatgccag ttcctgagat tctgcagctt caaggtgaca 540tcaccacctt
aattagtgaa tgccaaaggc agcaactgga ggctgtgagc tactcctctc 600gatatgctct
gggcctcttt tatgaagctg gtacgaagat tgatgtccct tgggctgggc 660agtacatcac
cagtaatccc tgcatacgct tcgtctccat tgataataag aagcgcaata 720tagagtcatc
agaaattggg ccttccctcg tgattcacac cactgtccca tttggagtta 780catacttgga
acacagcatt gaggatgtgc aagagttagt cttccagcag ctggaaaaca 840ttttgccggg
tttgcctcag ccaattgcta ccaaatgcca aaaatggaga cattcacagg 900ttacaaatgc
tgctgccaac tgtcctggcc aaatgactct gcatcacaaa cctttccttg 960catgtggagg
ggatggattt actcagtcca actttgatgg ctgcatcact tctgccctat 1020gtgttctgga
agctttaaag aattatattt agtgcctata tccttattct ctatatgtgt 1080attgggtttt
tattttcaca attttctgtt attgattatt ttgttttcta ttttgctaag 1140aaaaattact
ggaaaattgt tcttcactta ttatcatttt tcatgtggag tataaaatca 1200attttgtaat
tttgatagtt acaacccatg ctagaatgga aattcctcac accttgcacc 1260ttccctactt
ttctgaattg ctatgactac tccttgttgg aggaaaagtg gtacttaaaa 1320aataacaaac
gactctctca aaaaaattac attaaatcac aataacagtt tgtatgccaa 1380aaacttgatt
atccttatga aaatttcaat tctgaataaa gaataatcac attatcaaag 1440ccccatcaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaa
14771833100DNAHomo sapiens 183actcgtctct ggtaaagtct gagcaggaca gggtggctga
ctggcagatc cagaggttcc 60cttggcagtc cacgccaggc cttcaccatg gatcagttcc
ctgaatcagt gacagaaaac 120tttgagtacg atgatttggc tgaggcctgt tatattgggg
acatcgtggt ctttgggact 180gtgttcctgt ccatattcta ctccgtcatc tttgccattg
gcctggtggg aaatttgttg 240gtagtgtttg ccctcaccaa cagcaagaag cccaagagtg
tcaccgacat ttacctcctg 300aacctggcct tgtctgatct gctgtttgta gccactttgc
ccttctggac tcactatttg 360ataaatgaaa agggcctcca caatgccatg tgcaaattca
ctaccgcctt cttcttcatc 420ggcttttttg gaagcatatt cttcatcacc gtcatcagca
ttgataggta cctggccatc 480gtcctggccg ccaactccat gaacaaccgg accgtgcagc
atggcgtcac catcagccta 540ggcgtctggg cagcagccat tttggtggca gcaccccagt
tcatgttcac aaagcagaaa 600gaaaatgaat gccttggtga ctaccccgag gtcctccagg
aaatctggcc cgtgctccgc 660aatgtggaaa caaattttct tggcttccta ctccccctgc
tcattatgag ttattgctac 720ttcagaatca tccagacgct gttttcctgc aagaaccaca
agaaagccaa agccattaaa 780ctgatccttc tggtggtcat cgtgtttttc ctcttctgga
caccctacaa cgttatgatt 840ttcctggaga cgcttaagct ctatgacttc tttcccagtt
gtgacatgag gaaggatctg 900aggctggccc tcagtgtgac tgagacggtt gcatttagcc
attgttgcct gaatcctctc 960atctatgcat ttgctgggga gaagttcaga agataccttt
accacctgta tgggaaatgc 1020ctggctgtcc tgtgtgggcg ctcagtccac gttgatttct
cctcatctga atcacaaagg 1080agcaggcatg gaagtgttct gagcagcaat tttacttacc
acacgagtga tggagatgca 1140ttgctccttc tctgaaggga atcccaaagc cttgtgtcta
cagagaacct ggagttcctg 1200aacctgatgc tgactagtga ggaaagattt ttgttgttat
ttcttacagg cacaaaatga 1260tggacccaat gcacacaaaa caaccctaga gtgttgttga
gaattgtgct caaaatttga 1320agaatgaaca aattgaactc tttgaatgac aaagagtaga
catttctctt actgcaaatg 1380tcatcagaac tttttggttt gcagatgaca aaaattcaac
tcagactagt ttagttaaat 1440gagggtggtg aatattgttc atattgtggc acaagcaaaa
gggtgtctga gccctcaaag 1500tgaggggaaa ccagggcctg agccaagcta gaattccctc
tctctgactc tcaaatcttt 1560tagtcattat agatccccca gactttacat gacacagctt
tatcaccaga gagggactga 1620cacccatgtt tctctggccc caagggaaaa ttcccaggga
agtgctctga taggccaagt 1680ttgtatcagg tgcccatccc tggaaggtgc tgttatccat
ggggaaggga tatataagat 1740ggaagcttcc agtccaatct catggagaag cagaaataca
tatttccaag aagttggatg 1800ggtgggtact attctgatta cacaaaacaa atgccacaca
tcacccttac catgtgcctg 1860atccagcctc tcccctgatt acaccagcct cgtcttcatt
aagccctctt ccatcatgtc 1920cccaaacctg caagggctcc ccactgccta ctgcatcgag
tcaaaactca aatgcttggc 1980ttctcatacg tccaccatgg ggtcctacca atagattccc
cattgcctcc tccttcccaa 2040aggactccac ccatcctatc agcctgtctc ttccatatga
cctcatgcat ctccacctgc 2100tcccaggcca gtaagggaaa tagaaaaacc ctgcccccaa
ataagaaggg atggattcca 2160accccaactc cagtagcttg ggacaaatca agcttcagtt
tcctggtctg tagaagaggg 2220ataaggtacc tttcacatag agatcatcct ttccagcatg
aggaactagc caccaactct 2280tgcaggtctc aacccttttg tctgcctctt agacttctgc
tttccacacc tgcactgctg 2340tgctgtgccc aagttgtggt gctgacaaag cttggaagag
cctgcaggtg ccttggccgc 2400gtgcatagcc cagacacaga agaggctggt tcttacgatg
gcacccagtg agcactccca 2460agtctacaga gtgatagcct tccgtaaccc aactctcctg
gactgccttg aatatcccct 2520cccagtcacc ttgtgcaagc ccctgcccat ctgggaaaat
accccatcat tcatgctact 2580gccaacctgg ggagccaggg ctatgggagc agcttttttt
tcccccctag aaacgtttgg 2640aacaatgtaa aactttaaag ctcgaaaaca attgtaataa
tgctaaagaa aaagtcatcc 2700aatctaacca catcaatatt gtcattcctg tattcacccg
tccagacctt gttcacactc 2760tcacatgttt agagttgcaa tcgtaatgta cagatggttt
tataatctga tttgttttcc 2820tcttaacgtt agaccacaaa tagtgctcgc tttctatgta
gtttggtaat tatcatttta 2880gaagactcta ccagactgtg tattcattga agtcagatgt
ggtaactgtt aaattgctgt 2940gtatctgata gctctttggc agtctatatg tttgtataat
gaatgagaga ataagtcatg 3000ttccttcaag atcatgtacc ccaatttact tgccattact
caattgataa acatttaact 3060tgtttccaat gtttagcaaa tacatatttt atagaacttc
3100184662DNAHomo sapiens 184tgaacatatt caggctgatt
ggggacgtgt cccacctggc ggccatcgtc atcttgatgg 60tagagatctg gaagacgcgc
tcctgcgccg gtatttctgg gaaaagccag cttctgtctg 120cactggtctt cacaactcgt
gacctggatc ttttcacttc atttatttca gtgtatcaca 180catctatcaa ggttatctac
gttgcctgct cgtatgccac agtgtacctg atctacctta 240aatttaaggc aacatcggat
ggaaatcatg ataccttccg agtggagttt ctggtggtcc 300ctgtgggagg cctcctcatt
tttagttaat cacgatttct ctcctcttga gtactcaagg 360gaaagaagct cagtttgcca
gcataagtgc caaagaccat cgccagcatc tgtccttcag 420ggtgttcgga cagaattctt
accacagcaa aggcataaga tgcttgatac ggaaaatcaa 480gaacttaact tttttgttgc
agatagtcat cagtggttct gtaaaaacgc agaggaaaag 540agccagaagg tttctgttta
atgcatcttg ccttatcttt ttttattact gtgcacaaag 600atttttttac acaaacatcc
ttaatgctgt tttaataaat tcagtgtgta gcttcaaaaa 660aa
6621855920DNAHomo sapiens
185ggcaggtctc gctctcggca ccctcccggc gcccgcgttc tcctggccct gcccggcatc
60ccgatggccg ccgctgggcc ccggcgctcc gtgcgcggag ccgtctgcct gcatctgctg
120ctgaccctcg tgatcttcag tcgtgatggt gaagcctgca aaaaggtgat acttaatgta
180ccttctaaac tagaggcaga caaaataatt ggcagagtta atttggaaga gtgcttcagg
240tctgcagacc tcatccggtc aagtgatcct gatttcagag ttctaaatga tgggtcagtg
300tacacagcca gggctgttgc gctgtctgat aagaaaagat catttaccat atggctttct
360gacaaaagga aacagacaca gaaagaggtt actgtgctgc tagaacatca gaagaaggta
420tcgaagacaa gacacactag agaaactgtt ctcaggcgtg ccaagaggag atgggcacct
480attccttgct ctatgcaaga gaattccttg ggccctttcc cattgtttct tcaacaagtt
540gaatctgatg cagcacagaa ctatactgtc ttctactcaa taagtggacg tggagttgat
600aaagaacctt taaatttgtt ttatatagaa agagacactg gaaatctatt ttgcactcgg
660cctgtggatc gtgaagaata tgatgttttt gatttgattg cttatgcgtc aactgcagat
720ggatattcag cagatctgcc cctcccacta cccatcaggg tagaggatga aaatgacaac
780caccctgttt tcacagaagc aatttataat tttgaagttt tggaaagtag tagacctggt
840actacagtgg gggtggtttg tgccacagac agagatgaac cggacacaat gcatacgcgc
900ctgaaataca gcattttgca gcagacacca aggtcacctg ggctcttttc tgtgcatccc
960agcacaggcg taatcaccac agtctctcat tatttggaca gagaggttgt agacaagtac
1020tcattgataa tgaaagtaca agacatggat ggccagtttt ttggattgat aggcacatca
1080acttgtatca taacagtaac agattcaaat gataatgcac ccactttcag acaaaatgct
1140tatgaagcat ttgtagagga aaatgcattc aatgtggaaa tcttacgaat acctatagaa
1200gataaggatt taattaacac tgccaattgg agagtcaatt ttaccatttt aaagggaaat
1260gaaaatggac atttcaaaat cagcacagac aaagaaacta atgaaggtgt tctttctgtt
1320gtaaagccac tgaattatga agaaaaccgt caagtgaacc tggaaattgg agtaaacaat
1380gaagcgccat ttgctagaga tattcccaga gtgacagcct tgaacagagc cttggttaca
1440gttcatgtga gggatctgga tgaggggcct gaatgcactc ctgcagccca atatgtgcgg
1500attaaagaaa acttagcagt ggggtcaaag atcaacggct ataaggcata tgaccccgaa
1560aatagaaatg gcaatggttt aaggtacaaa aaattgcatg atcctaaagg ttggatcacc
1620attgatgaaa tttcagggtc aatcataact tccaaaatcc tggataggga ggttgaaact
1680cccaaaaatg agttgtataa tattacagtc ctggcaatag acaaagatga tagatcatgt
1740actggaacac ttgctgtgaa cattgaagat gtaaatgata atccaccaga aatacttcaa
1800gaatatgtag tcatttgcaa accaaaaatg gggtataccg acattttagc tgttgatcct
1860gatgaacctg tccatggagc tccattttat ttcagtttgc ccaatacttc tccagaaatc
1920agtagactgt ggagcctcac caaagttaat gatacagctg cccgtctttc atatcagaaa
1980aatgctggat ttcaagaata taccattcct attactgtaa aagacagggc cggccaagct
2040gcaacaaaat tattgagagt taatctgtgt gaatgtactc atccaactca gtgtcgtgcg
2100acttcaagga gtacaggagt aatacttgga aaatgggcaa tccttgcaat attactgggt
2160atagcactgc tcttttctgt attgctaact ttagtatgtg gagtttttgg tgcaactaaa
2220gggaaacgtt ttcctgaaga tttagcacag caaaacttaa ttatatcaaa cacagaagca
2280cctggagacg atagagtgtg ctctgccaat ggatttatga cccaaactac caacaactct
2340agccaaggtt tttgtggtac tatgggatca ggaatgaaaa atggagggca ggaaaccatt
2400gaaatgatga aaggaggaaa ccagaccttg gaatcctgcc ggggggctgg gcatcatcat
2460accctggact cctgcagggg aggacacacg gaggtggaca actgcagata cacttactcg
2520gagtggcaca gttttactca accccgtctc ggtgaagaat ccattagagg acacactggt
2580taaaaattaa acataaaaga aattgcatcg atgtaatcag aatgaagacc gcatgccatc
2640ccaagattat gtcctcactt ataactatga gggaagagga tctccagctg gttctgtggg
2700ctgctgcagt gaaaagcagg aagaagatgg ccttgacttt ttaaataatt tggaacccaa
2760atttattaca ttagcagaag catgcacaaa gagataatgt cacagtgcta caattaggtc
2820tttgtcagac attctggagg tttccaaaaa taatattgta aagttcaatt tcaacatgta
2880tgtatatgat gatttttttc tcaattttga attatgctac tcaccaattt atatttttaa
2940agccagttgt tgcttatctt ttccaaaaag tgaaaaatgt taaaacagac aactggtaaa
3000tctcaaactc cagcactgga attaaggtct ctaaagcatc tgctcttttt tttttttacg
3060gatattttag taataaatat gctggataaa tattagtcca acaatagcta agttatgcta
3120atatcacatt attatgtatt cactttaagt gatagtttaa aaaataaaca agaaatattg
3180agtatcacta tgtgaagaaa gttttggaaa agaaacaatg aagactgaat taaattaaaa
3240atgttgcagc tcataaagaa ttgggactca cccctactgc actaccaaat tcatttgact
3300ttggaggcaa aatgtgttga agtgccctat gaagtagcaa ttttctatag gaatatagtt
3360ggaaataaat gtgtgtgtgt atattattat taatcaatgc aatatttaaa atgaaatgag
3420aacaaagagg aaaatggtaa aaacttgaaa tgaggctggg gtatagtttg tcctacaata
3480gaaaaaagag agagcttcct aggcctgggc tcttaaatgc tgcattataa ctgagtctat
3540gaggaaatag ttcctgtcca atttgtgtaa tttgtttaaa attgtaaata aattaaactt
3600ttctggtttc tgtgggaagg aaatagggaa tccaatggaa cagtagcttt gctttgcagt
3660ctgtttcaag atttctgcat ccacaagtta gtagcaaact ggggaatact cgctgcagct
3720ggggttccct gctttttggt agcaagggtc cagagatgag gtgttttttt cggggagcta
3780ataacaaaaa cattttaaaa cttaccttta ctgaagttaa atcctctatt gctgtttcta
3840ttctctctta tagtgaccaa catcttttta atttagatcc aaataaccat gtcctcctag
3900agtttagagg ctagagggag ctgaggggag gatcttactg aaagcaccct ggggagattg
3960attgtcctta aacctaagcc ccacaaactt gacacctgat caggtctggg agctacaaaa
4020tttcattttt ctcctcactg cccttcttct gagtggcatt ggcctgaatc aaggaaagcc
4080aggccttgtg ggcccccttc tttcggcttt ctgctaaagc aacacctcca gcagagattc
4140ccttaagtga ctccaggttt tccaccatcc ttcagcgtga attaattttt aatcagtttg
4200ctttctccag agaaatttta aaataataga agaaatagaa attttgaatg tataaaagaa
4260aaagatcaag ttgtcatttt agaacagagg gaactttggg agaaagcagc ccaagtaggt
4320tatttgtaca gtcagagggc aacaggaaga tgcaggcctt caagggcaag gagaggccac
4380aaggaatatg ggtgggagta aaagcaacat cgtctgcttc atactttttc ctaggcttgg
4440cactgccttt tcctttctca ggccaatggc aactgccatt tgagtccggt gagggatcag
4500ccaacctctt ctctatggct caccttattt ggagtgagaa atcaaggaga cagagctgac
4560tgcatgatga gtctgaaggc atttgcagga tgagcctgaa ctggttgtgc agaacaaaca
4620aggcattcat gggaattgtt gtattccttc tgcagccctc cttctgggca ctaagaaggt
4680ctatgaatta aatgcctatc taaaattctg atttattcct acattttctg ttttctaatt
4740tgaccctaaa atctatgtgt tttagactta gactttttat tgcccccccc cccttttttt
4800ttgagacgga gtctcgctct gacgcacagg ctggagtgca gtggctccga tctctgctca
4860ctgaaagctc cgcctcccgg gttcatgcca ttctcctgcc tcagcctcct gagtagctgg
4920gactacaggc gcccaccacc acgcccggct aattttttgt atttttaata gagacggggt
4980ttcactgtgt tagccaggat ggtctcgatc tcctgacctc gtgatccgcc tgcctcggcc
5040tcccaaagtg ctgggattac aggcatgacc caccgctccc ggccttgttt tccgtttaaa
5100gtcgtcttct tttaatgtaa tcattttgaa catgtgtgaa agttgatcat acgaattgga
5160tcaatcttga aatactcaac caaaagacag tcgagaagcc agggggagaa agaactcagg
5220gcacaaaata ttggtctgag aatggaattc tctgtaagcc tagttgctga aatttcctgc
5280tgtaaccaga agccagtttt atctaacggc tactgaaaca cccactgtgt tttgctcact
5340cccactcacc gatcaaaacc tgctacctcc ccaagacttt actagtgccg ataaactttc
5400tcaaagagca accagtatca cttccctgtt tataaaacct ctaaccatct ctttgttctt
5460tgaacatgct gaaaaccacc tggtctgcat gtatgcccga atttgtaatt cttttctctc
5520aaatgaaaat ttaattttag ggattcattt ctatattttc acatatgtag tattattatt
5580tccttatatg tgtaaggtga aatttatggt atttgagtgt gcaagaaaat atatttttaa
5640agctttcatt tttcccccag tgaatgattt agaatttttt atgtaaatat acagaatgtt
5700ttttcttact tttataagga agcagctgtc taaaatgcag tggggtttgt tttgcaatgt
5760tttaaacaga gttttagtat tgctattaaa agaagttact ttgcttttaa agaaacttgg
5820ctgcttaaaa taagcaaaaa ttggatgcat aaagtaatat ttacagatgt ggggagatgt
5880aataaaacaa tattaacttg gaaaaaaaaa aaaaaaaaaa
5920186696DNAHomo sapiensmisc_feature(8)..(8)a or g or c or t/u
186gactcagnct tcagccgctc tcctccccct gggcaaacag gactcatctg atgatgtgag
60aagagttcag aggagggaga aaaatcgtat tgccgcccag aagagccgac agaggcagac
120acagaaggcc gacaccctgc acctggagag cgaagacctg gagaaacaga acgcggctct
180acgcaaggag atcaagcagc tcacagagga actgaagtac ttcacgtcgg tgctgaacag
240ccacgagccc ctgtgctcgg tgctggccgc cagcacgccc tcgccccccg aggtggtgta
300cagcgcccac gcattccacc aacctcatgt cagctccccg cgcttccagc cctgagcttc
360cgatgcgggg agagcagagc ctcgggaggg gcacacagac tgtggcagag ctgcgcccat
420cccgcagagg cccctgtcca cctggagacc cggagacaga ggcctggaca aggagtgaac
480acgggaactg tcacgactgg aagggcgtga ggcctcccag cagtgccgca gcgtttcgag
540gggcgtgtgc tggaccccac cactgtgggt tgcaggccca atgcagaaga gtattaagaa
600agatgctcaa gtcccatggc acagagcaag gcgggcaggg aacggttatt tttctaaata
660aatgctttaa aagaaaaaaa aaaaaaaaaa aaaaaa
696187586DNAHomo sapiensmisc_feature(9)..(10)a or g or c or
t/umisc_feature(31)..(31)a or g or c or t/umisc_feature(44)..(45)a or g
or c or t/umisc_feature(116)..(116)a or g or c or
t/umisc_feature(130)..(130)a or g or c or t/u 187atgcaaggnn taggcaaaga
ttgttgaccc nggagataga ggtnncaatg agccagatca 60ttccattgca ttccagcttg
ggcgacagaa tgagactctg tctcaaaatt aaaaancaaa 120aaaccaaaan caaatagatg
aaaaagtaga ctggagacaa ataaaagtga gtttctaaag 180gaaattcaca gtaatgctgc
attaaacact aagctcactt aggtcacttt ctagtgagct 240aaccgtaaca gagagcctac
aggatacacg tgagataatg tcacgtgtag aagatcgttg 300tgaattaaag ttcaaaatta
agacttctta gattatgatg tagattttag agctccttaa 360aacataaagc gaatcttata
aatgttcaat tctaaagtta ttccacttgg aaaaattagc 420ttttgggaca atttttaaga
acttttgtgt aaaatgcagc tccatgttta gcataatcta 480aaaataattt caagcaatcc
agaatcttcc aagaatgtta ttaaagcttt aaaacaaagc 540aaaacaaaaa gacccttttg
tgccttatat gggaagacta aaaaaa 5861881359DNAHomo sapiens
188accgggcacc ggacggctcg ggtactttcg ttcttaatta ggtcatgccc gtgtgagcca
60ggaaagggct gtgtttatgg gaagccagta acactgtggc ctactatctc ttccgtggtg
120ccatctacat ttttgggact cgggaattat gaggtagagg tggaggcgga gccggatgtc
180agaggtcctg aaatagtcac catgggggaa aatgatccgc ctgctgttga agcccccttc
240tcattccgat cgctttttgg ccttgatgat ttgaaaataa gtcctgttgc accagatgca
300gatgctgttg ctgcacagat cctgtcactg ctgccattga agttttttcc aatcatcgtc
360attgggatca ttgcattgat attagcactg gccattggtc tgggcatcca cttcgactgc
420tcagggaagt acagatgtcg ctcatccttt aagtgtatcg agctgatagc tcgatgtgac
480ggagtctcgg attgcaaaga cggggaggac gagtaccgct gtgtccgggt gggtggtcag
540aatgccgtgc tccaggtgtt cacagctgct tcgtggaaga ccatgtgctc cgatgactgg
600aagggtcact acgcaaatgt tgcctgtgcc caactgggtt tcccaagcta tgtgagttca
660gataacctca gagtgagctc gctggagggg cagttccggg aggagtttgt gtccatcgat
720cacctcttgc cagatgacaa ggtgactgca ttacaccact cagtatatgt gagggaggga
780tgtgcctctg gccacgtggt taccttgcag tgcacagcct gtggtcatag aaggggctac
840agctcacgca tcgtgggtgg aaacatgtcc ttgctctcgc agtggccctg gcaggccagc
900cttcagttcc agggctacca cctgtgcggg ggctctgtca tcacgcccct gtggatcatc
960actgctgcac actgtgttta tgacttgtac ctccccaagt catggaccat ccaggtgggt
1020ctagtttccc tgttggacaa tccagcccca tcccacttgg tggagaagat tgtctaccac
1080agcaagtaca agccaaagag gctgggcaat gacatcgccc ttatgaagct ggccgggcca
1140ctcacgttca atggtacatc tgggtctcta tgtggttctg cagctcttcc tttgtttcaa
1200gaggatttgc aattgctcat tgaagcattc ttatgatggc tgctttataa tccttgtcag
1260atattaataa ttccaactcc tgattcatgt tggtgttggc atcagttgat tatcttttct
1320cattaaaatt gtgatgctcc taaaaaaaaa aaaaaaaaa
13591892711DNAHomo sapiens 189ttcagaagga ggagagacac cgggcccagg gcaccctcgc
gggcgggcgg acccaagcag 60tgagggcctg cagccggccg gccagggcag cggcaggcgc
ggcccggacc tacgggagga 120agccccgagc cctcggcggg ctgcgagcga ctccccggcg
atgcctcaca actccatcag 180atctggccat ggagggctga accagctggg aggggccttt
gtgaatggca gacctctgcc 240ggaagtggtc cgccagcgca tcgtagacct ggcccaccag
ggtgtaaggc cctgcgacat 300ctctcgccag ctccgcgtca gccatggctg cgtcagcaag
atccttggca ggtactacga 360gactggcagc atccggcctg gagtgatagg gggctccaag
cccaaggtgg ccacccccaa 420ggtggtggag aagattgggg actacaaacg ccagaaccct
accatgtttg cctgggagat 480ccgagaccgg ctcctggctg agggcgtctg tgacaatgac
actgtgccca gtgtcagctc 540cattaataga atcatccgga ccaaagtgca gcaaccattc
aacctcccta tggacagctg 600cgtggccacc aagtccctga gtcccggaca cacgctgatc
cccagctcag ctgtaactcc 660cccggagtca ccccagtcgg attccctggg ctccacctac
tccatcaatg ggctcctggg 720catcgctcag cctggcagcg acaagaggaa aatggatgac
agtgatcagg atagctgccg 780actaagcatt gactcacaga gcagcagcag cggaccccga
aagcaccttc gcacggatgc 840cttcagccag caccacctcg agccgctcga gtgcccattt
gagcggcagc actacccaga 900ggcctatgcc tcccccagcc acaccaaagg cgagcagggc
ctctacccgc tgcccttgct 960caacagcacc ctggacgacg ggaaggccac cctgacccct
tccaacacgc cactggggcg 1020caacctctcg actcaccaga cctaccccgt ggtggcagat
cctcactcac ccttggccat 1080aaagcaggaa acccccgagg tgtccagttc tagctccacc
ccttgctctt tatctagctc 1140cgcccttttg gatctgcagc aagtcggctc cggggtcccg
cccttcaatg cctttcccca 1200tgctgcctcc gtgtacgggc agttcacggg ccaggccctc
ctctcagggc gagagatggt 1260ggggcccacg ctgcccggat acccacccca catccccacc
agcggacagg gcagctatgc 1320ctcctctgcc atcgcaggca tggtggcagg aagtgaatac
tctggcaatg cctatggcca 1380caccccctac tcctcctaca gcgaggcctg gggcttcccc
aactccagct tgctgagttc 1440cccatattat tacagttcca catcaaggcc gagtgcaccg
cccaccactg ccacggcctt 1500tgaccatctg tagttgccat ggggacagtg ggagcgactg
agcaacagga ggactcagcc 1560tgggacaggc cccagagagt cacacaaagg aatctttatt
attacatgaa aaataaccac 1620aagtccagca ttgcggcaca ctccctgtgt ggttaattta
atgaaccatg aaagacagga 1680tgaccttgga caaggccaaa ctgtcctcca agactcctta
atgaggggca ggagtcccag 1740ggaaagagaa ccatgccatg ctgaaaaaga caaaattgaa
gaagaaatgt agccccagcc 1800ggtaccctcc aaaggagaga agaagcaata gccgaggaac
ttggggggat ggcgaatggt 1860tcctgcccgg gcccaagggt gcacagggca cctccatggc
tccattatta acacaactct 1920agcaattatg gaccataagc acttccctcc agcccacaag
tcacagcctg gtgccgaggc 1980tctgctcacc agccacccag ggagtcacct ccctcagcct
cccgcctgcc ccacacggag 2040gctctggctg tcctctttcc tccactccat ttgcttggct
ctttctacac ctccctcttg 2100gatgggctga gggctggagc gagtccctca gaaattccac
caggctgtca gctgacctct 2160ttttcctgct gctgtgaagg tatagcacca cccaggtcct
cctgcagtgc ggcatcccct 2220tggcagctgc cgtcagccag gccagcccca gggagcttaa
aacagacatt ccacagggcc 2280tgggcccctg ggaggtgagg tgtggtgtgc ggcttcaccc
agggcagaac aaggcagaat 2340cgcaggaaac ccgcttcccc ttcctgacag ctcctgccaa
gccaaatgtg cttcctgcag 2400ctcacgccca ccagctactg aagggaccca aggcaccccc
tgaagccagc gatagagggt 2460ccctctctgc tccccagcag ctcctgcccc caaggcctga
ctgtatatac tgtaaatgaa 2520actttgtttg ggtcaagctt ccttctttct aacccccaga
ctttggcctc tgagtgaaat 2580gtctctcttt gccctgtggg gcttctctcc ttgatgcttc
tttctttttt taaagacaac 2640ctgccattac cacatgactc aataaaccat tgctcttcaa
aaaaaaaaaa aaaaaaaaaa 2700aaaaaaaaaa a
27111903323DNAHomo sapiens 190tgcttcataa aatttaccta
agcaagtggt cttgcttgcc tcaaatccaa gcagtcttga 60acacttggag gcaattaatg
agtatatctt agtcaaaaga attgttggag ctttttatta 120aagctgcagt ttcagttctg
cttttgggga attgtgctat gaaagcagct gccaaaataa 180gctcatttat tttcttcaat
cccactcagt gctcagtcac tatattctgt ttcctttttt 240tttttcaagt tgcatatttg
gtttcccctt atgattggga aagatgaatt ttcagcagaa 300aacagtgttt gttcactttc
aaagagtgat agtttctaaa acatttagag caataaatat 360tcatcagagg taccaagtaa
gccagcagaa gagttaaggg ttagagaaat cccttatttc 420atgtcttgac tctaaaatga
tcaaagtact tttccttgta atgtggattt cttcttatgc 480ggatatgcaa aaacttcagt
tatacgtagt aatgctagca ggtaatttta gtggacattt 540tataacaact gtcactttgt
tttgccacat gtagagtttg ttcagctatt ttccagatat 600ctccccacaa aaggaggcaa
agggtaccag cttttcaatg agcattacct attacttggc 660aaagatgatg aagactctat
taatagttca tttgataaat gttgacataa ccaacaatag 720agattaggaa gttagtttta
agaaatcaat agcatataga cattaccctc atggagtttg 780tattctacta cttgaactga
ttgtagctat aaaagcatag ttagatagct gaatagttag 840atcataagca aagaaggcca
gaacacatct cttatcaaga aatcaatgaa tagtttatct 900catttttaaa gcaactttat
ccttctttaa ttccttcctt tcttctagtg caaaactact 960taataaggtt ggtgtttagg
ttagtgttca caccattcct catctggtgt gaattacctt 1020ctctttcttt actatttact
accaacctag tacatgtgtt gactgaattc ttttcaaaca 1080atgttgagtt atcatggtgc
acctaataaa ttaacaccac agattacagc atccttgctg 1140attttctcag caaagccaga
ttagatggaa ataaacaaag aaaatgatcc tagagtgaat 1200ttttctagaa aatatctatt
atgaaccatg ctgtttaaag tattagcttg aaggtgatgg 1260atccagctat tcagaaaata
actttcatat aaccatgatt ttgcacagta tgaggtctta 1320aatgtgtgga aagagataaa
ttttttatca ttaccacaaa ccccttttaa agattcaaag 1380gtggaagaaa gtgatttatt
ttttctcttc agcatacata tataaaagac ttgtcagatg 1440tttaatttgg ggaggttgat
aatgaaacat atcaacagag tatagtagtt atagtagtgt 1500ttgtgggtaa ataatttcct
ggggtcagac atatataaac atatttgctt caaaatgata 1560aaggcatgaa atcagtctta
aaaattgaaa tgggggtgat gggggagaaa aagaagaaca 1620aatttgaagt gccctttcaa
atctgctgga tacaagtatt gaagttttaa gtcatcttat 1680tctgtctgaa agtgtatttt
tcattctaca atagacccaa tcaacaagac gtataacttg 1740agttgcatga tgttcagttt
atgtaatcta ctgttgggat ggtaagaatt gatgtaggct 1800gtggtgtaag aatgaattaa
aatatagttt cactggcttt tctctacata tccactatca 1860caatggctag gtttcctgtt
gctcactgtt ggattctgga gaaaaattta atgaaagatg 1920atatcagagg aagaataagt
ggaggtagag aagaaaggag tgatagagga ggggaaaaaa 1980acaaaacata tttttgtgtt
atccaaagga gctttttcct tattctgtca agcattgaga 2040tcttcttcag ctttcaatgt
agttgctaaa tacaaataat gctactaggt agtgactaaa 2100tatagcaaac acttcatcag
atattagaat taggtcacac tattgaggtt ataatctgaa 2160ggttgtgtta catagaaacc
actttagatt attatcaact tgggctaggc tttattttat 2220aatagcatag taagtaatat
ctattgtgtc atttcttcaa ccattttatt ctaagatcca 2280tgaagcttct tgaggccaaa
taaaataata agtttagaca agaagtagat tgtgactttt 2340tttcccttag agatactatt
tactatctcc tatcctgata ggtggaaggt ttactgaatt 2400ggaaattggt tgactattag
tttttaacta aaatgtgcaa taacacattg cagtttcctc 2460aaactagttt cctatgatca
ttaaactcat tctcagggtt aagaaaggaa tgtaaatttc 2520tgcctcaatt tgtacttcat
caataagttt ttgaagagtg cagattttta gtcaggtctt 2580aaaaataaac tcacaaatct
ggatgcattt ctaaattctg caaatgtttc ctggggtgac 2640ttaacaagga ataatcccac
aatataccta gctacctaat acatggagct ggggctcaac 2700ccactgtttt taaggatttg
cgcttacttg tggctgagga aaaataagta gttcgaggaa 2760gtagttttta aatgtgagct
tatagataga aacagaatat caacttaatt atgaaattgt 2820tagaacctgt tctcttgtat
ctgaatctga ttgcaattac tattgtactg atagactcca 2880gccattgcaa gtctcagata
tcttagctgt gtagtgattc ttgaaattct ttttaagaaa 2940aattgagtag aaagaaataa
accctttgta aatgaggctt ggcttttgtg aaagatcatc 3000cgcaggctat gttaaaagga
ttttagctca ctaaaagtgt aataatggaa atgtggaaaa 3060tatcgtaggt aaaggaaact
acctcatgct ctgaaggttt tgtagaagca caattaaaca 3120tctaaaatgg ctttgttaca
ccagagccat ctggtgtgaa gaactctata tttgtatgtt 3180gagagggcat ggaataattg
tattttgctg gcaatagaca cattctttat tatttgcaga 3240ttcctcatca aatctgtaat
tatgcacagt ttctgttatc aataaaacaa aagaatcctg 3300ttaaaaaaaa aaaaaaaaaa
aaa 3323191671DNAHomo
sapiensmisc_feature(21)..(21)a or g or c or t/umisc_feature(49)..(49)a or
g or c or t/u 191tggctctctc cttcaaaagg nccaggccct gtcccccttt ctccccgant
ccaaccccag 60ctcccctgtg aagaaaaaag ttaaaaaatt tgttatttat ttgctttttg
cgttgggatg 120ggttcgtgtc cagtcccggg ggtctgatat ggccatcaca ggctgggtgt
tcccagcagc 180cctggcttgg gggcttgacg cccttcccct tgccccaggc catcatctcc
ccacctctcc 240tcccctctcc tcagttttgc cgactgcttt tcatctgagt caccatttac
tccaagcatg 300tattccagac ttgtcactga ctttccttct ggagcaggtg gctagaaaaa
gaggctgtgg 360gcaggaaaga aaggctcctg tttctcattt gtgaggccag cctctggctt
ttctgccgtg 420gattctcccc ctgtcttctc ccctcagcaa ttcctgcaaa gggttaaaaa
tttaactggt 480ttttactact gatgacttga tttaaaaaaa atacaaagat gctggatgct
aacttgatac 540taaccatcag attgtacagt ttggttgttg ctgtaaatat ggtagcgttt
tgttgttgtt 600gttttttcat gccccatact actgaataaa ctagttctgt gcgggtaaaa
aaaaaaaaaa 660aaaaaaaaaa a
6711923485DNAHomo sapiens 192cacaaagaaa aaagaaatac ctgtagaagc
gcatcgaaag ctcctggaac agagttgtgt 60ctcatatttg caaagatgca gaaaaaataa
acccgggaca tccagctttc ttttcctttc 120ttctttgact attctgagaa gctatgcgac
taggagcaca ttttaggtaa acacgtggct 180tgagtagcca taaggccact cttccctgtc
gtgtgacccg cgcctgggcc tttaagagat 240attggtgttt gaaaagggag gaatctgttt
gccctcagat atttagttca actgcctgca 300ttgcttccta ttttgttgtc caactctgta
gtagttagca ctggccttac caacatgtaa 360agaaattttc tttactgccc catgagtagt
tggaggcaaa gagaaatttt taaagcgcag 420aaaaaggcct gcagggagat ggaatttgtt
ctgccagaga aacgagatga tagctgtatt 480taataaagtt actgacctct tgtcaaaatt
taaaacgcaa aagaagatgt ttcaaaatgc 540agagaatgtc agaaaacaaa aactacaggg
accagaccag tataatgttt agttttcatt 600atactaactt ttgtctagac tggagttgat
tcactatttt ttctttaact cctcaggaag 660caaaccttcc cgatgatgaa gacttcttga
aggatttcat gggtgatttg ggatcccagg 720accatttggc tagtgtgcct aggtgaccac
atgattgctg ttttaccagg aatgcagcat 780cccattgaca aaacaagtgc tctgagaagg
tttaaaatac tacagagaat atgggaacac 840agaccttgaa atttagctga gttgtaacag
ctgaaactcc aagaggtgtc ttccttgttt 900gaggtgaaac tagtgttgct tccagagggc
agctggaaac cgtaaagctg tttggaaatc 960tttttgactg acttgctgac aaagaggtac
tgtgatgcat tttaacaata tctaagttga 1020ttttttttta aatcaaggaa aataaaaacc
aagcatgaat gctatggtat gtgccccttt 1080tgaccatcct gggctgatta acatcattta
aatcaaagta atcataaaaa ggcatattct 1140acttcaatta tgtggtcaaa taagagtaaa
cacacacact cacacatgct gaccccaatt 1200gccagagcat tactgcacta taaattacgg
ttaattccca aattatacta ctgtttatct 1260tatttaacaa gtcagaaagc acttttaaaa
taacttgagg gctacaaggt cattctatta 1320atgtcattct ccattcgggt tgtaggcatg
tggaagtacc cattaaaaga taagttagag 1380tttaaatact gataaacaaa accttttatt
gcaactggac agtttctgga gagttagcgg 1440aagaatcttg gagtttcctt tggtcagatg
aatacaacat ttcacttttg cagcactatt 1500tagaatgtac tccatggttc tcttgttccc
aacttccaaa aagaacagaa aactttggtt 1560tacacagaac acgggcatct gaggcaggac
ctcttccctg ccctttgatc tgactcacac 1620ctccacatat gacgtaatca acccaaattt
gacaccaatt cactcttttc tgcaaagggc 1680atattttgaa acaagggaca gcctgagggc
ggctataatg agaatgttca tgggggttac 1740tgggtcccta attctgaact tgcttatgac
acccagagtg aatagattca gattcagaac 1800cttctgagaa ataacccaaa gaaaatttgt
tacccagcca attcttcgaa agcttaatat 1860caaaatatat cttttcaaga agaaaatcgt
tagagagaag aatgtggagg ggagagaaat 1920gggtttctca ttgatatgat attttgttaa
ccatttcatt ttgaattatt caagttttgg 1980ttaatattgt attctttttt cgtaactatt
ttaccgtgag agtaggtcat tgggttactt 2040agatatttat ttttacacag ttattagtct
tcagatagtt ttattttact tcatatgatt 2100ttagtttttg tcagtataat tttaaatcat
gtttttcttg gtcatctctt tgtgtatatt 2160gtgtaattgg attttcattg actgcaagtg
gagtgtttgc cactcaattc agtactcagt 2220actatggtga cttgttttca aataagtctc
agatacacat ttagggagcc tttgctggcc 2280gaatatagac tctgtcagga cagcaggtcc
cctgatctaa gaattttccc caatggttgc 2340tctaaaaatg ctgctatttt gctgttcact
gtattgcact tagttaaaaa gaagataatg 2400tgaaagatga gagcagtttt ttaaaggatc
ttttcatata cccaattccc ttattttcag 2460atgtcccatc aattttagat atgaaagctt
taagtaaaag tgtgtatgcc tttctactgt 2520cagaacagga tggatgcagc ctgggtcaga
tttatttaag ataaaaatca tgcagactca 2580tcattcatat cataggtgaa aaatgtaaaa
accaaatggt ttccactaaa gccaccaaga 2640tcttttagaa atgtttgcac ctttggtggt
ggcacaggaa aagagaagaa ttcagctgga 2700gtgaattcta gaagtagata tcagaaacgg
ggcatgaaga acaggggaac tgggtggcat 2760cagactccta aagaagtgag ttaattttcc
ttcccttcca ttcagattca tgccacagct 2820ccatatcttg agtatgtgta agaggtgagt
tccttcttca gccaggggcg gtggctcatg 2880cctttaatcc caatgctttg ggaggccaag
gtgggaggat cacttgtgcc ttggggttca 2940aggttgcagt gaaccatgat tgcaccactg
cactccagcc tgagtgacag agcaagaccc 3000tgtctctaaa aatatatata aaaagtaaaa
ctaaagaact tcttgcctaa acctgaatta 3060ccgcaatttg ctgagtgact ttgagaaaaa
tcagactgtt tagttcagtc gggatgaaaa 3120gcttgcgatt gcttcccaca agaatgggca
atagtgacgg ctgcaaggta cttttatttg 3180ttcatgaaag aacgacaatt tttcaaaatg
taattaaaca taatagaatg ttttaaacta 3240ctgggcactg aaactggaag aaaaaggagg
ctttattgaa cattcccctt tttcagttgg 3300ttcaaagttc agcactgtgg ttatcattgg
tgatgccaga aaacattagt agacttagac 3360aattgctatg gcagtttcta aacagagctt
tttctataca ctatttgcaa ctggagtgca 3420atattgtata ttctgtgtta aagaaataaa
gtatttttat catttattaa aaaaaaaaaa 3480aaaaa
34851931915DNAHomo sapiens 193ccatccagaa
cgatgaggcc gtggccccgc tcatgaagta cctggatgag aagctggccc 60tgctgaacgc
ctcgctggtg aaggggaacc tgagcagggt gctggaggcc ctgtgggagc 120tactcctcca
ggccattctg caggcgctgg gtgcaaaccg tgacgtctct gctgatttct 180acagccgctt
ccatttcacg ctggaggccc tggtcagttt tttccacgca gagggtcagg 240gtttgcccct
ggagagcctg agggatggaa gctacaagag gctgaaggag gagctgcggc 300tgcacaaatg
ttccacccgc gagtgcatcg agcagttcta cctggacaag ctcaaacaga 360ggaccctgga
gcagaaccgg tttggacgcc tgagcgtccg ttgccattac gaggcggctg 420agcagcggct
ggccgtggag gtgctgcacg ccgcggacct gctccccctg gatgccaacg 480gcttaagtga
cccctttgtg attgtggagc tgggcccacc gcatctcttt ccactggtcc 540gcagccagag
gacccaggtg aagacccgga cgctgcaccc tgtatacgac gaactcttct 600acttttccgt
gcctgccgag gcgtgccgcc gccgcgcggc ctgtgtgttg ttcaccgtca 660tggaccacga
ctggctgtcc accaacgact tcgctgggga ggcggccctc ggcctaggtg 720gcgtcactgg
tgtcgcccgg ccccaggtgg gcgggggtgc aagggctggg cagcctgtca 780ccctgcacct
gtgccggccc agagcccagg tgagatctgc gctgaggagg ctggaaggcc 840gcaccagcaa
ggaggcgcag gagttcgtga agaaactcaa ggagctggag aagtgcatgg 900aggcggaccc
ctgagtccat cagctgccag ccccggccct ggcccccacc ccaagttccc 960tgaagcatcc
tccagctcac tgtggccagc tttgtgcaac cagggcccac ggcgcccctc 1020ctgtgctgtg
acgtgtgtgt cgtggctggc cccgcggcgc ctaccgccct ggccgtgtct 1080gtctggtgtg
tgctgtgaac ccctgcaccc aaccccacat ctgggtggcc aacttggcag 1140gacttggcca
gcagctgccc aggacacagt gcaggccaga gcgggcttga ccacctggtg 1200ggcctccctg
cccgcttcct tgggctcccc ggccctgggt gggcggtgcg cagctggtct 1260ccagggactc
agtgagtggc tgtgctctct gcacaacggg caatgtgcag acgcattttt 1320ggtaatcaca
gctggggagt gaaaagggtg ccactggcac cactgggtgg atggtccaga 1380gcctccaccc
acagagggga tgcaaagggc aggtgagtca agaaccgcat aggtctccag 1440tccccacggg
gctcccaggc cggggaaagg ttcccctgag gtcactctga ggccagggac 1500gtcacccaag
gctggtggtc agtgtgaagg gctccgtgcc aactggtcag ctgtccttca 1560cgcacatatc
cgtggccacc tgagacctgc tccacgaccc ttccaggcag agccgagagt 1620tcgccccaac
ccttccccag gcccagtgtg aaaaacagac tcacaagggg cttcttggcc 1680tgcagcttca
tttgcgagag cgccgaggca ggacacagag cacagctgtg ctggaagtgt 1740ggggagaacc
cggacagctc agtcctgcca gcagccgcaa agagccgagg ctgccaggcc 1800catttatgtc
cctcatgtct ctagattttc tcgtcaccca gcctcaaaaa tatatgtgtc 1860tgcaaccctc
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaa
19151942681DNAHomo sapiens 194gggggggctc cgtgacagcc aacgcagtga ccctcgcccc
ttccttggca gcacatcatg 60cttgtgcagc ggcagatgtc tgtgatggaa gaggacctgg
aagaattcca gctcgctctg 120aaacactacg tggagagtgc ttcctcccaa agtggatgct
tgcgtatttc tatacagaag 180ctttcaaatg aatctcgcta catgatctat gagttctggg
agaatagtag tgtatggaat 240agccaccttc agacaaatta tagcaagaca ttccaaagaa
gtaatgtgga tttcttggaa 300actccagaac tcacatctac aatgctagtt cctgcttcgt
ggtggatcct gaacaactag 360atgttcctag acattttctt tatggttcca agtgcaaaac
aggtgttctt atctaaaacg 420tcaattagaa aattatctgc ggttgttaat ctactgtata
tttttgtttg gtatatttac 480taagtgcact ctttcaaaac ttattctata actttatcaa
ttcatgtgaa ttttagctca 540attttcaaag ttcactaata ttctcaatat ttaatgctaa
atgctttgct acattgtaac 600tcacctaaaa ccttttagtg acaaaatcct aatatgtgga
aaaaagcata tgcataaagg 660aataatattg tgaaaatgaa tctgttatga taaagaaaaa
ataaagtgga aacttttaga 720gtattacttc atagggcaga ttttgtaaac tgtcgtatac
tgtaaagggt taaatcagcg 780ttttgtgatt tttaagtaac tgtgagtgaa gtttattctt
caacaatgtc tactccatcc 840ccaacccaac tcacagccct atgactacta tctttgcatt
agttaaaaag ttagtatata 900ggcatcaaac aaccttggct gtaacctata gaatctctat
ccatgtatca ggttatagac 960tggtttttca aaagtgaaca atcctgtgat aagttggagt
accatttagt aatacagcaa 1020cattgtgtca tttattagca tcataattct ttgttatgta
agttaaatat atcaagaaag 1080aagagactgt ttggaaaaat gtggttcaag ttttatgcta
tatagttttg gtatgcgata 1140cagacagcta acttttctta tgaaaaatac atatttgcat
gtaaacaatg atttcaaaat 1200acttgaaaaa taaaatttta acccaaatga ataactaaga
aatataaaac aagcacaaaa 1260tcttagggaa gtcataaaat agtagtgaaa gtattagaca
gaagacatct gttttcgaat 1320ttcaacacta gaatgactaa aactatctac ctatagaact
atctgtagat agtatactat 1380ctacactctg ctcaacaagc tcagaaatta aatattttta
gtaataaaaa tctgttctgg 1440ttataaacct tgctaatgaa aatacaatac atataaaaat
gtatagccat gttattttct 1500agtataaatt cctttgaaac tataagtctt tgaggaaaat
tataaggtaa aattttcctg 1560tttttccccc tttgaaaaac tcaggaaaaa aggaagattg
aactaataaa attttatttc 1620ttaaatataa atttgaccta aaatattttc tcaaactaat
tcatgaaaca gcaactttta 1680ccaatacctt tgtatactct cagttctcat tcagtataaa
taaaatttta aaatcctttc 1740atagttctat tagaaataag tagtaaattt tgatatattg
tacatacaca cgtgtgtgtg 1800tgtgtgtgtg tgtgtgtgta tttgtgtgcc tctggtcaac
tctaaggatg acagacactg 1860tgtaacaaca cctgggtcaa ctcttttaat ttatatacaa
agcaaagaac aacattaatg 1920gagatgcaca atgattattc aaacaagcta tatatatgta
caaaggcaaa cagacacata 1980acagtctctg cagactgatt gtatatagta agaaaagatc
aaaagacttt aaaacctaaa 2040tgacttttga catacaaact cttcttgaga atgtttgttg
taaatggttt caaaaataca 2100aattatagcc aatcaaaaca ttgctttggt tggtgcattt
aagtatccaa ctcaaaaagc 2160atatcaaata ttttgggtac taggcagttt ccaaagtagc
atggtagtat tacttgttaa 2220aagggttctg ttttcattaa cagtactaag tggaagggat
ctgcagattc caaattggaa 2280taagctctat catattctga aacaagaatt agaatgactt
gagaacgggc aaataacaaa 2340gcaaaccaat ataattatat ggtcattctg accccagctc
ttatacaaat tatacatgta 2400tttttgtgta tgtttgtgag agttgtatgt atgtgaatgt
gtgtgagtgt gtattcacat 2460acacatatat actggaacct atagtagaaa aggaaactag
tagggccaaa aaaaaaaaga 2520aaaagaaaaa gaaaaaagaa aaaaaaagaa aaaactggga
cctaagtata aatatctcat 2580cctaaagtaa acaataagtt tatagttaac gaagattttt
ttctatttaa aaccccattt 2640tcctaaagaa caaaaaaaaa aaaaaaaaaa aaaaaaaaaa a
26811952805DNAHomo sapiens 195ggcacgaggg cagggggaag
ggaagtgcgg ctcggtcggc gcgggtggag ggggcgtgag 60gccgccctac ggtggccgtc
gagggacggc gctacggctc ccacgctagg ccaaacgcct 120ccggcggccg cgcccgagag
ccccttcacc tgcagggcga ccccagccgg cgacgcgtga 180accacgccct cagccgcctt
gccagcgccc ccagccgcgc gccccagcac catgcggccg 240ccctgcgcac ggagccccga
gggacagggg cacccgcagg cccggcccct agcaccgccg 300gccggccccg aggtccggga
cgccggcgcc gccgcggaga gggcaccggg ccgacgcctc 360cccccagggt cagctgcggg
ctcccaggcc taggcgccca tgacccctac gccaaccgcc 420gcctggacac cgccgccgcc
actgcgacct agcgccgccg ccgccggggc ccaatgccgg 480tcatgcccat tccgcggcgg
gtgcgctcct tccacggccc gcacaccacc tgcctgcatg 540cggcctgcgg gcccgtgcgc
gcctcccacc tggcccgcac caagtacaac aacttcgacg 600tgtacatcaa gacgcgctgg
ctgtacggct tcatccgctt cctactctac tttagctgca 660gcctgttcac tgcggcgctc
tggggtgcgc tggccgccct cttctgccta cagtacctgg 720gcgttcgcgt cctgctgcgc
ttccagcgca agctgtcggt gctgctgctg ctgctgggcc 780gccggcgcgt ggacttccgc
ctggtgaacg agctgctcgt ctatggcatc cacgtcacca 840tgctgctggt cgggggcctg
ggctggtgct tcatggtctt cgtggacatg tgagggccgt 900gggtgcgagc ttgatgtatc
gtcccggcct gtggctgtgt tctctccatg ggtggggtcg 960gccagcgcct tcccttcgcc
catcccccag gcagtcgctg ctgcccggcg cccacggaga 1020gaaaagaaag ggctgagact
tctgtgatgg gggcgcggac accaccccta ggctggcttc 1080ctggacccac cctccccgta
tgcactctca ggggcagcgc ccacctgccg gtggctcctg 1140ctcacatgtc ttcgggtcgt
actgcggggt gggccctccg ttccgcctct ctgtgggcct 1200ctctccagga ccacagctgc
cagggacttt agacatcacc ctgggaggcc cctggacaca 1260gagggctgtg tgcccaggag
caattccgga ggggggccct cctggctgca cagccccttc 1320tgcgtgccct ggccccagcc
ccagccaacg ggacacggaa ggctcccctc gctgacacac 1380cacactgcca caaagctgct
tactctgccc tgggccgcct gaggcctggc actgcccgcg 1440gaccaccctg tgtgtgtcat
cctgaggggc tgtgtgggtc ctgagtcccc agccagcctt 1500cagggtcccc ttggattgtg
tagatgcagt ctagcggggg gccggagaag ggctcaggtg 1560ggaggggcct cagcaggctc
ccagctcagg ggctggcctg gggggaaccc tgggagccag 1620gggctgactc cagcaacact
ggcctgtctg cctgttctgg gagggctgtg aggatgtctt 1680gcagatgctc tggatttctg
cggaggcacc tccattcctt tctggctttt tttgcggggg 1740agggctttgg gcctctttct
ttgagggaac accgtcaaag aaagcctggg agatcgaggc 1800ttcagtgagc caggatggaa
acgcgtgtcc caagtgtccg gagcaggcgg cagaggcctc 1860agtgcggcaa acacagcccc
agagcctgtg tggcaccagc agcatcttag agccccaggt 1920atatgctgag atcttatctc
acgctgtcct ccagtgtctg gggggcccaa atgatggcac 1980agggtcaggt gggctggagg
ggcgcagatg cctgtgttca gggagggtgg ccaccatggg 2040ccgaggtctc acccaggacc
ccttgctctg ctcctcagcc ttgcagtcac ggcagcacta 2100tggtggactg cccatggccg
tgtgactttg ggggcaagtg ggagggcgcc ctgaataatg 2160attgcaagga caacaggcag
aggctaccct agagcaggac acagggtgtg gtactgacaa 2220ccctagtgtc acctcaaatc
catgtcccca cactctgggc atgggtggga cttgtgaccc 2280taccctgtca ggcggaccag
tggcccagga gccatgagga cagttgtgtg ccactggaag 2340agaaactttt tgaaaaaccc
taaatcaggt agagaaagca aaaaatctct ggccgtaaac 2400cgtgctctct aatttatcgg
cagcttctgt ggatgacctc tgatgagccc gggctgcgtc 2460cacgccctgg gcaggtaggc
gggagcttcc ctgcgtgggc ctcatttctt gctgcagaga 2520atcttttgca ctaagtcatg
ctgtttcctc aaagaagctt tgttttttgt taacgtatta 2580ctcagagtca cccaagcctc
ttggctgagg gtgaaggtgg gacgggaggc gggagggggc 2640tggtggtgcc gctcgtgcgg
tgtcaacgct gcagggagtt gtggcacctt ggtgccctct 2700gagcacctgg ccgcctgctg
tccccggtgc ctgtgaaatt cgtcatgcca tgacccacct 2760gcattaaacc tattttttta
atgtgttaaa aaaaaaaaaa aaaaa 2805196496DNAHomo
sapiensmisc_feature(2)..(2)a or g or c or t/umisc_feature(26)..(26)a or g
or c or t/umisc_feature(49)..(49)a or g or c or
t/umisc_feature(60)..(60)a or g or c or t/umisc_feature(88)..(88)a or g
or c or t/umisc_feature(115)..(115)a or g or c or
t/umisc_feature(124)..(124)a or g or c or t/umisc_feature(159)..(159)a or
g or c or t/umisc_feature(207)..(207)a or g or c or t/u 196gnggaaacac
gggccaaacc cgtgantttg gtgccccttg taaactcanc ccctgcaaan 60ccaaagaccc
caatggattt aaagttgntt ggcatttgta ctggcaaggc aaaanatttt 120taantacctt
ttcctaatac ttattgtatg agcttttgnt gtttacttgg aggttttgtc 180ttttactaca
agtttggaac tatttantat tgccttggta tttgtgctct gtttaagaaa 240caggcacttt
tttttattat ggataaaatg ttgagatgac aggaggtcat ttcaatatgg 300cttagtaaaa
tatttattgt tcctttattc tctgtacaag attttgggcc tctttttttc 360cttaatgtca
caatgttgag ttcagcatgt gtctgtccat ttcatttgta cgcttgttca 420aaaccaagtt
tgttctggtt tcaagttata aaaataaatt ggacatttaa cttgatctcc 480aaaaaaaaaa
aaaaaa
4961972802DNAHomo sapiens 197ggcacgaggc aatctgagga gcaggaggac cggggcgccg
gtgtcctgcc gcctccttct 60ccttgctctc acctgcgcct attagtccac gcgccttcaa
ggccaggggc tacagcccag 120acagagaggg gacagcagag ggagagagag cacctgagga
tacagagctg gcactggact 180gccttttcac cccccaggtg atgagtgagg ttcgaagaac
ggaagattta aaaagcagcc 240ggggcctccg tattgaatga aagacccagt gcaaagacat
caccatgaac actagcattc 300cttatcagca gaatccttac aatccacggg gcagctccaa
tgtcatccag tgctaccgct 360gtggagacac ctgcaaaggg gaagtggtcc gcgtgcacaa
caaccacttc cacatcagat 420gcttcacctg tcaagtatgt ggctgtggcc tggcccagtc
aggcttcttc ttcaagaacc 480aggagtacat ctgcacccag gactaccagc aactctatgg
cacccgctgt gacagctgcc 540gggacttcat cacaggcgaa gtcatctcgg ccctgggccg
cacttaccac cccaagtgct 600tcgtgtgcag cttgtgcagg aagcctttcc ccattggaga
caaggtgacc ttcagcggta 660aagaatgtgt gtgccaaacg tgctcccagt ccatggccag
cagtaagccc atcaagattc 720gtggaccaag ccactgtgcc gggtgcaagg aggagatcaa
gcacggccag tcactcctgg 780ctctggacaa gcagtggcac gtcagctgct tcaagtgcca
gacctgcagc gtcatcctca 840ccggggagta tatcagcaag gatggtgttc catactgtga
gtccgactac catgcccagt 900ttggcattaa atgtgagact tgtgaccgat acatcagtgg
cagagtcttg gaggcaggag 960ggaagcacta ccacccaacc tgtgccaggt gtgtacgctg
ccaccagatg ttcaccgaag 1020gagaggaaat gtacctcaca ggttccgagg tttggcaccc
catctgcaaa caggcagccc 1080gggcagagaa gaagttaaag catagacgga catctgaaac
ctccatctca ccccctggat 1140ccagcattgg gtcacccaac cgagtcatct gcgacatcta
cgagaacctg gacctccggc 1200agagacgggc ctccagcccg gggtacatag actcccccac
ctacagccgg cagggcatgt 1260cccccacctt ctcccgctca cctcaccact actaccgctc
tggtgatttg tctacagcaa 1320ccaagagcaa aacaagtgaa gacatcagcc agacctccaa
gtacagtccc atctactcgc 1380cagaccccta ctatgcttcg gagtctgagt actggaccta
ccatgggtcc cccaaagtgc 1440cccgagccag aaggttctcg tctggaggag aggaggatga
ttttgaccgc agcatgcaca 1500agctccaaag tggaattggc cggctgattc tgaaggaaga
aatgaaggcc cggtcgagct 1560cctatgcaga tccctggacc cctccccgga gctccaccag
cagccgggaa gccctgcaca 1620cagctggcta tgagatgtcc ctcaatggct cccctcggtc
gcactacctg gctgacagtg 1680atcctctcat ctccaaatct gcctccctgc ctgcctaccg
aagaaatggg ctgcacagga 1740cacccagcgc agacctcttc cactacgaca gcatgaacgc
agtcaactgg ggcatgcgag 1800agtacaagat ctacccttat gaactgctgc tggtgactac
aagaggaaga aaccgactgc 1860ccaaggatgt agacaggacc cgtttagagg gaaacttttg
gaagagtggc tgcttatgag 1920attccaaaat gaagtgttgg ccaacaccgc tcatggccat
cctggatttt cccagtggct 1980tcccttcctg ctcgcctccc tgaacagggg agaaagctta
acctctcttc tcctctccaa 2040acctttcacc ttgaatgggt aatgtttggt gggggctgtt
ccttcttgga gaagccttga 2100gtcggaccat tttgagatca tggaggaagg atgaagaagt
gaaaatgaca ataatgactc 2160tcaagaggct ggcgatgtga catggcaaat gtagaactga
cttaaattga acaaaccctc 2220actgagcacc tctgatgttg agcacctgct gaatactgag
cactgaatgg gggaggggga 2280ggggagcacg gggtgagtca acctgggact cggtctcagg
gatatgccta ccaatagcgg 2340gtatcgtaag gcatgtaccc aaacataacg gatgtaaggc
agaaagtgat cggagaagga 2400atgagaaagt gtgcgtgatg ttaatgaaaa gtcatatgca
gctagagcag acccaggaaa 2460gctttctgga agagattgca tctgaggaaa ttcaggaagg
atctttgtag attgggggga 2520gattctaaat tgaaggggtg atggggtgag gggccagagg
gaagtctgct gtgttctcat 2580gtaggatgtc agccctccct gcaacttctc tttttggcca
atgtcttttc actttcctga 2640ccctttagaa tcatccccag ccagacgcaa tcatggaagt
tgccttattg tcactggtta 2700agaacttggc gagattgaag ggcttttgtt attgttgttg
gatatttttg tttcccataa 2760aagcacatca tttcaaccct aaaaaaaaaa aaaaaaaaaa
aa 28021983278DNAHomo sapiens 198gaagaattag
atacttttga gtgggctttg aagagctggt ctcagtgttc caaaccctgt 60ggtggaggtt
tccagtacac taaatatgga tgccgtagga aaagtgataa taaaatggtc 120catcgcagct
tctgtgaggc caacaaaaag ccgaaaccta ttagacgaat gtgcaatatt 180caagagtgta
cacatccact ctgggtagca gaagaatggg aacactgcac caaaacctgt 240ggaagttctg
gctatcagct tcgcactgta cgctgccttc agccactcct tgatggcacc 300aaccgctctg
tgcacagcaa atactgcatg ggtgaccgtc ccgagagccg ccggccctgt 360aacagagtgc
cctgccctgc acagtggaaa acaggaccct ggagtgagtg ttcagtgacc 420tgcggtgaag
gaacggaggt gaggcaggtc ctctgcaggg ctggggacca ctgtgatggt 480gaaaagcctg
agtcggtcag agcctgtcaa ctgcctcctt gtaatgatga accatgtttg 540ggagacaagt
ccatattctg tcaaatggaa gtgttggcac gatactgctc cataccaggt 600tataacaagt
tatgttgtga gtcctgcagc aagcgcagta gcaccctgcc accaccatac 660cttctagaag
ctgctgaaac tcatgatgat gtcatctcta accctagtga cctccctaga 720tctctagtga
tgcctacatc tttggttcct tatcattcag agacccctgc aaagaagatg 780tctttgagta
gcatctcttc agtgggaggt ccaaatgcat atgctgcttt caggccaaac 840agtaaacctg
atggtgctaa tttacgccag aggagtgctc agcaagcagg aagtaagact 900gtgagactgg
tcaccgtacc atcctcccca cccaccaaga gggtccacct cagttcagct 960tcacaaatgg
ctgctgcttc cttctttgca gccagtgatt caataggtgc ttcttctcag 1020gcaagaacct
caaagaaaga tggaaagatc attgacaaca gacgtccgac aagatcatcc 1080accttagaaa
gatgagaaag tgaaccaaaa aggctagaaa ccagaggaaa acctggacaa 1140cctctctctt
cccatggtgc atatgcttgt ttaaagtgga aatctctata gatcgtcagc 1200tcattttatc
tgtaattgga agaacagaaa gtgctggctc actttctagt tgctttcatc 1260ctccttttgt
tctgcattga ctcatttacc agaattcatt ggaagaaatc accaaagatt 1320attacaaaag
aaaaatatgt tgctaagatt gtgttggtcg ctctctgaag cagaaaaggg 1380actggaacca
attgtgcata tcagctgact ttttgtttgt tttagaaaag ttacagtaaa 1440aattaaaaag
agataccaat ggtttacact ttaacaagaa attttggata tggaacaaag 1500aattcttaga
cttgtattcc tatttatcta tattagaaat attgtatgag caaatttgca 1560gctgttgtgt
aaatactgta tattgcaaaa atcagtatta ttttaagaga tgtgttctca 1620aatgattgtt
tactatatta catttctgga tgttctaggt gcctgtcgtt gagtattgcc 1680ttgtttgaca
ttctataggt taattttcaa agcagagtat tacaaaagag aagttagaat 1740tacagctact
gacaatataa agggttttgt tgaatcaaca atgtgatacg taaattatag 1800aaaaagaaaa
gaaacacaaa agctatagat atacagatat cagcttacct attgccttct 1860atacttataa
tttaaaggat tggtgtctta gtacacttgt ggtcacaggg atcaacgaat 1920agtaaataat
gaactcgtgc aagacaaaac tgaaaccctc tttccaggac ctcagtaggc 1980accgttgagg
tgtcctttgt ttttgtgtgt gtgtgttctt ttttaatttt cgcattgttg 2040acagatacaa
acagttatac tcaatgtact gtaataatcg caaaggaaaa agttttggga 2100taacttattt
gtatgttggt agctgagaaa aatatcatca gtctagaatt gatatttgag 2160tatagtagag
ctttggggct ttgaaggcag gttcaagaaa gcatatgtcg atggttgaga 2220tatttatttt
ccatatggtt catgttcaaa tgttcacaac cacaatgcat ctgactgcaa 2280taatgtgcta
ataatttatg tcagtagtca ccttgctcac agcaaagcca gaaatgctct 2340ctccagggag
tagatgtaaa gtacttgtac atagaattca gaactgaaga tatttattaa 2400aagttgattt
ttttttcttg atagtatttt tatgtactaa atatttacac taatatcaat 2460tacatatttt
ggtaaactag agagacataa ttagagatgc atgctttgtt ctgtgcatag 2520agacctttaa
gcaaactact acagccaact caaaagctaa aactgaacaa atttgatgtt 2580atgcaaacat
cttgcatttt tagtagttga tattaagttg atgacttgtt tcccttcaag 2640gaaacattaa
attgtatgga ctcagctagc tgttcaatga aattgtgaat tagaaacatt 2700tttaaaagtt
tttgaaagag ataagtgcat catgaattac atgtacatga gaggagatag 2760tgatatcagc
ataatgattt tgaggtcagt acctgagctg tctaaaaata tattatacaa 2820actaaaatgt
agatgaatta acctctcaaa gcacagaatg tgcaagaact tttgcatttt 2880aatcgttgta
aactaacagc ttaaactatt gactctatac ctctaaagaa ttgctgctac 2940tttgtgcaag
aactttgaag gtcaaattag gcaaattcca gatagtaaaa caatccctaa 3000gccttaagtc
tttttttttt tcctaaaaat tcccatagaa taaaattctc tctagtttac 3060ttgtgtgtgc
atacatctca tccacagggg aagataaaga tggtcacaca aacagtttcc 3120ataaagatgt
acatattcat tatacttctg acctttgggc tttcttttct actaagctaa 3180aaattccttt
ttatcaaagt gtacactact gatgctgttt gttgtactga gagcacgtac 3240caataaaaat
gttaacaaaa tataaaaaaa aaaaaaaa
3278199567DNAHomo sapiens 199tcctgtgttc tagacctctg gaggctgctg tggggaccac
actgatcctg gagaaaaggg 60atggagctga aaaagatgga atgcttgcag agcatgacct
gaggagggag gaacgtggtc 120aactcacacc tgcctcttcc tgcagcctca cctctacctg
cccccatcat aagggcactg 180agcccttccc aggctggata ctaagcacaa agcccatagc
actgggctct gatggctgct 240ccactgggtt acagaatcac agccctcatg atcattctca
gtgagggctc tggattgaga 300gggaggccct gggaggagag aagggggcag agtcttccct
accaggtttc tacacccccg 360ccaggctgcc catcagggcc cagggagccc ccagaggact
ttattcggac caagcagagc 420tcacagctgg acaggtgttg tatatagagt ggaatctctt
ggatgcagct tcaagaataa 480atttttcttc tcttttcaaa aatgtataaa aatcattata
catagcatta aagaaacatt 540tttgagaagt acaaaacaaa aaaaaaa
5672002907DNAHomo sapiens 200cgggcgccgc aggagcgagt
gagctgggag cgaggggcga aggcgcggag aagcccggcc 60gcccggtggg cggcagaagg
ctcagccgag gcggcggcgc cgactccgtt ccactctcgg 120cccggatcca ggcctccggg
ttcccaggcg ctcacctccc tctgacgcac tttaaagagt 180ctcccccctt ccacctcagg
gcgagtaata gcgaccaatc atcaagccat ttaccaggct 240tcggaggaag ctgtttatgt
gatccccgca ctaattaggc tcatgaacta acaaatcgtt 300tgcacaactt gtgaagaagc
gaacacttcc atggattgtc cttggactta gggcgccctg 360cccgcctttt gcagaggaga
aaaaactttt tttttttttt gcctcccccg agaactttcc 420ccccttctcc tccctgcctc
taactccgat ccccccacgc catctcgcca aaaaaaaaaa 480aaaaaaaaaa aaagaaaaaa
aaagaaaaaa aaagaaaaaa aattacccca atccacgcct 540gcaaattctt ctggaaggat
tttcccccct ctcttcaggt tgggcgcgtt tggtgcaaga 600ttctcgggat cctcggcttt
gcctctccct ctccctcccc cctcctttcc tttttccttt 660cctttccttt ctttcttcct
ttccttcccc ccacccccac ccccacccca aacaaacgag 720tccccaattc tcgtccgtcc
tcgccgcggg cagcgggcgg cggaggcagc gtgcggcggt 780cgccaggagc tgggagccca
gggcgcccgc tcctcggcgc agcatgttcc agccggcgcc 840caagcgctgc ttcaccatcg
agtcgctggt ggccaaggac agtcccctgc ccgcctcgcg 900ctccgaggac cccatccgtc
ccgcggcact cagctacgct aactccagcc ccataaatcc 960gttcctcaac ggcttccact
cggccgccgc cgccgccgcc ggtaggggcg tctactccaa 1020cccggacttg gtgttcgccg
aggcggtctc gcacccgccc aaccccgccg tgccagtgca 1080cccggtgccg ccgccgcacg
ccctggccgc ccacccccta ccctcctcgc actcgccaca 1140ccccctattc gcctcgcagc
agcgggatcc gtccaccttc tacccctggc tcatccaccg 1200ctaccgatat ctgggtcatc
gcttccaagg gaacgacact agccccgaga gtttcctttt 1260gcacaacgcg ctggcccgaa
agcccaagcg gatccgaacc gccttctccc cgtcccagct 1320tctaaggctg gaacacgcct
ttgagaagaa tcactacgtg gtgggcgccg aaaggaagca 1380gctggcacac agcctcagcc
tcacggaaac tcaggtaaaa gtatggtttc agaaccgaag 1440aacaaagttc aaaaggcaga
agctggagga agaaggctca gattcgcaac aaaagaaaaa 1500agggacgcac catattaacc
ggtggagaat cgccaccaag caggcgagtc cggaggaaat 1560agacgtgacc tcagatgatt
aaaaacataa acctaacccc acagaaacgg acaacatgga 1620gcaaaagaga cagggagagg
tggagaagga aaaaacccta caaaacaaaa acaaaccgca 1680tacacgttca ccgagaaagg
gagagggaat cggagggagc agcggaatgc ggcgaagact 1740ctggacagcg agggcacagg
gtcccaaacc gaggccgcgc caagatggca gaggatggag 1800gctccttcat caacaagcga
ccctcgtcta aagaggcagc tgagtgagag acacagagag 1860aaggagaaag agggagggag
agagagaaag agagagaaag agagagagag agagagagag 1920agaaagctga acgtgcactc
tgacaagggg agctgtcaat caaacaccaa accggggaga 1980caagatgatt ggcaggtatt
ccgtttatca cagtccactt aaaaaatgat gatgatgata 2040aaaaccacga cccaaccagg
cacaggactt ttttgttttt tgcacttcgc tgtgtttccc 2100ccccatcttt aaaaataatt
agtaataaaa aacaaaaatt ccatatctag ccccatccca 2160cacctgtttc aaatccttga
aatgcatgta gcagttgttg ggcgaatggt gtttaaagac 2220cgaaaatgaa ttgtaatttt
cttttccttt taaagacagg ttctgtgtgc tttttatttt 2280gatttttttt cccaagaaat
gtgcagtctg taaacacttt ttgatacctt ctgatgtcaa 2340agtgattgtg caagctaaat
gaagtaggct cagcgatagt ggtcctctta cagagaaacg 2400gggagcagga cgacgggggg
gctgggggtg gcgggggagg gtgcccacaa aaagaatcag 2460gacttgtact gggaaaaaaa
cccctaaatt aattatattt cttggacatt ccctttccta 2520acatcctgag gcttaaaacc
ctgatgcaaa cttctccttt cagtggttgg agaaattggc 2580cgagttcaac cattcactgc
aatgcctatt ccaaacttta aatctatcta ttgcaaaacc 2640tgaaggactg tagttagcgg
ggatgatgtt aagtgtggcc aagcgcacgg cggcaagttt 2700tcaagcactg agtttctatt
ccaagatcat agacttacta aagagagtga caaatgcttc 2760cttaatgtct tctataccag
aatgtaaata tttttgtgtt ttgtgttaat ttgttagaat 2820tctaacacac tatatacttc
caagaagtat gtcaatgtca atattttgtc aataaagatt 2880tatcaatatg ccaaaaaaaa
aaaaaaa 290720160DNAArtificial
SequenceSynthetic construct - oligonucleotide 201acttctggtg atgataaaaa
tggttttatc acccagatgt gaaagaagct gcctgtttac 6020260DNAArtificial
SequenceSynthetic construct - oligonucleotide 202gtggttctgt aaaaacgcag
aggaaaagag ccagaaggtt tctgtttaat gcatcttgcc 6020360DNAArtificial
SequenceSynthetic construct - oligonucleotide 203tttataagga agcagctgtc
taaaatgcag tggggtttgt tttgcaatgt tttaaacaga 6020460DNAArtificial
SequenceSynthetic construct - oligonucleotide 204cttatgaagc tggccgggcc
actcacgttc aatggtacat ctgggtctct atgtggttct 6020560DNAArtificial
SequenceSynthetic construct - oligonucleotide 205gtgagccagc atttcccata
gctaacccta ttctcttagt ctttcaaaat gtagaatggg 6020660DNAArtificial
SequenceSynthetic construct - oligonucleotide 206ctttacacct gataaaatat
tttgcgaaga gaggtgttct ttttccttac tggtgctgaa 6020760DNAArtificial
SequenceSynthetic construct - oligonucleotide 207gcatacatct catccacagg
ggaagataaa gatggtcaca caaacagttt ccataaagat 6020860DNAArtificial
SequenceSynthetic construct - oligonucleotide 208tgagttcagc atgtgtctgt
ccatttcatt tgtacgcttg ttcaaaacca agtttgttct 6020960DNAArtificial
SequenceSynthetic construct - oligonucleotide 209aagaccgaga ctgagggaaa
gcatgtctgc tgggtgtgac catgtttcct ctcaataaag 6021060DNAArtificial
SequenceSynthetic construct - oligonucleotide 210ggcatctggc ccctggtagc
cagctctcca gaattacttg taggtaattc ctctcttcat 6021160DNAArtificial
SequenceSynthetic construct - oligonucleotide 211tggatgtttg tgcgcgtgtg
tggacagtct tatcttccag catgatagga tttgaccatt 6021260DNAArtificial
SequenceSynthetic construct - oligonucleotide 212tcctggcaga gccatggtcc
caggcttccc aaaagtgttt gtggcaatta ttcccctagg 6021360DNAArtificial
SequenceSynthetic construct - oligonucleotide 213tttgatgata gcagacattg
ttacaaggac atggtgagtc tatttttaat gcaccaatct 6021460DNAArtificial
SequenceSynthetic construct - oligonucleotide 214ttcttccagt tgcactattc
tgagggaaaa tctgacacct aagaaattta ctgtgaaaaa 6021560DNAArtificial
SequenceSynthetic construct - oligonucleotide 215gaacaattgt ggtctctctt
aacttgaggt tctcttttga ctaatagagc tccatttccc 6021660DNAArtificial
SequenceSynthetic construct - oligonucleotide 216gttaagtgtg gccaagcgca
cggcggcaag ttttcaagca ctgagtttct attccaagat 6021760DNAArtificial
SequenceSynthetic construct - oligonucleotide 217cggcctactg agcggacaga
atgatgccaa aatattgctt atgtctctac atggtattgt 6021860DNAArtificial
SequenceSynthetic construct - oligonucleotide 218cagggtgttt gcccaataat
aaagccccag agaactgggc tgggccctat gggattggta 6021960DNAArtificial
SequenceSynthetic construct - oligonucleotide 219tgtacagttt ggttgttgct
gtaaatatgg tagcgttttg ttgttgttgt tttttcatgc 6022060DNAArtificial
SequenceSynthetic construct - oligonucleotide 220taccaaactg ggactcacag
ctttattggg ctttctttgt gtcttgtgtg tttcttttat 6022160DNAArtificial
SequenceSynthetic construct - oligonucleotide 221cattgaggtt tggatggtgg
caggtaaaac agaaaggcaa gatgtcatct gacattaggc 6022260DNAArtificial
SequenceSynthetic construct - oligonucleotide 222agttcagcac tgtggttatc
attggtgatg ccagaaaaca ttagtagact tagacaattg 6022360DNAArtificial
SequenceSynthetic construct - oligonucleotide 223taaaatttct tgattgtgac
tatgtggtca tatgcccgtg tttgtcactt acaaaaatgt 6022460DNAArtificial
SequenceSynthetic construct - oligonucleotide 224agccatctgg tgtgaagaac
tctatatttg tatgttgaga gggcatggaa taattgtatt 6022560DNAArtificial
SequenceSynthetic construct - oligonucleotide 225cttattgtca ctggttaaga
acttggcgag attgaagggc ttttgttatt gttgttggat 6022660DNAArtificial
SequenceSynthetic construct - oligonucleotide 226ctttctagtg agctaaccgt
aacagagagc ctacaggata cacgtgagat aatgtcacgt 6022760DNAArtificial
SequenceSynthetic construct - oligonucleotide 227ttgtcttaaa atttcttgat
tgtgatactg tggtcatatg cccgtgtttg tcacttacaa 6022860DNAArtificial
SequenceSynthetic construct - oligonucleotide 228cctgggggaa aggggcattc
atgacctgaa ctttttagca aattattatt ctcagtttcc 6022960DNAArtificial
SequenceSynthetic construct - oligonucleotide 229ttcattaaca gtactaagtg
gaagggatct gcagattcca aattggaata agctctatca 6023060DNAArtificial
SequenceSynthetic construct - oligonucleotide 230ccaatgcaga agagtattaa
gaaagatgct caagtcccat ggcacagagc aaggcgggca 6023160DNAArtificial
SequenceSynthetic construct - oligonucleotide 231caaggctacg atggctatga
tggtcagaat tactaccacc accagtgaag ctccagcctg 6023260DNAArtificial
SequenceSynthetic construct - oligonucleotide 232agctcacagc tggacaggtg
ttgtatatag agtggaatct cttggatgca gcttcaagaa 6023360DNAArtificial
SequenceSynthetic construct - oligonucleotide 233tccaaagtag aaagggttct
tttagaaaac ttgaagaatg tgcctcctct tagcatctgt 6023460DNAArtificial
SequenceSynthetic construct - oligonucleotide 234gatgcatttt tcagtccctt
ttcagagcaa atgcttttgc aatggtagta atgtttagtt 6023560DNAArtificial
SequenceSynthetic construct - oligonucleotide 235cctgtggggc ttctctcctt
gatgcttctt tcttttttta aagacaacct gccattacca 6023660DNAArtificial
SequenceSynthetic construct - oligonucleotide 236ttgcactaag tcatgctgtt
tcctcaaaga agctttgttt tttgttaacg tattactcag 6023760DNAArtificial
SequenceSynthetic construct - oligonucleotide 237ctggatccca ggccctggca
cccctcagga aatacaagaa aaagaatatt cacatctgtt 6023860DNAArtificial
SequenceSynthetic construct - oligonucleotide 238ttagaggggc cacctatcaa
ctcatcagtg ttcaaagaat atgctgggag catgggtgag 6023960DNAArtificial
SequenceSynthetic construct - oligonucleotide 239ggcccattta tgtccctcat
gtctctagat tttctcgtca cccagcctca aaaatatatg 6024060DNAArtificial
SequenceSynthetic construct - oligonucleotide 240tccccaaaaa cctcacccga
ggctgcccac tatggtcatc tttttctcta aaatagttac 6024160DNAArtificial
SequenceSynthetic construct - oligonucleotide 241gaaattcctc acaccttgca
ccttccctac ttttctgaat tgctatgact actccttgtt 6024260DNAArtificial
SequenceSynthetic construct - oligonucleotide 242tgtctgtcca ccacgagatg
ggaggaggag aaaaagcggt acgatgcctt cctgacctca 6024360DNAArtificial
SequenceSynthetic construct - oligonucleotide 243gtcttatctc tcaggggggg
tttaagtgcc gtttgcaata atgtcgtctt atttatttag 6024460DNAArtificial
SequenceSynthetic construct - oligonucleotide 244ccgagtagta tgggtctctg
tgtgagaaac caggagatat tttcatcttg ttcggaaata 6024560DNAArtificial
SequenceSynthetic construct - oligonucleotide 245ttgtgcaaaa gtcccacaac
ctttctggat tgatagtttg tggtgaaata aacaatttta 6024660DNAArtificial
SequenceSynthetic construct - oligonucleotide 246tccagtattc tgcagggcca
gtcagttgta cagaagttgg aatattctgt tccagaatta 6024760DNAArtificial
SequenceSynthetic construct - oligonucleotide 247gtctcgaaca gcggttgttt
ttactttatt tatcttaggc cctcagctcc ctgacgtcct 6024860DNAArtificial
SequenceSynthetic construct - oligonucleotide 248agtgaatctt ttcctcttgg
tagcatcaac actggggata aatcagaacc attctgtgga 6024960DNAArtificial
SequenceSynthetic construct - oligonucleotide 249tgagagccca gaacaagaag
gagcagaagg gcactttgac cttcattatt atgaaaatca 6025060DNAArtificial
SequenceSynthetic construct - oligonucleotide 250ggaagaactg atgcttgctg
ctaactaaag ttttggatgt atcgatttag agaaccaatt 6025160DNAArtificial
SequenceSynthetic construct - oligonucleotide 251gaatgagaga ataagtcatg
ttccttcaag atcatgtacc ccaatttact tgccattact 6025260DNAArtificial
SequenceSynthetic construct - oligonucleotide 252tacggaaagg aaacaggtta
tactcttaga tttaaaaagt gaaagaaact gcaggcgcct 602532888DNAHomo sapiens
253gtggcggcgg aggcggcgga ggccagggag gaagatgtcg taatgagcga tccacagacc
60agcatggctg ccactgctgc tgtgagtccc agtgactacc tgcagcctgc cgcctccacc
120acccaggact cccagccatc tcccttagcc ctgcttgctg caacatgtag caaaattggc
180cctccagcag ttgaagctgc tgtgacacct cctgctcccc cacagcccac accgcggaaa
240cttgtcccta tcaaacctgc ccctctccct ctcagccccg gcaagaatag ctttggaatc
300ttgtcctcca aaggaaatat acttcagatt caggggtcac aactgagcgc ctcctatcct
360ggagggcagc tggtgttcgc tatccagaat cccaccatga tcaacaaagg gacccgatca
420aatgccaata tccagtacca ggcggtccct cagattcagg caagcaattc ccaaaccatc
480caagtacagc ccaatctcac caaccagatc cagatcatcc ctggcaccaa ccaagccatc
540atcaccccct caccgtccag tcacaagcct gtccccatca agccagcccc catccagaag
600tcgagtacga ccaccacccc cgtgcagagc ggggccaatg tggtgaagtt gacaggtggg
660ggcggcaatg tgacgctcac tctgcccgtc aacaacctcg tgaacgccag tgacaccggg
720gcccctactc agctcctcac tgaaagcccc ccaaccccgc tgtctaagac taacaagaaa
780gcaaggaaga agagccttcc tgcctcccag ccccctgtgg ctgtggctga gcaggtggag
840acggtgctga tcgagaccac cgcggacaac atcatccagg caggaaataa cctgctcatt
900gttcagagcc ctggtggggg ccagccagct gtggtccagc aggtccaggt ggtgcccccc
960aaggccgagc agcagcaggt ggtacagatc ccccagcagg ctctgcgggt ggtgcaggcg
1020gcatctgcca ccctccccac tgtaccccag aagccctccc agaactttca gatccaggca
1080gctgagccga cacctactca ggtctacatc cgcacgcctt ccggtgaggt gcagacagtc
1140cttgtccagg acagcccccc agcaacagct gcagccacct ctaacaccac ctgtagcagc
1200cctgcatccc gtgctcccca tctgagtggg accagcaaaa agcactcagc tgcaattctc
1260cgaaaagagc gtcccctgcc aaagattgcc ccagccggga gcatcatcag cctgaatgca
1320gcccagttgg cggcagctgc ccaggcaatg cagaccatca acatcaatgg tgtccaggtc
1380cagggcgtgc ctgtcaccat caccaacaca ggcgggcagc agcagctgac agtgcagaat
1440gtttctggga acaacctgac catcagtggg ctgagcccca cccagatcca gctgcaaatg
1500gaacaagccc tggccggaga gacccagccc ggggagaagc ggcgccgcat ggcctgcacg
1560tgtcccaact gcaaggatgg ggagaagagg tctggagagc agggcaagaa gaagcacgtg
1620tgccacatcc ccgactgtgg caagacgttc cgtaagacgt ccttgctgcg tgcccatgtg
1680cgcctgcaca ctggcgagcg gccctttgtc tgcaactggt tcttctgtgg gaagaggttc
1740acacggagtg acgagctcca acggcatgct cgcacccaca caggggacaa acgcttcgag
1800tgcgcccagt gtcagaagcg cttcatgagg agtgaccacc tcaccaagca ttacaagacc
1860cacctggtca cgaagaactt gtaaggccaa ctgcggcggg aggccctgaa gatgcagtcc
1920cccacctgtg tcctccctgg gcccctggtg gaaaggagcc ctgtggctgc cttgggcctg
1980ccctcagccc cactcctgtt ctgcaactgt ccccacagga aggggctctg ttccctgtat
2040tgtcctcctt ctgaagcccc ttggctctgc cttggccctt cccctcacca cgagctcccg
2100gcctgcccag actgtggaca ctggccgtgc ccaatgagac gttctaaacc aggacgcgtg
2160ggaaccctta tttccaaagg aaaaacatgc atttcactcc gtcgaggagc aaagtgagcc
2220cctacccccc accccgatcc ccgctcccaa cactgccgga gtcgcgtcat gccatgcccc
2280ctctcctgca cctccctggc cctgccggcc actgtggacg ccctggggct tggcacccac
2340ctctggagaa actcggggcc acctccactc catgtgccca gccccgccac aacctctcct
2400ccagcacatt ccagctctat ttaaaaagta aagacaccca ccgactcctg atccccctct
2460ttttctatgg agaacgttgc cttatactct ctacttcaga tgatgaacac tgtgtactgt
2520gtgtgcttta aagaagtttt atttaattgc tcccttcttc ctttccttgt tattcacctc
2580cctgatgcct gctttcagtt gagggttggg ggcaatgatg agcatatgaa ttttttctca
2640ctctagcaat tcccttttct aaatgacaca gcatttaaac tcaaatctgg attcagataa
2700cagcacctgc acatcctgca cctcctccct ctcccttcac ctcacccctg cccggcccaa
2760gctctacttg tgtacagtgt atattgtata atagacaatt gtgtctacta catgtttaaa
2820aacacattgc ttgttatttt tgaggctttt aaattaaaca aaaatccaac tttaaaaaaa
2880aaaaaaaa
2888254999DNAHomo sapiens 254cccgcgtcgg tgcccgcgcc cctccccggg ccccgccatg
ggcctcaccg tgtccgcgct 60cttttcgcgg atcttcggga agaagcagat gcggattctc
atggttggct tggatgcggc 120tggcaagacc acaatcctgt acaaactgaa gttgggggag
attgtcacca ccatcccaac 180cataggcttc aatgtagaaa cagtggaata taagaacatc
tgtttcacag tctgggacgt 240gggaggccag gacaagattc ggcctctgtg gcggcactac
ttccagaaca ctcagggcct 300catctttgtg gtggacagta atgaccggga gcgggtccaa
gaatctgctg atgaactcca 360gaagatgctg caggaggacg agctgcggga tgcagtgctg
ctggtatttg ccaacaagca 420ggacatgccc aacgccatgc ccgtgagcga gctgactgac
aagctggggc tacagcactt 480acgcagccgc acgtggtatg tccaggccac ctgtgccacc
caaggcacag gtctgtacga 540tggtctggac tggctgtccc acgagctgtc aaagcgctaa
ccagccaggg gcaggcccct 600gatgcccgga agctcctgcg tgcatccccg gatgaccata
ctcccggact cctcaggcag 660tgccctttcc tcccactttt cctcccccat agccacaggc
ctctgctcct gctcctgcct 720gcatgttctc tctgttgttg gagcctggag ccttgctctc
tgggcacaga ggggtccact 780ctcctgcctg ctgggaccta tggaaggggc ttcctggcca
aggccccctc ttccagagga 840ggagcaggga tctgggtttc cttttttttt tctgttttgg
gtgtactcta ggggccaggt 900tgggaggggg aaggtgaggg cttcgggtgg tgctataatg
tggcactgga tcttgagtaa 960taaatttgct gtggtttgaa aaaaaaaaaa aaaaaaaaa
9992553487DNAHomo sapiens 255gtggcggtgg ctgcggcgac
ggcagaggcg aagggagccg gatcgccgac ctgagcggga 60ggcggcggtg gcggccatgg
cggcagatgg agagcgttcc ccgctgctgt ctgagcccat 120cgacggtggc gcgggcggca
acggtttagt ggggcccggc gggagtgggg ctgggcccgg 180gggaggcctg accccctccg
caccaccgta cggagccggt aaacatgccc cgccccaggg 240taagccgggg cgggtccgag
gtgctccccg gggtactctg aaagccgggg agggggcggg 300accgagggcg gaggcgggtc
ccagtcgcca ggtgcgggac tgctgcacct gtgactgggc 360gaggcttcct tccctccgta
atcgcgacca cagcctaggg acggaagggg gttctgagca 420acctgataga agtgccaatt
atgagaagcc ctccgagctt ggtcagaggg ttgaagatca 480gaaggacttc cctaccaccg
tggagcatca gtgggggtgt aagtgatccc agcccttcta 540tttgcttcct ctccagcatt
tcccccgttt cccgaggggc atccagccgt gttgcctggg 600gaggacccac ccccctattc
acccttaact agcccggaca gtgggagtgc ccctatgatc 660acctgccgag tctgccaatc
tctcatcaac gtggaaggca agatgcatca gcatgtagtc 720aaatgtggtg tctgcaatga
agccaccgtg agttacacat atctatgaaa tgggccctgt 780ttcctggatc ctctttctga
tgtcttggtt ctagaccctg accttccggc tattagccaa 840gtgcttttga tgatacccag
gtttcagttc caggtgtctc acacagccat ttccccagaa 900gccactcacc aaagctaatg
ttcactttct ctcactttta cacctagcct agttcctatt 960tgcaaatctc atgatatagt
ctttctttta tttctccttc ctggttagca ccttattttt 1020ctgatctcat aaagtgtttt
tggagggaag tggaggggat tgggattaga ggtttgcttg 1080ctgatgaccc tattattctc
tagccaatca agaatgcacc cccagggaaa aaatatgttc 1140gatgcccctg taactgtctc
cttatctgca aagtgacatc ccaacggatt gcatgccctc 1200gtccctactg gtaagaggca
taaggtgggg aagggcctaa gtggggaact ggaaagtcaa 1260aaaaggatga gcgtatacag
agaatgtaaa ggtgagagag cctagtgttt atttaggaga 1320aaaggctttg aagcatgtgc
ctcaggaatg ttatagctgt ctttctcgtt tctcaataaa 1380aatattgaga tgaaatgatg
tcgtttcgga gaatagagag ccttggggac tgggtgtgtt 1440atcctgaggt cggaggggaa
ttggggacct gaagtttaaa cagtgctctt tctttctcaa 1500ggattcttga gggtatacag
ttgggggaca gagtatctta agtacagaga agtcgagtga 1560cttaatagac agggagtggg
ggatgtggaa cagggactgt gaagattttt aggattaaaa 1620atttttcaaa cacaagtttg
aaaatacaag tctttttctt ttgtatagca aaagaatcat 1680caacctgggg cctgtgcatc
ccggacctct gagtccagaa ccccaaccca tgggtgtcag 1740ggttatctgt ggacattgca
agaatacttt tctggtgagg aaggggtatt gggaagggga 1800ggggaaagga gactaagagt
catttcgagt atatttctta gagtaatggt aatgacccct 1860gaaaggtctg tcctatggga
acatgttctg catccccacc ccaaggttct cattgaggga 1920gaccctgctt gtgctattat
ttttgttttc tttctccata gtggacagag ttcacagacc 1980gcactttggc acgttgtcct
cactgcagga aagtgtcatc tattgggcgc agatacccac 2040gtaagagatg tatctgctgc
ttcttgcttg gcttgctttt ggcagtcact gccactggcc 2100ttgccgtgag tacccttgcc
ccaacctctt tcattctgca gcctcatctc cataggctaa 2160gatttgggaa actgctaccc
taaaaaaaag tggaagaaac ttaggggact agtttgtttt 2220gttttaagat atggatgagc
taaagtgcaa agtggctgat caaacagact ttattactac 2280tacaagagtg aaaaacagcc
ttcctttctc tgtaggatga ggataggaca gtgaaattct 2340taatttaaga gttgctattt
ttcaaacctg gctcagttgt cagatattaa gaaaaactga 2400gatacagtgt gggatgggat
gagtatgtta cgcctaaggg aaggaagctg atcagctctg 2460cctttaagaa ggtccctgag
ggtggctaca tgtggataag gaacaaggac tgaagcgtga 2520gttattactg ttcttagaac
taataggagg tagtggagac caacattaac cccatctttc 2580ttttcttctc cctccttatc
ttcatcagtt tggcacatgg aagcatgcac ggcgatatgg 2640aggcatctat gcagcctggg
catttgtcat cctgttggct gtgctgtgtt tgggccgggc 2700tctttattgg gcctgtatga
aggtcagcca ccctgtccag aacttctcct gagcctgatg 2760acccacagac tgtgcctggc
ccctccctgg tggggacagt gacactacga agggagctgg 2820ggtagttaaa ggctcccggg
gcttctagaa ggaagccaag cagctgcctt ccttttccct 2880ggggagaggt aggaaggaac
caggccctca cttaggtttg gaggggcaga taagagcact 2940gctgaccatc tgctttcctc
caagggttgc tgtgtctagg gtgaagtagg caaaacgttg 3000cccttaaaac tgggccctga
agacggttcc agccttgtcc ttcctgtgtg ctccctgaga 3060gccattcctg tcccttacac
attccagggc agggtggggg tgggtagccc tgggggttcc 3120cctccctctt gtgcaccatt
aggactttgc tgctgctatt gcacttcacc agaggttggc 3180tctggcctca gtaccctcag
tctcctctcc ccacattgtg tcctgtgggg gtggggtcag 3240ccgctgctct gtacagaacc
acaggaactg atgtgtatat aactatttaa tgtgggatat 3300gttcccctat tcctgtattt
cccttaattc ctcctcccga ccttttttac ccccccagtt 3360gcagtattta actgggctgg
gtagggttgc tcagtctttg ggggaggtta gggacttatc 3420ctgtgcttgt aaataaataa
ggtcatgact ctaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3480aaaaaaa
34872561651DNAHomo sapiens
256ctgtcagcac ggggcctggc atgtaattgg tctgcaccca ctggtgcact gaactgccat
60aacctcaggt tttctttctt gctgataccc ctgggtcatg ttctttggca aataacatga
120ttcattatga agtagagttc agcaaaggac aaggatgaaa gttgtcattt agagaactgc
180cattcagact ttcttgtcta ggtaaagagc aaggtcttct ctcttttcaa ctcattttct
240aaatttaaac tgacgatgag aatatggatg atgtgtagct tccttctccc ccactgattt
300ttggttcagg ctctgggttt ttggcaagaa cttacagatc tcacttatta ttggccaccc
360ttctgcttta agacctgtca gggcttgtct gaaataaaac tggaagcact tctgattcca
420tcctcactgc tttcctcctt caccgtcaga cagcattact gtatagcact gagtgagggg
480ccctgacact ggaaggtggc aggtggggcc tggccgccag tgaggtatca tcatttgtgt
540gtgctcatgt gtgcgttggg cttgttgtat ctgaggcatg aacattccat atacacggct
600taaagagttt tcttcccata ccgaaagcat atattcggag aggacccaac ttattcagca
660tagccttgtt cccatagtag ccatcctatt cccccacagc ctctacttta ggaaagctcc
720ccgtccccat atgaaatcca aaccaaaaaa gatatatcac tttcagctca attattccat
780aattacaaga tattaggcta gtgggctctt tattggttgg gtcttatatt aatgttatat
840gctagccttg taattttgag ctcctctatg gatgttaatt ttagtgaaac tctatattga
900agaaaagatg ggactaaggg ggagacagga ggaggaaaga aagcagagac aggcaaagaa
960tcatagcctg aaattcaaca gcaagcatgg cttatgaaga tcaagttata tttttgcttc
1020atgaatcatt gtcagacaaa ttaagaacat attgtttctt atttatctat tgtcaaggat
1080tcactatcag acactaagaa tgaatcttga ttttcataag ctctgttgac accatggagc
1140cacagagcat aaaacttgca tctaataaag aaagtgcaac atggaacagc agggagtgga
1200ataccagcac aactcacagc tgcttcctgt tcctcgtccc tgttttcagg aatgtttctt
1260agcaggaagt tttttaatag accgagaatt tgttatatgt attctaagaa aagttgtagt
1320tgtagatgca ttactctccc aaatcttaga gatcagggat gattatgttc catttttgtt
1380tggtgagttc ccatctttgt atgtacctcc ttgctcccgg ctgtcctcct ctcctcttcc
1440ctagtgagtg gttaatgagt gttaatgcct aaaccatact tgttttatgg acacttctat
1500aatggattcg ttgcataatt ttcatgcagt gtatagtgtt actagttgga aattcttgga
1560ggactcttag ctgtctgatg aaattcctag tagaaatttt tgttttgaat tcctaaagtt
1620gaaatatgaa aattatattt taatttgatt c
16512572511DNAHomo sapiens 257agtttttctg gtagaaggcg gggttctcct cgtacgctgc
ggagtctctg cggggtgtag 60accggaatcc tgctgacggg cagagtggat cagggaggga
gggtcgagac acggtggctg 120caggtctgag acaaggctgc tccgaggtag tagctctctt
gcctggaggt ggccattcat 180tcctggagtg ctgctgagga gcgagggccc atctggggtc
tctggaagtc ggtgcccagg 240cctgaaggat agcccccctt gcgcttccct gggctgcggc
cggccttctc agaacgaagg 300gcgtccttcc accccgcggc gcaggtgacc gctgccatgg
cttttcccca tcggccggac 360gcccctgagc tgcctgactt ctccatgctg aagaggctgg
ctcgagacca gctcatctat 420ctgctggagc agcttcctgg aaaaaaggat ttattcattg
aggcagatct catgagccct 480ttggatcgaa ttgccaatgt ctccatcctg aagcaacacg
aagtagacaa gctatacaag 540gtggagaaca agccagccct cagctccaat gaacaattgt
gcttcttggt cagaccccgc 600atcaagaata tgcgatacat tgccagtctt gtcaatgctg
acaaattggc tggccgaact 660cgcaaataca aagtgatctt cagccctcaa aagttctatg
cgtgtgagat ggtgcttgag 720gaagagggaa tctatggaga tgtgagctgt gatgaatggg
ccttctcttt gctgcctctt 780gatgtggatc tgctgagcat ggaactacca gaatttttca
gggattactt tctggaagga 840gatcagcgtt ggatcaacac tgtagctcag gccttacacc
ttctcagcac tctctatgga 900ccctttccaa actgctatgg aattggcagg tgcgccaaga
tggcatatga attgtggagg 960aacctggagg aggaggagga tggcgaaacc aagggccgaa
ggccagagat tggacatatc 1020tttctcttgg acagagatgt ggactttgtg acagcacttt
gctcccaagt ggtttatgag 1080ggcctagtag atgacacctt ccgcatcaag tgtgggagtg
tcgactttgg cccagaagtc 1140acatcctctg acaagagcct gaaggtgcta ctcaatgccg
aggacaaggt gtttaatgag 1200attcggaacg agcacttctc caatgtcttt ggcttcttga
gccagaaggc ccggaacttg 1260caggcccagt atgatcgccg gagaggcatg gacattaagc
agatgaagaa tttcgtgtcc 1320caggagctca agggcctgaa acaggagcac cgcctgctga
gtctccatat tggggcctgt 1380gaatccatca tgaagaagaa aaccaagcag gatttccagg
agctaatcaa gactgagcat 1440gcactgctag aggggttcaa catccgggag agcaccagct
acattgagga acacatagac 1500cggcaggtgt cgcctataga aagcctgcgc ctcatgtgcc
ttttgtccat cactgagaat 1560ggtttgatcc ccaaggatta ccgatctctg aaaacacagt
atctgcagag ctatggccct 1620gagcacctgc taaccttctc caatctgcga agagctgggc
tcctaacgga gcaggccccc 1680ggggacaccc tcacagccgt ggagagtaaa gtgagcaagc
tggtgaccga caaggctgca 1740ggaaagatta ctgatgcctt cagttctctg gccaagagga
gcaattttcg tgccatcagc 1800aaaaagctga atttgatccc acgtgtggac ggcgagtatg
atctgaaagt gccccgagac 1860atggcttacg tcttcagtgg tgcttatgtg cccctgagct
gccgaatcat tgagcaggtg 1920ctagagcggc gaagctggca gggccttgat gaggtggtac
ggctgctcaa ctgcagtgac 1980tttgcattca cagatatgac taaggaagac aaggcttcca
gtgagtccct gcgcctcatc 2040ttggtggtgt tcttgggtgg ttgtacattc tctgagatct
cagccctccg gttcctgggc 2100agagagaaag gctacaggtt cattttcctg acgacagcag
tcacaaacag cgctcgcctt 2160atggaggcca tgagtgaggt gaaagcctga tgtttttccc
ggccagtgtt gacatcttcc 2220ctgaacacat tcctcagtga gatgcaggca tctggcaccc
agctgctata accaagtgtc 2280caccaactac ctgctaagag ccgggagcat ggaacgtgtt
gggatttaga gaacattatc 2340tgagaaaaga gttcacttcc tgctcccagg atatttctct
tttctgttta tgaagtacaa 2400cccatgctgc taagatgcga gcaggaagag gcatcctttg
ctaaatcctg tttgaatgtc 2460attgtaaata aagcctctgc tctcagatgt aaaaaaaaaa
aaaaaaaaaa a 25112582401DNAHomo sapiens 258ggcacgaggg
gtcgcgctgc cgccgtttta tttgaagaca tcgtccagtt ctgaccatgg 60actcgcagcc
atcggccctt agtttccatc ccctctagtg ggccttcggg ggctctactg 120acgtccctcc
ttcccttggt accgggccgg ggaagtgttc tcgggcgcgg gaggttccgc 180atgcccaggc
ctggccaggg gagatgaccg atccgtcgct ggggctgaca gtccccatgg 240cgccgcctct
ggccccgctc cctccccggg acccaaacgg ggcgggatcc gagtggagaa 300agcccggggc
cgtgagcttc gccgacgtgg ccgtgtactt ctcccgggag gagtggggct 360gcctgcggcc
cgcgcagagg gccctgtacc gggacgtgat gcgggagacc tacggccacc 420tgggcgcgct
cggtgagagc cccacctgct tgcctgggcc ctgcgcctcc acaggccctg 480ccgcgcctct
gggagctgcg tgtggagttg ggggccccgg ggccgggcag gcggcctcct 540cgcagcgtgg
ggtttgcgtt cttctccccc aggagtcgga ggcagcaagc cggcgctcat 600ctcctgggtg
gaggagaagg ccgaactgtg ggatccggct gcccaggatc cggaggtggc 660gaagtgtccg
acagaagcgg acccagcaga ttccagaaac aaggaagagg aaagacaaag 720ggaagggacg
ggagccctgg agaagcccga ccctgtggcc gccgggtctc ctgggctgaa 780ggctccccaa
gccccctttg ccgggttgga gcagctgtcc aaggcccggc gccggagtcg 840cccccgcttt
tttgcccacc cccctgtccc ccgagctgac cagcgtcacg gctgctacgt 900gtgcgggaag
agcttcgcct ggcgctccac actggtggag cacatttaca gccacagggg 960cgagaagccc
ttccactgcg cagactgcgg caagggcttc ggccacgctt cctccctgag 1020caaacaccgg
gccatccatc gtggggagcg gccccaccgc tgtcccgagt gtggtcgggc 1080cttcatgcgc
cgcacggcgc tgacttctca cctgcgcgtt cacactggcg agaagcccta 1140ccgctgcccg
cagtgtggcc gctgcttcgg cctgaagacc ggcatggcca agcaccaatg 1200ggtccatcgg
cccgggggcg aggggcgtag gggccggcgc cctggggggc tgtctgtgac 1260cctgactcct
gtccgcgggg acctggaccc gcctgtgggc ttccagctgt atccagagat 1320attccaggaa
tgtgggtgac ggcctaaaaa gtgaccatct agacattgtg ggcggcccga 1380gatgggctca
ggggcccgaa cctctgcagc ggcctgcagg gaggtcccag aatccaccgc 1440aagagctggc
ctggggtgcg gacagtctga tcttgggctc tcagcagcct cttctgccag 1500caccttgctc
cccgctgccc tgggctctcc aaggccccct ttgctgaggc agggctgagg 1560tgagaacccc
ccagacctcc atacagggaa gcaaaagctg tttctcctcc cagagatgct 1620aagaggattg
aggtagagaa gaaccttgtt ttctctgttg tctttttctt tttacttttt 1680taattttttg
agacggagtt ttgctcttgt tgcccaggct ggagtgcaat ggtgcgatct 1740cgactcactg
caacttccac ctcctggagt caagcgattc tcctgcctca gccacccaag 1800tagctggaat
tacaggcacc tgccactatg cccggctaac tttttgtatt tttagtagag 1860atggggtttc
accatgttgg ctaggctggt ctcgaactcc tgccctcagg tgatccaccc 1920acctctgcct
cccaaagtgc tgggattaca ggcgtgagcc acctcacctg gccttttctt 1980ttttattctt
tgaccttccc acaagacaat acccattgtc tgtttttttt gtttatttat 2040ttacttatta
agacagcatc ttgctcctca cccaggctgg aatgcagtgg tgtgaactgg 2100gctcactgca
gcctagacct gctgggctca aggaatcctc ctgccccagc ctctcagatg 2160gctgtgacta
caggtgggca acactatgcc tggttaattt ttaaattttt ttgcagagat 2220ggggttccca
ctatgttgat caggctggtc tcaaactcct cggttcaagc aattcgccca 2280ccttggcctc
ccaaagtgct gggattacag gggagccact gcactggcct tcattgtctt 2340tttgctgcac
aacctaaaaa accagtgacc ctgtattgga aaaaaaaaaa aaaaaaaaaa 2400a
24012592384DNAHomo
sapiens 259gccatggccg ccggccccgc gccgcccccc ggccgccccc gggcgcagat
gccgcatctg 60aggaaggtgc gaggcggatg gagcgggtgg tcgtgagcat gcaggacccc
gaccagggcg 120tgaagatgcg gagccagcgc ctgctggtca ccgtcattcc ccacgcggtg
acaggcagcg 180acgtcgtgca gtggttggcc cagaagttct gcgtctcgga ggaggaggcc
ctgcacctgg 240gcgccgtcct ggtgcagcat ggctacatct acccgctgcg cgacccccgt
agcctcatgc 300tccggccaga cgagacgccc tacaggttcc agaccccgta cttctggaca
agtaccctga 360ggccggctgc agagctggac tatgccatct acctggccaa gaagaacatc
cgaaaacggg 420ggaccctggt ggattatgag aaggactgct atgaccggct acacaagaag
atcaaccacg 480catgggacct ggtgctgatg caggcgaggg agcagctgag ggcagccaag
cagcgcagca 540agggggacag gctggtcatt gcgtgccagg agcagaccta ctggctggtg
aacaggcccc 600cgcccggggc ccccgatgtg ctggagcagg gtccagggcg gggatcctgc
gctgccagcc 660gtgtgctcat gaccaagagt gcagatttcc ataagcggga gatcgagtac
ttcaggaaag 720cgctgggcag gacccgagtg aagtcctccg tctgccttga ggcgtacctg
agtttctgcg 780gccagcgtgg accccacgat cccctcgtgt cggggtgcct gcccagcaat
ccctggatct 840cagacaatga cgcctactgg gtcatgaatg cccccacggt ggctgccccc
acgaagctcc 900gtgtggagag atggggcttc agcttccggg agctcctgga ggaccccgtg
gggcgggccc 960acttcatgga ctttctggga aaggagttca gtggagaaaa cctcagcttc
tgggaggcat 1020gtgaggagct tcgatatgga gcgcaggccc aggtccccac cctggtggat
gccgtgtacg 1080agcagttcct ggcccccgga gctgcccact gggtcaacat cgacagccgg
accatggagc 1140agaccctgga ggggctgcgc cagccccacc gctatgtcct ggatgacgcc
cagctgcaca 1200tatacatgct catgaagaag gactcctacc caaggttcct gaagtctgac
atgtacaagg 1260ccctcctggc agaggctggg atcccgctgg agatgaagag acgcgtgttc
ccgtttacgt 1320ggaggccacg gcactcgagc cccagccctg cactccttcc cacccctgtg
gagcccacag 1380cggcttgtgg ccctgggggt ggagatgggg tggcctagtg gacctggccc
atctgccact 1440ctagtccctg cagctcaacg tcctgcgtga atgcagcagc cacccccgtc
ttggcccagg 1500tcctgggggc tgctgaaccc agcaccagtg tccccttgtg cccagggggc
ccagtcttct 1560gtggggtgca cagcctccct ccctccagca agccctccct gcccagaagg
aatgggtcca 1620ggtgtggatt cccagggagg gggttcattg gctcagcttg ggtcagggca
gagcctgtta 1680cctgaagaga ggtgagacca aggccacagg gagctccacc ttctctggtc
ttcagtccag 1740cactgggtgc ccatccccat ctctaaaacc agtaaatcag ccagcgaata
cccggaagca 1800agatgcacag gcgggcggct tcccacacac ccgtcacaag acgcggacat
gcaggtctcg 1860gcgcgagctc tgccccgtcc aagagcctct ccgctgtcgc ccagtgtgag
cctggaagag 1920gacccaagag agtgccgtgc tgaggctgcc tcgaggtcac tgccttccgg
agctgcgcct 1980attcctccct cgccaaacgc gttccagaat ttgtccacag gtgcgccggc
acctgctttc 2040ccacctcgag gccgcggcct cccccccgat ttatagacaa ctctgacatt
gtcaccccac 2100tgacgaggcc cgattccata gggtggatcc ttgccaggcg tccctgatcc
tccctgccca 2160agtcttcctt cgtgagctgg ccttgctccc catcccccaa gtgcctcacc
agtcccccag 2220actgggtgaa ggtacagctg gctcctttcg ggggtgcagc ttcaactctc
tcggcggtag 2280ggcggtgcca tccccaccca tagggctggc tcacatccag tcactcccaa
cagcgtccag 2340cacacaaata aaagaccctt gggccctggc tctgagaaaa aaaa
23842601500DNAHomo sapiens 260agactgccga gcagccttga gccgttgagc
agctgaacag aggccatgcc ggggcactcc 60gaggcctgag acgaccacgc ctgtgccgct
gaggaccttc atcagggctc cgtccacttg 120gcccgcttgg ctgtccaatc acactccagt
gtcaaccact ggcacccagc agccaagaga 180ggtgtggcgt ggccctgggg acgcatggct
gaggcaggaa caggtgagcc gtcccccagc 240gtggagggcg aacacgggac ggagtatgac
acgctgcctt ccgacacagt ctccctcagt 300gactcggact ctgacctcag cttgcccggt
ggtgctgaag tggaagcact gtccccgatg 360gggctgcctg gggaggagga ttcaggtcct
gatgagccgc cctcaccccc gtcaggcctc 420ctcccagcca cggtgcagcc attccatctg
agaggcatga gctccacctt ctcccagcgc 480agccgtgaca tctttgactg cctggagggg
gcggccagac gggctccatc ctctgtggcc 540cacaccagca tgagtgacaa cggaggcttc
aagcggcccc tagcgccctc aggccggtct 600ccagtggaag gcctgggcag ggcccatcgg
agccctgcct caccaagggt gcctccggtc 660cccgactacg tggcacaccc cgagcgctgg
accaagtaca gcctggaaga tgtgaccgag 720gtcagcgagc agagcaatca ggccaccgcc
ctggccttcc tgggctccca gagcctggct 780gcccccactg actgcgtgtc ctccttcaac
caggatccct ccagctgtgg ggaggggagg 840gtcatcttca ccaaaccagt ccgaggggtc
gaagccagac acgagaggaa gagggtcctg 900gggaaggtgg gagagccagg caggggcggc
cttgggaatc ctgccacaga caggggcgag 960ggccctgtgg agctggccca tctggccggg
cccgggagcc cagaggctga ggagtggggc 1020agcccccatg gaggcctgca ggaggtggag
gcactgtcag ggtctgtcca cagtgggtct 1080gtgccaggtc tcccgccggt ggaaactgtt
ggcttccatg gcagcaggaa gcggagtcga 1140gaccacttcc ggaacaagag cagcagcccc
gaggacccag gtgctgaggt ctgagaggga 1200gatggcccag cctgacccca ctggccactg
ccatcctgct gccttcccag tggggctggt 1260cagggggcag cctggccact gcctagctgg
aatgggagga agcctgcagg tggcaccggt 1320ggccctggct gcagttctgg gcagcatcct
cccaagcaga gaccttgctg aagctcctgg 1380ggtgtggggt gtgggctgga agcactggct
ccctggtagg gacaataaag gttttgggtc 1440tttcaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaac 150026160DNAArtificial
SequenceSynthetic construct - oligonucleotide 261tcatcttcac caaaccagtc
cgaggggtcg aagccagaca cgagaggaag agggtcctgg 6026260DNAArtificial
SequenceSynthetic construct - oligonucleotide 262ctctgctcct gctcctgcct
gcatgttctc tctgttgttg gagcctggag ccttgctctc 6026360DNAArtificial
SequenceSynthetic construct - oligonucleotide 263tgctcccggc tgtcctcctc
tcctcttccc tagtgagtgg ttaatgagtg ttaatgccta 6026460DNAArtificial
SequenceSynthetic construct - oligonucleotide 264ccccatctct aaaaccagta
aatcagccag cgaatacccg gaagcaagat gcacaggcgg 6026560DNAArtificial
SequenceSynthetic construct - oligonucleotide 265ccagaaacaa ggaagaggaa
agacaaaggg aagggacggg agccctggag aagcccgacc 6026660DNAArtificial
SequenceSynthetic construct - oligonucleotide 266aagtacaacc catgctgcta
agatgcgagc aggaagaggc atcctttgct aaatcctgtt 6026760DNAArtificial
SequenceSynthetic construct - oligonucleotide 267acctcacccc tgcccggccc
aagctctact tgtgtacagt gtatattgta taatagacaa 6026860DNAArtificial
SequenceSynthetic construct - oligonucleotide 268ttcccttaat tcctcctccc
gacctttttt acccccccag ttgcagtatt taactgggct 60
User Contributions:
Comment about this patent or add new information about this topic: