Patent application title: NOVEL COMPOSITIONS OF COMBINATIONS OF NON-COVALENT DNA BINDING AGENTS AND ANTI-CANCER AND/OR ANTI-INFLAMMATORY AGENTS AND THEIR USE IN DISEASE TREATMENT
Inventors:
IPC8 Class: AA61K315517FI
USPC Class:
1 1
Class name:
Publication date: 2018-02-15
Patent application number: 20180042938
Abstract:
The invention provides for compositions for treating a cancer or an
inflammatory disorder comprising a combination of agents in a
pharmaceutically acceptable carrier, wherein said agents comprise: (i) a
non-covalent DNA binding agent; and (ii) an anti-cancer or
anti-inflammatory agent.Claims:
1.-43. (canceled)
44. A method of treating a subject with cancer or inflammation, comprising: a. identifying a subject in need of treatment; b. administering to said subject a therapeutically effective amount of one or more of (a) a non-covalent DNA binding agent and (b) an anti-cancer agent or an anti-inflammatory agent; wherein following said administration, there is inhibition of growth of a cancer cell or inflammation.
45. The method of claim 44, wherein said identification step comprises determining whether said patient has a mutation in one or more genes and/or the gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
46. (canceled)
47. The method of claim 44, wherein said subject has a loss of function of at least one tumor suppressor gene.
48. The method of claim 47, wherein said at least one tumor suppressor gene and/or the gene pathway is selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
49. The method of claim 44, wherein said subject has a DNA mismatch repair deficiency.
50. The method of claim 44, wherein said subject does not have a DNA mismatch repair deficiency.
51. The method of claim 44, wherein said cancer is mutant K-ras positive or has other mutations in oncogenes and/or the oncogene pathway, conferring "gain of function".
52. The method of claim 44, wherein said cancer is wild-type and/or mutant K-ras or BRAF gene and/or the wild-type or mutant K-ras or BRAF gene pathway, and as such genes or gene pathways in the epidermal growth factor receptor (EGFR) signaling pathway.
53. The method of claim 44, wherein said identification step comprises determining the response of a patient to a therapy for treating cancer.
54. The method of claim 44, wherein said identification step is reported to said subject and/or a health care professional.
55. The method of claim 44, wherein said non-covalent DNA binding agent binds to the minor groove of DNA.
56. The method of claim 44, wherein said non-covalent DNA binding agent binds to a GC rich region of the minor groove.
57. The method of claim 44, wherein said subject has a mutation in one or more genes and/or the gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
58. (canceled)
59. The method of claim 44, wherein said cancer is selected from the group consisting of: lung cancer, breast cancer, osteosarcoma, neuroblastoma, colon adenocarcinoma, chronic myelogenous leukemia (CML), acute myeloid leukemia (AML), acute promyelocytic leukemia (APL), sarcoma, myxoma, rhabdomyoma, fibroma, lipoma, teratoma; bronchogenic carcinoma, alveolar carcinoma, bronchial adenoma, sarcoma, lymphoma, chondromatous hamartoma, mesothelioma, esophageal cancer, stomach cancer, pancreatic cancer, small bowel cancer, large bowel cancer; kidney cancer, bladder cancer, urethra cancer, prostate cancer, testis cancer; hepatoma, cholangiocarcinoma, hepatoblastoma, angiosarcoma, hepatocellular adenoma, hemangioma, osteogenic sarcoma, fibrosarcoma, malignant fibrous histiocytoma, chondrosarcoma, Ewing's sarcoma, malignant lymphoma, multiple myeloma, malignant giant cell tumor chordoma, osteochronfroma, benign chondroma, chondroblastoma, chondromyxofibroma, osteoid osteoma, giant cell tumors, cancer of the skull, meninges cancer, brain cancer, spinal cord cancer, uterus cancer, cervical cancer, cancer of the ovaries, vulva cancer, vagina cancer, Hodgkin's disease, non-Hodgkin's lymphoma, malignant melanoma, basal cell carcinoma, squamous cell carcinoma, Karposi's sarcoma, moles dysplastic nevi, lipoma, angioma, dermatofibroma.
60. The method of claim 44, wherein said cancer is triple negative breast cancer which is negative for the estrogen receptor (ER), progesterone receptor (PR) and HER2/neu (HER2) receptors.
61. The method of claim 44, wherein said cancer is MMR-deficient colorectal cancer.
62. The method of claim 44, wherein said cancer is glioblastoma.
63. (canceled)
64. The method of claim 44, wherein the said cancer is non-small cell lung cancer.
65.-70. (canceled)
71. The method of claim 44, wherein said subject is a mammal.
72. The method of claim 44, wherein said subject is a human.
73.-164. (canceled)
Description:
[0001] Throughout this application various publications are referenced.
The disclosures of these publications in their entireties are hereby
incorporated by reference into this application in order to more fully
describe the state of the art to which this invention pertains.
FIELD OF THE INVENTION
[0002] The invention relates to non-covalent DNA binding agents, alone or in combination with anti-cancer agents and/or anti-inflammatory agents that can be used to treat cancer and inflammation.
BACKGROUND OF THE INVENTION
[0003] Cancers are caused by multiple genetic changes that drive tumorigenesis. Over the past several years, overexpressed oncogenic targets such as receptor tyrosine kinases (RTKs) have been targeted for treatment of cancers. Cancers can also arise from the loss of tumor suppressor gene functions such as through the loss of p53, BRCA1, BRCA2, PTEN and other tumor suppressor genes. Currently no therapeutic approaches have been designed to target cancers that are due to the loss of tumor suppressor gene functions.
[0004] The concept of synthetic lethality was introduced, recently, into the field of cancer therapeutics. Initial research in the field of synthetic lethality indicated that two genes are synthetic lethal if mutation of either gene alone is compatible with viability but a mutation of both genes results in cell death. There have been recent examples of treatment of cancers that have a BRCA1 gene deficiency by administration of a DNA crosslinking agent, such as a platinum drug, in combination with an inhibitor of an overexpressed gene, such as PARP, to produce a synthetic lethal outcome in such BRCA1 deficient tumor cells (A. Ashworth: A synthetic lethal therapeutic approach: Poly(ADP) Ribose Polymerase Inhibitors for the Treatment of Cancers Deficient in DNA Double-Strand Break Repair. J Clinical Oncology 26:3785-3790, 2008; Rehman, F. L., Lord, C. J. and Ashworth, A. Synthetic lethal approaches to breast cancer therapy. Nat Rev Clin Oncol 7: 718-724, 2010; O'Shaughnessy, J., Osborne, C., Pippen, J. E., Yoffe, M, Patt, D., Rocha, C., Koo, I. C., Sherman, B. M. and Bradley, C. Iniparib plus chemotherapy in metastatic triple-negative breast cancer. N Engl J Med 364: 205-214, 2011.
[0005] Currently, labor intensive bioinformatic analysis and small molecule or RNAi screens are needed to identify synthetic lethal relationships between well-established therapeutic targets and/or lesser-known components of cancer cells' signaling networks.
[0006] At present, the only clinical application of synthetic lethality is the use of DNA crosslinking platinum drugs such as carboplatin, together with an antimetabolite such as gemcitabine, in combination with poly (ADP-ribose) polymerase (PARP) inhibitor, such as iniparib in patients with triple-negative breast cancer that have BRCA1 and/or BRCA2 mutations (O'Shaughnessy et al., N Engl J Med 364: 205-214, 2011). Preclinical studies were required to establish synthetic lethal relationships among the combination of a DNA crosslinking agent (platinum), and antimetabolites (gemcitabine) and the inhibition of the DNA repair enzyme PARP, together with the genetic inactivation of tumor suppressor genes BRCA1 or BRCA2.
[0007] A clear advantage of cancer treatments based on synthetic lethality is that they have minimal toxicity, because only cells with the impairments that comprise the synthetic lethal relationship (e.g., a mutated gene and a therapeutically inhibited enzyme) should be affected. Those cells should almost exclusively be cancer cells. Treatments based on synthetic lethality offers the advantage of overcoming the problem of targets that, either due to underlying biology or the targets' actual physical make up, are "undruggable" with small molecule and biologic drugs. As much as 75% of the identified molecular targets for cancer may be "undruggable".
[0008] A key obstacle to appropriate treatment of cancers and other inflammatory diseases is the resistance or refractory responses to available therapies. For example, it is well known that tumor cells develop mutations in various genes and/or their expressed proteins. Such mutations allow the tumor cells to become refractory to currently available anticancer agents and thus the patients do not have therapeutic options. The novel invention described in this application shows the benefit of using non-covalent DNA binding agents that show synthetic lethality in tumors that carry mutations, particularly in DNA repair or tumor suppressor genes, that result in a "loss of function" in the cell's ability to either repair itself or go into apoptosis or programmed cell death. Since such mutations in DNA repair or tumor suppressor genes also render the tumor cells refractory to available treatments, the novel combinations of one or more non-covalent DNA binding agents with one or more anticancer or anti-inflammatory agents, represents a novel and unique way to treat tumor cells that have "loss of function" in tumor suppression and/or DNA repair functions.
[0009] Furthermore, in view of the fact that a) it is difficult to identify and/or predict synthetic lethal relationships, and b) the importance of cancer treatments based on synthetic lethality, there is a real and immediate need for methods of disease treatment based on combinations of agents that can leverage synthetic lethality and to develop such novel combinations in a rapid time frame, so that it does not involve time consuming identification of synthetic lethal relationships amongst genes. Moreover, such novel compositions of agents should result in treatment methods that are non-toxic. This application describes unique and novel compositions of combinations or one or more non-covalent binding DNA agents with one or more available anticancer agents, including but not limited to, those agents that have become refractory due to mutations in such cells and provide novel methods of therapies for treatment of highly unmet clinical need in cancer and inflammatory diseases, while leveraging, the concept of synthetic lethality.
SUMMARY OF THE INVENTION
[0010] The invention relates to novel compositions and methods of disease treatment comprising using one or more non-covalent DNA binding agents to create synthetic lethal combinations in cells that have "loss of function" in tumor suppressor and/or DNA repair pathways. The invention provides for the use of one or more non-covalent DNA binding agents as a monotherapy, that is, they function in the absence of other active agents, to, e.g., create synthetic lethality in tumors that exhibit loss of tumor suppressor gene function, thereby treating disease. In one embodiment of the invention, one or more non-covalent DNA binding agents may be used in combination with one or more anti-cancer agents and/or anti-inflammatory agents to, e.g., create synthetic lethality in tumors that exhibit loss of tumor suppressor gene function, so as to treat disease.
[0011] The invention also relates to novel compositions and methods of disease treatment comprising using one or more non-covalent DNA binding agents to treat a subject with at least one of a DNA repair deficiency, dysregulated apoptosis, a replication deficiency, loss of function of a tumor suppressor gene, deficiencies in DNA recombination, a ubiquitin disorder, cell cycle dysregulation and/or dysregulated translesion synthesis. In a further embodiment, one or more non-covalent DNA binding agents may be used with one or more anti-cancer agents in novel compositions and methods of disease.
[0012] The invention provides for novel compositions and methods of treating a subject with at least one of a gene deficiency, a protein deficiency, a DNA repair deficiency, dysregulated apoptosis, a recombination deficiency, a replication deficiency, a cell proliferation disorder, dysregulated transcription, loss of function of a tumor suppressor gene, a ubiquitin disorder, cell cycle dysregulation and/or dysregulation of translesion synthesis, comprising administering to the subject a therapeutically effective amount of one or more non-covalent DNA binding agents, as the only active agents, or in combination with one or more anti-cancer and/or anti-inflammatory active agents.
[0013] In one embodiment, the DNA repair deficiency is at least one of: DNA mismatch repair (MMR) deficiency, base excision repair (BER) deficiency, nucleotide excision repair (NER) deficiency, recombinational repair deficiency, homologous recombination repair (HRR) deficiency, non-homologous end joining (NHEJ) deficiency, a deficiency in the repair of double stranded breaks, and a deficiency in the repair of chromosomal damage.
[0014] The invention also provides for novel compositions and methods of treating a subject with cancer or inflammation, comprising: identifying a subject in need of treatment; administering to the subject a therapeutically effective amount of one or more non-covalent DNA binding agents, as the only active agents, or in combination with one or more anti-cancer and/or anti-inflammatory active agents; wherein following the administration, there is inhibition of inflammation or growth of a cancer cell.
[0015] In one embodiment the identification step comprises determining whether the patient has a mutation in at least one of a gene selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, and the MRE1/RPA1/RAD51 complex.
[0016] The invention also provides for novel compositions and methods of treating a subject with cancer, comprising administering to the subject a therapeutically effective amount of one or more non-covalent DNA binding agents, as the only agent agents, or in combination with one or more anti-cancer active agents, wherein following the administration, there is inhibition of growth of a cancer cell.
[0017] In one embodiment, the subject has a loss of function of at least one tumor suppressor gene.
[0018] In another embodiment, at least one tumor suppressor gene and/or the gene pathway is selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
[0019] In another embodiment, the subject has a DNA mismatch repair gene or pathway deficiency.
[0020] In another embodiment, the subject does not have a DNA mismatch repair gene or gene pathway deficiency i.e. the subject has no loss of function in DNA mismatch repair.
[0021] In another embodiment, the cancer is mutant K-ras positive or has mutations in the K-Ras pathway.
[0022] In another embodiment the cancer is has wild-type K-ras and no mutations in the K-Ras signaling pathway.
[0023] In another embodiment, the identification step comprises determining the response of a patient to a therapy for treating cancer.
[0024] In another embodiment, the identification step is reported to the subject and/or a health care professional.
[0025] In another embodiment, the non-covalent DNA binding agent binds to the minor groove of DNA.
[0026] In another embodiment, the non-covalent DNA binding agent binds to a "G-C rich" region of the minor groove.
[0027] In another embodiment, the subject has a mutation in at least one of a gene or gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
[0028] In another embodiment the patient cannot be treated by other therapies i.e. the tumor is refractory or resistant to available therapies.
[0029] In another embodiment, the cancer is selected from the group consisting of: breast cancer, colorectal cancer, leukemia, non-small cell lung cancer, ovarian cancer, renal cancer, melanoma, prostate cancer and CNS-cancers. The cancer may be a primary cancer or a metastatic cancer.
[0030] In another embodiment, the cancer is triple negative breast cancer.
[0031] In another embodiment, the cancer is MMR-deficient colorectal cancer.
[0032] In another embodiment, the cancer is glioblastoma.
[0033] In another embodiment, the novel composition comprises the non-covalent DNA binding agent or the pharmaceutically acceptable salt or prodrug thereof.
[0034] In another embodiment, the subject is a mammal.
[0035] In another embodiment, the subject is a human.
[0036] In another embodiment, the therapeutically effective amount of one or more non-covalent DNA binding agent is in the range of 0.001 mg to 1000 mg per subject.
[0037] In another embodiment, the administration step comprises administering one or more non-covalent DNA binding agent to the subject in accordance with a daily treatment regimen.
[0038] In another embodiment the administration step comprises administering one or more non-covalent DNA binding agent as a pharmaceutical formulation.
[0039] In another embodiment, the pharmaceutical formulation is a bioequivalent formulation of one or more non-covalent DNA binding agent.
[0040] In another embodiment, the pharmaceutical formulation is a pharmaceutically equivalent formulation.
[0041] In another embodiment, the pharmaceutical formulation is a therapeutically equivalent formulation.
[0042] The invention also provides for a novel composition of packaged pharmaceutical comprising one or more non-covalent DNA binding agents or pharmaceutically acceptable salt or prodrug thereof, which, upon administration to a subject, inhibits the growth of a cancer cell.
[0043] The invention also provides for a novel composition of packaged pharmaceutical comprising: one or more non-covalent DNA binding agents or pharmaceutically acceptable salt or prodrug thereof; and associated instructions for using the non-covalent DNA binding agent(s) to treat cancer.
[0044] In one embodiment, one or more of the non-covalent DNA binding agent is present as a pharmaceutical composition comprising a therapeutically effective salt or prodrug thereof and a pharmaceutically acceptable carrier.
[0045] In another embodiment, the packaged pharmaceutical further comprises in the instructions a step of identifying a subject in need of such pharmaceutical.
[0046] In another embodiment, the packaged pharmaceutical further comprises in the instructions a step of identifying one or more non-covalent DNA binding agent and one or more anticancer agent as capable of inhibiting the growth of a cancer cell.
[0047] In another embodiment, the invention provides for a novel composition of packaged pharmaceutical for administration to a subject comprising: one or more non-covalent DNA binding agents, as the only active agents, or in combination with one or more anti-cancer and/or anti-inflammatory active agents; a test for determining if the subject has a mutation in at least one of a gene; associated instructions for performing the test; and associated instructions for using the non-covalent DNA binding agent to treat cancer and/or inhibit inflammation.
[0048] In one embodiment, the gene or gene pathway is selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
[0049] The invention provides for novel compositions and methods of inhibiting the growth of a cancer cell comprising administering to the subject a non-covalent DNA binding agent.
[0050] In one embodiment, the cancer cell comprises a mutation in at least one of a gene or gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
[0051] In another embodiment, the non-covalent DNA binding agent binds to the minor groove.
[0052] In another embodiment, the non-covalent DNA binding agent binds to a GC rich region of the minor groove.
[0053] In another embodiment the subject has a mutation in at least one of a gene or gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
[0054] Methods are provided for the synthesis of poly(ethylene glycol) ("PEG") conjugates of non-covalent DNA binding agents of the invention, which conjugates retain unusually high biological potency. Also provides are novel poly(ethylene glycol) ("PEG") conjugates of non-covalent DNA binding agents of the invention and compositions thereof. Preparation of the pegylated conjugates according to the methods of the present invention reduces or avoids steric inhibition of receptor-ligand interactions that may result from the attachment of PEG to a polypeptide of small molecule of interest. The conjugates of the present invention retain a high level of biological potency compared to those produced by traditional PEG coupling methods that are not targeted to avoid receptor-binding domains of cytokines. The biological potency of the PEG conjugates of non-covalent DNA binding agents of the invention may be higher than that of unconjugated non-covalent DNA binding agents of the invention. The conjugates of the present invention may have an extended half-life in vivo compared to the corresponding unconjugated agents of the invention. The present invention also provides kits comprising such conjugates and/or compositions, and methods of use of such conjugates and compositions in a variety of diagnostic, prophylactic and therapeutic applications.
BRIEF DESCRIPTION OF THE DRAWINGS
[0055] FIG. 1 presents the effects of non-covalent DNA binding agents in osteosarcoma U2OS cells.
[0056] FIG. 2 presents the effects of non-covalent DNA binding agents in PTEN-deficient lymphoblastoid CEM cells.
[0057] FIG. 3 presents the effects of non-covalent DNA binding agents in leukemia (CEM) cells with PTEN (homologous recombination deficiency).
[0058] FIG. 4 presents the effects of non-covalent DNA binding agents in genetically resistant breast cancer cells (MDA-MB-468) cells with deficiencies in PTEN and epigenetic DNA mismatch repair mutations.
[0059] FIG. 5 presents the effects of non-covalent DNA binding agents in p53-deficient H1299 cells.
[0060] FIG. 6A-B presents the effects of non-covalent DNA binding agents in colorectals cells with (A) normal (SW403) or (B) mutated (SW480) kras.
[0061] FIG. 7A-B presents the effects of non-covalent DNA binding agents in colorectal cancer cells with (A) mutated kras or (B) mutated kras and having a mismatch repair (MMR) deficiency.
[0062] FIG. 8A-B shows that non-covalent DNA binding agents ((A) 723734 and (B) 726260), are synthetic lethal with homologous recombination repair deficiencies.
[0063] FIG. 9A-D presents the results of a comparison of the activity of non-covalent DNA binding agents in U2OS cells wherein MMR, p53 and REV functions have been inhibited using RNAi methods (A) NSC 718813; (B) NSC 723734; (C) NSC 726260; (D) table for data for NSC 718813, NSC 723734 and NSC 726260.
[0064] FIG. 10A-C presents the results of a comparison of the activity of non-covalent DNA binding agents in isogenic p53-deficient HI299 cells wherein MMR functions have been inhibited using RNAi methods (A) NSC 718813; (B) NSC 723734; (C) table for data for NSC 718813 and NSC 723734.
[0065] FIG. 11A-D presents the results of a comparison of the activity of non-covalent DNA binding agents in isogenic MMR-deficient HCTI 16 cells wherein p53 and REV functions have been inhibited using RNAi methods (A) NSC 718813; (B) NSC 723734; (C) NSC 726260; (D) camptothecin.
[0066] FIG. 12A-E presents a comparison of the activity of non-covalent DNA binding agents in p53, mlh1 and rev deficient U2OS cells (A) NSC 718813; (B) NSC 723734; (C) NSC 726260; (D) Doxorubicin; (E) table for data for NSC 718813, NSC 723734, NSC 726260 and Doxorubicin.
[0067] FIG. 13A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of TP53.
[0068] FIG. 14A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of MLH1.
[0069] FIG. 15A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of MSH2.
[0070] FIG. 16A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of BRCA1.
[0071] FIG. 17A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of REV3L.
[0072] FIG. 18A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of PARP1.
[0073] FIG. 19A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of RAD51.
[0074] FIG. 20A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of MRE11A.
[0075] FIG. 21A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of ATM.
[0076] FIG. 22A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of ATR.
[0077] FIG. 23A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of PTEN.
[0078] FIG. 24A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of ERCC1.
[0079] FIG. 25A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of BRCA2.
[0080] FIG. 26A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of XRCC1.
[0081] FIG. 27A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of KRAS.
[0082] FIG. 28A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of BRAF.
[0083] FIG. 29A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of RAD50.
[0084] FIG. 30A-B presents the amino acid sequence (A) and the nucleic acid sequence (B) of RAD51.
[0085] FIG. 31 shows a line graph of the combination effect of NSC 718813 and Vinblastin in MDA-MB-231.
[0086] FIG. 32 shows a line graph of the combination effect of NSC 718813 and 5-fluorouracil (5-FU) in MDA-MB-231.
[0087] FIG. 33 shows a line graph of the combination effect of NSC718813 with Vinblastin in MDA-MB-468.
[0088] FIG. 34 shows a line graph of the combination effect of NSC718813 with Trichostatin in MDA-MB-468.
[0089] FIG. 35 shows a line graph of the combination effect of NSC718813 with Camptothecin in MDA-MB-468.
[0090] FIG. 36 shows a line graph of the combination effect of NSC718813 with Cyclohexamide in MDA-MB-468.
[0091] FIG. 37 shows a line graph of the combination effect of NSC718813 with Mitomycin in MDA-MB-468.
[0092] FIG. 38 shows a line graph of the combination effect of NSC718813 with Doxorubicin in MDA-MB-468.
[0093] FIG. 39 shows a line graph of the combination effect of NSC718813 with Gefitinib in MDA-MB-468.
[0094] FIG. 40 shows a line graph of the combination effect of NSC718813 with 5FU in MDA-MB-468.
[0095] FIG. 41 shows a line graph of Trichostatin in CEM cells.
[0096] FIG. 42 shows a line graph of the combination effect of NSC 718813 with Cyclohexamide in CEM cells.
[0097] FIG. 43 shows a line graph of the combination effect of NSC 718813 with Vinblastin in CEM cells.
[0098] FIG. 44 shows a line graph of the combination effect of NSC 718813 with Mitomycin in CEM cells.
[0099] FIG. 45 shows a line graph of the combination effect of NSC 718813 with Doxorubicin in CEM cells.
[0100] FIG. 46A-D shows line graphs of 172Tag. FIG. 46A shows NSC 718813 with Paclitaxel in 172Tag. FIG. 46B shows NSC 718813 with Camptothecin in 172Tag. FIG. 46C shows the effect of NSC 718813 with Doxorubicin in MMR deficient cells (172Tag). FIG. 46D shows the combination effect of NSC 718813 with Trichostatin in MMR deficient cell line (172Tag).
[0101] FIG. 47A-B shows line graphs of 172Tag. FIG. 47A shows the effect of NSC 718813 with Mitomycin C in MMR deficient cell line (172Tag). FIG. 47B shows the combination with NSC 718813 and Actinomycin D in MMR deficient cells (172Tag).
[0102] FIG. 48A-D shows line graphs of HeLa. FIG. 48A shows the effect of NSC 718813 with Camptothecin in MMR proficient cells (HeLa). FIG. 48B shows the effect of NSC 718813 with Cyclohexamide in MMR proficient cells (HeLa). FIG. 48C shows the effect of NSC 718813 with Mitomycin C in MMR proficient cells (HeLa). FIG. 48D shows the effect of NSC 718813 with Vinblastine in MMR deficient cells (HEK293T).
[0103] FIG. 49A-D shows line graphs of 293T. FIG. 49A shows the combination effect of NSC 718813 with Mitomycin C in MMR deficient cells (HEK293T). FIG. 49B shows the combination effect of NSC 718813 with Paclitaxel in MMR deficient cells (HEK293T). FIG. 49C shows the combination effect of NSC 718813 with Vincristine in MMR deficient cells (HEK293T). FIG. 49D shows the combination effect of NSC 718813 with Actinomycin in MMR deficient cells (HEK293T).
[0104] FIG. 50A-B shows line graphs of MCF7. FIG. 50A shows NSC 718813 with Doxorubicin in MCF7. FIG. 50B shows NSC 718813 with Paclitaxel in MCF7.
[0105] FIG. 51A-D shows line graphs of CEM. FIG. 51A shows the combination effect in CEM cells NSC 718813 with Vinblastin. FIG. 51B shows cyclohexamide. FIG. 51C shows Trichostatin. FIG. 51D shows Mitomycin C.
[0106] FIG. 52A-D shows line graphs of SW403. FIG. 52A shows NSC 718813 with Vinblastin in SW403. FIG. 52B shows NSC 718813 with camptothecin in SW403. FIG. 52C shows NSC 718813 with Trichostatin in SW403. FIG. 52D shows NSC 718813 with cyclohexamide in SW403.
[0107] FIG. 53A-D shows line graphs of SW403. FIG. 53A shows NSC 718813 with Mitomycin in SW403. FIG. 53B shows NSC 718813 with Doxorubicin in SW403. FIG. 53C shows NSC 718813 with Paclitaxel in SW403. FIG. 53D shows NSC 718813 with actinomycin in SW403.
[0108] FIG. 54A-D shows line graphs of SW403. FIG. 54A shows NSC 718813 with olaparib in SW403. FIG. 54B shows NSC 718813 with Oxaliplatin in SW403. FIG. 54C shows NSC 718813 with Gefitinib in SW403. FIG. 54D shows NSC 718813 with 5FU in SW403.
[0109] FIG. 55A-D shows line graphs of MDA 231. FIG. 55A shows NSC 718813 with Vinblastin in MDA 231. FIG. 55B shows NSC 718813 with Cyclohexamide in MDA-MB-231. FIG. 55C shows NSC 718813 with Trichostatin in MDA-MB-231. FIG. 55D shows NSC 718813 with Mitomycin in MDA-MB-231.
[0110] FIG. 56A-D shows line graphs of MDA-MB-231. FIG. 56A shows NSC 718813 with Paclitaxel in MDA-MB-231. FIG. 56B shows NSC 718813 with Vincristin in MDA-MB-231. FIG. 56C shows NSC 718813 with Doxorubicin in MDA-MB-231. FIG. 56D shows NSC 718813 with 6TG in MDA-MB-231.
[0111] FIG. 57A-C shows line graphs of MDA-MB-231. FIG. 57A shows NSC 718813 in Olaparib in MDA 231. FIG. 57B shows NSC 718813 with Oxaliplatin in MDA-MB-231. FIG. 57C shows NSC 718813 with Gefitinib in MDA-MB-231.
[0112] FIG. 58A-D shows line graphs of MDA-MB-468. FIG. 58A shows NSC 718813 with Vinblastin in MDA-MB-468. FIG. 58B shows NSC 718813 with Camptothecin in MDA-MB-468. FIG. 58C shows NSC 718813 with Trichostatin in MDA-MB-468. FIG. 58D shows NSC 718813 with Cyclohexamide in MDA-MB-468.
[0113] FIG. 59A-D shows line graphs of MDA-MB-468. FIG. 59A shows NSC 718813 with Mitomycin in MDA-MB-231. FIG. 59B shows NSC 718813 with Doxorubicin in MDA-MB-468. FIG. 59C shows NSC 718813 with Paclitaxel in MDA-MB-468. FIG. 59D shows NSC 718813 with Olaparib in MDA-MB-468.
[0114] FIG. 60A-C shows line graphs of MDA-MB-468-468. FIG. 60A shows NSC 718813 with Gefitinib in MDA-MB-468. FIG. 60B shows NSC 718813 with Oxaliplatin in MDA-MB-468. FIG. 60C shows NSC 718813 with Erlonitib in MDA-MB-468.
[0115] FIG. 61A-E shows line graphs of U2OS. FIG. 61A shows NSC 718813 with Olaparib in U2OS. FIG. 61B shows NSC 718813 with Erlonitib in U2OS. FIG. 61C shows NSC 718813 with Gefitinib in U2OS. FIG. 61D shows NSC 718813 with Oxaliplatin in U2OS. FIG. 61E shows NSC 718813 with 5FU in U2OS.
[0116] FIG. 62A-D shows line graphs of SW620. FIG. 62A shows NSC 718813 with Olaparib in SW620. FIG. 62B shows effects of NSC 718813 with Oxaliplatin. FIG. 62C shows NSC 718813 with Gefitinib in SW620. FIG. 62D shows combination SW620 (NSC 718813 with 5FU).
[0117] FIG. 63 shows line graphs of representative NSC 718813 (A) effects in tumor cells in the NCI-60 in vitro evaluation.
[0118] FIG. 64 shows line graphs of representative NSC 723734 (B) effects in tumor cells in the NCI-60 in vitro evaluation.
[0119] FIG. 65 shows line graphs of representative NSC 723732 (C) effects in tumor cells in the NCI-60 in vitro evaluation.
[0120] FIG. 66 shows line graphs of representative NSC 726260 (D) effects in tumor cells in the NCI-60 in vitro evaluation.
[0121] FIG. 67A-B shows line graphs of colorectal cancer cells with competent DNA mismatch repair (MMR) are more sensitive to novel PBDs if they also carry mutant K-ras.
[0122] FIG. 68A-B shows line graphs of PBDs that show more potent growth inhibition in K-ras mutant colorectal cancer cells that are DNA mismatch repair (MMR) deficient.
[0123] FIG. 69A-B shows line graphs of breast cancer cells with BRCA/p53 deficiency (MCF-7) that have similar susceptibility to novel PBDs to those breast cancer cells with DNA MMR deficiency (MDA-MB-231).
[0124] FIG. 70 shows a line graph of breast cancer cells (MDA-MB-468) with loss of function in PTEN and mlh1 hypermethylation (deficient DNA mismatch repair) that are more susceptible to novel IndUS PBDs.
[0125] FIG. 71A-B shows line graphs of novel IndUS PBDs that are very potent in leukemia cells (CEM) that have loss of function in DNA MMR and PTEN compared to that in MSH2 deficient Jurkat lymphoma cells.
[0126] FIG. 72A-B shows line graphs of novel PBDs that show better potency in growth inhibition of p53-deficient H1299 compared to MMR competent A549 lung cancer cells.
[0127] FIG. 73A-E shows a table and line graphs of comparison of activity of IndUS PBDs in Isogenic U2OS with RNAi knockdowns of MMR, p53 and REV3 functions (A) NSC 718813; (B) NSC 723734; (C) NSC 726260; (D) Doxorubicin; (E) table for data for NSC 718813, NSC 723734, NSC 726260 and Doxorubicin.
[0128] FIG. 74A-E shows bar graphs of IndUS PBDs showing synthetic lethality as monotherapy in U2OS cells using RNAi knockdown of DNA mismatch repair (MMR), apoptosis (p53) and homologous recombination/translesional synthesis (REV3) genes (A) NSC 718813; (B) NSC 723734; (C) NSC 726260; (D) Doxorubicin; (E) table for data for NSC 718813, NSC 723734, NSC 726260 and Doxorubicin.
[0129] FIG. 75 is a table showing novel PBDs showing synthetic lethality in tumor cells that have loss of DNA mismatch repair (MMR) and/or apoptosis (p53).
[0130] FIG. 76A-D shows line graphs showing lead IndUS PBD compounds having excellent PK with long half-life in rats (A) NSC 718813; (B) NSC 723734; (C) NSC 726260; (D) NSC 723732.
[0131] FIG. 77 shows a line graph of intravenous and intraperitoneally administered NSC723734 showing dose-dependent reduction in SW620 colon tumor xenograft.
[0132] FIG. 78 shows a line graph of intraperitoneal NSC723734 showing superior activity to NSC718813 in SW620 colon tumor xenograft model following once daily administration for 7 days.
[0133] FIG. 79 shows a line graph of NSC718813 that reduces tumor burden in SW620 colon tumor xenograft model following a Q1Dx5 IV followed by Q4Dx3 IP administration.
[0134] FIG. 80 shows a line graph of NSC726260 showing limited pharmacological activity in SW620 colon tumor xenograft model following combined IV and IP administration.
[0135] FIG. 81 shows a line graph of NSC723734 showing excellent synergy with cisplatin following intermittent IP administration of the two drugs in SW620 colon tumor xenograft mouse model.
[0136] FIG. 82 shows a line graph of NSC723734 that is synergistic with cisplatin and restores antitumor activity of cisplatin at a lower (minimally active) cisplatin dose following intermittent IP administration in SW620 colon tumor xenograft model in mice.
[0137] FIG. 83A-B shows line graphs of quantitative analysis of in vivo SW620 colon tumor xenograft data showing that NSC723734 is synergistic with cisplatin at combination doses achieving >50% efficacy.
[0138] FIG. 84A-B shows line graphs of quantitative analysis of in vivo effects of NSC723734 and cisplatin results in significant dose-reduction index (DRI) supporting the mutual synergism in SW620 colon tumor xenograft mouse model.
[0139] FIG. 85 shows a table of novel IndUS anticancer PBDs that are significantly different compared to previously described DNA minor groove binders.
DETAILED DESCRIPTION OF THE INVENTION
I. Definitions
[0140] As used in the description of the invention and the appended claims, the singular forms "a", "an" and "the" are used interchangeably and intended to include the plural forms as well and fall within each meaning, unless the context clearly indicates otherwise. Also, as used herein, "and/or" refers to and encompasses any and all possible combinations of one or more of the listed items, as well as the lack of combinations when interpreted in the alternative ("or").
[0141] As used herein, "at least one" is intended to mean "one or more" of the listed elements.
[0142] Singular word forms are intended to include plural word forms and are likewise used herein interchangeably where appropriate and fall within each meaning, unless expressly stated otherwise.
[0143] Except where noted otherwise, capitalized and non-capitalized forms of all terms fall within each meaning.
[0144] Unless otherwise indicated, it is to be understood that all numbers expressing quantities, ratios, and numerical properties of ingredients, reaction conditions, and so forth used in the specification and claims are contemplated to be able to be modified in all instances by the term "about".
[0145] All parts, percentages, ratios, etc. herein are by weight unless indicated otherwise.
[0146] As used herein, a "non-covalent DNA binding agent" means an agent that reacts with one or more different positions in a DNA molecule, wherein binding can result in the formation of crosslinkages, either in the same strand (intrastrand crosslink) or in the opposite strands of the DNA (interstrand crosslink). Non-covalent DNA binding agents can also cause interactions between DNA and proteins that are recruited by the DNA. For example, DNA replication is blocked by non-covalent DNA binding agents of the invention that modulate interactions between DNA and genes or proteins which subsequently cause replication arrest, cell cycle arrest and/or cell death if the crosslink is not repaired.
[0147] A non-covalent DNA binding agent reacts with DNA via non-covalent interactions, for example, hydrogen bonds, Coulombic interactions, ionic bonds, van der Waals forces, and/or hydrophobic interactions. Non-covalent DNA binding agents of the invention include, but are not limited to, the agents presented herein below. The invention provides for a non-covalent DNA binding agent that binds to the minor groove of DNA. A DNA molecule has two types of grooves, the major groove which has the nitrogen and oxygen atoms of the nucleotide base pairs pointing inward toward the helical axis, and the minor groove, wherein the nitrogen and oxygen atoms of the nucleotides point outwards. The major groove is 22 .ANG. wide and the minor groove is 12 .ANG. wide. The majority of currently available DNA damaging chemotherapeutic agents target the major groove of the DNA.
[0148] Most of the currently studied DNA minor groove binding agents target "AT rich" regions of DNA. The current invention provides novel non-covalently linked, DNA minor groove binding agents that target "G-C" rich" regions of the DNA. As used herein, "GC rich region" means between 25% and 80% of the human genome and regions of hundreds of kilobases, often referred to as the isochores, that have relatively homogenous base compositions (Fullerton, S. M., Carvalho, A. B. and Clark, A. G. Local rates of recombination are positively correlated with GC content in human genome. Mol Biol Evol 18(6): 1139-1142, 2001). "GC rich regions" are preferably between 35% and 75% GC, and more preferably between 45% and 75% GC and most preferably, between 60% and 70% GC. There is evidence that the longest eukaryotic exons and the longest prokaryotic genes are the most "GC-rich" Furthermore, the expected length for random reading frames is a function of the sequence GC content, i.e. the higher the GC content, the higher the probability for longer reading frames. On the other hand, the most GC-rich introns are the shorter ones and GC content has a greater effect on the reduction of intron length (Oliver, J. L. and Marin, A. A relationship between GC content and Coding-sequence length. J Mol Evol 43: 216-223, 1996).
[0149] As used herein, "DNA repair deficiency" refers to a decrease in the ability of a cell to repair DNA as compared to a wild type or control cell. A "DNA repair deficiency" can be genetic and/or epigenetic in nature (Loeb, L. A., Loeb, K. R. and Anderson, J. P. Multiple mutations and cancer. Proc Nat Acad Sci 100(3): 776-781, 2003; Jones, P. A. and Baylin, S. B. The fundamental role of epigenetic events in cancer. Nat Rev Genetics 3: 415-428, 2002). For instance, DNA repair deficiencies can result in "microsatellite instability", a key feature of several cancers that are collectively referred to as Lynch tumors (Newish, M., Lord, C. J., Martin, S. A., Cunningham, D. and Ashworth, A. Mismatch repair deficient colorectal cancer in the era of personalized treatment. Nat Rev Clin Oncol 7: 197-208, 2010). Further, a well defined subtype of colorectal cancer (CRC) is characterized by a deficiency in the mismatch repair (MMR) pathway. MMR deficiency not only contributes to the pathogenesis of a large proportion (.about.70%) of colorectal cancer, but also determines the response of that subtype of colorectal cancer to many of the drugs that are frequently used to treat colorectal cancer.
[0150] A DNA repair deficiency can be determined by methods known in the art including but not limited to assays for microsatellite instability, for example by using a microsatellite instability test distributed by Roche (Cat. No. 12 041 901 00).
[0151] Assays for DNA mismatch repair tumors include but are not limited to those presented in Marcus et al., 1999 Am J Surg Pathol Oct: 23(10): 1248-55.
[0152] Although there are typical clinical and pathological features associated with MMR-deficiency phenotype in Lynch syndrome cancers, approximately 40% of the Lynch syndrome cases cannot be reliably diagnosed by morphological characteristics alone. A strong relationship exists between sporadic MMR deficiency colorectal cancer (dMMR CRC) and the CpG island methylator phenotype (CIMP) subtype of CRC. CIMP is characterized by regional hypermethylation of CpG islands in the DNA and thus results in the loss of functional MLH1 expression (Newish et al., Nat Rev Clin Oncol 7: 197-208, 2010). The relationship of CpG island methylation to microsatellite instability can be used to describe the clinical and pathological features of CRC. Hypermethylation (epigenetic) changes of p16 and MLH1 can be determined by methylation-specific polymerase chain reaction (PCR). Methylation of MINT 1, 2, 12 and 31 loci can be assessed by bisulfite PCR. Microsatellite instability and K-ras and p53 status of patient cancer tissues can be assessed by microsatellite PCR, restriction enzyme-mediated PCR and/or immunohistochemistry (IHC) (Hawkins, N., Norrie, M, Cheong, K., Mokany, E., Ku, S-L., Meagher, A., O'Connor, T. and Ward, R. CpG island methylation in sporadic colorectal cancers and its relationship to microsatellite instability. Gastroenterology 122(5): 1376-1387, 2002).
[0153] As used herein, a "decrease" in the ability of a cell to repair DNA means that the cell repairs damaged DNA, either due to genetic or epigenetic mutations, such that the repaired DNA is less than 100% error free (for example, 99%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less). A cell that has a DNA repair deficiency also refers to a cell that cannot perform any DNA repair.
[0154] As used herein, a "decrease" in the ability of a cell to repair DNA means that the cell repairs damaged DNA at a rate that is less than the rate at which a wild type or control cell repairs DNA.
[0155] As used herein, "less than" as it refers to the rate of repair of DNA damage, means that the rate of repair of DNA damage is 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold, or more, lower than the rate of repair of DNA damage in a wild type or control cell. As used herein, "less than" as it refers to the rate of repair of DNA damage also means that the rate of repair of DNA damage in a cell is 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, lower than the rate of repair of DNA damage in a control or wild type cell.
[0156] As used herein, a "DNA repair deficiency" includes but is not limited to: base excision repair deficiency, a deficiency in the repair of double stranded breaks and a deficiency in the repair of chromosomal damage. DNA repair deficiencies can result from genetic changes such as mutated DNA mismatch repair genes like MSH2. Furthermore, DNA repair deficiencies can also include epigenetic changes such as hypermethylation of genes involved in DNA mismatch repair, recombination, replication and/or apoptosis. (Helleday, T., Petermann, E., Lundin, C., Hodgson, B and Sharma, R. A. DNA repair pathways as targets for cancer therapy. Nat Rev Cancer 8: 193-204, 2008).
[0157] As used herein, "apoptosis" or "programmed cell death" refers to a mechanism whereby a cell undergoes death or destruction, for example, to control cell number and proliferation or in response to DNA damage. Many cancer cells do not undergo apoptosis and certain cancers involve an alteration in the apoptotic pathway.
[0158] As used herein, "dysregulated apoptosis" refers to a decrease in the ability of a cell to undergo apoptosis or a decrease in the number of cells that undergo apoptosis as compared to a wild type or control cell, for example apoptosis in response to DNA damage. For example, mutations in the p53 gene are a feature of 50% of all reported cancer cases. In the other 50% of cancer cases, the p53 gene is not itself mutated, but the p53-directed apoptosis pathway is partially inactivated (Cheok, C. F., Verma, C. S., Baselga, J. and Lane, D. P. Translating p53 into the clinic. Nat Rev Clin Oncol 8: 25-37, 2011). P53 protein is a transcription factor that controls the cellular response to stress signals through the induction of cell-cycle arrest, apoptosis and senescence. Apoptosis is detected by any one of the following assays including but not limited to DNA laddering, COMET assays and/or TUNEL staining.
[0159] As used herein, a "decrease" in the ability of a cell to undergo apoptosis means that within a population of cells, less than 100% (for example, 99%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less) of the cells undergo apoptosis, as compared to a wild type or control population of cells, for example, wherein 100% of the cells undergo apoptosis
[0160] A cell that has dysregulated apoptosis also refers to a cell that does not undergo apoptosis
[0161] As used herein, "dysregulated apoptosis" also means that a cell or population of cells undergoes apoptosis at a rate that is less than that of a wild type or control cell or a population thereof.
[0162] As used herein, "less than" as it refers to the rate of apoptosis, means 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold or more, less than the rate at which a wild type or control cell or a population thereof, undergoes apoptosis. As used herein, "less than" as it refers to the rate of apoptosis also means that the rate of apoptosis is 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, than the rate of apoptosis of a control or wild type cell or a population thereof.
[0163] As used herein, a "recombination deficiency" refers to an abnormality in homologous recombination repair in a cell, as compared to a wild type or control cell. While DNA repair is essential for cells to maintain genomic stability, there is increasing evidence that defects in homologous recombination repair (HRR) underlie hereditary and sporadic tumorigenesis (Evers, B. Helleday, T. and Jonkers, J. Targeting homologous recombination repair defects in cancer. Trends Pharmacol Sci 31: 372-380, 2010). Deficiencies in HRR may determine the sensitivity of tumors to many currently available DNA-damaging anti-cancer agents. Furthermore, HRR-deficient tumors are also more susceptible to synthetic lethal interactions. More importantly, HRR-deficient tumors may also have an increased dependence on cell-cycle checkpoints, which could be exploited.
[0164] As used herein, a "replication deficiency" refers to an abnormality in DNA replication in a cell, as compared to a wild type or control cell.
[0165] A "replication deficiency" includes replication of damaged DNA as determined by, for example, a BrdU assay wherein the thymidine analog, 5-Bromo-2-deoxyuridine (BrdU), is added to the cell growth medium just prior to fixing and the cells are stained with an antibody to BrdU, which detects the thymidine analog in DNA.
[0166] A "replication deficiency" also includes replication of DNA prior to cell division.
[0167] As used herein, a "cell proliferation disorder" refers to an increase in the number of divisions that a cell undergoes as compared to a wild type or control cell.
[0168] A "cell proliferation disorder" also refers to an increase in the rate of cellular division as compared to a wild type or control cell.
[0169] A "cell proliferation disorder" also refers to an increase in the frequency of cell division as compared to a wild type or control cell.
[0170] A "cell proliferation disorder" also refers to unregulated cell division, for example, the inability of a cell to respond to signals that cause a wild type or control cell to stop dividing or start dividing.
[0171] A "cell proliferation disorder" also refers to the inability of a cell to enter senescence.
[0172] As used herein, "senescence" refers to a state wherein diploid cells lose the ability to divide.
[0173] A "cell proliferation disorder is detected by methods known in the art including but not limited to alamar blue assay, as described herein below.
[0174] As used herein, "dysregulated transcription" means transcription of damaged DNA as determined by, for example, real-time reverse transcription polymerase chain reaction (PCR), in vitro transcription methods well known in the art, S1 nuclease assays.
[0175] As used herein, a "tumor suppressor gene" includes but is not limited to p53, RBI, WT1, NF1, NF2, APC, TSC1, TSC2, DPC4, DCC, BRCA1, BRCA2, PTEN, STK11, MSH2, MLH1, CDH1, VHL, CDKN2A, PTCH and MEN1.
[0176] As used herein, "mutation" refers to a genetic or epigenetic change in phenotype or gene expression.
[0177] A "mutation" refers to a change in the genetic sequence, for example a substitution (transition or transversion), a deletion, an insertion (including a duplication) and a translocation.
[0178] A "mutation" also refers to a chromosomal rearrangement or a chromosomal translocation.
[0179] A "mutation" also refers to an epigenetic mutation or a heritable change in phenotype and or gene expression that occurs via a mechanism that does not require a change in the genetic sequence.
[0180] An epigenetic mutation can occur by a variety of mechanisms including but not limited to post-translational modification of amino acids encoding a histone protein, thereby resulting in chromatin remodelling, DNA methylation (hypermethylation or hypomethylation), production of alternate splice forms of RNA and formation of double stranded RNA.
[0181] A "mutation" according to the invention can result in a gain in function, a loss of function, an increase or decrease in expression, an increase or decrease in the rate of expression, expression of a defective mRNA and/or expression or translation of a defective protein.
[0182] A "function" as used herein includes but is not limited to DNA repair, apoptosis, recombination, replication, cell proliferation, transcription, ubiquitination, cell cycle regulation and translesion synthesis.
[0183] "Loss of function" refers to the inability of any cell to perform any of these functions due to any reasons including, but not limited to, mutations, gene silencing and post-translational modifications, that result in a reduction of these functions.
[0184] "Gain of function" refers to the increased activity of any cell to perform any of these functions due to any reasons including but not limited to, mutations, gene amplification, overexpression of gene product or proteins and post-translational modifications resulting in amplified activity of such functions.
[0185] As used herein, "dysregulation of translesion synthesis" means a decrease in the ability of a cell to undergo translesion synthesis as compared to a wild type or control cell.
[0186] As used herein, "translesion synthesis" refers to a DNA damage tolerance process that allows the DNA replication machinery to replicate past DNA lesions such as thymine dimers or AP sites. Translesion synthesis involves replacing the DNA polymerases that mediate DNA synthesis in the absence of DNA damage with specialized, translesion polymerase (i.e. DNA polymerase IV or V). In addition to replication functions, translesion synthesis is also involved in the homologous recombination repair pathways.
[0187] As used herein, "decrease" as it refers to translesion synthesis means that the level of translesion synthesis is 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold, or more, less than the level of translesion synthesis as compared to a wild type or control cell. As used herein, "decrease" as it refers to translesion synthesis also means that the level of translesion synthesis is 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less lower than the level of translesion synthesis in a control or wild type cell.
[0188] A "decrease" in translesion synthesis also refers to a decrease in the rate of translesion synthesis as compared to a wild type or control cell.
[0189] As used herein, "decrease" as it refers to the rate of translesion synthesis, means 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold or more, less than the rate of translesion synthesis in a wild type or control cell. As used herein, "decrease" as it refers to the rate of translesion synthesis also means that the rate is 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, than the rate of translesion synthesis in a control or wild type cell.
[0190] As used herein, a "control cell" or "wild type cell" means a cell that is derived from a subject that does not have at least one of a DNA repair deficiency, dysregulated apoptosis, a recombination deficiency, a replication deficiency, a cell proliferation disorder, dysregulated transcription, loss of function of a tumor suppressor gene, a ubiquitin disorder, cell cycle dysregulation and dysregulation of translesion synthesis.
[0191] A "control cell" or "wild type cell" also means a cell that is derived from a subject that does not have cancer or an inflammatory disease, and/or does not exhibit any detectable symptoms associated with the disease.
[0192] In certain embodiments, a "control cell" means a cell from a subject that has at least one of a DNA repair deficiency, an apoptosis deficiency, a recombination deficiency, a replication deficiency, a cell proliferation disorder, dysregulated transcription, loss of function of a tumor suppressor gene, a ubiquitin disorder, cell cycle dysregulation and dysregulation of translesion synthesis, prior to administration of a DNA binding agent of the invention.
[0193] In certain embodiments, a "control cell" means a cell from a subject that has been diagnosed with cancer, prior to administration of a non-covalent DNA binding agent of the invention.
[0194] In certain embodiments, a "control cell" means a cell from a subject that has been diagnosed with an inflammatory disease, prior to administration of a non-covalent DNA binding agent of the invention.
[0195] In certain embodiments, "patient" or "subject" refers to a mammal that is diagnosed with a disease, e.g., a cancer (including but not limited to cancer of the lung, breast, colon, prostate, kidney, pancreas, ovary, and lymphatic organs; melanomas) an inflammatory disease (including but not limited to autoimmune diseases, such as systemic lupus, rheumatoid arthritis, and multiple sclerosis; graft rejections, such as renal transplant rejection, liver transplant rejection, lung transplant rejection, cardiac transplant rejection, and bone marrow transplant rejection; graft versus host disease) or an infection (including but not limited to bacterial infections, parasitic infections or viral infections. The term "patient" or "subject" includes human and other mammalian subjects that receive either prophylactic or therapeutic treatment.
[0196] As used herein, "mammal" refers to any mammal including but not limited to human, mouse, rat, sheep, monkey, dog, cat, goat, rabbit, hamster, horse, cow or pig.
[0197] A "non-human mammal", as used herein, refers to any mammal that is not a human.
[0198] As used herein, "control subject" means a subject that does not have a disease, and/or does not exhibit any detectable symptoms associated with that disease, for example cancer or an inflammatory disease.
[0199] A "control subject" also means a subject that has a disease, prior to administration of a non-covalent DNA binding agent of the invention.
[0200] A "control subject" also means a subject that does not have at least one of a DNA repair deficiency, dysregulated apoptosis, a recombination deficiency, a replication deficiency, a cell proliferation disorder, dysregulated transcription, loss of function of a tumor suppressor gene, a ubiquitin disorder, cell cycle dysregulation and dysregulation of translesion synthesis.
[0201] A "control subject" also means a subject that has at least one of a DNA repair deficiency, dysregulated apoptosis, a recombination deficiency, a replication deficiency, a cell proliferation disorder, dysregulated transcription, loss of function of a tumor suppressor gene, a ubiquitin disorder, cell cycle dysregulation and dysregulation of translesion synthesis, prior to administration of a non-covalent DNA binding agent of the invention.
[0202] A "control subject" also means a subject that does not have a mutation in at least one of a gene or gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex.
[0203] A "control subject" also means a subject has a mutation in at least one of a gene or gene pathway selected from the group consisting of: PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, XRCC1, XRCC2, XRCC3, RAD51, RAD52, REV, ATM, ATR, K-Ras, BRAF and the MRE1/RPA1/RAD51 complex, prior to administration or a non-covalent DNA binding agent of the invention.
[0204] "Treatment", or "treating" as used herein, is defined as the application or administration of one or more non-covalent DNA binding agent and one or more anticancer or anti-inflammatory agent of the invention, for example, one or more non-covalent DNA minor groove binding agent of the invention, to a subject or patient, or application or administration of one or more non-covalent DNA binding agent and one or more anticancer or anti-inflammatory agent of the invention to an isolated tissue or cell line from a subject or patient, who has a disease, e.g., cancer or an inflammatory disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease or disorder, or symptoms of the disease or disorder. The term "treatment" or "treating" is also used herein in the context of administering agents prophylactically. The term "effective dose" or "effective amount" or "effective dosage" or "therapeutic dosage" is defined as an amount sufficient to achieve or at least partially achieve the desired effect. The terms "therapeutically effective dose" and "therapeutically effective amount" are defined as an amount sufficient to cure or at least partially arrest the disease and its complications in a patient already suffering from the disease.
[0205] As used herein, "treating" a disease refers to preventing the onset of disease and/or reducing, delaying, or eliminating disease symptoms, such as an increase in the rate of growth or number of cancer cells. By "treating" is meant restoring the patient or subject to the basal state as defined herein, and/or to prevent a disease in a subject at risk thereof. Alternatively, "treating" means arresting or otherwise ameliorating symptoms of a disease.
[0206] "Treatment," as used herein, includes any drug, drug product, method, procedure, lifestyle change, or other adjustment introduced in an attempt to effect a change in a particular aspect of a subject's health (i.e., directed to a particular disease, disorder, or condition).
[0207] As used herein, "inhibition" as it refers to growth of a cancer cell means a decrease in the rate of growth, or a decrease in the amount of growth.
[0208] For example, an inhibition of growth of a cancer cell means that the rate of growth of a cancer cell that has been treated with a non-covalent DNA binding agent of the invention is 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold, or more, less than that of a cancer cell that has not been treated with a non-covalent DNA binding agent of the invention. As used herein, "inhibition" as it refers to the rate of growth of a cancer cell that has been treated with a non-covalent DNA binding agent of the invention also means that the rate is 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, lower than the rate of growth of a cancer cell that has not been treated with a non-covalent DNA binding agent of the invention.
[0209] An inhibition of growth of a cancer cell also means that the number or growth of cancer cells that have been treated with a non-covalent DNA binding agent of the invention is 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold, or more, less than the number or growth of cancer cells that have not been treated with a non-covalent DNA binding agent of the invention. As used herein, "inhibition" as it refers to the rate of growth of a cancer cell also means that the number or growth of cancer cells that have been treated with a non-covalent DNA binding agent of the invention is 90%, 80%, 70%, 60%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5% or less, lower than the growth or number of cancer cells that have not been treated with a non-covalent DNA binding agent of the invention.
[0210] As used herein, "K-ras positive" means activating mutations including but not limited to, in the RAS oncogene (KRAS, HRAS and NRAS), PI3K, BRAF, MEK, ERK and MAPK pathways, that are frequent in human cancers. For example, KRAS mutations occur in 60% of pancreatic cancers, 32% of cancers of the large intestine and 17% of lung cancers (Karnoub, A. E. and Weinberg, R. A. Ras oncogenes: split personalities. Nat Rev Mol Cell Biol 9: 517-531, 2008). RAS family members signal through numerous effector molecules with diverse functions such as RAF/MAPK, PI3K and RAL proteins (Bommi-Reddy, A. and Kaelin, W. G. Slaying RAS with a synthetic lethal weapon. Cell Res 20: 119-121, 2010).
[0211] As used herein, "K-ras negative tumors" means tumors presenting with wild type K-ras. Similarly, "BRAF negative tumors" refers to tumors presenting with wild-type BRAF.
[0212] As used herein, a cancer that is "genetically resistant" means those cancers that have developed genetic and/or epigenetic mutations in oncogenes as well as tumor suppressor and DNA repair genes; thereby leading to the genesis of various cancers. Furthermore, those tumors that have loss of tumor suppressor gene function, resulting in dysregulation of DNA repair, recombination, replication, cell cycle regulation and/or apoptosis pathways, are also considered "genetically resistant".
[0213] More specifically, "genetically-resistant" cancers are defined to include all those cancers that either have "functional loss of tumor suppressor genes", and subtypes of cancers that are resistant to currently available anti-cancer agents. For example, such subtypes of "genetically resistant" cancers include, but are not limited to, metastatic colorectal cancer (mCRC) and other Lynch syndrome tumors, such as endometrial and bladder cancers, that have deficiencies in DNA mismatch repair pathways (dMMR tumors); p53-deficient and/or p53-pathway-deficient tumors; BRCA1 and/or BRCA2-mutated (i.e. homologous recombination repair deficient (dHRR)) tumors such as triple-negative breast cancer and basal-like breast cancer; and PTEN-deficient mCRC subtypes.
[0214] Furthermore, "genetically resistant" cancers are also defined to include `gain of function` cancers with KRAS-mutator phenotype, such as mCRC and pancreatic cancers.
[0215] As used herein, "determining the response to a therapy for cancer" means comparing a parameter that is indicative of a response to treatment, for example tumor size, rate or growth or number of cancer cells, in a subject before receiving a particular therapy for cancer and after receiving a particular therapy for cancer. "Determining the response to a therapy for cancer" also means comparing a parameter that is indicative of a response to treatment, for example tumor size, rate of growth or number of cancer cells, in a subject that has received a therapy for cancer as compared to a subject that has not received a therapy for cancer. "Determining the response to a therapy for cancer" also means comparing a parameter that is indicative of a response to treatment, for example tumor size, rate of growth or number of cancer cells, in a subject that has received a therapy for cancer as compared to a control subject that has not been diagnosed with cancer and is not in need of cancer treatment.
[0216] As used herein, "cannot be treated" means that following receipt of a therapy for cancer there is no change in a parameter that is indicative of a response to treatment, for example tumor size, rate or growth or number of cancer cells, in a subject, as compared to the parameter before receiving the therapy for cancer. "Cannot be treated" also means that following receipt of a particular therapy for cancer, there is no change in a parameter that is indicative of a response to treatment, for example tumor size, rate of growth or number of cancer cells, in a subject that has received a therapy for cancer as compared to a subject that has not received a therapy for cancer. "Cannot be treated" also means that an individual cannot receive a therapy for cancer, for example due to an adverse reaction to the therapy or because they are receiving another treatment that makes it medically unadvisable, for example, due to a negative drug interaction.
[0217] "Gene," as used herein, means a segment of DNA that contains information for the regulated biosynthesis of an RNA product, including promoters, exons, introns, and other noncoding or untranslated regions that control gene expression.
[0218] The invention contemplates novel compositions and methods of treating a subject who has either failed to respond to prior therapy or has been diagnosed with mutations that would render the treatment regimens ineffective based on existing knowledge among those skilled in treatment of cancers. Both cases would result in "refractory" tumors. Such `refractory` tumors would be candidates to receive treatment comprising administering to the subject, a therapeutically effective amount of one or more non-covalent DNA binding agent and one or more available anticancer or anti-inflammatory agents of the invention, for example, one or more DNA minor groove binding agent, either alone or in combination with one or more anti-cancer agents.
[0219] As used herein, prior treatment or therapy as it applies to cancer treatment includes but is not limited to surgery, radiotherapy (for example, gamma-radiation, neutron beam radiotherapy, electron beam radiotherapy, proton therapy, brachytherapy, and systemic radioactive isotopes), endocrine therapy, biologic response modifiers (for example, interferons, interleukins, antibodies, aptamers, siRNAs, oligonucleotides, enzyme, ion channel and receptor inhibitors or activators), hyperthermia and cryotherapy, agents to attenuate any adverse effects (e.g., antiemetics), and other approved chemotherapeutic drugs, including, but not limited to, alkylating drugs (e.g., mechlorethamine, chlorambucil, Cyclophosphamide, Melphalan, Ifosfamide), antimetabolites (e.g., Methotrexate), purine antagonists and pyrimidine antagonists (e.g., 6-Mercaptopurine, 5-Fluorouracil, Cytarabile, Gemcitabine), spindle poisons (e.g., Vinblastine, Vincristine, Vinorelbine, Paclitaxel), podophyllotoxins (e.g., Etoposide, Irinotecan, Topotecan), antibiotics (Doxorubicin, Bleomycin, Mitomycin), nitrosoureas (e.g., Carmustine, Lomustine), inorganic ions (e.g., Cisplatin, Carboplatin), enzymes (e.g., Asparaginase), and hormones (e.g., Tamoxifen, Leuprolide, Flutamide, and Megestrol).
[0220] A method of "administration" useful according to the invention includes but is not limited to intravenous, subcutaneous, intramuscular, intraperitoneal, intracranial and spinal injection, ingestion via the oral route, inhalation, trans-epithelial diffusion (such as via a drug-impregnated, adhesive patch), by the use of an implantable, time-release drug delivery device, which may comprise a reservoir of exogenously-produced agent or may, instead, comprise cells that produce and secrete the therapeutic agent or topical application or administration directly to a blood vessel, including artery, vein or capillary, intravenous drip or injection. Additional methods of administration are provided herein below in the section entitled "Dosage and Administration."
[0221] A "therapeutically effective amount" of a non-covalent DNA binding agent, according to the invention is in the range of 0.001 mg-1000 mg per subject. In another embodiment, a "therapeutically effective amount" of a non-covalent DNA binding agent according to the invention is in the range of 0.01 mg to 100 mg per subject. In another embodiment, a "therapeutically effective amount" of a non-covalent DNA binding agent according to the invention is in the range of 0.1 mg to 10 mg per subject.
[0222] As used herein, "basal state" refers to an individual who does not have a disease, e.g., cancer or an inflammatory disorder.
[0223] A subject who "does not have a disease" has no detectable symptoms of the disease.
[0224] As used herein, "diagnosing" or "identifying a patient or subject having" refers to a process of determining if an individual is afflicted with a disease or ailment, for example cancer as defined herein. Methods well known and accepted in the art are used to diagnose any of the cancers recited herein.
[0225] "Cancer" refers to any one of cancer, tumor growth, cancer of the colon, breast, bone, brain and others (e.g., osteosarcoma, neuroblastoma, colon adenocarcinoma), chronic myelogenous leukemia (CML), acute myeloid leukemia (AML), acute promyelocytic leukemia (APL), cardiac cancer (e.g., sarcoma, myxoma, rhabdomyoma, fibroma, lipoma and teratoma); lung cancer (e.g., bronchogenic carcinoma, alveolar carcinoma, bronchial adenoma, sarcoma, lymphoma, chondromatous hamartoma, mesothelioma); various gastrointestinal cancers (e.g., cancers of esophagus, stomach, pancreas, small bowel, and large bowel); genitourinary tract cancer (e.g., kidney, bladder and urethra, prostate, testis; liver cancer (e.g., hepatoma, cholangiocarcinoma, hepatoblastoma, angiosarcoma, hepatocellular adenoma, hemangioma); bone cancer (e.g., osteogenic sarcoma, fibrosarcoma, malignant fibrous histiocytoma, chondrosarcoma, Ewing's sarcoma, malignant lymphoma, multiple myeloma, malignant giant cell tumor chordoma, osteochronfroma, benign chondroma, chondroblastoma, chondromyxofibroma, osteoid osteoma and giant cell tumors); cancers of the nervous system (e.g., of the skull, meninges, brain, and spinal cord); gynecological cancers (e.g., uterus, cervix, ovaries, vulva, vagina); hematologic cancer (e.g., cancers relating to blood, Hodgkin's disease, non-Hodgkin's lymphoma); skin cancer (e.g., malignant melanoma, basal cell carcinoma, squamous cell carcinoma, Karposi's sarcoma, moles dysplastic nevi, lipoma, angioma, dermatofibroma, keloids, psoriasis); and cancers of the adrenal glands (e.g., neuroblastoma).
[0226] An "inflammatory disorder" includes any one or more of the following: autoimmune diseases or disorders: diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, osteoarthritis, psoriatic arthritis), multiple sclerosis, myasthenia gravis, systemic lupus erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and eczematous dermatitis), psoriasis, Sjogren's Syndrome, including keratoconjunctivitis sicca secondary to Sjogren's Syndrome, alopecia areata, allergic responses due to arthropod bite reactions, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, Stevens Johnson syndrome, idiopathic sprue, lichen planus, Graves ophthalmopathy, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial lung fibrosis.
[0227] "Inflammatory disorder" also includes any one of rheumatoid spondylitis; post ischemic perfusion injury; inflammatory bowel disease; chronic inflammatory pulmonary disease, eczema, asthma, ischemia/reperfusion injury, acute respiratory distress syndrome, infectious arthritis, progressive chronic arthritis, deforming arthritis, traumatic arthritis, gouty arthritis, Reiter's syndrome, acute synovitis and spondylitis, glomerulonephritis, hemolytic anemia, aplastic anemia, neutropenia, host versus graft disease, allograft rejection, chronic thyroiditis, Graves' disease, primary binary cirrhosis, contact dermatitis, skin sunburns, chronic renal insufficiency, Guillain-Barre syndrome, uveitis, otitis media, periodontal disease, pulmonary interstitial fibrosis, bronchitis, rhinitis, sinusitis, pneumoconiosis, pulmonary insufficiency syndrome, pulmonary emphysema, pulmonary fibrosis, silicosis, or chronic inflammatory pulmonary disease.
[0228] As used herein, the term "pharmaceutically acceptable salt" refers to those salts of the compounds formed by the process of the present invention which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of humans and lower animals without undue toxicity, irritation, allergic response and the like, and are commensurate with a reasonable benefit/risk ratio. Pharmaceutically acceptable salts are well known in the art. For example, S. M. Berge, et al. describes pharmaceutically acceptable salts in detail in J. Pharmaceutical Sciences, 66: 1-19 (1977). The salts can be prepared in situ during the final isolation and purification of the compounds of the invention, or separately by reacting the free base function with a suitable organic acid. Examples of pharmaceutically acceptable salts include, but are not limited to, nontoxic acid addition salts, salts of an amino group formed with inorganic acids such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid and perchloric acid or with organic acids such as acetic acid, maleic acid, tartaric acid, citric acid, succinic acid or malonic acid or by using other methods used in the art such as ion exchange. Other pharmaceutically acceptable salts include, but are not limited to, adipate, alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate, borate, butyrate, camphorate, camphorsulfonate, citrate, cyclopentanepropionate, digluconate, dodecylsulfate, ethanesulfonate, formate, fumarate, glucoheptonate, glycerophosphate, gluconate, hemisulfate, heptanoate, hexanoate, hydroiodide, 2-hydroxy-ethanesulfonate, lactobionate, lactate, laurate, lauryl sulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate salts, and the like. Representative alkali or alkaline earth metal salts include sodium, lithium, potassium, calcium, magnesium, and the like. Further pharmaceutically acceptable salts include, when appropriate, nontoxic ammonium, quaternary ammonium, and amine cations formed using counterions such as halide, hydroxide, carboxylate, sulfate, phosphate, nitrate, alkyl having from 1 to 6 carbon atoms, sulfonate and aryl sulfonate.
[0229] As used herein, "bioequivalence" or "bioequivalent", refers to non-covalent DNA binding agents or drug products of the agents of the invention, which are pharmaceutically equivalent, and their bioavailabilities (rate and extent of absorption) after administration in the same molar dosage or amount are similar to such a degree that their therapeutic effects, as to safety and efficacy, are essentially the same. In other words, bioequivalence or bioequivalent means the absence of a significant difference in the rate and extent to which the non-covalent DNA binding agent becomes available from such formulations at the site of action when administered at the same molar dose under similar conditions, e.g., the rate at which a non-covalent DNA binding agent can leave such a formulation and the rate at which it can be absorbed and/or become available at the site of action to affect cancer. In other words, there is a high degree of similarity in the bioavailabilities of two non-covalent DNA binding agent pharmaceutical products (of the same galenic form) from the same molar dose, that are unlikely to produce clinically relevant differences in therapeutic effects, or adverse reactions, or both. The terms "bioequivalence", as well as "pharmaceutical equivalence" and "therapeutic equivalence" are also used herein as defined and/or used by (a) the FDA, (b) the Code of Federal Regulations ("C.F.R."), Title 21, (c) Health Canada, (d) European Medicines Agency (EMEA), and/or (e) the Japanese Ministry of Health and Welfare.
[0230] Thus, it should be understood that the present invention contemplates novel compositions of one or more non-covalent DNA binding agent formulations, as the only active agents, or in combination with one or more anti-cancer or anti-inflammatory active agents or drug products that may be bioequivalent to other non-covalent DNA binding agent and anti-cancer or anti-inflammatory formulations or drug products of the present invention. By way of example, a first non-covalent DNA binding agent formulation or drug product is bioequivalent to a second non-covalent DNA binding agent formulation or drug product, in accordance with the present invention, when the measurement of at least one pharmacokinetic parameter(s), such as a Cmax, Tmax, AUC, etc., of the first non-covalent DNA binding agent formulation or drug product varies by no more than about .+-.25%, when compared to the measurement of the same pharmacokinetic parameter for the second non-covalent DNA binding agent formulation or drug product.
[0231] As used herein, "bioavailability" or "bioavailable", means generally the rate and extent of absorption of a non-covalent DNA binding agent into the systemic circulation and, more specifically, the rate or measurements intended to reflect the rate and extent to which a non-covalent DNA binding agent becomes available at the site of action or is absorbed from a drug product and becomes available at the site of action. In other words, and by way of example, the extent and rate of absorption of a non-covalent DNA binding agent from a formulation of the present invention as reflected by a time-concentration curve of the non-covalent DNA binding agent in systemic circulation.
[0232] With respect to absolute bioavailability, absolute bioavailability compares the bioavailability (estimated as area under the curve, or AUC) of the active drug in systemic circulation following non-intravenous administration (i.e., after oral, rectal, transdermal, subcutaneous administration), with the bioavailability of the same drug following intravenous administration. It is the fraction of the drug absorbed through non-intravenous administration compared with the corresponding intravenous administration of the same drug. The comparison must be dose normalized if different doses are used; consequently, each AUC is corrected by dividing the corresponding dose administered.
[0233] As used herein, the terms "pharmaceutical equivalence" or "pharmaceutically equivalent", refer to non-covalent DNA binding agent formulations or drug products of these agents that contain the same amount of non-covalent DNA binding agent, in the same dosage forms, but not necessarily containing the same inactive ingredients, for the same route of administration and meeting the same or comparable compendial or other applicable standards of identity, strength, quality, and purity, including potency and, where applicable, content uniformity and/or stability. Thus, it should be understood that the present invention contemplates non-covalent DNA binding agent formulations or drug products that may be pharmaceutically equivalent to other non-covalent DNA binding agent formulations or drug products used in accordance with the present invention.
[0234] As used herein, the terms "therapeutic equivalence or therapeutically equivalent", mean those non-covalent DNA binding agent formulations or drug products which (a) will produce the same clinical effect and safety profile when utilizing a non-covalent DNA binding agent drug product to treat a disease, for example cancer, in accordance with the present invention and (b) are pharmaceutical equivalents, e.g., they contain the non-covalent DNA binding agent in the same dosage form, they have the same route of administration; and they have the same non-covalent DNA binding agent strength. In other words, therapeutic equivalence means that a chemical equivalent of a non-covalent DNA binding agent formulation of the present invention (i.e., containing the same amount of the non-covalent DNA binding agent in the same dosage form when administered to the same individuals in the same dosage regimen) will provide essentially the same efficacy and toxicity.
[0235] "Biological sample," as used herein, refers to a material containing, for example, a nucleic acid or other biological or chemical material of interest. Biological samples containing DNA include hair, skin, cheek swab, and biological fluids such as blood, serum, plasma, sputum, lymphatic fluid, semen, vaginal mucus, feces, urine, spinal fluid, and the like. Isolation of DNA from such samples is well known to those skilled in the art.
[0236] "Drug" or "drug substance," as used herein, refers to an active ingredient, such as a chemical entity or biological entity, or combinations of chemical entities and/or biological entities, suitable to be administered to a subject to treat a disease, e.g., cancer or an inflammatory disease. In accordance with the present invention, the drug or drug substance is a non-covalent DNA binding agent or a pharmaceutically acceptable salt thereof.
[0237] The term "drug product," as used herein, is synonymous with the terms "medicine," "medicament," "therapeutic intervention," or "pharmaceutical product." Most preferably, a drug product is approved by a government agency for use in accordance with the methods of the present invention. A drug product, in accordance with the present invention, contains a non-covalent DNA binding agent.
II. Non-Covalent DNA Binding Agents
[0238] The invention provides for novel compositions of one or more non-covalent DNA binding agents, for example one or more non-covalent DNA minor groove binding agents, alone or in combination with one or more available anticancer or anti-inflammatory agent, and their use in treating a disease, for example cancer or an inflammatory disease, according to the methods defined herein.
[0239] The invention provides for a library of pyrrolobenzodiazepine dimers (PBDs) (for example as described in U.S. Pat. Nos. 6,362,331, 6,800,622, 6,683,073, 6,884,799 and 7,015,215 the contents of which are incorporated herein by reference in their entirety).
[0240] Non-covalent DNA binding agents of the invention that are PBDs are non-anthramycin DNA minor groove binding agents that exhibit improved properties, for example, water solubility, and decreased cardiotoxicity and metabolic inactivation as compared to natural anti-cancer antibiotics, for example anthramycin, tomaymycin, sibiromycin and neothramycin. The invention provides for PBDs that demonstrate unique S-phase cell cycle specificity resulting in the stalling of the DNA replication fork.
[0241] The invention provides for non-covalent DNA binding agents that are pyrrolobenzodiazepine dimers.
[0242] The non-covalent DNA binding agents of the invention are distinct from anti-tumor antibiotics because of the following:
[0243] They are potent minor groove binders of the DNA with specificity for G-C rich sequences;
[0244] These non-covalent DNA binding agents or intercalators are distinct from previously described DNA minor groove binding agents;
[0245] They exhibit excellent pharmacokinetics in rats;
[0246] They exhibit excellent potency in tumor cells that are deficient in DNA mismatch repair genes and/or pathways, such as those involved in the development of Lynch tumors, that have DNA mismatch repair gene deficiencies-either through genetic or epigenetic mutations;
[0247] These non-covalent DNA binding agents have excellent potency in tumors that exhibit `loss of tumor suppressor gene` function of apoptotic genes such as p53 and PTEN;
[0248] The non-covalent DNA binding agents of the invention show excellent cytotoxic potency in tumor cells that have loss of function in multiple gene targets that regulate DNA repair, replication and/or apoptosis.
[0249] Non-covalent DNA binding agents useful according to the invention include but are not limited to the PBDs presented below:
##STR00001##
III. Non-Covalent DNA Binding Agents May be Conjugated
[0250] PEGylation of Molecules
[0251] Non-covalent DNA binding agents of the invention may be joined to a PEG molecule (also referred to herein as pegylated non-covalent DNA binding agents of the invention) in order to enhance its stability and effectiveness.
[0252] Poly(ethylene glycol) (PEG) may be a linear or branched polyether terminated with hydroxyl groups and having the general structure:
HO--(CH.sub.2CH.sub.2O).sub.n--CH.sub.2CH.sub.2--OH
[0253] A useful modification for PEG is monomethoxy PEG (mPEG) having the general structure:
CH.sub.3O--(CH.sub.2CH.sub.2O).sub.n--CH.sub.2CH.sub.2--OH
[0254] The monofunctionality of mPEG makes it particularly suitable for conjugation with non PEG molecules because it can yield reactive PEGs that do not produce crosslinked products. mPEG can be further modified to have a functional group useful for conjugation with non PEG molecules.
[0255] To conjugate a PEG molecule to a non-PEG molecule such as a non-covalent DNA binding agent of the invention, it is necessary to activate the PEG by preparing a derivative of the PEG having a functional group at one or both termini. The functional group can be chosen based on the type of available reactive group on the molecule that will be conjugated to the PEG. In certain embodiments of this invention, it can be desirable to use the succinimidyl ester of the monopropionic acid derivative of PEG, as disclosed in Harris, J. M., et al., U.S. Pat. No. 5,672,662, which is incorporated herein fully by reference, or other succinimide activated PEG-carboxylic acids. In certain other embodiments, it can be desirable to use the p-nitrophenyl carbonate derivative of PEG, as disclosed in Kelly, S. J., et al. (2001) supra; PCT publication WO 00/07629 A2, supra, and in PCT publication WO 01/59078 A2 supra. Additional PEG derivatives include, but are not limited to, aldehyde derivatives of PEGs (Royer, G. P., U.S. Pat. No. 4,002,531; Harris, J. M., et al., U.S. Pat. No. 5,252,714), amine, bromophenyl carbonate, carbonylimidazole, chlorophenyl carbonate, fluorophenyl carbonate, hydrazide, iodoacetamide, maleimide, orthopyridyl disulfide, oxime, phenylglyoxal, thiazolidine-2-thione, thioester, thiol, triazine and vinylsulfone derivatives of PEGs.
[0256] In accordance with the practice of the invention, one or several (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, and up to 10) strands of one or more PEGs can be coupled to a non-covalent DNA binding agents of the invention. In one embodiment, one or two strands of PEG may be coupled to a non-covalent DNA binding agents of the invention.
[0257] In an embodiment of the invention, coupling of PEG to non-covalent DNA binding agents of the invention may be effected by, for example, reductive alkylation (also known as reductive amination) using standard methods (see e.g., Bentley, M. D., et al., U.S. Pat. No. 5,990,237; references 1-69).
[0258] In one embodiment, a PEG derivative suitable for conjugation with N-terminal amino acid groups of proteins or polypeptides (e.g. non-covalent DNA binding agents of the invention) is mPEG-propionaldehyde as shown below in a reductive alkylation reaction (see for example U.S. Pat. No. 5,252,714?). In this embodiment, sodium cyanoborohydride may be used as the reducing agent (Cabacungan, J. C., et al., (1982) Anal Biochem 124:272-278; U.S. Pat. No. 5,252,714). In accord with the practice of the invention, H.sub.2N--R can be non-covalent DNA binding agents of the invention.
##STR00002##
[0259] Other PEG derivatives suitable for conjugation with N-terminal amino acid groups include, but are not limited to: PEG-acetaldehyde, PEG carboxylic acids (e.g., PEG propionic acid, PEG butanoic acid).
[0260] Reversible conjugation using PEG derivative molecules can be beneficial in some circumstances. Examples of PEG derivatives that can conjugate and release non-PEG molecules include, but are not limited to: PEG-succinimidyl succinate, PEG maleic anhydride, mPEG phenyl ether succinimidyl carbonates and mPEG benzamide succinimidyl carbonates.
[0261] Heterobifunctional PEGs are PEGs bearing dissimilar terminal groups. Heterobifunctional PEGs with appropriate functional groups can be used to link two entities where a hydrophilic, flexible, and biocompatible spacer is needed. Heterobifunctional PEGs can be used in a variety of ways including, but not limited to, linking molecules to surfaces (for immunoassays, biosensors or various probe applications, etc), targeting of drugs, liposomes, and viruses to specific tissues, liquid phase peptide synthesis and other applications.
[0262] In addition to the linear PEG molecules described above, branched and/or forked PEGs can be used to conjugate non-PEG molecules (e.g. non-covalent DNA binding agents of the invention). Branched PEG molecules have a single functional group at the end of two PEG chains. A branched PEG structure can be more effective than a linear PEG in protecting conjugated agents from proteolysis and in reducing antigenicity and immunogenicity of such conjugates. Forked PEGs have two reactive groups at one end of a single PEG chain. Forked PEG molecules can be used to bring two non PEG molecules in close proximity to each other by attaching the non PEG molecules to the single forked PEG molecule.
[0263] Examples of branched and/or forked PEG molecules are shown below.
[0264] Branched PEG:
##STR00003##
[0265] Linear Forked PEG:
##STR00004##
[0266] Branched Forked PEG:
##STR00005##
[0267] Enhanced Activity of PEGylated Non-Covalent DNA Binding Agents of the Invention
[0268] Enhanced receptor binding activity and functional activity (e.g., increased or extended half-life) may be an advantage of the pegylated non-covalent DNA binding agents of the invention. Increased receptor binding activity and increased functional activity can be measured, or employed, in vitro, and increased potency, can be measured either in vitro or in vivo.
III. Anti-Inflammatory Agents
[0269] Anti-inflammatory agents useful in the combination therapy of the invention include, but are not limited to, dihydrofolic acid reductase inhibitors e.g., methotrexate; cyclophosphamide; cyclosporine; cyclosporin A; chloroquine; hydroxychloroquine; sulfasalazine (sulphasalazopyrine) gold salts D-penicillamine; leflunomide; azathioprine; anakinra; a Non-Steroidal Anti-Inflammatory Drug (NSAID); TNF blockers e.g., infliximab (REMICADE.RTM.) or etanercept; and a biological agent that targets an inflammatory cytokine. In accordance with the practice of the invention, therapeutically effective salts or prodrugs of these agents may also be used.
[0270] NSAIDs include, but are not limited to acetyl salicylic acid, choline magnesium salicylate, diflunisal, magnesium salicylate, salsalate, sodium salicylate, diclofenac, etodolac, fenoprofen, flurbiprofen, indomethacin, ketoprofen, ketorolac, meclofenamate, naproxen, nabumetone, phenylbutazone, piroxicam, sulindac, tolmetin, acetaminophen, ibuprofen, Cox-2 inhibitors, meloxicam and tramadol. In accordance with the practice of the invention, therapeutically effective salts or prodrugs of these agents may also be used.
IV. Anti-Cancer Agents
[0271] Anti-cancer agents useful in the combination therapy of the invention include, but are not limited to: histone deacetylase inhibitors (HDIs or HDACIs) (such as trichostatin A (7-[4-(dimethylamino)phenyl]-N-hydroxy-4,6-dimethyl-7-oxohepta-2,4-dienam- ide)); topoisomerase I inhibitors such as camptothecin (S)-4-ethyl-4-hydroxy-1H-pyrano[3',4':6,7]indolizino[1,2-b]quinoline-3,14- -(4H,12H)-dione), topotecan (S)-10-[(dimethylamino)methyl]-4-ethyl-4,9-dihydroxy-1H-pyrano[3',4':6,7]- indolizino[1,2-b]quinoline-3,14(4H,12H)-dione monohydrochloride) and irinotecan ((S)-4,11-diethyl-3,4,12,14-tetrahydro-4-hydroxy-3,14-dioxo1H-pyrano[3',4- ':6,7]-indolizino[1,2-b]quinolin-9-yl-[1,4'bipiperidine]-1'-carboxylate); protein synthesis inhibitors such as cyclohexamide (4-[(2R)-2-[(1S,3S,5S)-3,5-Dimethyl-2-oxocyclohexyl]-2-hydroxyethyl]piper- idine-2,6-dione); DNA alkylating agents such as mitomycin C ([6-Amino-8a-methoxy-5-methyl-4,7-dioxo-1,1a,2,4,7,8,8a,8b-octahydroazire- no[2',3':3,4]pyrrolo[1,2-a]indol-8-yl]methyl carbamate); topoisomerase II inhibitors such as anthracycline antibiotics like doxorubicin ((8S,10S)-10-(4-amino-5-hydroxy-6-methyl-tetrahydro-2H-pyran-2-yloxy)-6,8- ,11-trihydroxy-8-(2-hydroxyacetyl)-1-methoxy-7,8,9,10-tetrahydrotetracene-- 5,12-dione) and etoposide (4'-demethyl-epipodophyllotoxin 9-[4,6-O--(R)-ethylidene-beta-D-glucopyranoside], 4'-(dihydrogen phosphate)); anti-metabolite agents (such as 6-thioguanine (6TG) (2-amino-6,7-dihydro-3H-purine-6-thione), and 5-fluorouracil (5-FU)(5-fluoro-1H-pyrimidine-2,4-dione); epidermal growth factor receptor (EGFR) inhibitors (such as gefitinib (N-(3-chloro-4-fluoro-phenyl)-7-methoxy-6-(3-morpholin-4-ylpropoxy)quinaz- olin-4-amine) and erlonitib (N-(3-ethynylphenyl)-6,7-bis(2-methoxyethoxy) quinazolin-4-amine)); RNA synthesis inhibitor such as actinomycin D (2-amino-N,N' bis[(6S,9R,10S,13R,18aS)-6,13-diisopropyl-2,5,9-trimethyl-1,4,7,11, 14-pentaoxohexadecahydro-1H-pyrrolo[2, 1-i][1,4,7,10,13] oxatetraazacyclohexadecin-10-yl]-4,6-dimethyl-3-oxo-3H-phenoxazine-1,9-di- carboxamide); anti-mitotic agents like tubulin inhibitors such as paclitaxel ((2.alpha.,4.alpha.,5.beta.,7.beta.,10.beta.,13.alpha.)-4, 10-bis(acetyloxy)-13-{[(2R,3S)-3-(benzoylamino)-2-hydroxy-3-phenylpropano- yl]oxy}-1,7-dihydroxy-9-oxo-5, 20-epoxytax-11-en-2-yl benzoate)(also known as Taxol) and vinca alkaloids like vincristine (methyl (1R,9R,10S,11R,12R,19R)-11-(acetyloxy)-12-ethyl-4-[(13S,15S,17S)-17-ethyl- -17-hydroxy-13-(methoxycarbonyl)-1,11-diazatetracyclo[13.3.1.0.sup.4,12.0.- sup.5,10]nonadeca-4(12),5,7,9-tetraen-13-yl]-8-formyl-10-hydroxy-5-methoxy- -8,16-diazapentacyclo[10.6.1.0.sup.1,9.0.sup.2,7.0.sup.16,19]nonadeca-2,4,- 6,13-tetraene-10-carboxylate) and vinblastine (dimethyl (2.beta.,3.beta.,4.beta.,5.alpha.,12.beta.,19.alpha.)-15-[(5S,9S)-5-ethyl- -5-hydroxy-9-(methoxycarbonyl)-1,4,5,6,7,8,9,10-octahydro-2H-3,7-methanoaz- acycloundecino[5,4-b]indol-9-yl]-3-hydroxy-16-methoxy-1-methyl-6,7-didehyd- roaspidospermidine-3,4-dicarboxylate); DNA synthesis inhibitors like fludarabine ([(2R,3R,4S,5R)-5-(6-amino-2-fluoro-purin-9-yl)-3,4-dihydroxy-oxolan-2-yl- ]methoxyphosphonic acid) and hydroxyurea; Poly ADP ribose polymerase (PARP) inhibitors (such as olaparib (4-[(3-[(4-cyclopropylcarbonyl)piperazin-4-yl]carbonyl)-4-fluorophenyl]me- thyl(2H)phthalazin-1-one)); and DNA crosslinking agents such as such as cisplatin ((SP-4-2)-diamminedichloridoplatinum), carboplatin (cis-diammine(cyclobutane-1,1-dicarboxylate-O,O)platinum(II)) and oxaliplatin (R1R,2R)-cyclohexane-1,2-diamineyethanedioato-O,O)platinum(II)). In accordance with the practice of the invention, therapeutically effective salts or prodrugs of these anti-cancer agents may be used.
V. Genes
[0272] The invention provides for novel compositions and use of one or more non-covalent DNA binding agents, alone (as the only active agent(s)) or in combination with other anticancer or anti-inflammatory active agents, in the treatment of cancer or inflammatory disease in patients with, for example, mutations in genes including but not limited to:
[0273] genes regulating DNA replication, recombination, repair and/or apoptosis such as PTEN, p53, BRCA1 and/or BRCA2, together with the associated BRCA1/rad51/MRE11/replication protein A (RPA) complex;
[0274] genes regulating DNA mismatch repair such as mlh1, MSH2, MSH6, PMS1, PMS2;
[0275] genes regulating translesion synthesis such as REV3 and its associated protein complexes at the replication fork;
[0276] genes regulating cell proliferation such as KRAS and BRAF kinase pathways.
[0277] Genes encoding kinases regulating DNA replication, recombination, repair and/or apoptosis such as ATM, ATR, Chk1 and/or Chk2 kinases;
[0278] genes involved in base excision repair such as XRCC1;
[0279] nucleotide excision repair genes such as ERCC1;
[0280] homologous recombination genes such as RAD51, RAD52, RAD54, BRCA1, BRCA2, XRCC2 and XRCC3;
[0281] genes regulating non-homologous recombination such as KU70, KU80, XRCC4 and DNA ligase4; and
[0282] genes regulating transcription-coupled repair such as CSA, CSB and XPG.
[0283] The invention therefore provides for novel compositions and use of one or more non-covalent DNA binding agents alone, as the only active agent(s), or in combination with other anticancer or anti-inflammatory active agents, in the treatment of cancer or inflammatory disease in patients with, for example, a mutation in a gene or gene pathway including but not limited to PTEN, p53, BRCA1, BRCA2, MLH1, PMS1, PMS2, MSH2, MSH6, REV3, KRAS, BRAF, Chk1, Chk2, KU70, KU80, DNA ligase 4, CSA, CSB, XRCC1, XRCC2, XRCC3, XRCC4, RAD51, RAD52, RAD54, REV, ATM, ATR, XPF, Ercc1, XPA, XPB, XPD, XPF, XPG, MSH6/3, PCNA, BARD1, RAD50, NBS1, Mre11, BLM, PMS2, MLH1, MED1, RFC, pol.gamma./.epsilon., RPA, DNA ligase I and the MRE1/RPA1/RAD51 complex.
TABLE-US-00001 TABLE 1 Symbol Entrez Gene ID NCBI Reference Sequence TP53 7157 NM_000546 MLH1 4292 NM_000249 MSH2 4436 NM_000251 BRCA1 672 NM_007294 REV3L 5980 NM_002912 PARP1 142 NM_001618 RAD51 5888 NM_002875 MRE11A 4361 NM_005591 ATM 472 NM_000051 ATR 545 NM_001184 PTEN 5728 NM_000314 ERCC1 2067 NM_001983 BRCA2 675 NM_000059 XRCC1 7515 NM_006297 KRAS 3845 NM_033360 BRAF 673 NM_004333 RAD50 10111 NM_005732 RAD51 5393 NM_134424
[0284] PTEN
[0285] Phosphatase and tensin homolog (PTEN) is a protein that is encoded by the PTEN gene. Mutations of this gene are a step in the development of many cancers. PTEN acts as a tumor suppressor gene through the action of its phosphatase protein product. This phosphatase is involved in the regulation of the cell cycle, preventing cells from growing and dividing too rapidly.
[0286] This gene was identified as a tumor suppressor that is mutated in a large number of cancers at high frequency. The protein encoded by this gene is a phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase. It contains a tensin like domain as well as a catalytic domain similar to that of the dual specificity protein tyrosine phosphatases. Unlike most of the protein tyrosine phosphatases, this protein preferentially dephosphorylates phosphoinositide substrates. It negatively regulates intracellular levels of phosphatidylinositol-3,4,5-trisphosphate in cells and functions as a tumor suppressor by negatively regulating the Akt/PKB signaling pathway.
[0287] p53
[0288] p53 is a tumor suppressor protein that in humans is encoded by the TP53 gene. p53 is important in multicellular organisms, where it regulates the cell cycle and, thus, functions as a tumor suppressor that is involved in preventing cancer. As such, p53 plays a role in conserving stability by preventing genome mutation.
[0289] BRCA1
[0290] BRCA1 (breast cancer 1) is a human tumor suppressor gene, which produces a protein, called breast cancer type 1 susceptibility protein. BRCA1 is expressed in the cells of breast and other tissue, where it helps repair damaged DNA, or destroy cells if DNA cannot be repaired. If BRCA1 itself is damaged, damaged DNA is not repaired properly and this increases risks for cancers.
[0291] The protein encoded by the BRCA1 gene combines with other tumor suppressors, DNA damage sensors, and signal transducers to form a large multi-subunit protein complex known as the BRCA1-associated genome surveillance complex (BASC). The BRCA1 protein associates with RNA polymerase II, and, through the C-terminal domain, also interacts with histone deacetylase complexes. This protein thus plays a role in transcription, DNA repair of double-stranded breaks, ubiquitination, transcriptional regulation as well as other functions.
[0292] BRCA2
[0293] BRCA2 (Breast Cancer 2 susceptibility protein) is a protein that in humans is encoded by the BRCA2 gene. BRCA2 belongs to the tumor suppressor gene family and the protein encoded by this gene is involved in the repair of chromosomal damage with an important role in the error-free repair of DNA double strand breaks.
[0294] DNA Mismatch Repair Genes
[0295] DNA mismatch repair is a system for recognizing and repairing erroneous insertion, deletion and mis-incorporation of bases that can arise during DNA replication and recombination, as well as repairing some forms of DNA damage.
[0296] Mismatch repair is strand-specific. During DNA synthesis it is common that errors are introduced into the newly synthesized (daughter) strand.
[0297] Any mutational event that disrupts the superhelical structure of DNA carries with it the potential to compromise the genetic stability of a cell.
[0298] Examples of mismatched bases include a G/T or A/C. Mismatches are commonly due to tautomerization of bases during synthesis. The damage is repaired by recognition of the deformity caused by the mismatch, determination of the template and non-template strand, and excision of the wrongly incorporated base and replacement of the incorrect base with the correct nucleotide. The removal process involves more than just the mismatched nucleotide itself. A few or up to thousands of base pairs of the newly synthesized DNA strand can be removed.
[0299] Mismatch repair (MMR) genes are involved in recognition and repair of certain types of DNA damage or replication errors. These genes also function to help preserve the fidelity of the genome through successive cycles of cell division.
[0300] The protein products of MMR genes also repair branched DNA structures, prevent recombination of divergent sequences, direct non-MMR proteins in nucleotide excision and other forms of DNA repair, and are involved in regulation of meiotic crossover. Defects in MMR genes lead to Microsatellite Instability (MSI) and cancer.
[0301] MLH1
[0302] MutL homolog 1, colon cancer, nonpolyposis type 2 (E. coli), also known as MLH1, is a human gene located on Chromosome 3. It is a gene commonly associated with hereditary nonpolyposis colorectal cancer.
[0303] This gene was identified as a locus frequently mutated in hereditary nonpolyposis colon cancer (HNPCC). It is a human homolog of the E. coli DNA mismatch repair gene mutL, consistent with the characteristic alterations in microsatellite sequences (RER+ phenotype) found in HNPCC. Alternatively spliced transcript variants encoding different isoforms have been described, but their full-length natures have not been determined.
[0304] PMS1
[0305] PMS1 protein homolog 1 is a protein that in humans is encoded by the PMS1 gene.
[0306] The protein encoded by this gene was identified by its homology to a yeast protein involved in DNA mismatch repair. This protein forms heterodimers with MLH1, a DNA mismatch repair protein, and some cases of hereditary nonpolyposis colorectal cancer have been found to have mutations in this gene.
[0307] PMS2
[0308] Mismatch repair endonuclease PMS2 is an enzyme that in humans is encoded by the PMS2 gene.
[0309] This gene is one of the PMS2 gene family members which are found in clusters on chromosome 7. The product of this gene is involved in DNA mismatch repair. The protein forms a heterodimer with MLH1 and this complex interacts with MSH2 bound to mismatched bases. Defects in this gene are associated with hereditary nonpolyposis colorectal cancer, with Turcot syndrome, and are a cause of supratentorial primitive neuroectodermal tumors. Alternatively spliced transcript variants have been observed.
[0310] MSH2
[0311] MSH2 is a gene commonly associated with Hereditary nonpolyposis colorectal cancer.
[0312] MSH2 was identified as a locus frequently mutated in hereditary nonpolyposis colon cancer (HNPCC). When cloned, it was discovered to be a human homolog of the E. coli mismatch repair gene mutS, consistent with the characteristic alterations in microsatellite sequences (RER+ phenotype) found in HNPCC. It is also associated with some endometrial cancers.
[0313] MSH3
[0314] DNA mismatch repair protein Msh3 is a protein that in humans is encoded by the MSH3 gene. MSH3 has been shown to interact with MSH2, PCNA and BRCA1.
[0315] MSH6
[0316] MSH6 is a gene commonly associated with hereditary nonpolyposis colorectal cancer.
[0317] MSH6 has been shown to interact with MSH2, PCNA and BRCA1.
VI. Cells and Cell Lines
[0318] Cell lines useful according to the invention include but are not limited to breast cancer cell lines (MMR- or PTEN-deficient or BRCA1 mutant), e.g., MDA-MB-231, MCF-7, MDA-MB-468; colon cancer cell lines (MMR-deficient; KRAS-mutant cells) e.g., HCT-116, SW-620, SW-480, SW48, SW-403, Colo205; lymphoblastoid cell lines (MSH2- or PTEN-deficient cells) e.g., CEM and Jurkat; ovarian and uterine cancer cell lines (DNA MMR-deficient cells) e.g., HeLa, SKOV-3; osteosarcoma cells (MMR-competent) e.g., U2OS; and lung cancer cells (MMR-competent or MMR-deficient) e.g., A549 and H1299.
[0319] Cell lines derived from patients with any of the cancers or inflammatory diseases recited herein are also useful according to the methods of the invention.
VII. Diseases
[0320] The invention provides for novel compositions and methods for treatment of a subject with a disease comprising administration of a pharmaceutically effective amount of one or more of a non-covalent DNA binding agent, for example, a non-covalent DNA minor groove binding agent, alone, as the only active agent(s) or in combination with one or more anti-cancer and/or anti-inflammatory active agents. For example, the invention provides for treating cancer with one or more non-covalent DNA-minor groove binding agents that result in DNA crosslinking or intercalation, alone, as the only active agent(s) or in combination with one or more anti-cancer active agents. The invention contemplates treating any one of one of cancer, tumor growth, cancer of the colon, breast, bone, brain and others (e.g., osteosarcoma, neuroblastoma, colon adenocarcinoma), chronic myelogenous leukemia (CML), acute myeloid leukemia (AML), acute promyelocytic leukemia (APL), cardiac cancer (e.g., sarcoma, myxoma, rhabdomyoma, fibroma, lipoma and teratoma); lung cancer (e.g., bronchogenic carcinoma, alveolar carcinoma, bronchial adenoma, sarcoma, lymphoma, chondromatous hamartoma, mesothelioma); various gastrointestinal cancer (e.g., cancers of esophagus, stomach, pancreas, small bowel, and large bowel); genitourinary tract cancer (e.g., kidney, bladder and urethra, prostate, testis; liver cancer (e.g., hepatoma, cholangiocarcinoma, hepatoblastoma, angiosarcoma, hepatocellular adenoma, hemangioma); bone cancer (e.g., osteogenic sarcoma, fibrosarcoma, malignant fibrous histiocytoma, chondrosarcoma, Ewing's sarcoma, malignant lymphoma, multiple myeloma, malignant giant cell tumor chordoma, osteochronfroma, benign chondroma, chondroblastoma, chondromyxofibroma, osteoid osteoma and giant cell tumors); cancers of the nervous system (e.g., of the skull, meninges, brain, and spinal cord); gynecological cancers (e.g., uterus, cervix, ovaries, vulva, vagina); hematologic cancer (e.g., cancers relating to blood, Hodgkin's disease, non-Hodgkin's lymphoma); skin cancer (e.g., malignant melanoma, basal cell carcinoma, squamous cell carcinoma, Karposi's sarcoma, moles dysplastic nevi, lipoma, angioma, dermatofibroma, keloids, psoriasis); and cancers of the adrenal glands (e.g., neuroblastoma).
[0321] In particular, the invention relates to novel compositions of one or more non-covalent DNA binding agents, alone, as the only active agent(s) or in combination with one or more anti-cancer active agents and their use to treat those cancers that are genetically-resistant and have a loss of at least one tumor suppressor gene function. Such cancers include tumors of the brain (such as gliomas and glioblastomas), blood (such as leukemias and lymphomas), bladder, breast, colorectal, endometrial, lung, melanomas, ovarian, prostate, renal and testicular cancers.
[0322] In one embodiment the invention provides for treating MMR-deficient colorectal cancer using a novel composition of one or more non-covalent DNA binding agents, alone, as the only active agent(s) or in combination with one or more anti-cancer active agents of the invention. One of the most studied genotypic subtypes of colorectal cancer is that characterized by a deficient mismatch repair (dMMR) pathway, usually found in combination with microsatellite instability (see Hewish et al., Nature Reviews 7: 197-208, 2010). MMR-deficient colorectal cancer can occur as a result of inherited or sporadic abnormalities in DNA repair pathways. The phenotypic characteristics of this cancer include proximal anatomical location, mucinous features and lymphocytic infiltration.
[0323] Preclinical and clinical studies have demonstrated that MMR-deficient colorectal cancer shows resistance to 5-fluorouracil. Heterogeneity exists within MMR-deficient colorectal cancer subtype, possibly due to secondary mutations from MMR-deficiency-associated mutator phenotype.
[0324] In another embodiment, the invention provides for treating `triple negative` and `basal-like` breast cancers with novel compositions of one or more non-covalent DNA binding agents, alone, as the only active agents, or in combination with one or more anti-cancer active agents of the invention. Triple-negative breast cancer is the subgroup of breast cancer that does not express clinically significant levels of the estrogen receptor (ER), progesterone receptor (PR) and HER2/neu (HER2) (Carey, L., Winer, E, Viale, G, Cameron, D. and Gianni, L. Triple-negative breast cancer: disease entity or title of convenience. Nature Reviews 7: 683-692, 2010).
[0325] BRCA1 protein expression levels are significantly lower in tumors of high histological grade that lack hormone receptors (triple negative and basal-like breast tumors). Further, basal-like breast cancers also have significant TP53 (P53) gene mutations and BRCA1 pathway dysfunction. BRCA1-pathway related cancers likely have DNA repair defects. These BRCA1 pathway dysfunction tumors show sensitivity to DNA crosslinking agents, for example platinum, in combination with antimetabolite drugs, such as gemcitabine, and poly ADP-ribose polymerase (PARP) inhibitors, such as olaparib and iniparib.
[0326] In another embodiment, the invention provides for treating human glioblastomas with novel compositions of one or more non-covalent DNA binding agents, alone, as the only active agent(s) or in combination with one or more anti-cancer active agents of the invention. One of the key markers for glioblastomas is the methylation status of MGMT. The MGMT methylation status predicts the sensitivity of human glioblastomas to alkylating agents, e.g., temozolomide.
[0327] The invention also contemplates treating any one of the inflammatory disease recited herein with novel compositions of one or more non-covalent DNA binding agents, alone, as the only active agent(s) or in combination with one or more anti-inflammatory active agents of the invention.
[0328] The invention also contemplates treating a subject having an infection (e.g. bacterial infection, viral infection, yeast infection, or parasitic infection) with therapeutically effective amount of one or more of a PBD such as NSC718813, NSC723734, NSC 723732 and NSC726260 so as to treat the subject with the infection.
[0329] The invention also contemplates treating a subject suffering from an infection (e.g. bacterial infection, viral infection, yeast infection, or parasitic infection) by administering to the subject a therapeutically effective amount of one or more of the following PBD's:
##STR00006##
wherein R is H, OH, or OAc and n is 3 to 5;
##STR00007##
wherein R is H, OH, and n is 1 to 4;
##STR00008##
wherein R and R.sub.1 are independently H or --OH, and n is an integer from 3 to 5;
##STR00009##
wherein n is 2 to 10; or
##STR00010##
wherein R is H, OH, OAc, and R.sub.1 is H, and n is 3 to 5.
VIII. Dosages and Modes of Administration
[0330] In general, non-covalent DNA binding agents of the invention may be administered in therapeutically effective amounts via any of the usual and acceptable modes known in the art, either as one or more non-covalent DNA binding agents like the PBDs alone or in combination with one or more additional therapeutic agents, e.g., anti-cancer agents and/or anti-inflammatory agents. A therapeutically effective amount may vary widely depending on the disease, the severity of the disease, the age and relative health of the subject, the potency of the compound used and other factors. In general, satisfactory results are indicated to be obtained systemically at daily dosages of from about 0.001 mg to 1000 mg per subject. An indicated daily intravenous dosage in the larger mammal, e.g. humans, is in the range from about 0.0001 mg to about 100 mg per subject, conveniently administered, e.g. in divided doses up to 1-2 times a day or in retard form. Suitable unit dosage forms for intravenous administration comprise from about 0.001 mg to about 10 mg/ml active ingredient.
[0331] In certain embodiments, a therapeutic amount or intravenous dose of one or more of a non-covalent DNA binding agent of the present invention may range from about 0.001 mg to about 100 mg per subject, alternatively from about 0.01 mg to about 10 mg per subject. In general, treatment regimens according to the present invention comprise administration to a patient in need of such treatment from about 0.001 mg to about 1000 mg of the compound(s) of this invention per day in single or multiple doses. Therapeutic amounts or doses will also vary depending on route of administration, as well as the possibility of co-usage with other agents.
[0332] Upon improvement of a subject's condition, a maintenance dose of one or more of a non-covalent DNA binding agent, either alone or in combination with one or more additional therapeutic agents, e.g., a chemotherapeutic agent, may be administered, if necessary. Subsequently, the dosage or frequency of administration, or both, may be altered, for example reduced, as a function of the symptoms, to a level at which the improved condition is retained and when the symptoms have been alleviated to the desired level, treatment should cease. The subject may, however, require intermittent treatment on a long-term basis upon any recurrence of disease symptoms.
[0333] It will be understood, however, that the total daily usage of one or more non-covalent DNA binding agents alone or in combination with one or more anti-cancer and/or anti-inflammatory agents of the present invention will be decided by the attending physician within the scope of sound medical judgment. The specific inhibitory dose for any particular patient will depend upon a variety of factors including the disorder being treated and the severity of the disorder; the activity of the specific compound employed; the specific composition employed; the age, body weight, general health, sex and diet of the patient; the time of administration, route of administration, and rate of excretion of the specific compound employed; the duration of the treatment; drugs used in combination or coincidental with the specific compound employed; and like factors well known in the medical arts.
[0334] In general, the anti-inflammatory agents of the invention may be administered in therapeutically effective amounts via any of the acceptable modes known in the art. Depending on the anti-inflammatory agent, an effective amount can be in a range of about 0.01 to about 5000 mg/day. This range can be modified to an amount of about 0.1 to 10 mg/day, about 10 to 50 mg/day, about 50 to 100 mg/day, about 100 to 150 mg/day, about 150 to 200 mg/day, about 200 to 250 mg/day, about 250 to 300 mg/day, about 300 to 350 mg/day, about 350 to 400 mg/day, about 400 to 450 mg/day, about 450 to 500 mg/day, about 500 to 550 mg/day, about 550 to 600 mg/day, about 600 to 650 mg/day, about 650 to 700 mg/day, about 700 to 750 mg/day, about 750 to 800 mg/day, about 800 to 850 mg/day, about 850 to 900 mg/day, about 900 to 950 mg/day, about 950 to 1000 mg/day, about 1000 to 1100 mg/day, about 1100 to 1200 mg/day, about 1200 to 1300 mg/day, about 1300 to 1400 mg/day, about 1400 to 1500 mg/day, about 1500 to 1600 mg/day, about 1600 to 1700 mg/day, about 1700 to 1800 mg/day, about 1800 to 1900 mg/day, about 1900 to 2000 mg/day, about 2000 to 2500 mg/day, about 2500 to 3000 mg/day, about 3000 to 3500 mg/day, about 3500 to 4000 mg/day, about 4000 to 4500 mg/day or about 4500 to 5000 mg/day. It would be clear to one skilled in the art that dosage will vary depending on the particular anti-inflammatory agent being used. Specific examples of appropriate dosages, depending on the anti-inflammatory agent, are described below.
[0335] In another embodiment, an effective amount of an anti-inflammatory agent can be in a range of about 0.1 mg/week to 40 mg/week; 0.1 mg/week to 5 mg/week; 5 mg/week to 10 mg/week; 10 mg/week to 30 mg/week; 30 mg/week to 35 mg/week; 0.1 mg/week to 100 mg/week; or 30 mg/week to 50 mg/week. In another embodiment, an anti-inflammatory agent can be administered in an amount of about 50 mg/week or 25 mg twice weekly. It would be clear to one skilled in the art that dosage range will vary depending on the particular anti-inflammatory agent being used, for example see below.
[0336] Methotrexate is an antimetabolite molecule that interferes with DNA synthesis, repair and cellular replication. Methotrexate is an inhibitor of dihydrofolic acid reductase i.e. it is a folic acid antagonist. Methotrexate may be administered in an amount about 0.1 to 40 mg per week with a dosage ranging from about 5 to 30 mg per week. Methotrexate may be administered to a subject in various increments: about 0.1 to 5 mg/week, about 5 to 10 mg/week, about 10 to 15 mg/week, about 15 to 20 mg/week, about 20 to 25 mg/week, about 25 to 30 mg/week, about 30 to 35 mg/week, or about 35 to 40 mg/week. In one embodiment, an effective amount of methotrexate, may be about 10 to 30 mg/week.
[0337] Cyclophosphamide, an alkylating agent, may be administered in dosages ranging about 0.1 to 10 mg/kg body weight per day.
[0338] Cyclosporine (e.g. NEORAL.RTM.) also known as Cyclosporin A, may be administered in dosages ranging from about 1 to 10 mg/kg body weight per day. Dosages ranging about 2.5 to 4 mg per body weight per day may be used.
[0339] Chloroquine or hydroxychloroquine (e.g. PLAQUENIL.RTM.), may be administered in dosages ranging about 100 to 1000 mg daily. Preferred dosages range about 200-600 mg administered daily.
[0340] Sulfasalazine (e.g., AZULFIDINE EN-Tabs.RTM.) may be administered in amounts ranging about 50 to 5000 mg per day, with a dosage of about 2000 to 3000 mg per day for adults. Dosages for children may be about 5 to 100 mg/kg of body weight, up to 2 grams per day.
[0341] Injectable gold salts may be prescribed in dosages about 5 to 100 mg doses every two to four weeks. Orally administered gold salts may be prescribed in doses ranging about 1 to 10 mg per day.
[0342] D-penicillamine or penicillamine (CUPRIMINE.RTM.) may be administered in dosages about 50 to 2000 mg per day, with dosages about 125 mg per day up to 1500 mg per day.
[0343] Azathioprine may be administered in dosages of about 10 to 250 mg per day. For example, a dosage range of about 25 to 200 mg per day is acceptable.
[0344] Anakinra (e.g. KINERET.RTM.) is an interleukin-1 receptor antagonist. A possible dosage range for anakinra is about 10 to 250 mg per day. In one example, the dosage may be about 100 mg per day.
[0345] Infliximab (REMICADE.RTM.) is a chimeric monoclonal antibody that binds tumor necrosis factor alpha (TNF.alpha.) and inhibits the activity of TNF.alpha.. Infliximab may be administered in dosages about 1 to 20 mg/kg body weight every four to eight weeks. Dosages of about 3 to 10 mg/kg body weight may be administered every four to eight weeks depending on the subject.
[0346] Etanercept (e.g. ENBREL.RTM.) is a dimeric fusion protein that binds the tumor necrosis factor (TNF) and blocks its interactions with TNF receptors. In one example, the dosage range of etanercept may be about 10 to 100 mg per week for adults. In another example, the dosage may be about 50 mg per week. Dosages for juvenile subjects may range from about 0.1 to 50 mg/kg body weight per week with a maximum of about 50 mg per week. For adult patients, etanercept may be administered e.g., injected, in 25 mg doses twice weekly e.g., 72-96 hours apart in time.
[0347] Leflunomide (ARAVA.RTM.) may be administered at dosages from about 1 and 100 mg per day. In one embodiment, the dosage range is from about 10 to 20 mg per day.
[0348] It is contemplated that global administration of a therapeutic composition to a subject is not needed in order to achieve a highly localized effect. Localized administration of a therapeutic composition according to the invention is preferably by injection, catheter or by means of a drip device, drug pump or drug-saturated solid matrix from which the composition can diffuse implanted at the target site. When a tissue that is the target of treatment according to the invention is on a surface of an organism, topical administration of a pharmaceutical composition is possible. For example, antibiotics are commonly applied directly to surface wounds as an alternative to oral or intravenous administration, which methods necessitate a much higher absolute dosage in order to counter the effect of systemic dilution, resulting both in possible side-effects in otherwise unaffected tissues and in increased cost.
[0349] Systemic administration of a therapeutic composition according to the invention may be performed by methods of whole-body drug delivery well known in the art. These include, but are not limited to, intravenous drip or injection, subcutaneous, intramuscular, intraperitoneal, intracranial and spinal injection, ingestion via the oral route, inhalation, trans-epithelial diffusion (such as via a drug-impregnated, adhesive patch) or by the use of an implantable, time-release drug delivery device. Note that injection may be performed by conventional means.
[0350] Systemic administration is advantageous when a pharmaceutical composition must be delivered to a target tissue that is widely-dispersed, inaccessible to direct contact or, while accessible to topical or other localized application, is resident in an environment (such as the digestive tract) wherein the native activity of the nucleic acid or other agent might be compromised, e.g. by digestive enzymes or extremes of pH.
[0351] A novel therapeutic composition for use in the invention can be given in a single- or multiple doses. A multiple dose schedule is one in which a primary course of administration can include 1-10 or more separate doses, followed by other doses given at subsequent time intervals required to maintain and or reinforce the level of the therapeutic agent. Such intervals are dependent on the continued need of the recipient for the therapeutic agent, and/or the half-life of a therapeutic agent. The efficacy of administration may be assayed by monitoring the reduction in the levels of a symptom indicative or associated with the disease which it is designed to inhibit. The assays can be performed as described herein or according to methods known to one skilled in the art.
[0352] A therapeutically effective regimen may be sufficient to arrest or otherwise ameliorate symptoms of a disease. An effective dosage regimen requires providing the regulatory drug over a period of time to achieve noticeable therapeutic effects wherein symptoms are reduced to a clinically acceptable standard or ameliorated. The symptoms are specific for the disease in question. For example, when the disease is associated with tumor formation, the claimed invention is successful when tumor growth is arrested, or tumor mass is decreased by at least 50% and preferably 75%.
IX. Pharmaceutical Compositions
[0353] In another aspect, the invention provides for novel pharmaceutical compositions comprising one or more non-covalent DNA binding agents, alone or in combination with other anticancer or anti-inflammatory agents, or a pharmaceutically acceptable ester, salt, or prodrug thereof, together with a pharmaceutically acceptable carrier. This invention provides for a pharmaceutical composition comprising one or more non-covalent DNA binding agent, alone, as the only active agent(s) or in combination with one or more therapeutic active agents, e.g., a chemotherapeutic agent.
[0354] Non-covalent DNA binding agents of the invention can be administered as pharmaceutical compositions by any conventional route, in particular parenterally such as intravenously or by subcutaneous or intramuscular injections; enterally, e.g., orally, e.g., in the form of tablets or capsules, topically, e.g., in the form of lotions, gels, ointments or creams, or in a nasal or suppository form for localized delivery. Pharmaceutical compositions comprising a non-covalent DNA binding agent of the present invention in free form or in a pharmaceutically acceptable salt form in association with at least one pharmaceutically acceptable carrier or diluent can be manufactured in a conventional manner by mixing, granulating or coating methods. For example, oral compositions can be tablets or gelatin capsules comprising the active ingredient together with a) diluents, e.g., lactose, dextrose, sucrose, mannitol, sorbitol, cellulose and/or glycine; b) lubricants, e.g., silica, talcum, stearic acid, its magnesium or calcium salt and/or polyethyleneglycol; for tablets also c) binders, e.g., magnesium aluminum silicate, starch paste, gelatin, tragacanth, methylcellulose, sodium carboxymethylcellulose and or polyvinylpyrrolidone; if desired d) disintegrants, e.g., starches, agar, alginic acid or its sodium salt, or effervescent mixtures; and/or e) absorbents, colorants, flavors and sweeteners. Injectable compositions can be aqueous isotonic solutions or suspensions, and suppositories can be prepared from fatty emulsions or suspensions. The compositions may be sterilized and/or contain adjuvants, such as preserving, stabilizing, wetting or emulsifying agents, solution promoters, salts for regulating the osmotic pressure and/or buffers. In addition, they may also contain other therapeutically valuable substances. Suitable formulations for transdermal applications include an effective amount of a compound of the present invention with a carrier. A carrier can include absorbable pharmacologically acceptable solvents to assist passage through the skin of the host. For example, transdermal devices are in the form of a bandage comprising a backing member, a reservoir containing the compound optionally with carriers, optionally a rate controlling barrier to deliver the compound to the skin of the host at a controlled and predetermined rate over a prolonged period of time, and means to secure the device to the skin. Matrix transdermal formulations may also be used. Suitable formulations for topical application, e.g., to the skin and eyes, are preferably aqueous solutions, ointments, creams or gels well-known in the art. Such may contain solubilizers, stabilizers, tonicity enhancing agents, buffers and preservatives.
[0355] One or more non-covalent DNA binding agents of the invention can be administered in therapeutically effective amounts, alone, as the only active agent(s) or in combination with one or more therapeutic active agents (pharmaceutical combinations), resulting in novel compositions. For example, synergistic effects can occur with other anti-proliferative, anti-cancer, immunomodulatory or anti-inflammatory substances. Where the compounds of the invention are administered in conjunction with other therapies, dosages of the co-administered compounds will of course vary depending on the type of co-drug employed, on the specific drug employed, on the condition being treated and so forth.
[0356] The present invention encompasses pharmaceutically acceptable topical formulations of inventive compounds. The term "pharmaceutically acceptable topical formulation," as used herein, means any formulation which is pharmaceutically acceptable for intradermal administration of a compound of the invention by application of the formulation to the epidermis. In certain embodiments of the invention, the topical formulation comprises a carrier system. Pharmaceutically effective carriers include, but are not limited to, solvents (e.g., alcohols, poly alcohols, water), creams, lotions, ointments, oils, plasters, liposomes, powders, emulsions, microemulsions, and buffered solutions (e.g., hypotonic or buffered saline) or any other carrier known in the art for topically administering pharmaceuticals. A more complete listing of art-known carriers is provided by reference texts that are standard in the art, for example, Remington's Pharmaceutical Sciences, 16th Edition, 1980 and 17th Edition, 1985, both published by Mack Publishing Company, Easton, Pa., the disclosures of which are incorporated herein by reference in their entireties. In certain other embodiments, the topical formulations of the invention may comprise excipients. Any pharmaceutically acceptable excipient known in the art may be used to prepare the inventive pharmaceutically acceptable topical formulations. Examples of excipients that can be included in the topical formulations of the invention include, but are not limited to, preservatives, antioxidants, moisturizers, emollients, buffering agents, solubilizing agents, other penetration agents, skin protectants, surfactants, and propellants, and/or additional therapeutic agents used in combination with the inventive compound. Suitable preservatives include, but are not limited to, alcohols, quaternary amines, organic acids, parabens, and phenols. Suitable antioxidants include, but are not limited to, ascorbic acid and its esters, sodium bisulfite, butylated hydroxytoluene, butylated hydroxyanisole, tocopherols, and chelating agents like EDTA and citric acid. Suitable moisturizers include, but are not limited to, glycerine, sorbitol, polyethylene glycols, urea, and propylene glycol. Suitable buffering agents for use with the invention include, but are not limited to, citric, hydrochloric, and lactic acid buffers. Suitable solubilizing agents include, but are not limited to, quaternary ammonium chlorides, cyclodextrins, benzyl benzoate, lecithin, and polysorbates. Suitable skin protectants that can be used in the topical formulations of the invention include, but are not limited to, vitamin E oil, allatoin, dimethicone, glycerin, petrolatum, and zinc oxide.
[0357] In certain embodiments, the pharmaceutically acceptable topical formulations of the invention comprise at least a compound of the invention and a penetration enhancing agent. The choice of topical formulation will depend on several factors, including the condition to be treated, the physicochemical characteristics of the inventive compound and other excipients present, their stability in the formulation, available manufacturing equipment, and costs constraints. As used herein the term "penetration enhancing agent" means an agent capable of transporting a pharmacologically active compound through the stratum corneum and into the epidermis or dermis, preferably, with little or no systemic absorption. A wide variety of compounds have been evaluated as to their effectiveness in enhancing the rate of penetration of drugs through the skin. See, for example, Percutaneous Penetration Enhancers, Maibach H. I. and Smith H. E. (eds.), CRC Press, Inc., Boca Raton, Fla. (1995), which surveys the use and testing of various skin penetration enhancers, and Buyuktimkin et ah, Chemical Means of Transdermal Drug Permeation Enhancement in Transdermal and Topical Drug Delivery Systems, Gosh T. K., Pfister W. R., Yum S. I. (Eds.), Interpharm Press Inc., Buffalo Grove, IU. (1997). In certain exemplary embodiments, penetration agents for use with the invention include, but are not limited to, triglycerides (e.g., soybean oil), aloe compositions (e.g., aloe-vera gel), ethyl alcohol, isopropyl alcohol, octolyphenylpolyethylene glycol, oleic acid, polyethylene glycol 400, propylene glycol, N-decylmethylsulfoxide. fatty acid esters (e.g., isopropyl myristate, methyl laurate, glycerol monooleate, and propylene glycol monooleate) and N-methyl pyrrolidine.
[0358] In certain embodiments, the compositions may be in the form of ointments, pastes, creams, lotions, gels, powders, solutions, sprays, inhalants or patches. In certain exemplary embodiments, formulations of the compositions according to the invention are creams, which may further contain saturated or unsaturated fatty acids such as stearic acid, palmitic acid, oleic acid, palmito-oleic acid, cetyl or oleyl alcohols, stearic acid being particularly preferred. Creams of the invention may also contain a non-ionic surfactant, for example, polyoxy-40-stearate. In certain embodiments, the active component is admixed under sterile conditions with a pharmaceutically acceptable carrier and any needed preservatives or buffers as may be required. Ophthalmic formulation, eardrops, and eye drops are also contemplated as being within the scope of this invention. Additionally, the present invention contemplates the use of transdermal patches, which have the added advantage of providing controlled delivery of a compound to the body. Such dosage forms are made by dissolving or dispensing the compound in the proper medium. As discussed above, penetration enhancing agents can also be used to increase the flux of the compound across the skin. The rate can be controlled by either providing a rate controlling membrane or by dispersing the compound in a polymer matrix or gel.
[0359] The pharmaceutical compositions of the present invention comprise a therapeutically effective amount of a compound of the present invention formulated together with one or more pharmaceutically acceptable carriers. As used herein, the term "pharmaceutically acceptable carrier" means a non-toxic, inert solid, semi-solid or liquid filler, diluent, encapsulating material or formulation auxiliary of any type. The pharmaceutical compositions of this invention can be administered to humans and other animals orally, rectally, parenterally, intracisternally, intravaginally, intraperitoneally, topically (as by powders, ointments, or drops), buccally, or as an oral or nasal spray.
[0360] Liquid dosage forms for oral administration include pharmaceutically acceptable emulsions, microemulsions, solutions, suspensions, syrups and elixirs. In addition to the active compounds, the liquid dosage forms may contain inert diluents commonly used in the art such as, for example, water or other solvents, solubilizing agents and emulsifiers such as ethyl alcohol, isopropyl alcohol, ethyl carbonate, ethyl acetate, benzyl alcohol, benzyl benzoate, propylene glycol, 1,3-butylene glycol, dimethylformamide, oils (in particular, cottonseed, groundnut, corn, germ, olive, castor, and sesame oils), glycerol, tetrahydrofurfuryl alcohol, polyethylene glycols and fatty acid esters of sorbitan, and mixtures thereof. Besides inert diluents, the oral compositions can also include adjuvants such as wetting agents, emulsifying and suspending agents, sweetening, flavoring, and perfuming agents.
[0361] Injectable preparations, for example, sterile injectable aqueous or oleaginous suspensions may be formulated according to the known art using suitable dispersing or wetting agents and suspending agents. The sterile injectable preparation may also be a sterile injectable solution, suspension or emulsion in a nontoxic parenterally acceptable diluent or solvent, for example, as a solution in 1,3-butanediol. Among the acceptable vehicles and solvents that may be employed are water, Ringer's solution, U.S.P. and isotonic sodium chloride solution. In addition, sterile, fixed oils are conventionally employed as a solvent or suspending medium. For this purpose any bland fixed oil can be employed including synthetic mono- or diglycerides. In addition, fatty acids such as oleic acid are used in the preparation of injectables.
[0362] In order to prolong the effect of a drug, it is often desirable to slow the absorption of the drug from subcutaneous or intramuscular injection. This may be accomplished by the use of a liquid suspension of crystalline or amorphous material with poor water solubility. The rate of absorption of the drug then depends upon its rate of dissolution which, in turn, may depend upon crystal size and crystalline form. Alternatively, delayed absorption of a parenterally administered drug form is accomplished by dissolving or suspending the drug in an oil vehicle.
[0363] Compositions for rectal or vaginal administration are preferably suppositories which can be prepared by mixing the compounds of this invention with suitable non-irritating excipients or carriers such as cocoa butter, polyethylene glycol or a suppository wax which are solid at ambient temperature but liquid at body temperature and therefore melt in the rectum or vaginal cavity and release the active compound.
[0364] Solid compositions of a similar type may also be employed as fillers in soft and hard-filled gelatin capsules using such excipients as lactose or milk sugar as well as high molecular weight polyethylene glycols and the like.
[0365] The active compounds can also be in micro-encapsulated form with one or more excipients as noted above. The solid dosage forms of tablets, dragees, capsules, pills, and granules can be prepared with coatings and shells such as enteric coatings, release controlling coatings and other coatings well known in the pharmaceutical formulating art. In such solid dosage forms the active compound may be admixed with at least one inert diluent such as sucrose, lactose or starch. Such dosage forms may also comprise, as is normal practice, additional substances other than inert diluents, e.g., tableting lubricants and other tableting aids such as magnesium stearate and microcrystalline cellulose. In the case of capsules, tablets and pills, the dosage forms may also comprise buffering agents.
[0366] Dosage forms for topical or transdermal administration of a compound of this invention include ointments, pastes, creams, lotions, gels, powders, solutions, sprays, inhalants or patches. The active component is admixed under sterile conditions with a pharmaceutically acceptable carrier and any needed preservatives or buffers as may be required. Ophthalmic formulation, ear drops, eye ointments, powders and solutions are also contemplated as being within the scope of this invention.
[0367] The ointments, pastes, creams and gels may contain, in addition to an active compound of this invention, excipients such as animal and vegetable fats, oils, waxes, paraffins, starch, tragacanth, cellulose derivatives, polyethylene glycols, silicones, bentonites, silicic acid, talc and zinc oxide, or mixtures thereof.
[0368] Powders and sprays can contain, in addition to the compounds of this invention, excipients such as lactose, talc, silicic acid, aluminum hydroxide, calcium silicates and polyamide powder, or mixtures of these substances. Sprays can additionally contain customary propellants such as chlorofluorohydrocarbons.
[0369] Transdermal patches have the added advantage of providing controlled delivery of a compound to the body. Such dosage forms can be made by dissolving or dispensing the compound in the proper medium. Absorption enhancers can also be used to increase the flux of the compound across the skin. The rate can be controlled by either providing a rate controlling membrane or by dispersing the compound in a polymer matrix or gel.
[0370] According to the descriptions of novel compositions and methods of treatment of the present invention, disorders are treated or prevented in a subject, such as a human or other animal, by administering to the subject a therapeutically effective amount of one or more non-covalent DNA binding agent, alone, as the only active agent(s) or in combination with one or more other active agents, in such amounts and for such time as is necessary to achieve the desired result. The term "therapeutically effective amount" of a compound of the invention, as used herein, means a sufficient amount of the compound so as to decrease the symptoms of a disorder in a subject. As is well understood in the medical arts a therapeutically effective amount of a compound of this invention will be at a reasonable benefit/risk ratio applicable to any medical treatment.
[0371] The invention also provides for novel compositions of pharmaceutical combinations, e.g. a kit, comprising an agent which is a compound of the invention as disclosed herein, in free form or in pharmaceutically acceptable salt form. The kit can comprise instructions for its administration to a subject suffering from or susceptible to a disease or disorder.
[0372] Some examples of materials which can serve as pharmaceutically acceptable carriers include, but are not limited to, ion exchangers, alumina, aluminum stearate, lecithin, serum proteins, such as human serum albumin, buffer substances such as phosphates, glycine, sorbic acid, or potassium sorbate, partial glyceride mixtures of saturated vegetable fatty acids, water, salts or electrolytes, such as protamine sulfate, disodium hydrogen phosphate, potassium hydrogen phosphate, sodium chloride, zinc salts, colloidal silica, magnesium trisilicate, polyvinyl pyrrolidone, polyacrylates, waxes, polyethylene-polyoxypropylene-block polymers, wool fat, sugars such as lactose, glucose and sucrose; starches such as corn starch and potato starch; cellulose and its derivatives such as sodium carboxymethyl cellulose, ethyl cellulose and cellulose acetate; powdered tragacanth; malt; gelatin; talc; excipients such as cocoa butter and suppository waxes, oils such as peanut oil, cottonseed oil; safflower oil; sesame oil; olive oil; corn oil and soybean oil; glycols; such a propylene glycol or polyethylene glycol; esters such as ethyl oleate and ethyl laurate, agar; buffering agents such as magnesium hydroxide and aluminum hydroxide; alginic acid; pyrogen-free water, isotonic saline; Ringer's solution; ethyl alcohol, and phosphate buffer solutions, as well as other non-toxic compatible lubricants such as sodium lauryl sulfate and magnesium stearate, as well as coloring agents, releasing agents, coating agents, sweetening, flavoring and perfuming agents, preservatives and antioxidants can also be present in the composition, according to the judgment of the formulator. The non-covalent DNA binding agent compounds (e.g., including those delineated herein), or pharmaceutical salts thereof may be formulated into pharmaceutical compositions for administration to animals or humans. These pharmaceutical compositions, which comprise an amount of the non-covalent DNA binding agent compounds effective to treat or prevent a non-covalent DNA cross-link mediated condition and a pharmaceutically acceptable carrier, are another embodiment of the present invention.
[0373] This invention also encompasses novel pharmaceutical compositions containing, and methods of treating disorders through administering, pharmaceutically acceptable prodrugs of one or more non-covalent DNA binding agents of the invention alone, as the only active agent(s) or in combination with other available active agents. For example, non-covalent DNA binding agents of the invention having free amino, amido, hydroxy or carboxylic groups can be converted into prodrugs. Prodrugs include compounds wherein an amino acid residue, or a polypeptide chain of two or more (e.g., two, three or four) amino acid residues is covalently joined through an amide or ester bond to a free amino, hydroxy or carboxylic acid group of compounds of the invention. The amino acid residues include but are not limited to the 20 naturally occurring amino acids commonly designated by three letter symbols and also includes 4-hydroxyproline, hydroxyysine, demosine, isodemosine, 3-methylhistidine, norvalin, beta-alanine, gamma-aminobutyric acid, citrulline, homocysteine, homoserine, ornithine and methionine sulfone. Additional types of prodrugs are also encompassed. For instance, free carboxyl groups can be derivatized as amides or alkyl esters. Free hydroxy groups may be derivatized using groups including but not limited to hemisuccinates, phosphate esters, dimethylaminoacetates, and phosphoryloxymethyloxy carbonyls, as outlined in Advanced Drug Delivery Reviews, 1996, 19, 1 15. Carbamate prodrugs of hydroxy and amino groups are also included, as are carbonate prodrugs, sulfonate esters and sulfate esters of hydroxy groups. Derivatization of hydroxy groups as (acyloxy)methyl and (acyloxy)ethyl ethers wherein the acyl group may be an alkyl ester, optionally substituted with groups including but not limited to ether, amine and carboxylic acid functionalities, or where the acyl group is an amino acid ester as described above, are also encompassed. Prodrugs of this type are described in J. Med. Chem. 1996, 39, 10. Free amines can also be derivatized as amides, sulfonamides or phosphonamides. All of these prodrug moieties may incorporate groups including but not limited to ether, amine and carboxylic acid functionalities.
[0374] Combinations of substituents and variables envisioned by this invention are only those that result in the formation of stable compounds. The term "stable", as used herein, refers to compounds which possess stability sufficient to allow manufacture and which maintain the integrity of the compound for a sufficient period of time to be useful for the purposes detailed herein (e.g., therapeutic or prophylactic administration to a subject).
[0375] The terms "isolated," "purified," or "biologically pure" refer to material that is substantially or essentially free from components that normally accompany it as found in its native state. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. Particularly, in embodiments of the invention the compound is at least 85% pure, more preferably at least 90% pure, more preferably at least 95% pure, and most preferably at least 99% pure.
X. Kits or Pharmaceutical Systems
[0376] The novel compositions described in this application may be assembled into kits or pharmaceutical systems for use in disease treatment, e.g., cancer treatment or treatment of an inflammatory disease. Kits or pharmaceutical systems according to this aspect of the invention comprise a carrier means, such as a box, carton, tube or the like, having in close confinement therein one or more container means, such as vials, tubes, ampules, bottles and the like. The kits or pharmaceutical systems of the invention may also comprise associated instructions for using one or more non-covalent DNA binding agents of the invention, alone, as the only active agent(s) or in combination with other active agents. The non-covalent DNA binding agents of the kits or pharmaceutical systems of the invention may have any one of the functional properties described for the non-covalent DNA binding agents of the methods of the invention.
[0377] In certain embodiments, the kits of the invention include a test for determining if a subject has a mutation in a particular gene of interest.
XI. Uses
[0378] The methods of the invention can be used to treat a subject with a disease, e.g., cancer and/or inflammatory disease.
XII. Animal Models
[0379] The invention provides for animal models for various diseases, including but not limited to cancer.
[0380] Additional animal models known in the art are also useful according to the invention, such as those models for inflammatory disorders such as rheumatoid arthritis, psorias, Crohn's disease and ulcerative colitis.
[0381] A. Rheumatoid Arthritis:
[0382] Animal models for Rheumatoid arthritis include but are not limited to collagen induced arthritis in mouse and rat, collagen antibody induced arthritis in mouse, spontaneous rheumatoid arthritis in K/BxN mice, arthritis induced by adoptive transfer of serum from K/BxN mice and spontaneous arthritis in TNF.alpha. transgenic mice.
[0383] B. Multiple Sclerosis:
[0384] Animal models for Multiple Sclerosis include but are not limited to experimental autoimmune encephalopathy in mouse and rat induced by injection of myelin oligodendrocyte glycoprotein and experimental autoimmune encephalopathy in mouse and rat induced by injection of proteolipid protein.
[0385] C. Inflammatory Bowel Disease (Crohn's Disease):
[0386] Animal models for Crohn's Disease include but are not limited to Dextran sodium sulfate induced colitis in mouse and rat and colitis induced by adoptive transfer of CD4+CD45RBhigh cells into SCID mice
[0387] D. Inflammatory Bowel Disease (Ulcerative Colitis):
[0388] An animal model for ulcerative colitis includes but is not limited to trinitrobenzene sulfonic acid induced colitis in mouse and rat.
[0389] E. Type I Diabetes: Spontaneous Type I Diabetes
[0390] An animal model for Type I Diabetes includes but is not limited to BB/Wor rat or NOD mice.
[0391] F Graft Versus Host Disease
[0392] An animal model for graft versus host disease includes but is not limited to transfer of allogenic donor lymphocytes and stem cells into irradiated host mice and transfer of allogenic donor lymphocytes and stem cells into immune competent host mice.
[0393] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
EXAMPLES
[0394] Having now generally described the invention, the same will be more readily understood through reference to the following Examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.
[0395] The following examples are put forth for illustrative purposes only and are not intended to limit the scope of what the inventors regard as their invention.
Example 1
Potency of Non-Covalent DNA Binding Agents in MMR-Proficient Tumor Cells-Pharmacological Profile
[0396] Three novel non-covalent DNA binding agents of the invention, NSC 718813, NSC 723734 and NSC 726260, are tested in five different EGFR-resistant, K-Ras mutant cancer cell lines. These cell lines represent colorectal (SW480, SW620 and HCT116) and breast cancer (MDA-MB-231 and MDA-MA-468). The growth inhibitory effects of novel non-covalent DNA binding agents of the invention in EGFR-resistant, mutant K-ras cancer cells are compared to those observed in tumor cells that either do not express EGFR (U2OS) and/or carry the wild-type KRAS gene, and/or have normal EGFR expression or wild-type K-ras (SW403). The tumor cell lineage and their respective mutations in EGF receptor and/or its signaling cascade genes are shown in Table 2.
[0397] In Vitro Cancer Screening Methods
[0398] The in vitro assays to evaluate the anticancer potential of non-covalent DNA binding agents were measured by using one or more of the assays described below.
[0399] Sulforhodamine B (SRB) Uptake Assay:
[0400] The human tumor cell lines of the cancer screening panel are grown in RPMI 1640 medium containing 5% fetal bovine serum and 2 mM L-glutamine. For a typical screening experiment, cells are inoculated into 96 well microtiter plates in 100 .mu.L at plating densities ranging from 5,000 to 40,000 cells/well depending on the doubling time of individual cell lines. After cell inoculation, the microtiter plates are incubated at 37.degree. C., 5% CO.sub.2, 95% air and 100% relative humidity for 24 h prior to addition of experimental drugs.
[0401] After 24 h, two plates of each cell line are fixed in situ with TCA, to represent a measurement of the cell population for each cell line at the time of drug addition (Tz). Experimental drugs are solubilized in dimethyl sulfoxide at 400-fold the desired final maximum test concentration, and stored frozen prior to use. At the time of drug addition, an aliquot of frozen concentrate is thawed and diluted to twice the desired final maximum test concentration with complete medium containing 50 .mu.g/ml gentamicin. Additional four, 10-fold or 1/2 log serial dilutions are made to provide a total of five drug concentrations plus control. Aliquots of 100 .mu.l of these different drug dilutions are added to the appropriate microtiter wells already containing 100 .mu.l of medium, resulting in the required final drug concentrations.
[0402] Following drug addition, the plates are incubated for an additional 48 h at 37.degree. C., 5% CO.sub.2, 95% air, and 100% relative humidity. For adherent cells, the assay is terminated by the addition of cold TCA. Cells are fixed in situ by the gentle addition of 50 .mu.l of cold 50% (w/v) TCA (final concentration, 10% TCA) and incubated for 60 minutes at 4.degree. C. The supernatant is discarded, and the plates are washed five times with tap water and air dried. Sulforhodamine B (SRB) solution (100 .mu.l) at 0.4% (w/v) in 1% acetic acid is added to each well, and plates are incubated for 10 minutes at room temperature. After staining, unbound dye is removed by washing five times with 1% acetic acid and the plates are air dried. Bound stain is subsequently solubilized with 10 mM trizma base, and the absorbance is read on an automated plate reader at a wavelength of 515 nm. For suspension cells, the methodology is the same except that the assay is terminated by fixing settled cells at the bottom of the wells by gently adding 50 .mu.l of 80% TCA (final concentration, 16% TCA). Using the seven absorbance measurements [time zero, (Tz), control growth, (C), and test growth in the presence of drug at the five concentration levels (Ti)], the percentage growth is calculated at each of the drug concentration levels. Percentage growth inhibition is calculated as:
[(Ti-Tz)/(C-Tz)].times.100 for concentrations for which Ti>/=Tz
[(Ti-Tz)/Tz].times.100 for concentrations for which Ti<Tz.
[0403] Three dose response parameters are calculated for each experimental agent.
[0404] Growth inhibition of 50% (GI.sub.50) is calculated from [(Ti-Tz)/(C-Tz)].times.100=50, which is the drug concentration resulting in a 50% reduction in the net protein increase (as measured by SRB staining) in control cells during the drug incubation.
[0405] The drug concentration resulting in total growth inhibition (TGI) is calculated from Ti=Tz.
[0406] The LC.sub.50 (concentration of drug resulting in a 50% reduction in the measured protein at the end of the drug treatment as compared to that at the beginning) indicating a net loss of cells following treatment is calculated from [(Ti-Tz)/Tz].times.100=-50.
[0407] Values are calculated for each of these three parameters if the level of activity is reached. However, if the effect is not reached or is exceeded, the value for that parameter is expressed as greater or less than the maximum or minimum concentration tested.
[0408] Alamar Blue Cell Survival Assay in Human Tumor Cells:
[0409] Tumor cells are plated in 96 well plates at a density of 8,000 to 10,000 cells per well in 100 uL volume and grown overnight. On the second day, the cells are supplemented with medium containing an appropriate dilution of the compounds to be tested. The cells are treated with the test compounds for two more days and the growth medium was replaced with fresh medium containing 3% Alamar Blue, incubated for 2-3 hours and plates are read in a SpectraMax Gemini XS fluorescence plate reader (Molecular Devices).
[0410] Alamar Blue Cell Survival Assay in Yeast Cells:
[0411] The cells are diluted 100-fold in yeast complete medium. 100 .mu.L of diluted cells are seeded in 96 well plates with or without a non-covalent DNA binding compound and incubated for 24 hours at 30.degree. C. The following day, an equal volume of yeast complete medium containing 1% alamar blue is added and incubated at 30.degree. C. for two hours. Fluorescence intensity is measured in a fluorescent reader to calculate the inhibition effect of non-covalent DNA binding agents in mutant and wild type yeast cells.
[0412] Half-Maximal Trypan Blue Exclusion Cytotoxic Concentration (CC50) Assay:
[0413] In this assay non-specific cytotoxicity of various test compounds is determined based upon trypan blue exclusion. For the trypan blue dye exclusion assay, the cells are seeded at 10*5 cells per well in a 24-well plate and incubated overnight. The medium is replaced with fresh medium containing serial dilutions of a test compound which is diluted in DMSO. DMSO alone is used as a control. The maximum amount of DMSO in each well does not exceed more than 10%. The cells are incubated with compound for 48 hours and the supernatant which may contain dead cells is collected. The attached cells are trypsinized and transferred to the supernatant. The number of cells which do not incorporate trypan blue dye are calculated as viable cell number by hemocytometer. From the dose-response curve, the 50% CC50 is determined.
[0414] siRNA Inhibition of MMR, p53, and REV FUNCTIONS
[0415] siRNA specific for different genes is purchased from Dharmacon (Thermo Fisher Scientific Dharmacon Products, Lafayette, Colo. 80026) and the protocol recommended by the supplier is utilized. Confluent cells are trypsinized and 5000 cells are seeded in a well in the presence or absence of siRNA in 100 .mu.L medium. The cells are incubated with siRNA for two days. A non-covalent DNA binding agent of the invention is added in a 10 .mu.L volume and incubated for another 48 hours. After treatment with the agent, the medium is replaced with 1% alamar blue containing medium to measure fluorescence after two hours. The difference in fluorescence intensity shows the growth inhibition.
[0416] Methods for Combination Experiments
[0417] Tumor cells are plated in 96 well plates at a density of 8,000 to 10,000 cells per well in 100 uL volume and grown overnight. On the second day, the cells are supplemented with medium containing an appropriate dilution of the compounds to be tested as follows: In each well 100 uL of medium is added to all the wells. 50 uL of 3.times. concentration of novel non-covalent DNA binding agents are added to the top row (row A). After mixing 50 uL is added to next row (row B) and 1/3 dilution is continued up to row F (six rows) leaving seventh and eighth rows. 50 uL 3.times. concentration of other compounds in the combination are added to seven wells (A to G) in the left column 1 and diluted (1/3 dilution) from left to right until column 6. This is repeated other half of the plate from 7 to 12. The cells are incubated with combination of compounds for two more days and the growth medium was replaced with fresh medium containing 3% Alamar Blue, incubated for 2-3 hours and plates are read in a SpectraMax Gemini XS fluorescence plate reader (Molecular Devices). Mean of two wells is taken for calculation of combination effect.
[0418] Results:
[0419] Novel non-covalent DNA binding agents have IC.sub.50 values ranging from 8 nM to 1075 nM in tumor cells that have wild type K-RAS gene. In tumor cells harboring mutations in genes in EGFR pathways, both K-RAS and K-RAS/BRAF with or without PTEN deficiency, the IC.sub.50 values for novel non-covalent DNA binding agents of the invention are similar or better than those observed in tumor cells with the wild type K-RAS, U2OS.
[0420] The colon cancer cell line HCT 116, which has double mutations in K-RAS and in the DNA mismatch repair gene MLH, is more susceptible to non-covalent DNA binding agents of the invention than the colon cancer cells which have a K-RAS mutation only. The tumor cells which are deficient in PTEN are more sensitive to novel non-covalent DNA binding agents of the invention then are other mutated tumor cell lines. Among the three compounds tested NSC 718813 and NSC 723734 have similar potency (<100 nM), while NSC 726260 is comparatively less potent, with IC.sub.50 values around 1 uM. These cellular potency estimates for novel non-covalent DNA binding agents of the invention, in tumor cells that have K-RAS mutations and/or PTEN or mismatch repair gene deficiencies, provides a novel approach to treating genetically-resistant cancers with such genetic mutations.
[0421] The results are presented in Figures 1-U2OS, 2-Colo205, 3-HeLa, 4-lymphoblastoid 4-CEM cells, 5-leukemia cells (CEM), 6-Jurkat Cells, 7-MDA-MB-468, 8-2E-H1299 cancer cells, 9A-SW403, 9B-SW403, 10A-SW620 and 10B-HCT116, and Table 2.
[0422] Novel Non-Covalent DNA Binding Agents of the Inventions are Effective in K-RAS Mutant Colon Cancer Cells:
TABLE-US-00002 TABLE 2 Mutation Deficiency IC50 nM Cell Type of (Gain of (Loss of NSC NSC NSC Line Cancer function) function) 718813 723734 726260 U2OS Osteo- WT Lack of 202 + 178 + 397 + sacroma EGFR 27.3 40.9 51.6 SW403 Colon WT -- 210 + 550 + 1025 + (EGFR 35.4 141.4 106.1 Over Ex- pression) SW620 Colon KRAS -- 236 .+-. 175 .+-. 1050 .+-. 37.2 25.0 50.0 SW480 Colon KRAS -- 48 + 575 + 1075 + 17.7 35.4 35.0 HCT116 Colon KRAS MLH1 17 + 160 + 550 + 2.5 42.4 167.5 MDA231 Breast KRAS & -- 54 + 394 + 501 + BRAF 2.3 17.0 29.0 (ERK+) MDA- Breast ERK+ PTEN 8 .+-. 22 .+-. 364 + MB-468 (EGFR 1.2 0.7 54.8 over ex- pression) CEM Leu- -- PTEN 51 .+-. 49 .+-. 161 .+-. kemia 0.6 0.8 0.4 Jurkat Leu- -- PTEN 17 .+-. 45 .+-. 114 .+-. kemia 0.2 4.0 26.7 WT: Wild Type tumor cell line
Example 2
Non-Covalent DNA Binding Agents Cause Double Strand Breaks
[0423] As evidenced by the sensitivity of yeast RAD52 mutants to the cytotoxicity of novel DNA binding agents, these agents cause double stranded breaks.
[0424] Yeast cells that carry mutations in different genes involved in homologous recombination (rad 50, rad51, rad52, and rad57) and nucleotide excision/double strand repair (rad1) are grown to stationary culture overnight. Results are shown in Table 3.
TABLE-US-00003 TABLE 3 IC50 uM Yeast PBD-A PBD-B PBD-C PBD-D mutation 718813 723734 723732 726260 rad1 11 15 R 15 rad50 90 17 R 20 rad51 7 28 100 4.5 rad52 90 50 105 15 rad57 ND ND ND 0.3 Wild type yeast R R R 45 R = Resistant (No killing up to 250 uM)
Example 3
Half-Life of Non-Covalent DNA Binding Agents in Rats
[0425] Determination of Pharmacokinetics of Novel Non-Covalent DNA Binding Agents in Rats:
[0426] Intravenous and oral pharmacokinetic studies are conducted on the novel non-covalent DNA binding agents, NSC 718813, NSC 723734 and NSC 726260, in male Sprague-Dawley rats. The studies are conducted in a parallel design with two groups of four male rats each for intravenous and oral administration of the test agents. The protocols for the studies are approved by the appropriate institutional animal care and use committee.
[0427] Groups of rats designated to receive oral doses of the novel non-covalent DNA binding agents of the invention molecules receive an oral dose of 20 mg/kg in a formulation vehicle comprised of N,N-dimethylacetamide (DMA), polyethylene glycol 400 (PEG400), ethanol, Cremophor EL and water (10:10:10:5:65 v/v). The dose volume for the oral doses of the test compounds is 8 mL/kg. Groups of rats are randomized to receive intravenous doses of agents. These rats receive a single intravenous bolus dose of 3 mg/kg of the test compound in a vehicle comprised of DMA:PEG400:ethanol:Cremophor EL:0.9% sodium chloride (saline) (10:10:10:5:65 v/v). The dose volume for intravenous doses of test agents is 1 mL/kg.
[0428] Predose blood samples are obtained from all rats from both, oral and intravenous dosing groups. For the intravenously dosed rats, blood samples (100 uL each) are obtained at 0.083, 0.25. 0.5, 1.0, 2.0, 4.0, 8.0, 12.0 and 24.0 hours post-dose. For the oral dose groups, the sampling times are identical to the intravenous dose group, except that the 0.083 hour sample is not collected. Following the collection of the blood samples, an equal volume of water is added to the blood sample to hemolyze the blood sample and the samples are stored frozen at -70.degree. C. until bioanalysis.
[0429] Plasma samples are analyzed for the concentration of the test non-covalent DNA binding agents of the invention using an HPLC method with mass spectrometric (MS/MS) detection, following a liquid:liquid extraction of the plasma samples using a dichloromethane:ethyl acetate (20:80) mixture. To a 100 .mu.L aliquot of sample, 50 .mu.L of an internal standard (NSC 723732) is added. After mixing the internal standard well, 2.5 mL of the extracting solvent (dichloromethane:ethyl acetate 20:80 v/v) is added. The mixture is vortexed for one minute and the samples are centrifuged at 3000 rpm for 3 minutes. Approximately 2 mL of the supernatant is taken from the centrifuged tubes and the sample is dried under a nitrogen stream at 50.degree. C. The residue is reconstituted with 100 .mu.L of the mobile phase and 20 .mu.L is injected into the HPLC system for analysis. The mobile phase is comprised of milli-Q water:acetonitrile:formic acid (20:80:0.05) adjusted to pH 7.5 with ammonia.
[0430] Liquid Chromatography Mass Spectrometric (LC/MS/MS) Conditions:
[0431] The analysis of the test agent concentration is conducted by an HPLC method using a Shimadzu Prominence HPLC system and the eluent is analyzed using an API 4000 LC-MS/MS system (Applied Biosystems). The samples are analyzed on a HyPurity Advance, 50.times.4.6 mm, 5 u, Thermoelectron column. An injection volume of 20 .mu.L is used for the analytical sample and the flow rate of the mobile phase is 0.6 mL/minute. Mass spectrometric analysis is conducted on the eluent using the API 4000 LC-MS/MS system and the mass parameters are analyzed for MRM transitions using NSC 723732 as the internal standard, in a positive ionization mode at a temperature of 400 C.
[0432] Pharmacokinetics of Novel Non-Covalent DNA Binding Agents, NSC 718813, NSC 723734 and NSC 726260 Following Intravenous and Oral Administration in Male Sprague-Dawley Rats:
[0433] The pharmacokinetics of NSC 718813, NSC723734 and NSC726260 are evaluated in the rat following intravenous and oral administration to evaluate the metabolic stability and clearance profile of these novel agents. Furthermore, the formulation properties of these agents are evaluated to assess their aqueous solubility and ability to administer formulations of these non-covalent DNA binding agents in vehicles similar to those used for various chemotherapeutic agents. Non-covalent DNA binding compounds have somewhat limited aqueous solubility, and require the addition of non-aqueous solvents such as polyethylene glycol 400, Cremophor and dimethylacetamide to allow intravenous administration of these agents in rats.
[0434] Pharmacokinetics of NSC 718813
[0435] NSC 718813 achieves excellent exposure in the blood following intravenous administration of a dose of 3 mg/kg. Concentrations well above its in vitro GI50 and/or TGI are achieved in rat blood for at least 4 hours following intravenous administration (see Table 4 and FIG. 11 below).
TABLE-US-00004 TABLE 4 Pharmacokinetic parameters (mean .+-. SD) of NSC 718813/1 in male Sprague Dawley rats following oral solution and intravenous bolus administration AUC.sub.0-t AUC.sub.0-inf CL.sub.blood T.sub.max .sup.a C.sub.max (ng h/ (ng h/ T.sub.1/2 .sup.b (mL/min/ Vd.sub.ss Route (h) (ng/mL) mL) mL) (h) kg) (L/kg) F (%) .sup.c IV-bolus 0.08 5723 .+-. 2376 .+-. 2424 .+-. 2.2 .+-. 20.0 .+-. 1.5 .+-. (N = 4) (0.08- 1005 304 309 0.4 2.9 0.4 0.13) Oral 0.5 112 .+-. 303 .+-. 345 .+-. 1.8 .+-. NA .sup.d NA 2.0 (N = 5) (0.25- 32 129 142 0.6 2.0) .sup.a median (range); .sup.b harmonic mean; .sup.c F = (AUC.sub.0-inf).sub.oral .times. dose.sub.iv/(AUC.sub.0-inf).sub.iv .times. dose.sub.oral, mean oral dose: 20.50 mg/kg; mean intravenous dose: 2.90 mg/kg; .sup.d not applicable
[0436] These novel non-covalent DNA binding agents of the invention are designed to address the metabolic instability and rapid clearance of the naturally occurring antitumor antibiotics like anthramycin and neothramycin. As shown in Table 4, the systemic clearance of NSC 718813 is estimated to be approximately 20 mL/min/kg, which is significantly lower than the hepatic blood flow in the rat--showing that NSC 718813 has a low to moderate clearance following intravenous administration. NSC-718813 has better metabolic stability than its naturally occurring antitumor antibiotic analogs. NSC 718813 at an oral dose of 20 mg/kg has low, but measurable blood levels for up to 8 hours post-dose (see FIG. 11) and has an estimated oral bioavailability of 2%. The poor oral bioavailability of NSC 718813 coupled with its low systemic clearance, suggests absorption-limited oral bioavailability, either due to poor absorption across the gut wall and/or luminal or gastrointestinal mucosal pre-systemic elimination.
[0437] The pharmacokinetic profile and estimated parameters following intravenous and oral administration for NSC723734 are shown in FIG. 12 and Table 5, below.
TABLE-US-00005 TABLE 5 Pharmacokinetic parameters (mean .+-. SD) of NSC 723734 in male Sprague Dawley rats (N = 4) following oral solution and intravenous bolus administration AUC.sub.0-t AUC.sub.0-inf CL.sub.blood T.sub.max .sup.a C.sub.max (ng h/ (ng h/ T.sub.1/2 .sup.b (mL/min/ Vd.sub.ss Route (h) (ng/mL) mL) mL) (h) kg) (L/kg) F (%) .sup.c IV-bolus 0.083 4053 .+-. 4246 .+-. 4405 .+-. 6.3 .+-. 11.4 .+-. 3.2 .+-. NA.sup.d (0.083- 472 311 330 0.3 0.5 0.3 0.083) Oral 0.25 90.5 .+-. 196 .+-. 216 .+-. 2.3 .+-. NA NA 0.7 (0.25- 56 93 84 0.7 0.25) .sup.a median (range); .sup.b harmonic mean; .sup.c F = (AUC.sub.0-inf).sub.oral .times. dose.sub.iv/(AUC.sub.0-inf).sub.iv .times. dose.sub.oral, mean oral dose: 20.34 mg/kg; mean intravenous dose: 3.00 mg/kg; .sup.dnot applicable
[0438] Following intravenous administration, NSC723734 shows a low clearance (11 mL/min/kg) which is about 20% of normal liver blood flow in rat (55 mL/min/kg). The compound is well distributed with a mean volume of distribution (3 L/kg) that is about 4 times the total body water. The compound is eliminated with a mean (harmonic) elimination T.sub.1/2 of 6.3 h. The mean intravenous C.sub.max is 4053 ng/mL and the mean overall intravenous exposure (AUC.sub.0-inf) is 4405 ngh/mL. After oral dosing, NSC723734 shows a median T.sub.max of 0.25 h, indicating that the compound undergoes rapid absorption. The mean oral C.sub.max is 91 ng/mL, and the mean overall exposure (AUC.sub.0-inf) is 216 ngh/mL. The oral absolute bioavailability of NSC723734 in rats is estimated to be approximately 1%. Because the overall blood clearance of the compound in the rat is low, it is unlikely that the low bioavailability of the compound results from a significant first-pass effect. It is possible that low solubility or membrane permeability may determine the oral bioavailability.
[0439] Pharmacokinetics of NSC 726260
[0440] The pharmacokinetic profile and estimated parameters following intravenous and oral administration for NSC726260 are shown in FIG. 13 and Table 6, below.
TABLE-US-00006 TABLE 6 Pharmacokinetic parameters (mean .+-. SD) of NSC726260 in male Sprague-Dawley rats (N = 4) following oral solution and intravenous bolus administration AUC.sub.0-t AUC.sub.0-inf CL.sub.blood T.sub.max .sup.a C.sub.max (ng h/ (ng h/ T.sub.1/2 .sup.b (mL/min/ Vd.sub.ss Route (h) (ng/mL) mL) mL) (h) kg) (L/kg) F (%) .sup.c IV-bolus 0.083 5587 .+-. 5058 .+-. 5112 .+-. 4.8 .+-. 10.4 .+-. 1.9 .+-. NA.sup.d (0.083- 1195 874 871 0.5 2.0 0.7 0.083) Oral 4.0 438 .+-. 2474 .+-. 2536 .+-. 4.6 .+-. NA NA 7.9 (4.0- 146 844 896 1.7 4.0) .sup.a median (range); .sup.b harmonic mean; .sup.c F = (AUC.sub.0-inf).sub.oral .times. dose.sub.iv/(AUC.sub.0-inf).sub.iv .times. dose.sub.oral, mean oral dose: 19.55 mg/kg; mean intravenous dose: 3.12 mg/kg; .sup.dnot applicable
[0441] Following intravenous administration, NSC726260 shows a low clearance (10.4 mL/min/kg) which is about 20% of normal liver blood flow in rat (55 mL/min/kg). The compound is well distributed with a mean volume of distribution (1.9 L/kg) that is about 3 times the total body water. The compound is eliminated with a mean (harmonic) elimination T.sub.112 of 4.8 h. The mean intravenous C.sub.max is 5587 ng/mL and the mean overall intravenous exposure (AUC.sub.0-inf) is 5112 ngh/mL. After oral dosing, NSC 726260 shows a median T.sub.max of 4.0 h, indicating that the compound undergoes sustained absorption. The mean oral C.sub.max is 438 ng/mL, and the mean overall exposure (AUC.sub.0-inf) is 2536 ngh/mL.
[0442] The oral absolute bioavailability of NSC726260 in rats is estimated to be approximately 8%. Because the overall blood clearance of the compound in the rat is low, it is unlikely that the low bioavailability of the compound results from a significant first-pass effect. It is possible that low solubility or membrane permeability may determine the oral bioavailability.
Example 4
siRNA Inhibition of MMR, p53, and REV FUNCTIONS
[0443] siRNA specific for different genes is purchased from Dharmacon (Thermo Fisher Scientific Dharmacon Products, Lafayette, Colo. 80026) and the protocol recommended by the supplier is utilized. Confluent cells are trypsinized and 5000 cells are seeded in a well in the presence or absence of siRNA in 100 .mu.L medium. The cells are incubated with siRNA for two days. A non-covalent DNA binding agent of the invention is added in a 10 .mu.L volume and incubated for another 48 hours. After treatment with the agent, the medium is replaced with 1% alamar blue containing medium to measure fluorescence after two hours. The difference in fluorescence intensity shows the growth inhibition. The results are presented in FIGS. 14-18 and Table 7.
TABLE-US-00007 TABLE 7 C50 (uM) Fold si RNA improvement knock out in IC50 Cell line Compound Control p53 rev mlh1 msh2 p53 rev mlh1 msh2 U2OS NSC 718813 0.30 0.03 0.06 0.1 10 5 3.0 Wild type NSC 723734 0.07 0.06 0.001 0.015 1.2 >70 3.5 NSC 726260 0.4 0.35 0.003 0.003 1.1 135 135 Doxorubicin 0.7 >1 uM >2 uM >3 uM 0.7 0.35 0.23 H1299 (p53-) NSC 718813 0.6 -- -- 0.5 0.35 1.3 1.9 NSC 723734 0.9 -- -- 0.45 0.35 2.0 2.6 HCT116 (mlh-) NSC 718813 0.1 0.04 0.07 -- -- 12.5 7.1 NSC 723734 0.3 0.18 0.18 -- -- 2.2 2.2 NSC 726260 0.75 0.2 0.15 -- -- 3.8 5.0 Camptothecin 0.25 0.2 0.15 -- -- 1.3 1.7
Sequence CWU
1
1
361393PRTArtificial SequenceTP53 sequence 1Met Glu Glu Pro Gln Ser Asp Pro
Ser Val Glu Pro Pro Leu Ser Gln 1 5 10
15 Glu Thr Phe Ser Asp Leu Trp Lys Leu Leu Pro Glu Asn
Asn Val Leu 20 25 30
Ser Pro Leu Pro Ser Gln Ala Met Asp Asp Leu Met Leu Ser Pro Asp
35 40 45 Asp Ile Glu Gln
Trp Phe Thr Glu Asp Pro Gly Pro Asp Glu Ala Pro 50
55 60 Arg Met Pro Glu Ala Ala Pro Pro
Val Ala Pro Ala Pro Ala Ala Pro 65 70
75 80 Thr Pro Ala Ala Pro Ala Pro Ala Pro Ser Trp Pro
Leu Ser Ser Ser 85 90
95 Val Pro Ser Gln Lys Thr Tyr Gln Gly Ser Tyr Gly Phe Arg Leu Gly
100 105 110 Phe Leu His
Ser Gly Thr Ala Lys Ser Val Thr Cys Thr Tyr Ser Pro 115
120 125 Ala Leu Asn Lys Met Phe Cys Gln
Leu Ala Lys Thr Cys Pro Val Gln 130 135
140 Leu Trp Val Asp Ser Thr Pro Pro Pro Gly Thr Arg Val
Arg Ala Met 145 150 155
160 Ala Ile Tyr Lys Gln Ser Gln His Met Thr Glu Val Val Arg Arg Cys
165 170 175 Pro His His Glu
Arg Cys Ser Asp Ser Asp Gly Leu Ala Pro Pro Gln 180
185 190 His Leu Ile Arg Val Glu Gly Asn Leu
Arg Val Glu Tyr Leu Asp Asp 195 200
205 Arg Asn Thr Phe Arg His Ser Val Val Val Pro Tyr Glu Pro
Pro Glu 210 215 220
Val Gly Ser Asp Cys Thr Thr Ile His Tyr Asn Tyr Met Cys Asn Ser 225
230 235 240 Ser Cys Met Gly Gly
Met Asn Arg Arg Pro Ile Leu Thr Ile Ile Thr 245
250 255 Leu Glu Asp Ser Ser Gly Asn Leu Leu Gly
Arg Asn Ser Phe Glu Val 260 265
270 Arg Val Cys Ala Cys Pro Gly Arg Asp Arg Arg Thr Glu Glu Glu
Asn 275 280 285 Leu
Arg Lys Lys Gly Glu Pro His His Glu Leu Pro Pro Gly Ser Thr 290
295 300 Lys Arg Ala Leu Pro Asn
Asn Thr Ser Ser Ser Pro Gln Pro Lys Lys 305 310
315 320 Lys Pro Leu Asp Gly Glu Tyr Phe Thr Leu Gln
Ile Arg Gly Arg Glu 325 330
335 Arg Phe Glu Met Phe Arg Glu Leu Asn Glu Ala Leu Glu Leu Lys Asp
340 345 350 Ala Gln
Ala Gly Lys Glu Pro Gly Gly Ser Arg Ala His Ser Ser His 355
360 365 Leu Lys Ser Lys Lys Gly Gln
Ser Thr Ser Arg His Lys Lys Leu Met 370 375
380 Phe Lys Thr Glu Gly Pro Asp Ser Asp 385
390 22550DNAArtificial SequenceTP53 sequence
2gattggggtt ttcccctccc atgtgctcaa gactggcgct aaaagttttg agcttctcaa
60aagtctagag ccaccgtcca gggagcaggt agctgctggg ctccggggac actttgcgtt
120cgggctggga gcgtgctttc cacgacggtg acacgcttcc ctggattggc agccagactg
180ccttccgggt cactgccatg gaggagccgc agtcagatcc tagcgtcgag ccccctctga
240gtcaggaaac attttcagac ctatggaaac tacttcctga aaacaacgtt ctgtccccct
300tgccgtccca agcaatggat gatttgatgc tgtccccgga cgatattgaa caatggttca
360ctgaagaccc aggtccagat gaagctccca gaatgccaga ggctgctccc cccgtggccc
420ctgcaccagc agctcctaca ccggcggccc ctgcaccagc cccctcctgg cccctgtcat
480cttctgtccc ttcccagaaa acctaccagg gcagctacgg tttccgtctg ggcttcttgc
540attctgggac agccaagtct gtgacttgca cgtactcccc tgccctcaac aagatgtttt
600gccaactggc caagacctgc cctgtgcagc tgtgggttga ttccacaccc ccgcccggca
660cccgcgtccg cgccatggcc atctacaagc agtcacagca catgacggag gttgtgaggc
720gctgccccca ccatgagcgc tgctcagata gcgatggtct ggcccctcct cagcatctta
780tccgagtgga aggaaatttg cgtgtggagt atttggatga cagaaacact tttcgacata
840gtgtggtggt gccctatgag ccgcctgagg ttggctctga ctgtaccacc atccactaca
900actacatgtg taacagttcc tgcatgggcg gcatgaaccg gaggcccatc ctcaccatca
960tcacactgga agactccagt ggtaatctac tgggacggaa cagctttgag gtgcgtgttt
1020gtgcctgtcc tgggagagac cggcgcacag aggaagagaa tctccgcaag aaaggggagc
1080ctcaccacga gctgccccca gggagcacta agcgagcact gcccaacaac accagctcct
1140ctccccagcc aaagaagaaa ccactggatg gagaatattt cacccttcag atccgtgggc
1200gtgagcgctt cgagatgttc cgagagctga atgaggcctt ggaactcaag gatgcccagg
1260ctgggaagga gccagggggg agcagggctc actccagcca cctgaagtcc aaaaagggtc
1320agtctacctc ccgccataaa aaactcatgt tcaagacaga agggcctgac tcagactgac
1380attctccact tcttgttccc cactgacagc ctcccacccc catctctccc tcccctgcca
1440ttttgggttt tgggtctttg aacccttgct tgcaataggt gtgcgtcaga agcacccagg
1500acttccattt gctttgtccc ggggctccac tgaacaagtt ggcctgcact ggtgttttgt
1560tgtggggagg aggatgggga gtaggacata ccagcttaga ttttaaggtt tttactgtga
1620gggatgtttg ggagatgtaa gaaatgttct tgcagttaag ggttagttta caatcagcca
1680cattctaggt aggggcccac ttcaccgtac taaccaggga agctgtccct cactgttgaa
1740ttttctctaa cttcaaggcc catatctgtg aaatgctggc atttgcacct acctcacaga
1800gtgcattgtg agggttaatg aaataatgta catctggcct tgaaaccacc ttttattaca
1860tggggtctag aacttgaccc ccttgagggt gcttgttccc tctccctgtt ggtcggtggg
1920ttggtagttt ctacagttgg gcagctggtt aggtagaggg agttgtcaag tctctgctgg
1980cccagccaaa ccctgtctga caacctcttg gtgaacctta gtacctaaaa ggaaatctca
2040ccccatccca caccctggag gatttcatct cttgtatatg atgatctgga tccaccaaga
2100cttgttttat gctcagggtc aatttctttt ttcttttttt tttttttttt tctttttctt
2160tgagactggg tctcgctttg ttgcccaggc tggagtggag tggcgtgatc ttggcttact
2220gcagcctttg cctccccggc tcgagcagtc ctgcctcagc ctccggagta gctgggacca
2280caggttcatg ccaccatggc cagccaactt ttgcatgttt tgtagagatg gggtctcaca
2340gtgttgccca ggctggtctc aaactcctgg gctcaggcga tccacctgtc tcagcctccc
2400agagtgctgg gattacaatt gtgagccacc acgtccagct ggaagggtca acatctttta
2460cattctgcaa gcacatctgc attttcaccc cacccttccc ctccttctcc ctttttatat
2520cccattttta tatcgatctc ttattttaca
25503756PRTArtificial SequenceMLH1 sequence 3Met Ser Phe Val Ala Gly Val
Ile Arg Arg Leu Asp Glu Thr Val Val 1 5
10 15 Asn Arg Ile Ala Ala Gly Glu Val Ile Gln Arg
Pro Ala Asn Ala Ile 20 25
30 Lys Glu Met Ile Glu Asn Cys Leu Asp Ala Lys Ser Thr Ser Ile
Gln 35 40 45 Val
Ile Val Lys Glu Gly Gly Leu Lys Leu Ile Gln Ile Gln Asp Asn 50
55 60 Gly Thr Gly Ile Arg Lys
Glu Asp Leu Asp Ile Val Cys Glu Arg Phe 65 70
75 80 Thr Thr Ser Lys Leu Gln Ser Phe Glu Asp Leu
Ala Ser Ile Ser Thr 85 90
95 Tyr Gly Phe Arg Gly Glu Ala Leu Ala Ser Ile Ser His Val Ala His
100 105 110 Val Thr
Ile Thr Thr Lys Thr Ala Asp Gly Lys Cys Ala Tyr Arg Ala 115
120 125 Ser Tyr Ser Asp Gly Lys Leu
Lys Ala Pro Pro Lys Pro Cys Ala Gly 130 135
140 Asn Gln Gly Thr Gln Ile Thr Val Glu Asp Leu Phe
Tyr Asn Ile Ala 145 150 155
160 Thr Arg Arg Lys Ala Leu Lys Asn Pro Ser Glu Glu Tyr Gly Lys Ile
165 170 175 Leu Glu Val
Val Gly Arg Tyr Ser Val His Asn Ala Gly Ile Ser Phe 180
185 190 Ser Val Lys Lys Gln Gly Glu Thr
Val Ala Asp Val Arg Thr Leu Pro 195 200
205 Asn Ala Ser Thr Val Asp Asn Ile Arg Ser Ile Phe Gly
Asn Ala Val 210 215 220
Ser Arg Glu Leu Ile Glu Ile Gly Cys Glu Asp Lys Thr Leu Ala Phe 225
230 235 240 Lys Met Asn Gly
Tyr Ile Ser Asn Ala Asn Tyr Ser Val Lys Lys Cys 245
250 255 Ile Phe Leu Leu Phe Ile Asn His Arg
Leu Val Glu Ser Thr Ser Leu 260 265
270 Arg Lys Ala Ile Glu Thr Val Tyr Ala Ala Tyr Leu Pro Lys
Asn Thr 275 280 285
His Pro Phe Leu Tyr Leu Ser Leu Glu Ile Ser Pro Gln Asn Val Asp 290
295 300 Val Asn Val His Pro
Thr Lys His Glu Val His Phe Leu His Glu Glu 305 310
315 320 Ser Ile Leu Glu Arg Val Gln Gln His Ile
Glu Ser Lys Leu Leu Gly 325 330
335 Ser Asn Ser Ser Arg Met Tyr Phe Thr Gln Thr Leu Leu Pro Gly
Leu 340 345 350 Ala
Gly Pro Ser Gly Glu Met Val Lys Ser Thr Thr Ser Leu Thr Ser 355
360 365 Ser Ser Thr Ser Gly Ser
Ser Asp Lys Val Tyr Ala His Gln Met Val 370 375
380 Arg Thr Asp Ser Arg Glu Gln Lys Leu Asp Ala
Phe Leu Gln Pro Leu 385 390 395
400 Ser Lys Pro Leu Ser Ser Gln Pro Gln Ala Ile Val Thr Glu Asp Lys
405 410 415 Thr Asp
Ile Ser Ser Gly Arg Ala Arg Gln Gln Asp Glu Glu Met Leu 420
425 430 Glu Leu Pro Ala Pro Ala Glu
Val Ala Ala Lys Asn Gln Ser Leu Glu 435 440
445 Gly Asp Thr Thr Lys Gly Thr Ser Glu Met Ser Glu
Lys Arg Gly Pro 450 455 460
Thr Ser Ser Asn Pro Arg Lys Arg His Arg Glu Asp Ser Asp Val Glu 465
470 475 480 Met Val Glu
Asp Asp Ser Arg Lys Glu Met Thr Ala Ala Cys Thr Pro 485
490 495 Arg Arg Arg Ile Ile Asn Leu Thr
Ser Val Leu Ser Leu Gln Glu Glu 500 505
510 Ile Asn Glu Gln Gly His Glu Val Leu Arg Glu Met Leu
His Asn His 515 520 525
Ser Phe Val Gly Cys Val Asn Pro Gln Trp Ala Leu Ala Gln His Gln 530
535 540 Thr Lys Leu Tyr
Leu Leu Asn Thr Thr Lys Leu Ser Glu Glu Leu Phe 545 550
555 560 Tyr Gln Ile Leu Ile Tyr Asp Phe Ala
Asn Phe Gly Val Leu Arg Leu 565 570
575 Ser Glu Pro Ala Pro Leu Phe Asp Leu Ala Met Leu Ala Leu
Asp Ser 580 585 590
Pro Glu Ser Gly Trp Thr Glu Glu Asp Gly Pro Lys Glu Gly Leu Ala
595 600 605 Glu Tyr Ile Val
Glu Phe Leu Lys Lys Lys Ala Glu Met Leu Ala Asp 610
615 620 Tyr Phe Ser Leu Glu Ile Asp Glu
Glu Gly Asn Leu Ile Gly Leu Pro 625 630
635 640 Leu Leu Ile Asp Asn Tyr Val Pro Pro Leu Glu Gly
Leu Pro Ile Phe 645 650
655 Ile Leu Arg Leu Ala Thr Glu Val Asn Trp Asp Glu Glu Lys Glu Cys
660 665 670 Phe Glu Ser
Leu Ser Lys Glu Cys Ala Met Phe Tyr Ser Ile Arg Lys 675
680 685 Gln Tyr Ile Ser Glu Glu Ser Thr
Leu Ser Gly Gln Gln Ser Glu Val 690 695
700 Pro Gly Ser Ile Pro Asn Ser Trp Lys Trp Thr Val Glu
His Ile Val 705 710 715
720 Tyr Lys Ala Leu Arg Ser His Ile Leu Pro Pro Lys His Phe Thr Glu
725 730 735 Asp Gly Asn Ile
Leu Gln Leu Ala Asn Leu Pro Asp Leu Tyr Lys Val 740
745 750 Phe Glu Arg Cys 755
42662DNAArtificial SequenceMLH1 sequence 4gaagagaccc agcaacccac
agagttgaga aatttgactg gcattcaagc tgtccaatca 60atagctgccg ctgaagggtg
gggctggatg gcgtaagcta cagctgaagg aagaacgtga 120gcacgaggca ctgaggtgat
tggctgaagg cacttccgtt gagcatctag acgtttcctt 180ggctcttctg gcgccaaaat
gtcgttcgtg gcaggggtta ttcggcggct ggacgagaca 240gtggtgaacc gcatcgcggc
gggggaagtt atccagcggc cagctaatgc tatcaaagag 300atgattgaga actgtttaga
tgcaaaatcc acaagtattc aagtgattgt taaagaggga 360ggcctgaagt tgattcagat
ccaagacaat ggcaccggga tcaggaaaga agatctggat 420attgtatgtg aaaggttcac
tactagtaaa ctgcagtcct ttgaggattt agccagtatt 480tctacctatg gctttcgagg
tgaggctttg gccagcataa gccatgtggc tcatgttact 540attacaacga aaacagctga
tggaaagtgt gcatacagag caagttactc agatggaaaa 600ctgaaagccc ctcctaaacc
atgtgctggc aatcaaggga cccagatcac ggtggaggac 660cttttttaca acatagccac
gaggagaaaa gctttaaaaa atccaagtga agaatatggg 720aaaattttgg aagttgttgg
caggtattca gtacacaatg caggcattag tttctcagtt 780aaaaaacaag gagagacagt
agctgatgtt aggacactac ccaatgcctc aaccgtggac 840aatattcgct ccatctttgg
aaatgctgtt agtcgagaac tgatagaaat tggatgtgag 900gataaaaccc tagccttcaa
aatgaatggt tacatatcca atgcaaacta ctcagtgaag 960aagtgcatct tcttactctt
catcaaccat cgtctggtag aatcaacttc cttgagaaaa 1020gccatagaaa cagtgtatgc
agcctatttg cccaaaaaca cacacccatt cctgtacctc 1080agtttagaaa tcagtcccca
gaatgtggat gttaatgtgc accccacaaa gcatgaagtt 1140cacttcctgc acgaggagag
catcctggag cgggtgcagc agcacatcga gagcaagctc 1200ctgggctcca attcctccag
gatgtacttc acccagactt tgctaccagg acttgctggc 1260ccctctgggg agatggttaa
atccacaaca agtctgacct cgtcttctac ttctggaagt 1320agtgataagg tctatgccca
ccagatggtt cgtacagatt cccgggaaca gaagcttgat 1380gcatttctgc agcctctgag
caaacccctg tccagtcagc cccaggccat tgtcacagag 1440gataagacag atatttctag
tggcagggct aggcagcaag atgaggagat gcttgaactc 1500ccagcccctg ctgaagtggc
tgccaaaaat cagagcttgg agggggatac aacaaagggg 1560acttcagaaa tgtcagagaa
gagaggacct acttccagca accccagaaa gagacatcgg 1620gaagattctg atgtggaaat
ggtggaagat gattcccgaa aggaaatgac tgcagcttgt 1680accccccgga gaaggatcat
taacctcact agtgttttga gtctccagga agaaattaat 1740gagcagggac atgaggttct
ccgggagatg ttgcataacc actccttcgt gggctgtgtg 1800aatcctcagt gggccttggc
acagcatcaa accaagttat accttctcaa caccaccaag 1860cttagtgaag aactgttcta
ccagatactc atttatgatt ttgccaattt tggtgttctc 1920aggttatcgg agccagcacc
gctctttgac cttgccatgc ttgccttaga tagtccagag 1980agtggctgga cagaggaaga
tggtcccaaa gaaggacttg ctgaatacat tgttgagttt 2040ctgaagaaga aggctgagat
gcttgcagac tatttctctt tggaaattga tgaggaaggg 2100aacctgattg gattacccct
tctgattgac aactatgtgc cccctttgga gggactgcct 2160atcttcattc ttcgactagc
cactgaggtg aattgggacg aagaaaagga atgttttgaa 2220agcctcagta aagaatgcgc
tatgttctat tccatccgga agcagtacat atctgaggag 2280tcgaccctct caggccagca
gagtgaagtg cctggctcca ttccaaactc ctggaagtgg 2340actgtggaac acattgtcta
taaagccttg cgctcacaca ttctgcctcc taaacatttc 2400acagaagatg gaaatatcct
gcagcttgct aacctgcctg atctatacaa agtctttgag 2460aggtgttaaa tatggttatt
tatgcactgt gggatgtgtt cttctttctc tgtattccga 2520tacaaagtgt tgtatcaaag
tgtgatatac aaagtgtacc aacataagtg ttggtagcac 2580ttaagactta tacttgcctt
ctgatagtat tcctttatac acagtggatt gattataaat 2640aaatagatgt gtcttaacat
aa 26625934PRTArtificial
SequenceMSH2 sequence 5Met Ala Val Gln Pro Lys Glu Thr Leu Gln Leu Glu
Ser Ala Ala Glu 1 5 10
15 Val Gly Phe Val Arg Phe Phe Gln Gly Met Pro Glu Lys Pro Thr Thr
20 25 30 Thr Val Arg
Leu Phe Asp Arg Gly Asp Phe Tyr Thr Ala His Gly Glu 35
40 45 Asp Ala Leu Leu Ala Ala Arg Glu
Val Phe Lys Thr Gln Gly Val Ile 50 55
60 Lys Tyr Met Gly Pro Ala Gly Ala Lys Asn Leu Gln Ser
Val Val Leu 65 70 75
80 Ser Lys Met Asn Phe Glu Ser Phe Val Lys Asp Leu Leu Leu Val Arg
85 90 95 Gln Tyr Arg Val
Glu Val Tyr Lys Asn Arg Ala Gly Asn Lys Ala Ser 100
105 110 Lys Glu Asn Asp Trp Tyr Leu Ala Tyr
Lys Ala Ser Pro Gly Asn Leu 115 120
125 Ser Gln Phe Glu Asp Ile Leu Phe Gly Asn Asn Asp Met Ser
Ala Ser 130 135 140
Ile Gly Val Val Gly Val Lys Met Ser Ala Val Asp Gly Gln Arg Gln 145
150 155 160 Val Gly Val Gly Tyr
Val Asp Ser Ile Gln Arg Lys Leu Gly Leu Cys 165
170 175 Glu Phe Pro Asp Asn Asp Gln Phe Ser Asn
Leu Glu Ala Leu Leu Ile 180 185
190 Gln Ile Gly Pro Lys Glu Cys Val Leu Pro Gly Gly Glu Thr Ala
Gly 195 200 205 Asp
Met Gly Lys Leu Arg Gln Ile Ile Gln Arg Gly Gly Ile Leu Ile 210
215 220 Thr Glu Arg Lys Lys Ala
Asp Phe Ser Thr Lys Asp Ile Tyr Gln Asp 225 230
235 240 Leu Asn Arg Leu Leu Lys Gly Lys Lys Gly Glu
Gln Met Asn Ser Ala 245 250
255 Val Leu Pro Glu Met Glu Asn Gln Val Ala Val Ser Ser Leu Ser Ala
260 265 270 Val Ile
Lys Phe Leu Glu Leu Leu Ser Asp Asp Ser Asn Phe Gly Gln 275
280 285 Phe Glu Leu Thr Thr Phe Asp
Phe Ser Gln Tyr Met Lys Leu Asp Ile 290 295
300 Ala Ala Val Arg Ala Leu Asn Leu Phe Gln Gly Ser
Val Glu Asp Thr 305 310 315
320 Thr Gly Ser Gln Ser Leu Ala Ala Leu Leu Asn Lys Cys Lys Thr Pro
325 330 335 Gln Gly Gln
Arg Leu Val Asn Gln Trp Ile Lys Gln Pro Leu Met Asp 340
345 350 Lys Asn Arg Ile Glu Glu Arg Leu
Asn Leu Val Glu Ala Phe Val Glu 355 360
365 Asp Ala Glu Leu Arg Gln Thr Leu Gln Glu Asp Leu Leu
Arg Arg Phe 370 375 380
Pro Asp Leu Asn Arg Leu Ala Lys Lys Phe Gln Arg Gln Ala Ala Asn 385
390 395 400 Leu Gln Asp Cys
Tyr Arg Leu Tyr Gln Gly Ile Asn Gln Leu Pro Asn 405
410 415 Val Ile Gln Ala Leu Glu Lys His Glu
Gly Lys His Gln Lys Leu Leu 420 425
430 Leu Ala Val Phe Val Thr Pro Leu Thr Asp Leu Arg Ser Asp
Phe Ser 435 440 445
Lys Phe Gln Glu Met Ile Glu Thr Thr Leu Asp Met Asp Gln Val Glu 450
455 460 Asn His Glu Phe Leu
Val Lys Pro Ser Phe Asp Pro Asn Leu Ser Glu 465 470
475 480 Leu Arg Glu Ile Met Asn Asp Leu Glu Lys
Lys Met Gln Ser Thr Leu 485 490
495 Ile Ser Ala Ala Arg Asp Leu Gly Leu Asp Pro Gly Lys Gln Ile
Lys 500 505 510 Leu
Asp Ser Ser Ala Gln Phe Gly Tyr Tyr Phe Arg Val Thr Cys Lys 515
520 525 Glu Glu Lys Val Leu Arg
Asn Asn Lys Asn Phe Ser Thr Val Asp Ile 530 535
540 Gln Lys Asn Gly Val Lys Phe Thr Asn Ser Lys
Leu Thr Ser Leu Asn 545 550 555
560 Glu Glu Tyr Thr Lys Asn Lys Thr Glu Tyr Glu Glu Ala Gln Asp Ala
565 570 575 Ile Val
Lys Glu Ile Val Asn Ile Ser Ser Gly Tyr Val Glu Pro Met 580
585 590 Gln Thr Leu Asn Asp Val Leu
Ala Gln Leu Asp Ala Val Val Ser Phe 595 600
605 Ala His Val Ser Asn Gly Ala Pro Val Pro Tyr Val
Arg Pro Ala Ile 610 615 620
Leu Glu Lys Gly Gln Gly Arg Ile Ile Leu Lys Ala Ser Arg His Ala 625
630 635 640 Cys Val Glu
Val Gln Asp Glu Ile Ala Phe Ile Pro Asn Asp Val Tyr 645
650 655 Phe Glu Lys Asp Lys Gln Met Phe
His Ile Ile Thr Gly Pro Asn Met 660 665
670 Gly Gly Lys Ser Thr Tyr Ile Arg Gln Thr Gly Val Ile
Val Leu Met 675 680 685
Ala Gln Ile Gly Cys Phe Val Pro Cys Glu Ser Ala Glu Val Ser Ile 690
695 700 Val Asp Cys Ile
Leu Ala Arg Val Gly Ala Gly Asp Ser Gln Leu Lys 705 710
715 720 Gly Val Ser Thr Phe Met Ala Glu Met
Leu Glu Thr Ala Ser Ile Leu 725 730
735 Arg Ser Ala Thr Lys Asp Ser Leu Ile Ile Ile Asp Glu Leu
Gly Arg 740 745 750
Gly Thr Ser Thr Tyr Asp Gly Phe Gly Leu Ala Trp Ala Ile Ser Glu
755 760 765 Tyr Ile Ala Thr
Lys Ile Gly Ala Phe Cys Met Phe Ala Thr His Phe 770
775 780 His Glu Leu Thr Ala Leu Ala Asn
Gln Ile Pro Thr Val Asn Asn Leu 785 790
795 800 His Val Thr Ala Leu Thr Thr Glu Glu Thr Leu Thr
Met Leu Tyr Gln 805 810
815 Val Lys Lys Gly Val Cys Asp Gln Ser Phe Gly Ile His Val Ala Glu
820 825 830 Leu Ala Asn
Phe Pro Lys His Val Ile Glu Cys Ala Lys Gln Lys Ala 835
840 845 Leu Glu Leu Glu Glu Phe Gln Tyr
Ile Gly Glu Ser Gln Gly Tyr Asp 850 855
860 Ile Met Glu Pro Ala Ala Lys Lys Cys Tyr Leu Glu Arg
Glu Gln Gly 865 870 875
880 Glu Lys Ile Ile Gln Glu Phe Leu Ser Lys Val Lys Gln Met Pro Phe
885 890 895 Thr Glu Met Ser
Glu Glu Asn Ile Thr Ile Lys Leu Lys Gln Leu Lys 900
905 910 Ala Glu Val Ile Ala Lys Asn Asn Ser
Phe Val Asn Glu Ile Ile Ser 915 920
925 Arg Ile Lys Val Thr Thr 930
63145DNAArtificial SequenceMSH2 nucleotide sequence 6ggcgggaaac
agcttagtgg gtgtggggtc gcgcattttc ttcaaccagg aggtgaggag 60gtttcgacat
ggcggtgcag ccgaaggaga cgctgcagtt ggagagcgcg gccgaggtcg 120gcttcgtgcg
cttctttcag ggcatgccgg agaagccgac caccacagtg cgccttttcg 180accggggcga
cttctatacg gcgcacggcg aggacgcgct gctggccgcc cgggaggtgt 240tcaagaccca
gggggtgatc aagtacatgg ggccggcagg agcaaagaat ctgcagagtg 300ttgtgcttag
taaaatgaat tttgaatctt ttgtaaaaga tcttcttctg gttcgtcagt 360atagagttga
agtttataag aatagagctg gaaataaggc atccaaggag aatgattggt 420atttggcata
taaggcttct cctggcaatc tctctcagtt tgaagacatt ctctttggta 480acaatgatat
gtcagcttcc attggtgttg tgggtgttaa aatgtccgca gttgatggcc 540agagacaggt
tggagttggg tatgtggatt ccatacagag gaaactagga ctgtgtgaat 600tccctgataa
tgatcagttc tccaatcttg aggctctcct catccagatt ggaccaaagg 660aatgtgtttt
acccggagga gagactgctg gagacatggg gaaactgaga cagataattc 720aaagaggagg
aattctgatc acagaaagaa aaaaagctga cttttccaca aaagacattt 780atcaggacct
caaccggttg ttgaaaggca aaaagggaga gcagatgaat agtgctgtat 840tgccagaaat
ggagaatcag gttgcagttt catcactgtc tgcggtaatc aagtttttag 900aactcttatc
agatgattcc aactttggac agtttgaact gactactttt gacttcagcc 960agtatatgaa
attggatatt gcagcagtca gagcccttaa cctttttcag ggttctgttg 1020aagataccac
tggctctcag tctctggctg ccttgctgaa taagtgtaaa acccctcaag 1080gacaaagact
tgttaaccag tggattaagc agcctctcat ggataagaac agaatagagg 1140agagattgaa
tttagtggaa gcttttgtag aagatgcaga attgaggcag actttacaag 1200aagatttact
tcgtcgattc ccagatctta accgacttgc caagaagttt caaagacaag 1260cagcaaactt
acaagattgt taccgactct atcagggtat aaatcaacta cctaatgtta 1320tacaggctct
ggaaaaacat gaaggaaaac accagaaatt attgttggca gtttttgtga 1380ctcctcttac
tgatcttcgt tctgacttct ccaagtttca ggaaatgata gaaacaactt 1440tagatatgga
tcaggtggaa aaccatgaat tccttgtaaa accttcattt gatcctaatc 1500tcagtgaatt
aagagaaata atgaatgact tggaaaagaa gatgcagtca acattaataa 1560gtgcagccag
agatcttggc ttggaccctg gcaaacagat taaactggat tccagtgcac 1620agtttggata
ttactttcgt gtaacctgta aggaagaaaa agtccttcgt aacaataaaa 1680actttagtac
tgtagatatc cagaagaatg gtgttaaatt taccaacagc aaattgactt 1740ctttaaatga
agagtatacc aaaaataaaa cagaatatga agaagcccag gatgccattg 1800ttaaagaaat
tgtcaatatt tcttcaggct atgtagaacc aatgcagaca ctcaatgatg 1860tgttagctca
gctagatgct gttgtcagct ttgctcacgt gtcaaatgga gcacctgttc 1920catatgtacg
accagccatt ttggagaaag gacaaggaag aattatatta aaagcatcca 1980ggcatgcttg
tgttgaagtt caagatgaaa ttgcatttat tcctaatgac gtatactttg 2040aaaaagataa
acagatgttc cacatcatta ctggccccaa tatgggaggt aaatcaacat 2100atattcgaca
aactggggtg atagtactca tggcccaaat tgggtgtttt gtgccatgtg 2160agtcagcaga
agtgtccatt gtggactgca tcttagcccg agtaggggct ggtgacagtc 2220aattgaaagg
agtctccacg ttcatggctg aaatgttgga aactgcttct atcctcaggt 2280ctgcaaccaa
agattcatta ataatcatag atgaattggg aagaggaact tctacctacg 2340atggatttgg
gttagcatgg gctatatcag aatacattgc aacaaagatt ggtgcttttt 2400gcatgtttgc
aacccatttt catgaactta ctgccttggc caatcagata ccaactgtta 2460ataatctaca
tgtcacagca ctcaccactg aagagacctt aactatgctt tatcaggtga 2520agaaaggtgt
ctgtgatcaa agttttggga ttcatgttgc agagcttgct aatttcccta 2580agcatgtaat
agagtgtgct aaacagaaag ccctggaact tgaggagttt cagtatattg 2640gagaatcgca
aggatatgat atcatggaac cagcagcaaa gaagtgctat ctggaaagag 2700agcaaggtga
aaaaattatt caggagttcc tgtccaaggt gaaacaaatg ccctttactg 2760aaatgtcaga
agaaaacatc acaataaagt taaaacagct aaaagctgaa gtaatagcaa 2820agaataatag
ctttgtaaat gaaatcattt cacgaataaa agttactacg tgaaaaatcc 2880cagtaatgga
atgaaggtaa tattgataag ctattgtctg taatagtttt atattgtttt 2940atattaaccc
tttttccata gtgttaactg tcagtgccca tgggctatca acttaataag 3000atatttagta
atattttact ttgaggacat tttcaaagat ttttattttg aaaaatgaga 3060gctgtaactg
aggactgttt gcaattgaca taggcaataa taagtgatgt gctgaatttt 3120ataaataaaa
tcatgtagtt tgtgg
314571863PRTArtificial SequenceBRCA1 sequence 7Met Asp Leu Ser Ala Leu
Arg Val Glu Glu Val Gln Asn Val Ile Asn 1 5
10 15 Ala Met Gln Lys Ile Leu Glu Cys Pro Ile Cys
Leu Glu Leu Ile Lys 20 25
30 Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys
Met 35 40 45 Leu
Lys Leu Leu Asn Gln Lys Lys Gly Pro Ser Gln Cys Pro Leu Cys 50
55 60 Lys Asn Asp Ile Thr Lys
Arg Ser Leu Gln Glu Ser Thr Arg Phe Ser 65 70
75 80 Gln Leu Val Glu Glu Leu Leu Lys Ile Ile Cys
Ala Phe Gln Leu Asp 85 90
95 Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn
100 105 110 Asn Ser
Pro Glu His Leu Lys Asp Glu Val Ser Ile Ile Gln Ser Met 115
120 125 Gly Tyr Arg Asn Arg Ala Lys
Arg Leu Leu Gln Ser Glu Pro Glu Asn 130 135
140 Pro Ser Leu Gln Glu Thr Ser Leu Ser Val Gln Leu
Ser Asn Leu Gly 145 150 155
160 Thr Val Arg Thr Leu Arg Thr Lys Gln Arg Ile Gln Pro Gln Lys Thr
165 170 175 Ser Val Tyr
Ile Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val Asn 180
185 190 Lys Ala Thr Tyr Cys Ser Val Gly
Asp Gln Glu Leu Leu Gln Ile Thr 195 200
205 Pro Gln Gly Thr Arg Asp Glu Ile Ser Leu Asp Ser Ala
Lys Lys Ala 210 215 220
Ala Cys Glu Phe Ser Glu Thr Asp Val Thr Asn Thr Glu His His Gln 225
230 235 240 Pro Ser Asn Asn
Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 245
250 255 His Pro Glu Lys Tyr Gln Gly Ser Ser
Val Ser Asn Leu His Val Glu 260 265
270 Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu Gln His Glu
Asn Ser 275 280 285
Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu Lys Ala Glu Phe 290
295 300 Cys Asn Lys Ser Lys
Gln Pro Gly Leu Ala Arg Ser Gln His Asn Arg 305 310
315 320 Trp Ala Gly Ser Lys Glu Thr Cys Asn Asp
Arg Arg Thr Pro Ser Thr 325 330
335 Glu Lys Lys Val Asp Leu Asn Ala Asp Pro Leu Cys Glu Arg Lys
Glu 340 345 350 Trp
Asn Lys Gln Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr Glu 355
360 365 Asp Val Pro Trp Ile Thr
Leu Asn Ser Ser Ile Gln Lys Val Asn Glu 370 375
380 Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser
Asp Asp Ser His Asp 385 390 395
400 Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu
405 410 415 Asn Glu
Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys Ile Asp Leu Leu 420
425 430 Ala Ser Asp Pro His Glu Ala
Leu Ile Cys Lys Ser Glu Arg Val His 435 440
445 Ser Lys Ser Val Glu Ser Asn Ile Glu Asp Lys Ile
Phe Gly Lys Thr 450 455 460
Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser His Val Thr Glu Asn 465
470 475 480 Leu Ile Ile
Gly Ala Phe Val Thr Glu Pro Gln Ile Ile Gln Glu Arg 485
490 495 Pro Leu Thr Asn Lys Leu Lys Arg
Lys Arg Arg Pro Thr Ser Gly Leu 500 505
510 His Pro Glu Asp Phe Ile Lys Lys Ala Asp Leu Ala Val
Gln Lys Thr 515 520 525
Pro Glu Met Ile Asn Gln Gly Thr Asn Gln Thr Glu Gln Asn Gly Gln 530
535 540 Val Met Asn Ile
Thr Asn Ser Gly His Glu Asn Lys Thr Lys Gly Asp 545 550
555 560 Ser Ile Gln Asn Glu Lys Asn Pro Asn
Pro Ile Glu Ser Leu Glu Lys 565 570
575 Glu Ser Ala Phe Lys Thr Lys Ala Glu Pro Ile Ser Ser Ser
Ile Ser 580 585 590
Asn Met Glu Leu Glu Leu Asn Ile His Asn Ser Lys Ala Pro Lys Lys
595 600 605 Asn Arg Leu Arg
Arg Lys Ser Ser Thr Arg His Ile His Ala Leu Glu 610
615 620 Leu Val Val Ser Arg Asn Leu Ser
Pro Pro Asn Cys Thr Glu Leu Gln 625 630
635 640 Ile Asp Ser Cys Ser Ser Ser Glu Glu Ile Lys Lys
Lys Lys Tyr Asn 645 650
655 Gln Met Pro Val Arg His Ser Arg Asn Leu Gln Leu Met Glu Gly Lys
660 665 670 Glu Pro Ala
Thr Gly Ala Lys Lys Ser Asn Lys Pro Asn Glu Gln Thr 675
680 685 Ser Lys Arg His Asp Ser Asp Thr
Phe Pro Glu Leu Lys Leu Thr Asn 690 695
700 Ala Pro Gly Ser Phe Thr Lys Cys Ser Asn Thr Ser Glu
Leu Lys Glu 705 710 715
720 Phe Val Asn Pro Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu
725 730 735 Thr Val Lys Val
Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 740
745 750 Ser Gly Glu Arg Val Leu Gln Thr Glu
Arg Ser Val Glu Ser Ser Ser 755 760
765 Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gln Glu Ser
Ile Ser 770 775 780
Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 785
790 795 800 Cys Val Ser Gln Cys
Ala Ala Phe Glu Asn Pro Lys Gly Leu Ile His 805
810 815 Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr
Glu Gly Phe Lys Tyr Pro 820 825
830 Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser Ile Glu Met
Glu 835 840 845 Glu
Ser Glu Leu Asp Ala Gln Tyr Leu Gln Asn Thr Phe Lys Val Ser 850
855 860 Lys Arg Gln Ser Phe Ala
Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 865 870
875 880 Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser
Leu Lys Lys Gln Ser 885 890
895 Pro Lys Val Thr Phe Glu Cys Glu Gln Lys Glu Glu Asn Gln Gly Lys
900 905 910 Asn Glu
Ser Asn Ile Lys Pro Val Gln Thr Val Asn Ile Thr Ala Gly 915
920 925 Phe Pro Val Val Gly Gln Lys
Asp Lys Pro Val Asp Asn Ala Lys Cys 930 935
940 Ser Ile Lys Gly Gly Ser Arg Phe Cys Leu Ser Ser
Gln Phe Arg Gly 945 950 955
960 Asn Glu Thr Gly Leu Ile Thr Pro Asn Lys His Gly Leu Leu Gln Asn
965 970 975 Pro Tyr Arg
Ile Pro Pro Leu Phe Pro Ile Lys Ser Phe Val Lys Thr 980
985 990 Lys Cys Lys Lys Asn Leu Leu Glu
Glu Asn Phe Glu Glu His Ser Met 995 1000
1005 Ser Pro Glu Arg Glu Met Gly Asn Glu Asn Ile
Pro Ser Thr Val 1010 1015 1020
Ser Thr Ile Ser Arg Asn Asn Ile Arg Glu Asn Val Phe Lys Glu
1025 1030 1035 Ala Ser Ser
Ser Asn Ile Asn Glu Val Gly Ser Ser Thr Asn Glu 1040
1045 1050 Val Gly Ser Ser Ile Asn Glu Ile
Gly Ser Ser Asp Glu Asn Ile 1055 1060
1065 Gln Ala Glu Leu Gly Arg Asn Arg Gly Pro Lys Leu Asn
Ala Met 1070 1075 1080
Leu Arg Leu Gly Val Leu Gln Pro Glu Val Tyr Lys Gln Ser Leu 1085
1090 1095 Pro Gly Ser Asn Cys
Lys His Pro Glu Ile Lys Lys Gln Glu Tyr 1100 1105
1110 Glu Glu Val Val Gln Thr Val Asn Thr Asp
Phe Ser Pro Tyr Leu 1115 1120 1125
Ile Ser Asp Asn Leu Glu Gln Pro Met Gly Ser Ser His Ala Ser
1130 1135 1140 Gln Val
Cys Ser Glu Thr Pro Asp Asp Leu Leu Asp Asp Gly Glu 1145
1150 1155 Ile Lys Glu Asp Thr Ser Phe
Ala Glu Asn Asp Ile Lys Glu Ser 1160 1165
1170 Ser Ala Val Phe Ser Lys Ser Val Gln Lys Gly Glu
Leu Ser Arg 1175 1180 1185
Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gln Gly Tyr Arg 1190
1195 1200 Arg Gly Ala Lys Lys
Leu Glu Ser Ser Glu Glu Asn Leu Ser Ser 1205 1210
1215 Glu Asp Glu Glu Leu Pro Cys Phe Gln His
Leu Leu Phe Gly Lys 1220 1225 1230
Val Asn Asn Ile Pro Ser Gln Ser Thr Arg His Ser Thr Val Ala
1235 1240 1245 Thr Glu
Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu Ser Leu 1250
1255 1260 Lys Asn Ser Leu Asn Asp Cys
Ser Asn Gln Val Ile Leu Ala Lys 1265 1270
1275 Ala Ser Gln Glu His His Leu Ser Glu Glu Thr Lys
Cys Ser Ala 1280 1285 1290
Ser Leu Phe Ser Ser Gln Cys Ser Glu Leu Glu Asp Leu Thr Ala 1295
1300 1305 Asn Thr Asn Thr Gln
Asp Pro Phe Leu Ile Gly Ser Ser Lys Gln 1310 1315
1320 Met Arg His Gln Ser Glu Ser Gln Gly Val
Gly Leu Ser Asp Lys 1325 1330 1335
Glu Leu Val Ser Asp Asp Glu Glu Arg Gly Thr Gly Leu Glu Glu
1340 1345 1350 Asn Asn
Gln Glu Glu Gln Ser Met Asp Ser Asn Leu Gly Glu Ala 1355
1360 1365 Ala Ser Gly Cys Glu Ser Glu
Thr Ser Val Ser Glu Asp Cys Ser 1370 1375
1380 Gly Leu Ser Ser Gln Ser Asp Ile Leu Thr Thr Gln
Gln Arg Asp 1385 1390 1395
Thr Met Gln His Asn Leu Ile Lys Leu Gln Gln Glu Met Ala Glu 1400
1405 1410 Leu Glu Ala Val Leu
Glu Gln His Gly Ser Gln Pro Ser Asn Ser 1415 1420
1425 Tyr Pro Ser Ile Ile Ser Asp Ser Ser Ala
Leu Glu Asp Leu Arg 1430 1435 1440
Asn Pro Glu Gln Ser Thr Ser Glu Lys Ala Val Leu Thr Ser Gln
1445 1450 1455 Lys Ser
Ser Glu Tyr Pro Ile Ser Gln Asn Pro Glu Gly Leu Ser 1460
1465 1470 Ala Asp Lys Phe Glu Val Ser
Ala Asp Ser Ser Thr Ser Lys Asn 1475 1480
1485 Lys Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys
Cys Pro Ser 1490 1495 1500
Leu Asp Asp Arg Trp Tyr Met His Ser Cys Ser Gly Ser Leu Gln 1505
1510 1515 Asn Arg Asn Tyr Pro
Ser Gln Glu Glu Leu Ile Lys Val Val Asp 1520 1525
1530 Val Glu Glu Gln Gln Leu Glu Glu Ser Gly
Pro His Asp Leu Thr 1535 1540 1545
Glu Thr Ser Tyr Leu Pro Arg Gln Asp Leu Glu Gly Thr Pro Tyr
1550 1555 1560 Leu Glu
Ser Gly Ile Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp 1565
1570 1575 Pro Ser Glu Asp Arg Ala Pro
Glu Ser Ala Arg Val Gly Asn Ile 1580 1585
1590 Pro Ser Ser Thr Ser Ala Leu Lys Val Pro Gln Leu
Lys Val Ala 1595 1600 1605
Glu Ser Ala Gln Ser Pro Ala Ala Ala His Thr Thr Asp Thr Ala 1610
1615 1620 Gly Tyr Asn Ala Met
Glu Glu Ser Val Ser Arg Glu Lys Pro Glu 1625 1630
1635 Leu Thr Ala Ser Thr Glu Arg Val Asn Lys
Arg Met Ser Met Val 1640 1645 1650
Val Ser Gly Leu Thr Pro Glu Glu Phe Met Leu Val Tyr Lys Phe
1655 1660 1665 Ala Arg
Lys His His Ile Thr Leu Thr Asn Leu Ile Thr Glu Glu 1670
1675 1680 Thr Thr His Val Val Met Lys
Thr Asp Ala Glu Phe Val Cys Glu 1685 1690
1695 Arg Thr Leu Lys Tyr Phe Leu Gly Ile Ala Gly Gly
Lys Trp Val 1700 1705 1710
Val Ser Tyr Phe Trp Val Thr Gln Ser Ile Lys Glu Arg Lys Met 1715
1720 1725 Leu Asn Glu His Asp
Phe Glu Val Arg Gly Asp Val Val Asn Gly 1730 1735
1740 Arg Asn His Gln Gly Pro Lys Arg Ala Arg
Glu Ser Gln Asp Arg 1745 1750 1755
Lys Ile Phe Arg Gly Leu Glu Ile Cys Cys Tyr Gly Pro Phe Thr
1760 1765 1770 Asn Met
Pro Thr Asp Gln Leu Glu Trp Met Val Gln Leu Cys Gly 1775
1780 1785 Ala Ser Val Val Lys Glu Leu
Ser Ser Phe Thr Leu Gly Thr Gly 1790 1795
1800 Val His Pro Ile Val Val Val Gln Pro Asp Ala Trp
Thr Glu Asp 1805 1810 1815
Asn Gly Phe His Ala Ile Gly Gln Met Cys Glu Ala Pro Val Val 1820
1825 1830 Thr Arg Glu Trp Val
Leu Asp Ser Val Ala Leu Tyr Gln Cys Gln 1835 1840
1845 Glu Leu Asp Thr Tyr Leu Ile Pro Gln Ile
Pro His Ser His Tyr 1850 1855 1860
87224DNAArtificial SequenceBRCA1 sequence 8gtaccttgat
ttcgtattct gagaggctgc tgcttagcgg tagccccttg gtttccgtgg 60caacggaaaa
gcgcgggaat tacagataaa ttaaaactgc gactgcgcgg cgtgagctcg 120ctgagacttc
ctggacgggg gacaggctgt ggggtttctc agataactgg gcccctgcgc 180tcaggaggcc
ttcaccctct gctctgggta aagttcattg gaacagaaag aaatggattt 240atctgctctt
cgcgttgaag aagtacaaaa tgtcattaat gctatgcaga aaatcttaga 300gtgtcccatc
tgtctggagt tgatcaagga acctgtctcc acaaagtgtg accacatatt 360ttgcaaattt
tgcatgctga aacttctcaa ccagaagaaa gggccttcac agtgtccttt 420atgtaagaat
gatataacca aaaggagcct acaagaaagt acgagattta gtcaacttgt 480tgaagagcta
ttgaaaatca tttgtgcttt tcagcttgac acaggtttgg agtatgcaaa 540cagctataat
tttgcaaaaa aggaaaataa ctctcctgaa catctaaaag atgaagtttc 600tatcatccaa
agtatgggct acagaaaccg tgccaaaaga cttctacaga gtgaacccga 660aaatccttcc
ttgcaggaaa ccagtctcag tgtccaactc tctaaccttg gaactgtgag 720aactctgagg
acaaagcagc ggatacaacc tcaaaagacg tctgtctaca ttgaattggg 780atctgattct
tctgaagata ccgttaataa ggcaacttat tgcagtgtgg gagatcaaga 840attgttacaa
atcacccctc aaggaaccag ggatgaaatc agtttggatt ctgcaaaaaa 900ggctgcttgt
gaattttctg agacggatgt aacaaatact gaacatcatc aacccagtaa 960taatgatttg
aacaccactg agaagcgtgc agctgagagg catccagaaa agtatcaggg 1020tagttctgtt
tcaaacttgc atgtggagcc atgtggcaca aatactcatg ccagctcatt 1080acagcatgag
aacagcagtt tattactcac taaagacaga atgaatgtag aaaaggctga 1140attctgtaat
aaaagcaaac agcctggctt agcaaggagc caacataaca gatgggctgg 1200aagtaaggaa
acatgtaatg ataggcggac tcccagcaca gaaaaaaagg tagatctgaa 1260tgctgatccc
ctgtgtgaga gaaaagaatg gaataagcag aaactgccat gctcagagaa 1320tcctagagat
actgaagatg ttccttggat aacactaaat agcagcattc agaaagttaa 1380tgagtggttt
tccagaagtg atgaactgtt aggttctgat gactcacatg atggggagtc 1440tgaatcaaat
gccaaagtag ctgatgtatt ggacgttcta aatgaggtag atgaatattc 1500tggttcttca
gagaaaatag acttactggc cagtgatcct catgaggctt taatatgtaa 1560aagtgaaaga
gttcactcca aatcagtaga gagtaatatt gaagacaaaa tatttgggaa 1620aacctatcgg
aagaaggcaa gcctccccaa cttaagccat gtaactgaaa atctaattat 1680aggagcattt
gttactgagc cacagataat acaagagcgt cccctcacaa ataaattaaa 1740gcgtaaaagg
agacctacat caggccttca tcctgaggat tttatcaaga aagcagattt 1800ggcagttcaa
aagactcctg aaatgataaa tcagggaact aaccaaacgg agcagaatgg 1860tcaagtgatg
aatattacta atagtggtca tgagaataaa acaaaaggtg attctattca 1920gaatgagaaa
aatcctaacc caatagaatc actcgaaaaa gaatctgctt tcaaaacgaa 1980agctgaacct
ataagcagca gtataagcaa tatggaactc gaattaaata tccacaattc 2040aaaagcacct
aaaaagaata ggctgaggag gaagtcttct accaggcata ttcatgcgct 2100tgaactagta
gtcagtagaa atctaagccc acctaattgt actgaattgc aaattgatag 2160ttgttctagc
agtgaagaga taaagaaaaa aaagtacaac caaatgccag tcaggcacag 2220cagaaaccta
caactcatgg aaggtaaaga acctgcaact ggagccaaga agagtaacaa 2280gccaaatgaa
cagacaagta aaagacatga cagcgatact ttcccagagc tgaagttaac 2340aaatgcacct
ggttctttta ctaagtgttc aaataccagt gaacttaaag aatttgtcaa 2400tcctagcctt
ccaagagaag aaaaagaaga gaaactagaa acagttaaag tgtctaataa 2460tgctgaagac
cccaaagatc tcatgttaag tggagaaagg gttttgcaaa ctgaaagatc 2520tgtagagagt
agcagtattt cattggtacc tggtactgat tatggcactc aggaaagtat 2580ctcgttactg
gaagttagca ctctagggaa ggcaaaaaca gaaccaaata aatgtgtgag 2640tcagtgtgca
gcatttgaaa accccaaggg actaattcat ggttgttcca aagataatag 2700aaatgacaca
gaaggcttta agtatccatt gggacatgaa gttaaccaca gtcgggaaac 2760aagcatagaa
atggaagaaa gtgaacttga tgctcagtat ttgcagaata cattcaaggt 2820ttcaaagcgc
cagtcatttg ctccgttttc aaatccagga aatgcagaag aggaatgtgc 2880aacattctct
gcccactctg ggtccttaaa gaaacaaagt ccaaaagtca cttttgaatg 2940tgaacaaaag
gaagaaaatc aaggaaagaa tgagtctaat atcaagcctg tacagacagt 3000taatatcact
gcaggctttc ctgtggttgg tcagaaagat aagccagttg ataatgccaa 3060atgtagtatc
aaaggaggct ctaggttttg tctatcatct cagttcagag gcaacgaaac 3120tggactcatt
actccaaata aacatggact tttacaaaac ccatatcgta taccaccact 3180ttttcccatc
aagtcatttg ttaaaactaa atgtaagaaa aatctgctag aggaaaactt 3240tgaggaacat
tcaatgtcac ctgaaagaga aatgggaaat gagaacattc caagtacagt 3300gagcacaatt
agccgtaata acattagaga aaatgttttt aaagaagcca gctcaagcaa 3360tattaatgaa
gtaggttcca gtactaatga agtgggctcc agtattaatg aaataggttc 3420cagtgatgaa
aacattcaag cagaactagg tagaaacaga gggccaaaat tgaatgctat 3480gcttagatta
ggggttttgc aacctgaggt ctataaacaa agtcttcctg gaagtaattg 3540taagcatcct
gaaataaaaa agcaagaata tgaagaagta gttcagactg ttaatacaga 3600tttctctcca
tatctgattt cagataactt agaacagcct atgggaagta gtcatgcatc 3660tcaggtttgt
tctgagacac ctgatgacct gttagatgat ggtgaaataa aggaagatac 3720tagttttgct
gaaaatgaca ttaaggaaag ttctgctgtt tttagcaaaa gcgtccagaa 3780aggagagctt
agcaggagtc ctagcccttt cacccataca catttggctc agggttaccg 3840aagaggggcc
aagaaattag agtcctcaga agagaactta tctagtgagg atgaagagct 3900tccctgcttc
caacacttgt tatttggtaa agtaaacaat ataccttctc agtctactag 3960gcatagcacc
gttgctaccg agtgtctgtc taagaacaca gaggagaatt tattatcatt 4020gaagaatagc
ttaaatgact gcagtaacca ggtaatattg gcaaaggcat ctcaggaaca 4080tcaccttagt
gaggaaacaa aatgttctgc tagcttgttt tcttcacagt gcagtgaatt 4140ggaagacttg
actgcaaata caaacaccca ggatcctttc ttgattggtt cttccaaaca 4200aatgaggcat
cagtctgaaa gccagggagt tggtctgagt gacaaggaat tggtttcaga 4260tgatgaagaa
agaggaacgg gcttggaaga aaataatcaa gaagagcaaa gcatggattc 4320aaacttaggt
gaagcagcat ctgggtgtga gagtgaaaca agcgtctctg aagactgctc 4380agggctatcc
tctcagagtg acattttaac cactcagcag agggatacca tgcaacataa 4440cctgataaag
ctccagcagg aaatggctga actagaagct gtgttagaac agcatgggag 4500ccagccttct
aacagctacc cttccatcat aagtgactct tctgcccttg aggacctgcg 4560aaatccagaa
caaagcacat cagaaaaagc agtattaact tcacagaaaa gtagtgaata 4620ccctataagc
cagaatccag aaggcctttc tgctgacaag tttgaggtgt ctgcagatag 4680ttctaccagt
aaaaataaag aaccaggagt ggaaaggtca tccccttcta aatgcccatc 4740attagatgat
aggtggtaca tgcacagttg ctctgggagt cttcagaata gaaactaccc 4800atctcaagag
gagctcatta aggttgttga tgtggaggag caacagctgg aagagtctgg 4860gccacacgat
ttgacggaaa catcttactt gccaaggcaa gatctagagg gaacccctta 4920cctggaatct
ggaatcagcc tcttctctga tgaccctgaa tctgatcctt ctgaagacag 4980agccccagag
tcagctcgtg ttggcaacat accatcttca acctctgcat tgaaagttcc 5040ccaattgaaa
gttgcagaat ctgcccagag tccagctgct gctcatacta ctgatactgc 5100tgggtataat
gcaatggaag aaagtgtgag cagggagaag ccagaattga cagcttcaac 5160agaaagggtc
aacaaaagaa tgtccatggt ggtgtctggc ctgaccccag aagaatttat 5220gctcgtgtac
aagtttgcca gaaaacacca catcacttta actaatctaa ttactgaaga 5280gactactcat
gttgttatga aaacagatgc tgagtttgtg tgtgaacgga cactgaaata 5340ttttctagga
attgcgggag gaaaatgggt agttagctat ttctgggtga cccagtctat 5400taaagaaaga
aaaatgctga atgagcatga ttttgaagtc agaggagatg tggtcaatgg 5460aagaaaccac
caaggtccaa agcgagcaag agaatcccag gacagaaaga tcttcagggg 5520gctagaaatc
tgttgctatg ggcccttcac caacatgccc acagatcaac tggaatggat 5580ggtacagctg
tgtggtgctt ctgtggtgaa ggagctttca tcattcaccc ttggcacagg 5640tgtccaccca
attgtggttg tgcagccaga tgcctggaca gaggacaatg gcttccatgc 5700aattgggcag
atgtgtgagg cacctgtggt gacccgagag tgggtgttgg acagtgtagc 5760actctaccag
tgccaggagc tggacaccta cctgataccc cagatccccc acagccacta 5820ctgactgcag
ccagccacag gtacagagcc acaggacccc aagaatgagc ttacaaagtg 5880gcctttccag
gccctgggag ctcctctcac tcttcagtcc ttctactgtc ctggctacta 5940aatattttat
gtacatcagc ctgaaaagga cttctggcta tgcaagggtc ccttaaagat 6000tttctgcttg
aagtctccct tggaaatctg ccatgagcac aaaattatgg taatttttca 6060cctgagaaga
ttttaaaacc atttaaacgc caccaattga gcaagatgct gattcattat 6120ttatcagccc
tattctttct attcaggctg ttgttggctt agggctggaa gcacagagtg 6180gcttggcctc
aagagaatag ctggtttccc taagtttact tctctaaaac cctgtgttca 6240caaaggcaga
gagtcagacc cttcaatgga aggagagtgc ttgggatcga ttatgtgact 6300taaagtcaga
atagtccttg ggcagttctc aaatgttgga gtggaacatt ggggaggaaa 6360ttctgaggca
ggtattagaa atgaaaagga aacttgaaac ctgggcatgg tggctcacgc 6420ctgtaatccc
agcactttgg gaggccaagg tgggcagatc actggaggtc aggagttcga 6480aaccagcctg
gccaacatgg tgaaacccca tctctactaa aaatacagaa attagccggt 6540catggtggtg
gacacctgta atcccagcta ctcaggtggc taaggcagga gaatcacttc 6600agcccgggag
gtggaggttg cagtgagcca agatcatacc acggcactcc agcctgggtg 6660acagtgagac
tgtggctcaa aaaaaaaaaa aaaaaaagga aaatgaaact agaagagatt 6720tctaaaagtc
tgagatatat ttgctagatt tctaaagaat gtgttctaaa acagcagaag 6780attttcaaga
accggtttcc aaagacagtc ttctaattcc tcattagtaa taagtaaaat 6840gtttattgtt
gtagctctgg tatataatcc attcctctta aaatataaga cctctggcat 6900gaatatttca
tatctataaa atgacagatc ccaccaggaa ggaagctgtt gctttctttg 6960aggtgatttt
tttcctttgc tccctgttgc tgaaaccata cagcttcata aataattttg 7020cttgctgaag
gaagaaaaag tgtttttcat aaacccatta tccaggactg tttatagctg 7080ttggaaggac
taggtcttcc ctagcccccc cagtgtgcaa gggcagtgaa gacttgattg 7140tacaaaatac
gttttgtaaa tgttgtgctg ttaacactgc aaataaactt ggtagcaaac 7200acttccaaaa
aaaaaaaaaa aaaa
722493130PRTArtificial SequenceREV3L sequence 9Met Phe Ser Val Arg Ile
Val Thr Ala Asp Tyr Tyr Met Ala Ser Pro 1 5
10 15 Leu Gln Gly Leu Asp Thr Cys Gln Ser Pro Leu
Thr Gln Ala Pro Val 20 25
30 Lys Lys Val Pro Val Val Arg Val Phe Gly Ala Thr Pro Ala Gly
Gln 35 40 45 Lys
Thr Cys Leu His Leu His Gly Ile Phe Pro Tyr Leu Tyr Val Pro 50
55 60 Tyr Asp Gly Tyr Gly Gln
Gln Pro Glu Ser Tyr Leu Ser Gln Met Ala 65 70
75 80 Phe Ser Ile Asp Arg Ala Leu Asn Val Ala Leu
Gly Asn Pro Ser Ser 85 90
95 Thr Ala Gln His Val Phe Lys Val Ser Leu Val Ser Gly Met Pro Phe
100 105 110 Tyr Gly
Tyr His Glu Lys Glu Arg His Phe Met Lys Ile Tyr Leu Tyr 115
120 125 Asn Pro Thr Met Val Lys Arg
Ile Cys Glu Leu Leu Gln Ser Gly Ala 130 135
140 Ile Met Asn Lys Phe Tyr Gln Pro His Glu Ala His
Ile Pro Tyr Leu 145 150 155
160 Leu Gln Leu Phe Ile Asp Tyr Asn Leu Tyr Gly Met Asn Leu Ile Asn
165 170 175 Leu Ala Ala
Val Lys Phe Arg Lys Ala Arg Arg Lys Ser Asn Thr Leu 180
185 190 His Ala Thr Gly Ser Cys Lys Asn
His Leu Ser Gly Asn Ser Leu Ala 195 200
205 Asp Thr Leu Phe Arg Trp Glu Gln Asp Glu Ile Pro Ser
Ser Leu Ile 210 215 220
Leu Glu Gly Val Glu Pro Gln Ser Thr Cys Glu Leu Glu Val Asp Ala 225
230 235 240 Val Ala Ala Asp
Ile Leu Asn Arg Leu Asp Ile Glu Ala Gln Ile Gly 245
250 255 Gly Asn Pro Gly Leu Gln Ala Ile Trp
Glu Asp Glu Lys Gln Arg Arg 260 265
270 Arg Asn Arg Asn Glu Thr Ser Gln Met Ser Gln Pro Glu Ser
Gln Asp 275 280 285
His Arg Phe Val Pro Ala Thr Glu Ser Glu Lys Lys Phe Gln Lys Arg 290
295 300 Leu Gln Glu Ile Leu
Lys Gln Asn Asp Phe Ser Val Thr Leu Ser Gly 305 310
315 320 Ser Val Asp Tyr Ser Asp Gly Ser Gln Glu
Phe Ser Ala Glu Leu Thr 325 330
335 Leu His Ser Glu Val Leu Ser Pro Glu Met Leu Gln Cys Thr Pro
Ala 340 345 350 Asn
Met Val Glu Val His Lys Asp Lys Glu Ser Ser Lys Gly His Thr 355
360 365 Arg His Lys Val Glu Glu
Ala Leu Ile Asn Glu Glu Ala Ile Leu Asn 370 375
380 Leu Met Glu Asn Ser Gln Thr Phe Gln Pro Leu
Thr Gln Arg Leu Ser 385 390 395
400 Glu Ser Pro Val Phe Met Asp Ser Ser Pro Asp Glu Ala Leu Val His
405 410 415 Leu Leu
Ala Gly Leu Glu Ser Asp Gly Tyr Arg Gly Glu Arg Asn Arg 420
425 430 Met Pro Ser Pro Cys Arg Ser
Phe Gly Asn Asn Lys Tyr Pro Gln Asn 435 440
445 Ser Asp Asp Glu Glu Asn Glu Pro Gln Ile Glu Lys
Glu Glu Met Glu 450 455 460
Leu Ser Leu Val Met Ser Gln Arg Trp Asp Ser Asn Ile Glu Glu His 465
470 475 480 Cys Ala Lys
Lys Arg Ser Leu Cys Arg Asn Thr His Arg Ser Ser Thr 485
490 495 Glu Asp Asp Asp Ser Ser Ser Gly
Glu Glu Met Glu Trp Ser Asp Asn 500 505
510 Ser Leu Leu Leu Ala Ser Leu Ser Ile Pro Gln Leu Asp
Gly Thr Ala 515 520 525
Asp Glu Asn Ser Asp Asn Pro Leu Asn Asn Glu Asn Ser Arg Thr His 530
535 540 Ser Ser Val Ile
Ala Thr Ser Lys Leu Ser Val Lys Pro Ser Ile Phe 545 550
555 560 His Lys Asp Ala Ala Thr Leu Glu Pro
Ser Ser Ser Ala Lys Ile Thr 565 570
575 Phe Gln Cys Lys His Thr Ser Ala Leu Ser Ser His Val Leu
Asn Lys 580 585 590
Glu Asp Leu Ile Glu Asp Leu Ser Gln Thr Asn Lys Asn Thr Glu Lys
595 600 605 Gly Leu Asp Asn
Ser Val Thr Ser Phe Thr Asn Glu Ser Thr Tyr Ser 610
615 620 Met Lys Tyr Pro Gly Ser Leu Ser
Ser Thr Val His Ser Glu Asn Ser 625 630
635 640 His Lys Glu Asn Ser Lys Lys Glu Ile Leu Pro Val
Ser Ser Cys Glu 645 650
655 Ser Ser Ile Phe Asp Tyr Glu Glu Asp Ile Pro Ser Val Thr Arg Gln
660 665 670 Val Pro Ser
Arg Lys Tyr Thr Asn Ile Arg Lys Ile Glu Lys Asp Ser 675
680 685 Pro Phe Ile His Met His Arg His
Pro Asn Glu Asn Thr Leu Gly Lys 690 695
700 Asn Ser Phe Asn Phe Ser Asp Leu Asn His Ser Lys Asn
Lys Val Ser 705 710 715
720 Ser Glu Gly Asn Glu Lys Gly Asn Ser Thr Ala Leu Ser Ser Leu Phe
725 730 735 Pro Ser Ser Phe
Thr Glu Asn Cys Glu Leu Leu Ser Cys Ser Gly Glu 740
745 750 Asn Arg Thr Met Val His Ser Leu Asn
Ser Thr Ala Asp Glu Ser Gly 755 760
765 Leu Asn Lys Leu Lys Ile Arg Tyr Glu Glu Phe Gln Glu His
Lys Thr 770 775 780
Glu Lys Pro Ser Leu Ser Gln Gln Ala Ala His Tyr Met Phe Phe Pro 785
790 795 800 Ser Val Val Leu Ser
Asn Cys Leu Thr Arg Pro Gln Lys Leu Ser Pro 805
810 815 Val Thr Tyr Lys Leu Gln Pro Gly Asn Lys
Pro Ser Arg Leu Lys Leu 820 825
830 Asn Lys Arg Lys Leu Ala Gly His Gln Glu Thr Ser Thr Lys Ser
Ser 835 840 845 Glu
Thr Gly Ser Thr Lys Asp Asn Phe Ile Gln Asn Asn Pro Cys Asn 850
855 860 Ser Asn Pro Glu Lys Asp
Asn Ala Leu Ala Ser Asp Leu Thr Lys Thr 865 870
875 880 Thr Arg Gly Ala Phe Glu Asn Lys Thr Pro Thr
Asp Gly Phe Ile Asp 885 890
895 Cys His Phe Gly Asp Gly Thr Leu Glu Thr Glu Gln Ser Phe Gly Leu
900 905 910 Tyr Gly
Asn Lys Tyr Thr Leu Arg Ala Lys Arg Lys Val Asn Tyr Glu 915
920 925 Thr Glu Asp Ser Glu Ser Ser
Phe Val Thr His Asn Ser Lys Ile Ser 930 935
940 Leu Pro His Pro Met Glu Ile Gly Glu Ser Leu Asp
Gly Thr Leu Lys 945 950 955
960 Ser Arg Lys Arg Arg Lys Met Ser Lys Lys Leu Pro Pro Val Ile Ile
965 970 975 Lys Tyr Ile
Ile Ile Asn Arg Phe Arg Gly Arg Lys Asn Met Leu Val 980
985 990 Lys Leu Gly Lys Ile Asp Ser Lys
Glu Lys Gln Val Ile Leu Thr Glu 995 1000
1005 Glu Lys Met Glu Leu Tyr Lys Lys Leu Ala Pro
Leu Lys Asp Phe 1010 1015 1020
Trp Pro Lys Val Pro Asp Ser Pro Ala Thr Lys Tyr Pro Ile Tyr
1025 1030 1035 Pro Leu Thr
Pro Lys Lys Ser His Arg Arg Lys Ser Lys His Lys 1040
1045 1050 Ser Ala Lys Lys Lys Thr Gly Lys
Gln Gln Arg Thr Asn Asn Glu 1055 1060
1065 Asn Ile Lys Arg Thr Leu Ser Phe Arg Lys Lys Arg Ser
His Ala 1070 1075 1080
Ile Leu Ser Pro Pro Ser Pro Ser Tyr Asn Ala Glu Thr Glu Asp 1085
1090 1095 Cys Asp Leu Asn Tyr
Ser Asp Val Met Ser Lys Leu Gly Phe Leu 1100 1105
1110 Ser Glu Arg Ser Thr Ser Pro Ile Asn Ser
Ser Pro Pro Arg Cys 1115 1120 1125
Trp Ser Pro Thr Asp Pro Arg Ala Glu Glu Ile Met Ala Ala Ala
1130 1135 1140 Glu Lys
Glu Ala Met Leu Phe Lys Gly Pro Asn Val Tyr Lys Lys 1145
1150 1155 Thr Val Asn Ser Arg Ile Gly
Lys Thr Ser Arg Ala Arg Ala Gln 1160 1165
1170 Ile Lys Lys Ser Lys Ala Lys Leu Ala Asn Pro Ser
Ile Val Thr 1175 1180 1185
Lys Lys Arg Asn Lys Arg Asn Gln Thr Asn Lys Leu Val Asp Asp 1190
1195 1200 Gly Lys Lys Lys Pro
Arg Ala Lys Gln Lys Thr Asn Glu Lys Gly 1205 1210
1215 Thr Ser Arg Lys His Thr Thr Leu Lys Asp
Glu Lys Ile Lys Ser 1220 1225 1230
Gln Ser Gly Ala Glu Val Lys Phe Val Leu Lys His Gln Asn Val
1235 1240 1245 Ser Glu
Phe Ala Ser Ser Ser Gly Gly Ser Gln Leu Leu Phe Lys 1250
1255 1260 Gln Lys Asp Met Pro Leu Met
Gly Ser Ala Val Asp His Pro Leu 1265 1270
1275 Ser Ala Ser Leu Pro Thr Gly Ile Asn Ala Gln Gln
Lys Leu Ser 1280 1285 1290
Gly Cys Phe Ser Ser Phe Leu Glu Ser Lys Lys Ser Val Asp Leu 1295
1300 1305 Gln Thr Phe Pro Ser
Ser Arg Asp Asp Leu His Pro Ser Val Val 1310 1315
1320 Cys Asn Ser Ile Gly Pro Gly Val Ser Lys
Ile Asn Val Gln Arg 1325 1330 1335
Pro His Asn Gln Ser Ala Met Phe Thr Leu Lys Glu Ser Thr Leu
1340 1345 1350 Ile Gln
Lys Asn Ile Phe Asp Leu Ser Asn His Leu Ser Gln Val 1355
1360 1365 Ala Gln Asn Thr Gln Ile Ser
Ser Gly Met Ser Ser Lys Ile Glu 1370 1375
1380 Asp Asn Ala Asn Asn Ile Gln Arg Asn Tyr Leu Ser
Ser Ile Gly 1385 1390 1395
Lys Leu Ser Glu Tyr Arg Asn Ser Leu Glu Ser Lys Leu Asp Gln 1400
1405 1410 Ala Tyr Thr Pro Asn
Phe Leu His Cys Lys Asp Ser Gln Gln Gln 1415 1420
1425 Ile Val Cys Ile Ala Glu Gln Ser Lys His
Ser Glu Thr Cys Ser 1430 1435 1440
Pro Gly Asn Thr Ala Ser Glu Glu Ser Gln Met Pro Asn Asn Cys
1445 1450 1455 Phe Val
Thr Ser Leu Arg Ser Pro Ile Lys Gln Ile Ala Trp Glu 1460
1465 1470 Gln Lys Gln Arg Gly Phe Ile
Leu Asp Met Ser Asn Phe Lys Pro 1475 1480
1485 Glu Arg Val Lys Pro Arg Ser Leu Ser Glu Ala Ile
Ser Gln Thr 1490 1495 1500
Lys Ala Leu Ser Gln Cys Lys Asn Arg Asn Val Ser Thr Pro Ser 1505
1510 1515 Ala Phe Gly Glu Gly
Gln Ser Gly Leu Ala Val Leu Lys Glu Leu 1520 1525
1530 Leu Gln Lys Arg Gln Gln Lys Ala Gln Asn
Ala Asn Thr Thr Gln 1535 1540 1545
Asp Pro Leu Ser Asn Lys His Gln Pro Asn Lys Asn Ile Ser Gly
1550 1555 1560 Ser Leu
Glu His Asn Lys Ala Asn Lys Arg Thr Arg Ser Val Thr 1565
1570 1575 Ser Pro Arg Lys Pro Arg Thr
Pro Arg Ser Thr Lys Gln Lys Glu 1580 1585
1590 Lys Ile Pro Lys Leu Leu Lys Val Asp Ser Leu Asn
Leu Gln Asn 1595 1600 1605
Ser Ser Gln Leu Asp Asn Ser Val Ser Asp Asp Ser Pro Ile Phe 1610
1615 1620 Phe Ser Asp Pro Gly
Phe Glu Ser Cys Tyr Ser Leu Glu Asp Ser 1625 1630
1635 Leu Ser Pro Glu His Asn Tyr Asn Phe Asp
Ile Asn Thr Ile Gly 1640 1645 1650
Gln Thr Gly Phe Cys Ser Phe Tyr Ser Gly Ser Gln Phe Val Pro
1655 1660 1665 Ala Asp
Gln Asn Leu Pro Gln Lys Phe Leu Ser Asp Ala Val Gln 1670
1675 1680 Asp Leu Phe Pro Gly Gln Ala
Ile Glu Lys Asn Glu Phe Leu Ser 1685 1690
1695 His Asp Asn Gln Lys Cys Asp Glu Asp Lys His His
Thr Thr Asp 1700 1705 1710
Ser Ala Ser Trp Ile Arg Ser Gly Thr Leu Ser Pro Glu Ile Phe 1715
1720 1725 Glu Lys Ser Thr Ile
Asp Ser Asn Glu Asn Arg Arg His Asn Gln 1730 1735
1740 Trp Lys Asn Ser Phe His Pro Leu Thr Thr
Arg Ser Asn Ser Ile 1745 1750 1755
Met Asp Ser Phe Cys Val Gln Gln Ala Glu Asp Cys Leu Ser Glu
1760 1765 1770 Lys Ser
Arg Leu Asn Arg Ser Ser Val Ser Lys Glu Val Phe Leu 1775
1780 1785 Ser Leu Pro Gln Pro Asn Asn
Ser Asp Trp Ile Gln Gly His Thr 1790 1795
1800 Arg Lys Glu Met Gly Gln Ser Leu Asp Ser Ala Asn
Thr Ser Phe 1805 1810 1815
Thr Ala Ile Leu Ser Ser Pro Asp Gly Glu Leu Val Asp Val Ala 1820
1825 1830 Cys Glu Asp Leu Glu
Leu Tyr Val Ser Arg Asn Asn Asp Met Leu 1835 1840
1845 Thr Pro Thr Pro Asp Ser Ser Pro Arg Ser
Thr Ser Ser Pro Ser 1850 1855 1860
Gln Ser Lys Asn Gly Ser Phe Thr Pro Arg Thr Ala Asn Ile Leu
1865 1870 1875 Lys Pro
Leu Met Ser Pro Pro Ser Arg Glu Glu Ile Met Ala Thr 1880
1885 1890 Leu Leu Asp His Asp Leu Ser
Glu Thr Ile Tyr Gln Glu Pro Phe 1895 1900
1905 Cys Ser Asn Pro Ser Asp Val Pro Glu Lys Pro Arg
Glu Ile Gly 1910 1915 1920
Gly Arg Leu Leu Met Val Glu Thr Arg Leu Ala Asn Asp Leu Ala 1925
1930 1935 Glu Phe Glu Gly Asp
Phe Ser Leu Glu Gly Leu Arg Leu Trp Lys 1940 1945
1950 Thr Ala Phe Ser Ala Met Thr Gln Asn Pro
Arg Pro Gly Ser Pro 1955 1960 1965
Leu Arg Ser Gly Gln Gly Val Val Asn Lys Gly Ser Ser Asn Ser
1970 1975 1980 Pro Lys
Met Val Glu Asp Lys Lys Ile Val Ile Met Pro Cys Lys 1985
1990 1995 Cys Ala Pro Ser Arg Gln Leu
Val Gln Val Trp Leu Gln Ala Lys 2000 2005
2010 Glu Glu Tyr Glu Arg Ser Lys Lys Leu Pro Lys Thr
Lys Pro Thr 2015 2020 2025
Gly Val Val Lys Ser Ala Glu Asn Phe Ser Ser Ser Val Asn Pro 2030
2035 2040 Asp Asp Lys Pro Val
Val Pro Pro Lys Met Asp Val Ser Pro Cys 2045 2050
2055 Ile Leu Pro Thr Thr Ala His Thr Lys Glu
Asp Val Asp Asn Ser 2060 2065 2070
Gln Ile Ala Leu Gln Ala Pro Thr Thr Gly Cys Ser Gln Thr Ala
2075 2080 2085 Ser Glu
Ser Gln Met Leu Pro Pro Val Ala Ser Ala Ser Asp Pro 2090
2095 2100 Glu Lys Asp Glu Asp Asp Asp
Asp Asn Tyr Tyr Ile Ser Tyr Ser 2105 2110
2115 Ser Pro Asp Ser Pro Val Ile Pro Pro Trp Gln Gln
Pro Ile Ser 2120 2125 2130
Pro Asp Ser Lys Ala Leu Asn Gly Asp Asp Arg Pro Ser Ser Pro 2135
2140 2145 Val Glu Glu Leu Pro
Ser Leu Ala Phe Glu Asn Phe Leu Lys Pro 2150 2155
2160 Ile Lys Asp Gly Ile Gln Lys Ser Pro Cys
Ser Glu Pro Gln Glu 2165 2170 2175
Pro Leu Val Ile Ser Pro Ile Asn Thr Arg Ala Arg Thr Gly Lys
2180 2185 2190 Cys Glu
Ser Leu Cys Phe His Ser Thr Pro Ile Ile Gln Arg Lys 2195
2200 2205 Leu Leu Glu Arg Leu Pro Glu
Ala Pro Gly Leu Ser Pro Leu Ser 2210 2215
2220 Thr Glu Pro Lys Thr Gln Lys Leu Ser Asn Lys Lys
Gly Ser Asn 2225 2230 2235
Thr Asp Thr Leu Arg Arg Val Leu Leu Thr Gln Ala Lys Asn Gln 2240
2245 2250 Phe Ala Ala Val Asn
Thr Pro Gln Lys Glu Thr Ser Gln Ile Asp 2255 2260
2265 Gly Pro Ser Leu Asn Asn Thr Tyr Gly Phe
Lys Val Ser Ile Gln 2270 2275 2280
Asn Leu Gln Glu Ala Lys Ala Leu His Glu Ile Gln Asn Leu Thr
2285 2290 2295 Leu Ile
Ser Val Glu Leu His Ala Arg Thr Arg Arg Asp Leu Glu 2300
2305 2310 Pro Asp Pro Glu Phe Asp Pro
Ile Cys Ala Leu Phe Tyr Cys Ile 2315 2320
2325 Ser Ser Asp Thr Pro Leu Pro Asp Thr Glu Lys Thr
Glu Leu Thr 2330 2335 2340
Gly Val Ile Val Ile Asp Lys Asp Lys Thr Val Phe Ser Gln Asp 2345
2350 2355 Ile Arg Tyr Gln Thr
Pro Leu Leu Ile Arg Ser Gly Ile Thr Gly 2360 2365
2370 Leu Glu Val Thr Tyr Ala Ala Asp Glu Lys
Ala Leu Phe His Glu 2375 2380 2385
Ile Ala Asn Ile Ile Lys Arg Tyr Asp Pro Asp Ile Leu Leu Gly
2390 2395 2400 Tyr Glu
Ile Gln Met His Ser Trp Gly Tyr Leu Leu Gln Arg Ala 2405
2410 2415 Ala Ala Leu Ser Ile Asp Leu
Cys Arg Met Ile Ser Arg Val Pro 2420 2425
2430 Asp Asp Lys Ile Glu Asn Arg Phe Ala Ala Glu Arg
Asp Glu Tyr 2435 2440 2445
Gly Ser Tyr Thr Met Ser Glu Ile Asn Ile Val Gly Arg Ile Thr 2450
2455 2460 Leu Asn Leu Trp Arg
Ile Met Arg Asn Glu Val Ala Leu Thr Asn 2465 2470
2475 Tyr Thr Phe Glu Asn Val Ser Phe His Val
Leu His Gln Arg Phe 2480 2485 2490
Pro Leu Phe Thr Phe Arg Val Leu Ser Asp Trp Phe Asp Asn Lys
2495 2500 2505 Thr Asp
Leu Tyr Arg Trp Lys Met Val Asp His Tyr Val Ser Arg 2510
2515 2520 Val Arg Gly Asn Leu Gln Met
Leu Glu Gln Leu Asp Leu Ile Gly 2525 2530
2535 Lys Thr Ser Glu Met Ala Arg Leu Phe Gly Ile Gln
Phe Leu His 2540 2545 2550
Val Leu Thr Arg Gly Ser Gln Tyr Arg Val Glu Ser Met Met Leu 2555
2560 2565 Arg Ile Ala Lys Pro
Met Asn Tyr Ile Pro Val Thr Pro Ser Val 2570 2575
2580 Gln Gln Arg Ser Gln Met Arg Ala Pro Gln
Cys Val Pro Leu Ile 2585 2590 2595
Met Glu Pro Glu Ser Arg Phe Tyr Ser Asn Ser Val Leu Val Leu
2600 2605 2610 Asp Phe
Gln Ser Leu Tyr Pro Ser Ile Val Ile Ala Tyr Asn Tyr 2615
2620 2625 Cys Phe Ser Thr Cys Leu Gly
His Val Glu Asn Leu Gly Lys Tyr 2630 2635
2640 Asp Glu Phe Lys Phe Gly Cys Thr Ser Leu Arg Val
Pro Pro Asp 2645 2650 2655
Leu Leu Tyr Gln Val Arg His Asp Ile Thr Val Ser Pro Asn Gly 2660
2665 2670 Val Ala Phe Val Lys
Pro Ser Val Arg Lys Gly Val Leu Pro Arg 2675 2680
2685 Met Leu Glu Glu Ile Leu Lys Thr Arg Phe
Met Val Lys Gln Ser 2690 2695 2700
Met Lys Ala Tyr Lys Gln Asp Arg Ala Leu Ser Arg Met Leu Asp
2705 2710 2715 Ala Arg
Gln Leu Gly Leu Lys Leu Ile Ala Asn Val Thr Phe Gly 2720
2725 2730 Tyr Thr Ser Ala Asn Phe Ser
Gly Arg Met Pro Cys Ile Glu Val 2735 2740
2745 Gly Asp Ser Ile Val His Lys Ala Arg Glu Thr Leu
Glu Arg Ala 2750 2755 2760
Ile Lys Leu Val Asn Asp Thr Lys Lys Trp Gly Ala Arg Val Val 2765
2770 2775 Tyr Gly Asp Thr Asp
Ser Met Phe Val Leu Leu Lys Gly Ala Thr 2780 2785
2790 Lys Glu Gln Ser Phe Lys Ile Gly Gln Glu
Ile Ala Glu Ala Val 2795 2800 2805
Thr Ala Thr Asn Pro Lys Pro Val Lys Leu Lys Phe Glu Lys Val
2810 2815 2820 Tyr Leu
Pro Cys Val Leu Gln Thr Lys Lys Arg Tyr Val Gly Tyr 2825
2830 2835 Met Tyr Glu Thr Leu Asp Gln
Lys Asp Pro Val Phe Asp Ala Lys 2840 2845
2850 Gly Ile Glu Thr Val Arg Arg Asp Ser Cys Pro Ala
Val Ser Lys 2855 2860 2865
Ile Leu Glu Arg Ser Leu Lys Leu Leu Phe Glu Thr Arg Asp Ile 2870
2875 2880 Ser Leu Ile Lys Gln
Tyr Val Gln Arg Gln Cys Met Lys Leu Leu 2885 2890
2895 Glu Gly Lys Ala Ser Ile Gln Asp Phe Ile
Phe Ala Lys Glu Tyr 2900 2905 2910
Arg Gly Ser Phe Ser Tyr Lys Pro Gly Ala Cys Val Pro Ala Leu
2915 2920 2925 Glu Leu
Thr Arg Lys Met Leu Thr Tyr Asp Arg Arg Ser Glu Pro 2930
2935 2940 Gln Val Gly Glu Arg Val Pro
Tyr Val Ile Ile Tyr Gly Thr Pro 2945 2950
2955 Gly Val Pro Leu Ile Gln Leu Val Arg Arg Pro Val
Glu Val Leu 2960 2965 2970
Gln Asp Pro Thr Leu Arg Leu Asn Ala Thr Tyr Tyr Ile Thr Lys 2975
2980 2985 Gln Ile Leu Pro Pro
Leu Ala Arg Ile Phe Ser Leu Ile Gly Ile 2990 2995
3000 Asp Val Phe Ser Trp Tyr His Glu Leu Pro
Arg Ile His Lys Ala 3005 3010 3015
Thr Ser Ser Ser Arg Ser Glu Pro Glu Gly Arg Lys Gly Thr Ile
3020 3025 3030 Ser Gln
Tyr Phe Thr Thr Leu His Cys Pro Val Cys Asp Asp Leu 3035
3040 3045 Thr Gln His Gly Ile Cys Ser
Lys Cys Arg Ser Gln Pro Gln His 3050 3055
3060 Val Ala Val Ile Leu Asn Gln Glu Ile Arg Glu Leu
Glu Arg Gln 3065 3070 3075
Gln Glu Gln Leu Val Lys Ile Cys Lys Asn Cys Thr Gly Cys Phe 3080
3085 3090 Asp Arg His Ile Pro
Cys Val Ser Leu Asn Cys Pro Val Leu Phe 3095 3100
3105 Lys Leu Ser Arg Val Asn Arg Glu Leu Ser
Lys Ala Pro Tyr Leu 3110 3115 3120
Arg Gln Leu Leu Asp Gln Phe 3125 3130
1010719DNAArtificial SequenceREV3L sequence 10catcatcatg gcaacaagag
ctgcagcctg ggaccgagga gcccgtgtga ttcccggcgg 60tggcggcagt ggcggcagca
ccagcaccga cgaaagctcg agggcttctc tcctgcggcc 120ccttgccggg tgctcctgag
gaggcggcgg cagcagcgcc tacaccgccc cgcccgccgc 180tcctcgaggt gcctctgtgt
gaggggaggg ggccgtgccg agaaggggag ggggcgccgc 240cgccgctgcg gagggagccg
ccgccgctgc tgctgccgct gccgggtcgc cagtgaaggg 300aggcagtggc ggcggcggcg
aacatgtttt cagtaaggat agtgactgca gactactaca 360tggccagccc gctgcagggg
ctggatacct gccaatcccc cctcacccag gcccctgtca 420agaaggtgcc ggtggtgcga
gtcttcggag cgaccccggc aggtcagaag acatgtcttc 480atctacatgg catctttcct
tacctctatg tgccatacga tggttatgga cagcagccag 540aaagctatct ttctcagatg
gcattcagta tcgacagagc acttaatgtg gctttaggca 600atccatcttc cactgctcag
catgtgttca aagtgtcatt agtatcagga atgccttttt 660atggttatca tgagaaggaa
agacacttta tgaagatcta tctttacaat cctacaatgg 720tgaaaaggat atgtgaactt
ttgcaaagcg gagccataat gaataaattt taccagcctc 780atgaagcgca tattccctac
ctcctacagc tcttcattga ctacaatctt tatggcatga 840atttaataaa tctggctgct
gtcaagttcc gaaaagcaag aaggaaaagt aatacattgc 900atgcaactgg atcctgcaag
aatcatttat caggaaattc tcttgctgat actttatttc 960ggtgggaaca agatgaaata
ccaagctctt taatattgga aggtgttgaa ccacagagta 1020catgtgaatt agaagtggat
gctgtagctg ctgatatctt aaatcgtctg gacattgaag 1080ctcaaattgg tggaaaccct
ggtctacagg ccatatggga agatgaaaag caacggcgaa 1140gaaacagaaa tgaaacttct
caaatgagcc aacctgagtc acaagatcac aggtttgtgc 1200cagcaacaga aagtgaaaaa
aaatttcaga agagacttca ggaaattctc aaacagaatg 1260atttctctgt aacattatca
ggatctgtgg actacagcga tggatcccag gagttctctg 1320ctgagttaac attgcactct
gaggttctgt ctcctgaaat gcttcagtgt acaccagcca 1380atatggtaga agttcacaaa
gacaaagagt caagcaaagg tcacactaga cacaaagtgg 1440aagaagctct tattaatgaa
gaagcaattt tgaaccttat ggaaaatagt cagacttttc 1500agcctttgac ccaaagactg
agtgagtcac ctgttttcat ggacagtagt cctgatgagg 1560ctctggtaca tcttcttgct
ggtttggaaa gtgatggata tcggggggaa agaaatagga 1620tgccatcacc atgtcgctcc
tttggaaata ataaatatcc acaaaatagt gatgatgaag 1680aaaatgaacc acagattgaa
aaagaggaaa tggagcttag tttggtgatg tcccagagat 1740gggacagcaa tattgaagaa
cattgtgcca aaaagagatc actgtgcaga aatacccaca 1800gaagttcaac tgaagatgat
gactcatctt caggagaaga aatggaatgg agtgataaca 1860gtttgcttct agccagtctt
tctatacctc agttagatgg aactgcagat gaaaatagtg 1920acaatccatt gaacaatgaa
aattctagaa cccactcttc tgtaattgca acaagcaagc 1980tttcagttaa accctccatc
tttcacaaag atgctgctac attagaaccc tcatcttctg 2040ctaagattac ctttcagtgt
aaacacacaa gtgccctttc ttcccatgtt ttgaacaagg 2100aagatttaat tgaagacctt
tcacagacaa acaaaaatac agaaaaaggt ctagataact 2160cagtcacttc ttttacaaac
gaaagcactt attctatgaa ataccctgga tctttaagca 2220gtactgttca ttcagaaaat
tctcataaag agaatagtaa gaaagagatc ctcccagtat 2280cttcctgtga aagtagtatt
tttgattatg aagaagatat tccatctgtt acaagacaag 2340taccaagtag aaaatataca
aacattagaa aaatcgaaaa ggattcccct tttatacata 2400tgcaccgtca ccctaacgag
aatacattgg gcaaaaattc tttcaacttt tctgacttaa 2460atcattcaaa aaataaagta
tcctctgaag gaaatgaaaa aggaaacagc acagctctga 2520gtagtttatt cccttcatca
tttactgaaa attgtgaatt actgtcatgc tcaggggaga 2580atagaactat ggtgcattct
cttaatagca ctgctgatga aagtggacta aataaactta 2640aaattaggta tgaagaattt
caagaacata aaacagaaaa gccaagcctc agccagcaag 2700cagcacacta tatgtttttt
cccagtgttg ttctttctaa ctgtcttact agaccacaga 2760aactatctcc tgtcacatat
aaattacaac ctggcaataa accatcccgg ttaaaattga 2820ataaaaggaa acttgcaggt
catcaggaga cttctaccaa aagtagtgag actggatcca 2880caaaagataa ttttatacaa
aataatcctt gtaatagtaa tcctgagaag gataatgcat 2940tggctagtga tttaactaaa
accactcgtg gagcttttga aaataaaaca cccacagatg 3000gttttataga ctgtcacttt
ggagatggaa cgttagaaac tgagcagtcc tttggactat 3060atggaaataa atacacactt
agagccaaac gcaaggtaaa ttatgagact gaagacagtg 3120agtcaagttt tgtaactcac
aactcaaaaa ttagtctacc tcatcccatg gaaattggtg 3180aaagtttaga tggaactctc
aaatcccgaa aacgaagaaa aatgtctaaa aagctgcccc 3240ctgtcatcat aaagtatatt
attattaata gatttagagg gagaaaaaat atgcttgtga 3300agctaggaaa aatagactct
aaagaaaaac aagtaatatt aacagaagaa aaaatggaac 3360tatataaaaa gcttgcacct
ttgaaggact tttggccaaa agttcccgac tcccctgcaa 3420ccaaatatcc catttatcca
ctaacaccaa agaaaagtca cagaagaaag tcaaaacata 3480aatctgctaa gaaaaaaact
ggtaaacaac aaaggacaaa taatgaaaat attaaaagaa 3540ctttgtcttt caggaaaaaa
cggtcacatg ctattctttc tcctccctca ccatcttaca 3600atgctgaaac cgaagattgt
gacctgaatt atagtgatgt tatgtctaaa ctaggttttc 3660tttctgagag aagcacaagt
cccataaatt cttctccacc tcgctgctgg tctcccacag 3720atccaagagc tgaagaaatc
atggctgctg cagaaaaaga ggcaatgctt tttaagggtc 3780ctaatgtata taagaagact
gttaattctc gtataggaaa aactagtcgc gcaagagcac 3840agattaagaa atcaaaagca
aagcttgcta atccctctat agttactaag aaaaggaaca 3900aacgaaatca gacaaataaa
ctagtagatg atggaaaaaa gaaaccaaga gcaaaacaaa 3960aaacaaatga gaaaggtaca
tcgagaaagc atacaacact taaggatgaa aaaataaaat 4020ctcagtctgg tgctgaggtt
aagtttgtac tgaaacacca gaatgtgtct gaatttgcaa 4080gtagttctgg aggctctcaa
ctacttttta aacagaaaga tatgccacta atgggctctg 4140ctgtagatca tcccctttct
gcttccctac ccactggaat taatgcacaa cagaagttat 4200ctggctgctt ttcttctttc
ttagaaagca agaagtctgt agatttgcag acattcccca 4260gttcacgaga tgatttgcat
ccatcagttg tttgtaattc tataggacct ggagtctcaa 4320aaattaatgt tcaaaggcct
cataatcaaa gtgctatgtt tactctaaag gaatcaacgt 4380taattcaaaa aaatatattt
gacctttcca atcatttatc tcaggtagca cagaatacac 4440agatatcttc tggtatgtcc
tcaaagatag aagataatgc aaataatata caaagaaact 4500atttgtcatc aatcggaaag
ttaagtgaat atcgcaattc cctagaatca aagctggacc 4560aagcatatac ccctaatttt
ttgcattgca aagacagtca gcagcagatt gtgtgcatag 4620cggaacagtc aaagcacagt
gaaacttgtt ctccgggaaa tacagcttca gaggaaagcc 4680aaatgcctaa taattgcttt
gtaacttcct tgagaagtcc aatcaaacaa atagcatggg 4740agcaaaagca aaggggcttt
attttagata tgtcaaattt taaacctgaa agagtaaaac 4800cgaggtcgtt atcagaagca
atttcacaaa ccaaagcact ttctcagtgt aaaaatcgaa 4860atgtgtcaac accttcagca
tttggtgaag gacagtctgg actggcagtt ctaaaagaat 4920tgttacaaaa aagacagcag
aaagcacaaa atgcaaatac tacacaagac ccattatcca 4980ataaacatca accaaataaa
aatatttctg gttcccttga gcataacaaa gcaaataaac 5040ggacacgatc ggtaacgtcc
ccaagaaaac ctcgaactcc cagaagtaca aaacaaaaag 5100aaaaaatccc caaacttctc
aaagtagact ctttaaattt acaaaactct agccagttgg 5160ataactctgt atcagatgat
agtcccatct ttttttcaga tccaggcttt gaaagttgtt 5220actcacttga agatagttta
tctcctgaac ataattataa ttttgatatt aacacaatag 5280gtcagactgg attttgtagc
ttttattctg gaagtcagtt tgtcccagct gatcagaatt 5340tgcctcagaa gttcctaagt
gatgctgttc aggatctttt tccaggacaa gctatagaaa 5400aaaatgagtt tttaagtcat
gacaaccaga aatgtgatga agacaagcat cataccacag 5460actcagcctc atggattaga
tctggtactt taagtcctga aatttttgag aagtcaacca 5520tagatagcaa tgagaatcgt
cgccacaacc agtggaaaaa tagctttcat cctctaacaa 5580ctcggtctaa ctcaataatg
gattctttct gtgttcagca ggcagaagac tgtctaagtg 5640aaaaatctag attgaatagg
agttcagtaa gcaaagaagt gtttcttagc ctcccacagc 5700caaacaattc agactggatt
caaggtcaca ccagaaaaga aatgggacag tctcttgact 5760cagccaatac ctcttttact
gcaatactct cctcccctga tggtgaactt gtagacgtgg 5820cctgtgaaga tttagaactg
tatgtttcaa gaaacaatga tatgttgaca ccaactcctg 5880atagttcacc aagatctact
agctctcctt cacaatctaa aaatggcagc ttcacccctc 5940gaactgctaa cattctgaaa
ccacttatgt cccccccaag tagggaagaa attatggcaa 6000ctttgttgga tcatgacctg
tctgagacta tttaccagga accattttgc agtaatcctt 6060ctgatgtacc agaaaagccc
agggagattg gtggacggct cctcatggta gaaactcgac 6120ttgcaaatga tctggctgag
tttgagggag acttttcctt ggaaggactt cgtctttgga 6180aaacagcatt ctcagcaatg
actcagaatc caaggccagg gtcacccctt cgcagtggcc 6240aaggagttgt caataaaggg
tcaagtaata gccctaagat ggttgaagat aaaaaaattg 6300tgattatgcc ttgcaaatgt
gccccaagtc gacaactggt tcaagtgtgg cttcaagcca 6360aagaagaata cgaacgttcc
aagaaactgc ctaaaaccaa gccaactgga gttgtaaaat 6420ctgctgagaa ctttagctct
tcagttaacc cagatgacaa acctgtagtg cctccaaaaa 6480tggatgtaag tccatgtata
ctccccacta cagcacatac caaggaggat gttgataatt 6540ctcagattgc tttacaagca
ccaaccacgg gatgtagtca aactgcaagt gaaagtcaga 6600tgctgccacc agttgcctct
gcaagtgatc ccgaaaaaga tgaagatgat gatgataact 6660attacattag ttatagctcc
cctgattctc cagtaattcc cccttggcaa caaccaatat 6720ccccagattc caaagcatta
aatggagatg atagaccctc atcaccagta gaggagctgc 6780cttcattggc ttttgagaac
ttcttaaagc caataaaaga tggtatacaa aaaagcccct 6840gcagtgagcc tcaagagcct
ctagtgatat ctccaattaa tactagggca agaactggga 6900aatgtgaatc actttgcttt
catagtacac caatcataca gagaaaactt ctggaaaggc 6960ttcctgaagc acctggcctt
agcccattat caacagaacc aaaaacacag aagttgagta 7020ataagaaagg aagtaatact
gacactctta gaagagtact gttaacacaa gcaaagaatc 7080aatttgcagc agtaaatacc
ccacagaaag aaacttctca gattgatgga ccatctttaa 7140acaatactta cggtttcaaa
gtcagcatac aaaacttaca ggaggcaaaa gctttacatg 7200agatacaaaa tcttacccta
atcagtgtgg agttgcatgc tcgaactaga cgagacttag 7260aaccggatcc tgaatttgac
ccaatctgtg ctctgttcta ctgcatctca tctgacactc 7320cactgccaga tacagaaaaa
acagaactca caggtgtaat agtgattgat aaagacaaga 7380cagttttcag tcaagatatc
agatatcaga ctccattact tattagatct ggaattacag 7440gactcgaagt cacctatgct
gctgatgaga aggcactttt tcatgaaatt gcaaatataa 7500taaagaggta tgatcctgat
attctgctag gatatgagat tcagatgcat tcctggggtt 7560acctcttaca aagggctgcc
gctttaagta ttgacttatg tcggatgatc tctcgggtgc 7620cagatgacaa aattgagaac
agatttgcag ctgaaagaga tgagtatgga tcatatacaa 7680tgagtgagat aaatattgtt
ggccgaatta cactaaatct ttggagaatc atgagaaatg 7740aggtggctct aactaactac
acctttgaaa atgtgagctt tcatgttctt catcagcgtt 7800ttcccctctt tacctttcga
gtcttgtcag actggtttga taacaagaca gatctataca 7860gatggaaaat ggttgatcat
tatgttagcc gtgtccgtgg aaatctccaa atgttagaac 7920agctggacct gattgggaaa
accagtgaga tggctagact ttttggcatt cagtttttac 7980atgtactgac aaggggttca
cagtaccgtg tggaatcaat gatgttgcgt attgctaaac 8040caatgaacta tattcctgtg
acacctagtg ttcagcaaag atcccagatg agagccccac 8100agtgtgttcc tctaattatg
gagcctgaat cccgcttcta tagcaactct gttctcgttt 8160tggatttcca atcactttat
ccttctattg tgattgcata taactactgc ttttccacct 8220gccttggcca tgtggagaac
ttgggaaagt atgatgagtt caaatttggc tgtacctctc 8280tgagagtacc tccagattta
ctttaccaag ttaggcatga tatcacagtg tcccccaatg 8340gagtagcttt tgtcaagcct
tcagtaagaa aaggtgtact accaagaatg cttgaagaaa 8400ttttgaagac tagatttatg
gtgaagcagt caatgaaggc ttacaagcaa gacagagccc 8460tgtcacgaat gcttgatgcg
cgtcagttgg gacttaagct gatagcaaat gtcacatttg 8520gctatacatc tgctaatttt
tctgggagaa tgccatgcat tgaggttggc gatagtattg 8580ttcacaaagc cagagagacc
ttggaacgag ctattaaact ggtgaatgat accaagaaat 8640ggggggctag ggttgtatat
ggcgatactg acagtatgtt tgtgctactg aaaggagcca 8700ctaaggagca gtcttttaag
attggtcagg aaattgccga agctgtaact gctaccaatc 8760ctaaaccagt gaaattgaag
tttgaaaagg tatatttgcc ctgtgtttta caaacaaaaa 8820agaggtatgt gggttacatg
tatgaaacac tggatcagaa ggacccagta tttgatgcaa 8880aaggaataga aacagtcaga
agagattcct gccctgctgt ttctaagata cttgagcgtt 8940ctctaaagct gctatttgaa
acgagagata taagtctaat taaacagtat gttcagcgac 9000aatgtatgaa gcttctggaa
ggaaaggcca gcatacaaga ctttatcttt gccaaggaat 9060acagaggaag tttttcttat
aaaccaggag cttgtgtgcc agcccttgaa cttacaagga 9120aaatgctgac ttatgaccgg
cgctctgagc ctcaggttgg ggagcgagtg ccatacgtca 9180tcatttatgg gacccccgga
gtaccactta tccagcttgt aaggcgccca gtggaagtcc 9240tgcaggaccc aactctgaga
ctgaatgcta cttactatat taccaagcaa atccttccac 9300ccttggcaag aatcttctca
cttattggta ttgatgtctt cagctggtat catgaattac 9360caaggatcca taaagctacc
agctcctcgc gaagtgaacc tgaagggcgg aaaggcacta 9420tttcacaata ttttactacc
ttacactgtc ctgtgtgtga tgacctaact cagcatggca 9480tctgtagtaa atgtcggagc
caacctcagc atgttgcagt catcctcaac caagaaatcc 9540gggagttgga acgtcaacag
gagcaacttg taaagatatg caagaactgt acaggttgct 9600ttgatcgaca catcccatgt
gtttctctga actgcccagt acttttcaaa ctctcccgag 9660taaatagaga attgtccaag
gcaccatatc tccggcagtt attagaccag ttttaaattg 9720tcaatatcac agtattacag
gtgctatttt tttcagtgct taccactaaa ctgttgtgca 9780tggtgctttt taactttcat
cgagtcaagg atgttcactg tctgttatct gaagactatg 9840aagacttcta tgctaaccga
attaaaatgt acttgttgat ctctgaatag ctcacttctt 9900acaatgtaca aattcctcat
tctgtcacct tttaaacatt gttttataat gcaggtgttg 9960gatttgctcc agtatgtgta
ccatcttgta aattcatttg agtagatcat gtttacttcc 10020cagtggaagg agcactgaaa
acctcttaaa gaaaaagcat ttgtgtgttt tccttgaact 10080gtctgtatca agacgtgtta
cttcgagata tccattcact ttataatttt gactgcaaaa 10140tattttgtaa atacactttt
ttacttttca aacgagcaaa ataatgtgca atgattttta 10200tacaaatgat tttcaagttg
tttggtatat ttcctctagg ttttgcttga ctcaaagtag 10260atcgttattt tgatcaaact
gtgcaaacag tagtaccacg tgtagcattt tgaaacatta 10320ttttttttta aaaaatgctg
tcttgcttta gctattaatg gggcattgtg aggaactgtg 10380caaagacatt tttgttacaa
acctgtgggc ctgttgcaat actttaaaaa taaaaaattt 10440tattccattt gcttgttttg
tatagacatt tctattgctt ctaaatatac ttaaaatatt 10500ttctttcctt atgtactgta
cagttaatct tatttgccat catcttgaac acaaaatgtg 10560tatttagaat atttgtataa
ctgtgtaaaa taaaaaagga attatgtggt cagtgcattg 10620ttttttaaac tggaaatcat
tttgttttaa aagttaataa tggaaaccat attaaaattg 10680aataaaatat aaaataatat
aaaaaaaaaa aaaaaaaaa 10719111014PRTArtificial
SequencePARP1 sequence 11Met Ala Glu Ser Ser Asp Lys Leu Tyr Arg Val Glu
Tyr Ala Lys Ser 1 5 10
15 Gly Arg Ala Ser Cys Lys Lys Cys Ser Glu Ser Ile Pro Lys Asp Ser
20 25 30 Leu Arg Met
Ala Ile Met Val Gln Ser Pro Met Phe Asp Gly Lys Val 35
40 45 Pro His Trp Tyr His Phe Ser Cys
Phe Trp Lys Val Gly His Ser Ile 50 55
60 Arg His Pro Asp Val Glu Val Asp Gly Phe Ser Glu Leu
Arg Trp Asp 65 70 75
80 Asp Gln Gln Lys Val Lys Lys Thr Ala Glu Ala Gly Gly Val Thr Gly
85 90 95 Lys Gly Gln Asp
Gly Ile Gly Ser Lys Ala Glu Lys Thr Leu Gly Asp 100
105 110 Phe Ala Ala Glu Tyr Ala Lys Ser Asn
Arg Ser Thr Cys Lys Gly Cys 115 120
125 Met Glu Lys Ile Glu Lys Gly Gln Val Arg Leu Ser Lys Lys
Met Val 130 135 140
Asp Pro Glu Lys Pro Gln Leu Gly Met Ile Asp Arg Trp Tyr His Pro 145
150 155 160 Gly Cys Phe Val Lys
Asn Arg Glu Glu Leu Gly Phe Arg Pro Glu Tyr 165
170 175 Ser Ala Ser Gln Leu Lys Gly Phe Ser Leu
Leu Ala Thr Glu Asp Lys 180 185
190 Glu Ala Leu Lys Lys Gln Leu Pro Gly Val Lys Ser Glu Gly Lys
Arg 195 200 205 Lys
Gly Asp Glu Val Asp Gly Val Asp Glu Val Ala Lys Lys Lys Ser 210
215 220 Lys Lys Glu Lys Asp Lys
Asp Ser Lys Leu Glu Lys Ala Leu Lys Ala 225 230
235 240 Gln Asn Asp Leu Ile Trp Asn Ile Lys Asp Glu
Leu Lys Lys Val Cys 245 250
255 Ser Thr Asn Asp Leu Lys Glu Leu Leu Ile Phe Asn Lys Gln Gln Val
260 265 270 Pro Ser
Gly Glu Ser Ala Ile Leu Asp Arg Val Ala Asp Gly Met Val 275
280 285 Phe Gly Ala Leu Leu Pro Cys
Glu Glu Cys Ser Gly Gln Leu Val Phe 290 295
300 Lys Ser Asp Ala Tyr Tyr Cys Thr Gly Asp Val Thr
Ala Trp Thr Lys 305 310 315
320 Cys Met Val Lys Thr Gln Thr Pro Asn Arg Lys Glu Trp Val Thr Pro
325 330 335 Lys Glu Phe
Arg Glu Ile Ser Tyr Leu Lys Lys Leu Lys Val Lys Lys 340
345 350 Gln Asp Arg Ile Phe Pro Pro Glu
Thr Ser Ala Ser Val Ala Ala Thr 355 360
365 Pro Pro Pro Ser Thr Ala Ser Ala Pro Ala Ala Val Asn
Ser Ser Ala 370 375 380
Ser Ala Asp Lys Pro Leu Ser Asn Met Lys Ile Leu Thr Leu Gly Lys 385
390 395 400 Leu Ser Arg Asn
Lys Asp Glu Val Lys Ala Met Ile Glu Lys Leu Gly 405
410 415 Gly Lys Leu Thr Gly Thr Ala Asn Lys
Ala Ser Leu Cys Ile Ser Thr 420 425
430 Lys Lys Glu Val Glu Lys Met Asn Lys Lys Met Glu Glu Val
Lys Glu 435 440 445
Ala Asn Ile Arg Val Val Ser Glu Asp Phe Leu Gln Asp Val Ser Ala 450
455 460 Ser Thr Lys Ser Leu
Gln Glu Leu Phe Leu Ala His Ile Leu Ser Pro 465 470
475 480 Trp Gly Ala Glu Val Lys Ala Glu Pro Val
Glu Val Val Ala Pro Arg 485 490
495 Gly Lys Ser Gly Ala Ala Leu Ser Lys Lys Ser Lys Gly Gln Val
Lys 500 505 510 Glu
Glu Gly Ile Asn Lys Ser Glu Lys Arg Met Lys Leu Thr Leu Lys 515
520 525 Gly Gly Ala Ala Val Asp
Pro Asp Ser Gly Leu Glu His Ser Ala His 530 535
540 Val Leu Glu Lys Gly Gly Lys Val Phe Ser Ala
Thr Leu Gly Leu Val 545 550 555
560 Asp Ile Val Lys Gly Thr Asn Ser Tyr Tyr Lys Leu Gln Leu Leu Glu
565 570 575 Asp Asp
Lys Glu Asn Arg Tyr Trp Ile Phe Arg Ser Trp Gly Arg Val 580
585 590 Gly Thr Val Ile Gly Ser Asn
Lys Leu Glu Gln Met Pro Ser Lys Glu 595 600
605 Asp Ala Ile Glu His Phe Met Lys Leu Tyr Glu Glu
Lys Thr Gly Asn 610 615 620
Ala Trp His Ser Lys Asn Phe Thr Lys Tyr Pro Lys Lys Phe Tyr Pro 625
630 635 640 Leu Glu Ile
Asp Tyr Gly Gln Asp Glu Glu Ala Val Lys Lys Leu Thr 645
650 655 Val Asn Pro Gly Thr Lys Ser Lys
Leu Pro Lys Pro Val Gln Asp Leu 660 665
670 Ile Lys Met Ile Phe Asp Val Glu Ser Met Lys Lys Ala
Met Val Glu 675 680 685
Tyr Glu Ile Asp Leu Gln Lys Met Pro Leu Gly Lys Leu Ser Lys Arg 690
695 700 Gln Ile Gln Ala
Ala Tyr Ser Ile Leu Ser Glu Val Gln Gln Ala Val 705 710
715 720 Ser Gln Gly Ser Ser Asp Ser Gln Ile
Leu Asp Leu Ser Asn Arg Phe 725 730
735 Tyr Thr Leu Ile Pro His Asp Phe Gly Met Lys Lys Pro Pro
Leu Leu 740 745 750
Asn Asn Ala Asp Ser Val Gln Ala Lys Val Glu Met Leu Asp Asn Leu
755 760 765 Leu Asp Ile Glu
Val Ala Tyr Ser Leu Leu Arg Gly Gly Ser Asp Asp 770
775 780 Ser Ser Lys Asp Pro Ile Asp Val
Asn Tyr Glu Lys Leu Lys Thr Asp 785 790
795 800 Ile Lys Val Val Asp Arg Asp Ser Glu Glu Ala Glu
Ile Ile Arg Lys 805 810
815 Tyr Val Lys Asn Thr His Ala Thr Thr His Asn Ala Tyr Asp Leu Glu
820 825 830 Val Ile Asp
Ile Phe Lys Ile Glu Arg Glu Gly Glu Cys Gln Arg Tyr 835
840 845 Lys Pro Phe Lys Gln Leu His Asn
Arg Arg Leu Leu Trp His Gly Ser 850 855
860 Arg Thr Thr Asn Phe Ala Gly Ile Leu Ser Gln Gly Leu
Arg Ile Ala 865 870 875
880 Pro Pro Glu Ala Pro Val Thr Gly Tyr Met Phe Gly Lys Gly Ile Tyr
885 890 895 Phe Ala Asp Met
Val Ser Lys Ser Ala Asn Tyr Cys His Thr Ser Gln 900
905 910 Gly Asp Pro Ile Gly Leu Ile Leu Leu
Gly Glu Val Ala Leu Gly Asn 915 920
925 Met Tyr Glu Leu Lys His Ala Ser His Ile Ser Lys Leu Pro
Lys Gly 930 935 940
Lys His Ser Val Lys Gly Leu Gly Lys Thr Thr Pro Asp Pro Ser Ala 945
950 955 960 Asn Ile Ser Leu Asp
Gly Val Asp Val Pro Leu Gly Thr Gly Ile Ser 965
970 975 Ser Gly Val Asn Asp Thr Ser Leu Leu Tyr
Asn Glu Tyr Ile Val Tyr 980 985
990 Asp Ile Ala Gln Val Asn Leu Lys Tyr Leu Leu Lys Leu Lys
Phe Asn 995 1000 1005
Phe Lys Thr Ser Leu Trp 1010 124001DNAArtificial
SequencePARP1 sequence 12aggcatcagc aatctatcag ggaacggcgg tggccggtgc
ggcgtgttcg gtggcggctc 60tggccgctca ggcgcctgcg gctgggtgag cgcacgcgag
gcggcgaggc ggcagcgtgt 120ttctaggtcg tggcgtcggg cttccggagc tttggcggca
gctaggggag gatggcggag 180tcttcggata agctctatcg agtcgagtac gccaagagcg
ggcgcgcctc ttgcaagaaa 240tgcagcgaga gcatccccaa ggactcgctc cggatggcca
tcatggtgca gtcgcccatg 300tttgatggaa aagtcccaca ctggtaccac ttctcctgct
tctggaaggt gggccactcc 360atccggcacc ctgacgttga ggtggatggg ttctctgagc
ttcggtggga tgaccagcag 420aaagtcaaga agacagcgga agctggagga gtgacaggca
aaggccagga tggaattggt 480agcaaggcag agaagactct gggtgacttt gcagcagagt
atgccaagtc caacagaagt 540acgtgcaagg ggtgtatgga gaagatagaa aagggccagg
tgcgcctgtc caagaagatg 600gtggacccgg agaagccaca gctaggcatg attgaccgct
ggtaccatcc aggctgcttt 660gtcaagaaca gggaggagct gggtttccgg cccgagtaca
gtgcgagtca gctcaagggc 720ttcagcctcc ttgctacaga ggataaagaa gccctgaaga
agcagctccc aggagtcaag 780agtgaaggaa agagaaaagg cgatgaggtg gatggagtgg
atgaagtggc gaagaagaaa 840tctaaaaaag aaaaagacaa ggatagtaag cttgaaaaag
ccctaaaggc tcagaacgac 900ctgatctgga acatcaagga cgagctaaag aaagtgtgtt
caactaatga cctgaaggag 960ctactcatct tcaacaagca gcaagtgcct tctggggagt
cggcgatctt ggaccgagta 1020gctgatggca tggtgttcgg tgccctcctt ccctgcgagg
aatgctcggg tcagctggtc 1080ttcaagagcg atgcctatta ctgcactggg gacgtcactg
cctggaccaa gtgtatggtc 1140aagacacaga cacccaaccg gaaggagtgg gtaaccccaa
aggaattccg agaaatctct 1200tacctcaaga aattgaaggt taaaaaacag gaccgtatat
tccccccaga aaccagcgcc 1260tccgtggcgg ccacgcctcc gccctccaca gcctcggctc
ctgctgctgt gaactcctct 1320gcttcagcag ataagccatt atccaacatg aagatcctga
ctctcgggaa gctgtcccgg 1380aacaaggatg aagtgaaggc catgattgag aaactcgggg
ggaagttgac ggggacggcc 1440aacaaggctt ccctgtgcat cagcaccaaa aaggaggtgg
aaaagatgaa taagaagatg 1500gaggaagtaa aggaagccaa catccgagtt gtgtctgagg
acttcctcca ggacgtctcc 1560gcctccacca agagccttca ggagttgttc ttagcgcaca
tcttgtcccc ttggggggca 1620gaggtgaagg cagagcctgt tgaagttgtg gccccaagag
ggaagtcagg ggctgcgctc 1680tccaaaaaaa gcaagggcca ggtcaaggag gaaggtatca
acaaatctga aaagagaatg 1740aaattaactc ttaaaggagg agcagctgtg gatcctgatt
ctggactgga acactctgcg 1800catgtcctgg agaaaggtgg gaaggtcttc agtgccaccc
ttggcctggt ggacatcgtt 1860aaaggaacca actcctacta caagctgcag cttctggagg
acgacaagga aaacaggtat 1920tggatattca ggtcctgggg ccgtgtgggt acggtgatcg
gtagcaacaa actggaacag 1980atgccgtcca aggaggatgc cattgagcac ttcatgaaat
tatatgaaga aaaaaccggg 2040aacgcttggc actccaaaaa tttcacgaag tatcccaaaa
agttctaccc cctggagatt 2100gactatggcc aggatgaaga ggcagtgaag aagctgacag
taaatcctgg caccaagtcc 2160aagctcccca agccagttca ggacctcatc aagatgatct
ttgatgtgga aagtatgaag 2220aaagccatgg tggagtatga gatcgacctt cagaagatgc
ccttggggaa gctgagcaaa 2280aggcagatcc aggccgcata ctccatcctc agtgaggtcc
agcaggcggt gtctcagggc 2340agcagcgact ctcagatcct ggatctctca aatcgctttt
acaccctgat cccccacgac 2400tttgggatga agaagcctcc gctcctgaac aatgcagaca
gtgtgcaggc caaggtggaa 2460atgcttgaca acctgctgga catcgaggtg gcctacagtc
tgctcagggg agggtctgat 2520gatagcagca aggatcccat cgatgtcaac tatgagaagc
tcaaaactga cattaaggtg 2580gttgacagag attctgaaga agccgagatc atcaggaagt
atgttaagaa cactcatgca 2640accacacaca atgcgtatga cttggaagtc atcgatatct
ttaagataga gcgtgaaggc 2700gaatgccagc gttacaagcc ctttaagcag cttcataacc
gaagattgct gtggcacggg 2760tccaggacca ccaactttgc tgggatcctg tcccagggtc
ttcggatagc cccgcctgaa 2820gcgcccgtga caggctacat gtttggtaaa gggatctatt
tcgctgacat ggtctccaag 2880agtgccaact actgccatac gtctcaggga gacccaatag
gcttaatcct gttgggagaa 2940gttgcccttg gaaacatgta tgaactgaag cacgcttcac
atatcagcaa gttacccaag 3000ggcaagcaca gtgtcaaagg tttgggcaaa actacccctg
atccttcagc taacattagt 3060ctggatggtg tagacgttcc tcttgggacc gggatttcat
ctggtgtgaa tgacacctct 3120ctactatata acgagtacat tgtctatgat attgctcagg
taaatctgaa gtatctgctg 3180aaactgaaat tcaattttaa gacctccctg tggtaattgg
gagaggtagc cgagtcacac 3240ccggtggctc tggtatgaat tcacccgaag cgcttctgca
ccaactcacc tggccgctaa 3300gttgctgatg ggtagtacct gtactaaacc acctcagaaa
ggattttaca gaaacgtgtt 3360aaaggttttc tctaacttct caagtccctt gttttgtgtt
gtgtctgtgg ggaggggttg 3420ttttggggtt gtttttgttt tttcttgcca ggtagataaa
actgacatag agaaaaggct 3480ggagagagat tctgttgcat agactagtcc tatggaaaaa
accaagcttc gttagaatgt 3540ctgccttact ggtttcccca gggaaggaaa aatacacttc
cacccttttt tctaagtgtt 3600cgtctttagt tttgattttg gaaagatgtt aagcatttat
ttttagttaa aaataaaaac 3660taatttcata ctatttagat tttctttttt atcttgcact
tattgtcccc tttttagttt 3720tttttgtttg cctcttgtgg tgaggggtgt gggaagacca
aaggaaggaa cgctaacaat 3780ttctcatact tagaaacaaa aagagctttc cttctccagg
aatactgaac atgggagctc 3840ttgaaatatg tagtattaaa agttgcattt gaaattcttg
actttcttat gggcactttt 3900gtcttccaaa ttaaaactct accacaaata tacttaccca
agggctaata gtaatactcg 3960attaaaaatg cagatgcctt ctctaaaaaa aaaaaaaaaa a
400113339PRTArtificial SequenceRAD51 sequence 13Met
Ala Met Gln Met Gln Leu Glu Ala Asn Ala Asp Thr Ser Val Glu 1
5 10 15 Glu Glu Ser Phe Gly Pro
Gln Pro Ile Ser Arg Leu Glu Gln Cys Gly 20
25 30 Ile Asn Ala Asn Asp Val Lys Lys Leu Glu
Glu Ala Gly Phe His Thr 35 40
45 Val Glu Ala Val Ala Tyr Ala Pro Lys Lys Glu Leu Ile Asn
Ile Lys 50 55 60
Gly Ile Ser Glu Ala Lys Ala Asp Lys Ile Leu Ala Glu Ala Ala Lys 65
70 75 80 Leu Val Pro Met Gly
Phe Thr Thr Ala Thr Glu Phe His Gln Arg Arg 85
90 95 Ser Glu Ile Ile Gln Ile Thr Thr Gly Ser
Lys Glu Leu Asp Lys Leu 100 105
110 Leu Gln Gly Gly Ile Glu Thr Gly Ser Ile Thr Glu Met Phe Gly
Glu 115 120 125 Phe
Arg Thr Gly Lys Thr Gln Ile Cys His Thr Leu Ala Val Thr Cys 130
135 140 Gln Leu Pro Ile Asp Arg
Gly Gly Gly Glu Gly Lys Ala Met Tyr Ile 145 150
155 160 Asp Thr Glu Gly Thr Phe Arg Pro Glu Arg Leu
Leu Ala Val Ala Glu 165 170
175 Arg Tyr Gly Leu Ser Gly Ser Asp Val Leu Asp Asn Val Ala Tyr Ala
180 185 190 Arg Ala
Phe Asn Thr Asp His Gln Thr Gln Leu Leu Tyr Gln Ala Ser 195
200 205 Ala Met Met Val Glu Ser Arg
Tyr Ala Leu Leu Ile Val Asp Ser Ala 210 215
220 Thr Ala Leu Tyr Arg Thr Asp Tyr Ser Gly Arg Gly
Glu Leu Ser Ala 225 230 235
240 Arg Gln Met His Leu Ala Arg Phe Leu Arg Met Leu Leu Arg Leu Ala
245 250 255 Asp Glu Phe
Gly Val Ala Val Val Ile Thr Asn Gln Val Val Ala Gln 260
265 270 Val Asp Gly Ala Ala Met Phe Ala
Ala Asp Pro Lys Lys Pro Ile Gly 275 280
285 Gly Asn Ile Ile Ala His Ala Ser Thr Thr Arg Leu Tyr
Leu Arg Lys 290 295 300
Gly Arg Gly Glu Thr Arg Ile Cys Lys Ile Tyr Asp Ser Pro Cys Leu 305
310 315 320 Pro Glu Ala Glu
Ala Met Phe Ala Ile Asn Ala Asp Gly Val Gly Asp 325
330 335 Ala Lys Asp 142299DNAArtificial
SequenceRAD51 sequence 14gttacgtcga cgcgggcgtg accctgggcg agagggtttg
gcgggaattc tgaaagccgc 60tggcggaccg cgcgcagcgg ccagagaccg agccctaagg
agagtgcggc gcttcccgag 120gcgtgcagct gggaactgca actcatctgg gttgtgcgca
gaaggctggg gcaagcgagt 180agagaagtgg agcgtaagcc aggggcgttg ggggccgtgc
gggtcgggcg cgtgccacgc 240ccgcggggtg aagtcggagc gcggggcctg ctggagagag
gagcgctgcg gaccgagtaa 300tggcaatgca gatgcagctt gaagcaaatg cagatacttc
agtggaagaa gaaagctttg 360gcccacaacc catttcacgg ttagagcagt gtggcataaa
tgccaacgat gtgaagaaat 420tggaagaagc tggattccat actgtggagg ctgttgccta
tgcgccaaag aaggagctaa 480taaatattaa gggaattagt gaagccaaag ctgataaaat
tctggctgag gcagctaaat 540tagttccaat gggtttcacc actgcaactg aattccacca
aaggcggtca gagatcatac 600agattactac tggctccaaa gagcttgaca aactacttca
aggtggaatt gagactggat 660ctatcacaga aatgtttgga gaattccgaa ctgggaagac
ccagatctgt catacgctag 720ctgtcacctg ccagcttccc attgaccggg gtggaggtga
aggaaaggcc atgtacattg 780acactgaggg tacctttagg ccagaacggc tgctggcagt
ggctgagagg tatggtctct 840ctggcagtga tgtcctggat aatgtagcat atgctcgagc
gttcaacaca gaccaccaga 900cccagctcct ttatcaagca tcagccatga tggtagaatc
taggtatgca ctgcttattg 960tagacagtgc caccgccctt tacagaacag actactcggg
tcgaggtgag ctttcagcca 1020ggcagatgca cttggccagg tttctgcgga tgcttctgcg
actcgctgat gagtttggtg 1080tagcagtggt aatcactaat caggtggtag ctcaagtgga
tggagcagcg atgtttgctg 1140ctgatcccaa aaaacctatt ggaggaaata tcatcgccca
tgcatcaaca accagattgt 1200atctgaggaa aggaagaggg gaaaccagaa tctgcaaaat
ctacgactct ccctgtcttc 1260ctgaagctga agctatgttc gccattaatg cagatggagt
gggagatgcc aaagactgaa 1320tcattgggtt tttcctctgt taaaaacctt aagtgctgca
gcctaatgag agtgcactgc 1380tccctggggt tctctacagg cctcttcctg ttgtgactgc
caggataaag cttccgggaa 1440aacagctatt atatcagctt ttctgatggt ataaacagga
gacaggtcag tagtcacaaa 1500ctgatctaaa atgtttattc cttctgtagt gtattaatct
ctgtgtgttt tctttggttt 1560tggaggaggg gtatgaagta tctttgacat ggtgccttag
gaatgacttg ggtttaacaa 1620gctgtctact ggacaatctt atgtttccaa gagaactaaa
gctggagaga cctgaccctt 1680ctctcacttc taaattaatg gtaaaataaa atgcctcagc
tatgtagcaa agggaatggg 1740tctgcacaga ttcttttttt ctgtcagtaa aactctcaag
caggttttta agttgtctgt 1800ctgaatgatc ttgtgtaagg ttttggttat ggagtcttgt
gccaaaccta ctaggccatt 1860agcccttcac catctacctg cttggtcttt cattgctaag
actaactcaa gataatccta 1920gagtcttaaa gcatttcagg ccagtgtggt gtcttgcgcc
tgtactccca gcactttggg 1980aggccgaggc aggtggatcg cttgagccca ggagttttaa
gtccagcttg gccaaggtgg 2040tgaaatccca tctctacaaa aaatgcagaa cttaatctgg
acacactgtt acacgtgcct 2100gtagtcccag ctactcgata gcctgaggtg ggagaatcac
ttaagcctgg aaggtggaag 2160ttgcagtgag tcgagattgc actgctgcat tccagccagg
gtgacagagt gagaccatgt 2220ttcaaacaag aaacatttca gagggtaagt aaacagattt
gattgtgagg cttctaataa 2280agtagttatt agtagtgaa
229915708PRTArtificial SequenceMRE11A sequence
15Met Ser Thr Ala Asp Ala Leu Asp Asp Glu Asn Thr Phe Lys Ile Leu 1
5 10 15 Val Ala Thr Asp
Ile His Leu Gly Phe Met Glu Lys Asp Ala Val Arg 20
25 30 Gly Asn Asp Thr Phe Val Thr Leu Asp
Glu Ile Leu Arg Leu Ala Gln 35 40
45 Glu Asn Glu Val Asp Phe Ile Leu Leu Gly Gly Asp Leu Phe
His Glu 50 55 60
Asn Lys Pro Ser Arg Lys Thr Leu His Thr Cys Leu Glu Leu Leu Arg 65
70 75 80 Lys Tyr Cys Met Gly
Asp Arg Pro Val Gln Phe Glu Ile Leu Ser Asp 85
90 95 Gln Ser Val Asn Phe Gly Phe Ser Lys Phe
Pro Trp Val Asn Tyr Gln 100 105
110 Asp Gly Asn Leu Asn Ile Ser Ile Pro Val Phe Ser Ile His Gly
Asn 115 120 125 His
Asp Asp Pro Thr Gly Ala Asp Ala Leu Cys Ala Leu Asp Ile Leu 130
135 140 Ser Cys Ala Gly Phe Val
Asn His Phe Gly Arg Ser Met Ser Val Glu 145 150
155 160 Lys Ile Asp Ile Ser Pro Val Leu Leu Gln Lys
Gly Ser Thr Lys Ile 165 170
175 Ala Leu Tyr Gly Leu Gly Ser Ile Pro Asp Glu Arg Leu Tyr Arg Met
180 185 190 Phe Val
Asn Lys Lys Val Thr Met Leu Arg Pro Lys Glu Asp Glu Asn 195
200 205 Ser Trp Phe Asn Leu Phe Val
Ile His Gln Asn Arg Ser Lys His Gly 210 215
220 Ser Thr Asn Phe Ile Pro Glu Gln Phe Leu Asp Asp
Phe Ile Asp Leu 225 230 235
240 Val Ile Trp Gly His Glu His Glu Cys Lys Ile Ala Pro Thr Lys Asn
245 250 255 Glu Gln Gln
Leu Phe Tyr Ile Ser Gln Pro Gly Ser Ser Val Val Thr 260
265 270 Ser Leu Ser Pro Gly Glu Ala Val
Lys Lys His Val Gly Leu Leu Arg 275 280
285 Ile Lys Gly Arg Lys Met Asn Met His Lys Ile Pro Leu
His Thr Val 290 295 300
Arg Gln Phe Phe Met Glu Asp Ile Val Leu Ala Asn His Pro Asp Ile 305
310 315 320 Phe Asn Pro Asp
Asn Pro Lys Val Thr Gln Ala Ile Gln Ser Phe Cys 325
330 335 Leu Glu Lys Ile Glu Glu Met Leu Glu
Asn Ala Glu Arg Glu Arg Leu 340 345
350 Gly Asn Ser His Gln Pro Glu Lys Pro Leu Val Arg Leu Arg
Val Asp 355 360 365
Tyr Ser Gly Gly Phe Glu Pro Phe Ser Val Leu Arg Phe Ser Gln Lys 370
375 380 Phe Val Asp Arg Val
Ala Asn Pro Lys Asp Ile Ile His Phe Phe Arg 385 390
395 400 His Arg Glu Gln Lys Glu Lys Thr Gly Glu
Glu Ile Asn Phe Gly Lys 405 410
415 Leu Ile Thr Lys Pro Ser Glu Gly Thr Thr Leu Arg Val Glu Asp
Leu 420 425 430 Val
Lys Gln Tyr Phe Gln Thr Ala Glu Lys Asn Val Gln Leu Ser Leu 435
440 445 Leu Thr Glu Arg Gly Met
Gly Glu Ala Val Gln Glu Phe Val Asp Lys 450 455
460 Glu Glu Lys Asp Ala Ile Glu Glu Leu Val Lys
Tyr Gln Leu Glu Lys 465 470 475
480 Thr Gln Arg Phe Leu Lys Glu Arg His Ile Asp Ala Leu Glu Asp Lys
485 490 495 Ile Asp
Glu Glu Val Arg Arg Phe Arg Glu Thr Arg Gln Lys Asn Thr 500
505 510 Asn Glu Glu Asp Asp Glu Val
Arg Glu Ala Met Thr Arg Ala Arg Ala 515 520
525 Leu Arg Ser Gln Ser Glu Glu Ser Ala Ser Ala Phe
Ser Ala Asp Asp 530 535 540
Leu Met Ser Ile Asp Leu Ala Glu Gln Met Ala Asn Asp Ser Asp Asp 545
550 555 560 Ser Ile Ser
Ala Ala Thr Asn Lys Gly Arg Gly Arg Gly Arg Gly Arg 565
570 575 Arg Gly Gly Arg Gly Gln Asn Ser
Ala Ser Arg Gly Gly Ser Gln Arg 580 585
590 Gly Arg Ala Asp Thr Gly Leu Glu Thr Ser Thr Arg Ser
Arg Asn Ser 595 600 605
Lys Thr Ala Val Ser Ala Ser Arg Asn Met Ser Ile Ile Asp Ala Phe 610
615 620 Lys Ser Thr Arg
Gln Gln Pro Ser Arg Asn Val Thr Thr Lys Asn Tyr 625 630
635 640 Ser Glu Val Ile Glu Val Asp Glu Ser
Asp Val Glu Glu Asp Ile Phe 645 650
655 Pro Thr Thr Ser Lys Thr Asp Gln Arg Trp Ser Ser Thr Ser
Ser Ser 660 665 670
Lys Ile Met Ser Gln Ser Gln Val Ser Lys Gly Val Asp Phe Glu Ser
675 680 685 Ser Glu Asp Asp
Asp Asp Asp Pro Phe Met Asn Thr Ser Ser Leu Arg 690
695 700 Arg Asn Arg Arg 705
165141DNAArtificial SequenceMRE11A sequence 16acgttatcca tgaagtgtcg
cgagagaaac ggacgccgtt ctctcccgcg gaattcaggt 60ttacggccct gcgggttctc
agagaatttc tagaatttgg aatcgagtgc attttctgac 120atttgagtac agtacccagg
ggttcttgga gaagaacctg gtcccagagg agcttgactg 180accataaaaa tgagtactgc
agatgcactt gatgatgaaa acacatttaa aatattagtt 240gcaacagata ttcatcttgg
atttatggag aaagatgcag tcagaggaaa tgatacgttt 300gtaacactcg atgaaatttt
aagacttgcc caggaaaatg aagtggattt tattttgtta 360ggtggtgatc tttttcatga
aaataagccc tcaaggaaaa cattacatac ctgcctcgag 420ttattaagaa aatattgtat
gggtgatcgg cctgtccagt ttgaaattct cagtgatcag 480tcagtcaact ttggttttag
taagtttcca tgggtgaact atcaagatgg caacctcaac 540atttcaattc cagtgtttag
tattcatggc aatcatgacg atcccacagg ggcagatgca 600ctttgtgcct tggacatttt
aagttgtgct ggatttgtaa atcactttgg acgttcaatg 660tctgtggaga agatagacat
tagtccggtt ttgcttcaaa aaggaagcac aaagattgcg 720ctatatggtt taggatccat
tccagatgaa aggctctatc gaatgtttgt caataaaaaa 780gtaacaatgt tgagaccaaa
ggaagatgag aactcttggt ttaacttatt tgtgattcat 840cagaacagga gtaaacatgg
aagtactaac ttcattccag aacaattttt ggatgacttc 900attgatcttg ttatctgggg
ccatgaacat gagtgtaaaa tagctccaac caaaaatgaa 960caacagctgt tttatatctc
acaacctgga agctcagtgg ttacttctct ttccccagga 1020gaagctgtaa agaaacatgt
tggtttgctg cgtattaaag ggaggaagat gaatatgcat 1080aaaattcctc ttcacacagt
gcggcagttt ttcatggagg atattgttct agctaatcat 1140ccagacattt ttaacccaga
taatcctaaa gtaacccaag ccatacaaag cttctgtttg 1200gagaagattg aagaaatgct
tgaaaatgct gaacgggaac gtctgggtaa ttctcaccag 1260ccagagaagc ctcttgtacg
actgcgagtg gactatagtg gaggttttga acctttcagt 1320gttcttcgct ttagccagaa
atttgtggat cgggtagcta atccaaaaga cattatccat 1380tttttcaggc atagagaaca
aaaggaaaaa acaggagaag agatcaactt tgggaaactt 1440atcacaaagc cttcagaagg
aacaacttta agggtagaag atcttgtaaa acagtacttt 1500caaaccgcag agaagaatgt
gcagctctca ctgctaacag aaagagggat gggtgaagca 1560gtacaagaat ttgtggacaa
ggaggagaaa gatgccattg aggaattagt gaaataccag 1620ttggaaaaaa cacagcgatt
tcttaaagaa cgtcatattg atgccctcga agacaaaatc 1680gatgaggagg tacgtcgttt
cagagaaacc agacaaaaaa atactaatga agaagatgat 1740gaagtccgtg aggctatgac
cagggccaga gcactcagat ctcagtcaga ggagtctgct 1800tctgccttta gtgctgatga
ccttatgagt atagatttag cagaacagat ggctaatgac 1860tctgatgata gcatctcagc
agcaaccaac aaaggaagag gccgaggaag aggtcgaaga 1920ggtggaagag ggcagaattc
agcatcgaga ggagggtctc aaagaggaag agcagacact 1980ggtctggaga cttctacccg
tagcaggaac tcaaagactg ctgtgtcagc atctagaaat 2040atgtctatta tagatgcctt
taaatctaca agacagcagc cttcccgaaa tgtcactact 2100aagaattatt cagaggtgat
tgaggtagat gaatcagatg tggaagaaga catttttcct 2160accacttcaa agacagatca
aaggtggtcc agcacatcat ccagcaaaat catgtcccag 2220agtcaagtat cgaaaggggt
tgattttgaa tcaagtgagg atgatgatga tgatcctttt 2280atgaacacta gttctttaag
aagaaataga agataatata tttaatggca ctgagaaaca 2340tgcaagatac aggaaaaatg
aaaatgttac aagctaagag tttacagttt aagattttaa 2400gtattgtttc ctgagcataa
ctccataagt aagaaatttc tagttcacag acatacaata 2460gcattgattc accttgtttt
tttaacctgg ttgttgtagt aagagctttg tttcaatatc 2520actcttgagt aaagattaaa
ataaagctac cattttacat ttctatttca taatgaaaaa 2580ctatgtcagt attttaatat
ggttacattt agccaaagtt gagggaaaga gcttataaaa 2640tttaacttct tcataatttt
agtaatttcc tagaggttct gggttttctg aaagtaaaac 2700aatttatgcg aacctatgtc
taaattcact gtttgttact atgtatgttt ttttccaatg 2760cttcttataa gactaaatga
ttagaagtac ctaatagttt gaacagatat gtttttattt 2820aaaagagtag aataaccttt
cagaattact gagtttttta ttccagttgt agcaaagatt 2880tcaaaagatt gtgttcccat
taagtggtag taatttcctt tattattctg tatccttaat 2940ggtgttctct ctctctctct
ctctctctct ctctccctct cccccccgtt ccccactctt 3000cctttctcct ttgctttttc
ttctctttca tacatatatg cgtgcctagt tctaggagga 3060aacgggttaa aaattgtttt
aaactacatc ttgaaaatat tgaagaattt gttttaggta 3120gagtggtcag ttgaacctta
cagtaaagta tagaaatata tttaatgtgg aatgtcaatg 3180ccaggatttc tcattaacaa
tattttatct caactttggt tcctgtgata catttctgaa 3240tgggcaattc cagaaatctt
agtagcccat gttaagcttc tattttttac ttgttttcgg 3300ggagaaataa gaattagaca
tcttcagatt taagttaaat aatcccattc tttataatcc 3360tctgtaaaaa gatccctgag
attattcctt cttctagttt tatgcgacag ctttacttta 3420aaattcaagt tatacatctt
gggagtacaa tggcccgaca tttcttcata ggtagaaaca 3480aatacttgac tcagtgatac
tcatgaccat tagaatagtc atacctggaa tgtgtcaaat 3540tataagagac agacacttgg
ttagtggctg cctcatatag cacttttgaa gaggcctaag 3600tcaaaacttg caatataaca
ttctattgac tttcttaaaa atattttttc tgtacctaac 3660ttgagcataa gggttatttg
agcaagtaac attaactcag tggaaggcat tgtcctgtga 3720aatattctta ggcagatctg
cccacatctt tattgaactt gaaatctaat atttctagta 3780tttgaacaaa gcagaaggtt
aagtcaggga agagcagtgc tgtccatgat gtaatggaag 3840ctaccagggg aggcagtgtc
tggatgatgc tgtgctacct acccctgcac aagccatgct 3900ggctcagtct gagctgtggg
ccacatcagc tagtggctct tctcatgcat cagttaggtg 3960ggtctgggtg agagttatag
tgagggaatg gtcactaaag tatcctgaca agttcctagg 4020aaaaaaggaa taaagttttt
ttccttaaaa aaaaaaaaat tgctcttggc tgtgaaaaga 4080ggtactaaat gcgattcagt
tcaccgctaa ggaaagtgat gacatagcag ttacagaggg 4140tgataaatct ctccagctaa
ttcaggtcat tttgtgaata ctatgtatca agccctgaaa 4200atatggtaaa taaaacgtga
cagggaaacc tttttttgat tgaatattgt tacatagtta 4260aatgtgctat atatccttaa
tattttatat tgatcctgca aaatctgttg gttttagggg 4320agttttgttt tttgtttcta
acaattttca gacctgttgg tataggaatg tagaagtctt 4380tcagatgatt tgaaagcagc
tgcatttgct cttggaggct ttgggagagc aggaatgaaa 4440acattcagag gaagacatct
gtagggaatt cttctgttac ttaccaaaga ataagtgtct 4500ttctggtgtt ttatttccta
tcataaaaat acaacagtgc atttacaagg ttaaagattc 4560ctcgaagttc taggaaattc
ttgaaaatat aagtggtgct tagaaaattc aagcatttag 4620gaatgtgacc tttaattcag
gtatgtaaaa gacttttttc ccaaactttt aaaagtagga 4680aatacaataa atacagaaaa
gtcatatggt tgaataaata attataaatt gagcactgat 4740ggaatccctc tacaggtcaa
gaaatagcgc agtgtcctgg atgcccatta tattgttttc 4800tcctttctgg gtaacaagcc
ctaacttctg taatttaaaa gctcctactt ttgccacaag 4860gtggtgcttc tgccattaga
cgcagttagg aggatgcaac tgcaaatcta aaattacgaa 4920gttagtgtag ttgcaataaa
cttagaacat atgcattaat actaaaccta tgcagtaata 4980ccataattag ccttctaatc
atgtaatttg ctttacttag gtatttcatt tggttcagcc 5040tgttatggaa tttaccagct
tgataaattt gcctataaag ttttataaag aaaaggaata 5100ttttgttttc ataaagagga
aaatccattc ttagaaaaaa a 5141173056PRTArtificial
SequenceATM sequence 17Met Ser Leu Val Leu Asn Asp Leu Leu Ile Cys Cys
Arg Gln Leu Glu 1 5 10
15 His Asp Arg Ala Thr Glu Arg Lys Lys Glu Val Glu Lys Phe Lys Arg
20 25 30 Leu Ile Arg
Asp Pro Glu Thr Ile Lys His Leu Asp Arg His Ser Asp 35
40 45 Ser Lys Gln Gly Lys Tyr Leu Asn
Trp Asp Ala Val Phe Arg Phe Leu 50 55
60 Gln Lys Tyr Ile Gln Lys Glu Thr Glu Cys Leu Arg Ile
Ala Lys Pro 65 70 75
80 Asn Val Ser Ala Ser Thr Gln Ala Ser Arg Gln Lys Lys Met Gln Glu
85 90 95 Ile Ser Ser Leu
Val Lys Tyr Phe Ile Lys Cys Ala Asn Arg Arg Ala 100
105 110 Pro Arg Leu Lys Cys Gln Glu Leu Leu
Asn Tyr Ile Met Asp Thr Val 115 120
125 Lys Asp Ser Ser Asn Gly Ala Ile Tyr Gly Ala Asp Cys Ser
Asn Ile 130 135 140
Leu Leu Lys Asp Ile Leu Ser Val Arg Lys Tyr Trp Cys Glu Ile Ser 145
150 155 160 Gln Gln Gln Trp Leu
Glu Leu Phe Ser Val Tyr Phe Arg Leu Tyr Leu 165
170 175 Lys Pro Ser Gln Asp Val His Arg Val Leu
Val Ala Arg Ile Ile His 180 185
190 Ala Val Thr Lys Gly Cys Cys Ser Gln Thr Asp Gly Leu Asn Ser
Lys 195 200 205 Phe
Leu Asp Phe Phe Ser Lys Ala Ile Gln Cys Ala Arg Gln Glu Lys 210
215 220 Ser Ser Ser Gly Leu Asn
His Ile Leu Ala Ala Leu Thr Ile Phe Leu 225 230
235 240 Lys Thr Leu Ala Val Asn Phe Arg Ile Arg Val
Cys Glu Leu Gly Asp 245 250
255 Glu Ile Leu Pro Thr Leu Leu Tyr Ile Trp Thr Gln His Arg Leu Asn
260 265 270 Asp Ser
Leu Lys Glu Val Ile Ile Glu Leu Phe Gln Leu Gln Ile Tyr 275
280 285 Ile His His Pro Lys Gly Ala
Lys Thr Gln Glu Lys Gly Ala Tyr Glu 290 295
300 Ser Thr Lys Trp Arg Ser Ile Leu Tyr Asn Leu Tyr
Asp Leu Leu Val 305 310 315
320 Asn Glu Ile Ser His Ile Gly Ser Arg Gly Lys Tyr Ser Ser Gly Phe
325 330 335 Arg Asn Ile
Ala Val Lys Glu Asn Leu Ile Glu Leu Met Ala Asp Ile 340
345 350 Cys His Gln Val Phe Asn Glu Asp
Thr Arg Ser Leu Glu Ile Ser Gln 355 360
365 Ser Tyr Thr Thr Thr Gln Arg Glu Ser Ser Asp Tyr Ser
Val Pro Cys 370 375 380
Lys Arg Lys Lys Ile Glu Leu Gly Trp Glu Val Ile Lys Asp His Leu 385
390 395 400 Gln Lys Ser Gln
Asn Asp Phe Asp Leu Val Pro Trp Leu Gln Ile Ala 405
410 415 Thr Gln Leu Ile Ser Lys Tyr Pro Ala
Ser Leu Pro Asn Cys Glu Leu 420 425
430 Ser Pro Leu Leu Met Ile Leu Ser Gln Leu Leu Pro Gln Gln
Arg His 435 440 445
Gly Glu Arg Thr Pro Tyr Val Leu Arg Cys Leu Thr Glu Val Ala Leu 450
455 460 Cys Gln Asp Lys Arg
Ser Asn Leu Glu Ser Ser Gln Lys Ser Asp Leu 465 470
475 480 Leu Lys Leu Trp Asn Lys Ile Trp Cys Ile
Thr Phe Arg Gly Ile Ser 485 490
495 Ser Glu Gln Ile Gln Ala Glu Asn Phe Gly Leu Leu Gly Ala Ile
Ile 500 505 510 Gln
Gly Ser Leu Val Glu Val Asp Arg Glu Phe Trp Lys Leu Phe Thr 515
520 525 Gly Ser Ala Cys Arg Pro
Ser Cys Pro Ala Val Cys Cys Leu Thr Leu 530 535
540 Ala Leu Thr Thr Ser Ile Val Pro Gly Thr Val
Lys Met Gly Ile Glu 545 550 555
560 Gln Asn Met Cys Glu Val Asn Arg Ser Phe Ser Leu Lys Glu Ser Ile
565 570 575 Met Lys
Trp Leu Leu Phe Tyr Gln Leu Glu Gly Asp Leu Glu Asn Ser 580
585 590 Thr Glu Val Pro Pro Ile Leu
His Ser Asn Phe Pro His Leu Val Leu 595 600
605 Glu Lys Ile Leu Val Ser Leu Thr Met Lys Asn Cys
Lys Ala Ala Met 610 615 620
Asn Phe Phe Gln Ser Val Pro Glu Cys Glu His His Gln Lys Asp Lys 625
630 635 640 Glu Glu Leu
Ser Phe Ser Glu Val Glu Glu Leu Phe Leu Gln Thr Thr 645
650 655 Phe Asp Lys Met Asp Phe Leu Thr
Ile Val Arg Glu Cys Gly Ile Glu 660 665
670 Lys His Gln Ser Ser Ile Gly Phe Ser Val His Gln Asn
Leu Lys Glu 675 680 685
Ser Leu Asp Arg Cys Leu Leu Gly Leu Ser Glu Gln Leu Leu Asn Asn 690
695 700 Tyr Ser Ser Glu
Ile Thr Asn Ser Glu Thr Leu Val Arg Cys Ser Arg 705 710
715 720 Leu Leu Val Gly Val Leu Gly Cys Tyr
Cys Tyr Met Gly Val Ile Ala 725 730
735 Glu Glu Glu Ala Tyr Lys Ser Glu Leu Phe Gln Lys Ala Lys
Ser Leu 740 745 750
Met Gln Cys Ala Gly Glu Ser Ile Thr Leu Phe Lys Asn Lys Thr Asn
755 760 765 Glu Glu Phe Arg
Ile Gly Ser Leu Arg Asn Met Met Gln Leu Cys Thr 770
775 780 Arg Cys Leu Ser Asn Cys Thr Lys
Lys Ser Pro Asn Lys Ile Ala Ser 785 790
795 800 Gly Phe Phe Leu Arg Leu Leu Thr Ser Lys Leu Met
Asn Asp Ile Ala 805 810
815 Asp Ile Cys Lys Ser Leu Ala Ser Phe Ile Lys Lys Pro Phe Asp Arg
820 825 830 Gly Glu Val
Glu Ser Met Glu Asp Asp Thr Asn Gly Asn Leu Met Glu 835
840 845 Val Glu Asp Gln Ser Ser Met Asn
Leu Phe Asn Asp Tyr Pro Asp Ser 850 855
860 Ser Val Ser Asp Ala Asn Glu Pro Gly Glu Ser Gln Ser
Thr Ile Gly 865 870 875
880 Ala Ile Asn Pro Leu Ala Glu Glu Tyr Leu Ser Lys Gln Asp Leu Leu
885 890 895 Phe Leu Asp Met
Leu Lys Phe Leu Cys Leu Cys Val Thr Thr Ala Gln 900
905 910 Thr Asn Thr Val Ser Phe Arg Ala Ala
Asp Ile Arg Arg Lys Leu Leu 915 920
925 Met Leu Ile Asp Ser Ser Thr Leu Glu Pro Thr Lys Ser Leu
His Leu 930 935 940
His Met Tyr Leu Met Leu Leu Lys Glu Leu Pro Gly Glu Glu Tyr Pro 945
950 955 960 Leu Pro Met Glu Asp
Val Leu Glu Leu Leu Lys Pro Leu Ser Asn Val 965
970 975 Cys Ser Leu Tyr Arg Arg Asp Gln Asp Val
Cys Lys Thr Ile Leu Asn 980 985
990 His Val Leu His Val Val Lys Asn Leu Gly Gln Ser Asn Met
Asp Ser 995 1000 1005
Glu Asn Thr Arg Asp Ala Gln Gly Gln Phe Leu Thr Val Ile Gly 1010
1015 1020 Ala Phe Trp His Leu
Thr Lys Glu Arg Lys Tyr Ile Phe Ser Val 1025 1030
1035 Arg Met Ala Leu Val Asn Cys Leu Lys Thr
Leu Leu Glu Ala Asp 1040 1045 1050
Pro Tyr Ser Lys Trp Ala Ile Leu Asn Val Met Gly Lys Asp Phe
1055 1060 1065 Pro Val
Asn Glu Val Phe Thr Gln Phe Leu Ala Asp Asn His His 1070
1075 1080 Gln Val Arg Met Leu Ala Ala
Glu Ser Ile Asn Arg Leu Phe Gln 1085 1090
1095 Asp Thr Lys Gly Asp Ser Ser Arg Leu Leu Lys Ala
Leu Pro Leu 1100 1105 1110
Lys Leu Gln Gln Thr Ala Phe Glu Asn Ala Tyr Leu Lys Ala Gln 1115
1120 1125 Glu Gly Met Arg Glu
Met Ser His Ser Ala Glu Asn Pro Glu Thr 1130 1135
1140 Leu Asp Glu Ile Tyr Asn Arg Lys Ser Val
Leu Leu Thr Leu Ile 1145 1150 1155
Ala Val Val Leu Ser Cys Ser Pro Ile Cys Glu Lys Gln Ala Leu
1160 1165 1170 Phe Ala
Leu Cys Lys Ser Val Lys Glu Asn Gly Leu Glu Pro His 1175
1180 1185 Leu Val Lys Lys Val Leu Glu
Lys Val Ser Glu Thr Phe Gly Tyr 1190 1195
1200 Arg Arg Leu Glu Asp Phe Met Ala Ser His Leu Asp
Tyr Leu Val 1205 1210 1215
Leu Glu Trp Leu Asn Leu Gln Asp Thr Glu Tyr Asn Leu Ser Ser 1220
1225 1230 Phe Pro Phe Ile Leu
Leu Asn Tyr Thr Asn Ile Glu Asp Phe Tyr 1235 1240
1245 Arg Ser Cys Tyr Lys Val Leu Ile Pro His
Leu Val Ile Arg Ser 1250 1255 1260
His Phe Asp Glu Val Lys Ser Ile Ala Asn Gln Ile Gln Glu Asp
1265 1270 1275 Trp Lys
Ser Leu Leu Thr Asp Cys Phe Pro Lys Ile Leu Val Asn 1280
1285 1290 Ile Leu Pro Tyr Phe Ala Tyr
Glu Gly Thr Arg Asp Ser Gly Met 1295 1300
1305 Ala Gln Gln Arg Glu Thr Ala Thr Lys Val Tyr Asp
Met Leu Lys 1310 1315 1320
Ser Glu Asn Leu Leu Gly Lys Gln Ile Asp His Leu Phe Ile Ser 1325
1330 1335 Asn Leu Pro Glu Ile
Val Val Glu Leu Leu Met Thr Leu His Glu 1340 1345
1350 Pro Ala Asn Ser Ser Ala Ser Gln Ser Thr
Asp Leu Cys Asp Phe 1355 1360 1365
Ser Gly Asp Leu Asp Pro Ala Pro Asn Pro Pro His Phe Pro Ser
1370 1375 1380 His Val
Ile Lys Ala Thr Phe Ala Tyr Ile Ser Asn Cys His Lys 1385
1390 1395 Thr Lys Leu Lys Ser Ile Leu
Glu Ile Leu Ser Lys Ser Pro Asp 1400 1405
1410 Ser Tyr Gln Lys Ile Leu Leu Ala Ile Cys Glu Gln
Ala Ala Glu 1415 1420 1425
Thr Asn Asn Val Tyr Lys Lys His Arg Ile Leu Lys Ile Tyr His 1430
1435 1440 Leu Phe Val Ser Leu
Leu Leu Lys Asp Ile Lys Ser Gly Leu Gly 1445 1450
1455 Gly Ala Trp Ala Phe Val Leu Arg Asp Val
Ile Tyr Thr Leu Ile 1460 1465 1470
His Tyr Ile Asn Gln Arg Pro Ser Cys Ile Met Asp Val Ser Leu
1475 1480 1485 Arg Ser
Phe Ser Leu Cys Cys Asp Leu Leu Ser Gln Val Cys Gln 1490
1495 1500 Thr Ala Val Thr Tyr Cys Lys
Asp Ala Leu Glu Asn His Leu His 1505 1510
1515 Val Ile Val Gly Thr Leu Ile Pro Leu Val Tyr Glu
Gln Val Glu 1520 1525 1530
Val Gln Lys Gln Val Leu Asp Leu Leu Lys Tyr Leu Val Ile Asp 1535
1540 1545 Asn Lys Asp Asn Glu
Asn Leu Tyr Ile Thr Ile Lys Leu Leu Asp 1550 1555
1560 Pro Phe Pro Asp His Val Val Phe Lys Asp
Leu Arg Ile Thr Gln 1565 1570 1575
Gln Lys Ile Lys Tyr Ser Arg Gly Pro Phe Ser Leu Leu Glu Glu
1580 1585 1590 Ile Asn
His Phe Leu Ser Val Ser Val Tyr Asp Ala Leu Pro Leu 1595
1600 1605 Thr Arg Leu Glu Gly Leu Lys
Asp Leu Arg Arg Gln Leu Glu Leu 1610 1615
1620 His Lys Asp Gln Met Val Asp Ile Met Arg Ala Ser
Gln Asp Asn 1625 1630 1635
Pro Gln Asp Gly Ile Met Val Lys Leu Val Val Asn Leu Leu Gln 1640
1645 1650 Leu Ser Lys Met Ala
Ile Asn His Thr Gly Glu Lys Glu Val Leu 1655 1660
1665 Glu Ala Val Gly Ser Cys Leu Gly Glu Val
Gly Pro Ile Asp Phe 1670 1675 1680
Ser Thr Ile Ala Ile Gln His Ser Lys Asp Ala Ser Tyr Thr Lys
1685 1690 1695 Ala Leu
Lys Leu Phe Glu Asp Lys Glu Leu Gln Trp Thr Phe Ile 1700
1705 1710 Met Leu Thr Tyr Leu Asn Asn
Thr Leu Val Glu Asp Cys Val Lys 1715 1720
1725 Val Arg Ser Ala Ala Val Thr Cys Leu Lys Asn Ile
Leu Ala Thr 1730 1735 1740
Lys Thr Gly His Ser Phe Trp Glu Ile Tyr Lys Met Thr Thr Asp 1745
1750 1755 Pro Met Leu Ala Tyr
Leu Gln Pro Phe Arg Thr Ser Arg Lys Lys 1760 1765
1770 Phe Leu Glu Val Pro Arg Phe Asp Lys Glu
Asn Pro Phe Glu Gly 1775 1780 1785
Leu Asp Asp Ile Asn Leu Trp Ile Pro Leu Ser Glu Asn His Asp
1790 1795 1800 Ile Trp
Ile Lys Thr Leu Thr Cys Ala Phe Leu Asp Ser Gly Gly 1805
1810 1815 Thr Lys Cys Glu Ile Leu Gln
Leu Leu Lys Pro Met Cys Glu Val 1820 1825
1830 Lys Thr Asp Phe Cys Gln Thr Val Leu Pro Tyr Leu
Ile His Asp 1835 1840 1845
Ile Leu Leu Gln Asp Thr Asn Glu Ser Trp Arg Asn Leu Leu Ser 1850
1855 1860 Thr His Val Gln Gly
Phe Phe Thr Ser Cys Leu Arg His Phe Ser 1865 1870
1875 Gln Thr Ser Arg Ser Thr Thr Pro Ala Asn
Leu Asp Ser Glu Ser 1880 1885 1890
Glu His Phe Phe Arg Cys Cys Leu Asp Lys Lys Ser Gln Arg Thr
1895 1900 1905 Met Leu
Ala Val Val Asp Tyr Met Arg Arg Gln Lys Arg Pro Ser 1910
1915 1920 Ser Gly Thr Ile Phe Asn Asp
Ala Phe Trp Leu Asp Leu Asn Tyr 1925 1930
1935 Leu Glu Val Ala Lys Val Ala Gln Ser Cys Ala Ala
His Phe Thr 1940 1945 1950
Ala Leu Leu Tyr Ala Glu Ile Tyr Ala Asp Lys Lys Ser Met Asp 1955
1960 1965 Asp Gln Glu Lys Arg
Ser Leu Ala Phe Glu Glu Gly Ser Gln Ser 1970 1975
1980 Thr Thr Ile Ser Ser Leu Ser Glu Lys Ser
Lys Glu Glu Thr Gly 1985 1990 1995
Ile Ser Leu Gln Asp Leu Leu Leu Glu Ile Tyr Arg Ser Ile Gly
2000 2005 2010 Glu Pro
Asp Ser Leu Tyr Gly Cys Gly Gly Gly Lys Met Leu Gln 2015
2020 2025 Pro Ile Thr Arg Leu Arg Thr
Tyr Glu His Glu Ala Met Trp Gly 2030 2035
2040 Lys Ala Leu Val Thr Tyr Asp Leu Glu Thr Ala Ile
Pro Ser Ser 2045 2050 2055
Thr Arg Gln Ala Gly Ile Ile Gln Ala Leu Gln Asn Leu Gly Leu 2060
2065 2070 Cys His Ile Leu Ser
Val Tyr Leu Lys Gly Leu Asp Tyr Glu Asn 2075 2080
2085 Lys Asp Trp Cys Pro Glu Leu Glu Glu Leu
His Tyr Gln Ala Ala 2090 2095 2100
Trp Arg Asn Met Gln Trp Asp His Cys Thr Ser Val Ser Lys Glu
2105 2110 2115 Val Glu
Gly Thr Ser Tyr His Glu Ser Leu Tyr Asn Ala Leu Gln 2120
2125 2130 Ser Leu Arg Asp Arg Glu Phe
Ser Thr Phe Tyr Glu Ser Leu Lys 2135 2140
2145 Tyr Ala Arg Val Lys Glu Val Glu Glu Met Cys Lys
Arg Ser Leu 2150 2155 2160
Glu Ser Val Tyr Ser Leu Tyr Pro Thr Leu Ser Arg Leu Gln Ala 2165
2170 2175 Ile Gly Glu Leu Glu
Ser Ile Gly Glu Leu Phe Ser Arg Ser Val 2180 2185
2190 Thr His Arg Gln Leu Ser Glu Val Tyr Ile
Lys Trp Gln Lys His 2195 2200 2205
Ser Gln Leu Leu Lys Asp Ser Asp Phe Ser Phe Gln Glu Pro Ile
2210 2215 2220 Met Ala
Leu Arg Thr Val Ile Leu Glu Ile Leu Met Glu Lys Glu 2225
2230 2235 Met Asp Asn Ser Gln Arg Glu
Cys Ile Lys Asp Ile Leu Thr Lys 2240 2245
2250 His Leu Val Glu Leu Ser Ile Leu Ala Arg Thr Phe
Lys Asn Thr 2255 2260 2265
Gln Leu Pro Glu Arg Ala Ile Phe Gln Ile Lys Gln Tyr Asn Ser 2270
2275 2280 Val Ser Cys Gly Val
Ser Glu Trp Gln Leu Glu Glu Ala Gln Val 2285 2290
2295 Phe Trp Ala Lys Lys Glu Gln Ser Leu Ala
Leu Ser Ile Leu Lys 2300 2305 2310
Gln Met Ile Lys Lys Leu Asp Ala Ser Cys Ala Ala Asn Asn Pro
2315 2320 2325 Ser Leu
Lys Leu Thr Tyr Thr Glu Cys Leu Arg Val Cys Gly Asn 2330
2335 2340 Trp Leu Ala Glu Thr Cys Leu
Glu Asn Pro Ala Val Ile Met Gln 2345 2350
2355 Thr Tyr Leu Glu Lys Ala Val Glu Val Ala Gly Asn
Tyr Asp Gly 2360 2365 2370
Glu Ser Ser Asp Glu Leu Arg Asn Gly Lys Met Lys Ala Phe Leu 2375
2380 2385 Ser Leu Ala Arg Phe
Ser Asp Thr Gln Tyr Gln Arg Ile Glu Asn 2390 2395
2400 Tyr Met Lys Ser Ser Glu Phe Glu Asn Lys
Gln Ala Leu Leu Lys 2405 2410 2415
Arg Ala Lys Glu Glu Val Gly Leu Leu Arg Glu His Lys Ile Gln
2420 2425 2430 Thr Asn
Arg Tyr Thr Val Lys Val Gln Arg Glu Leu Glu Leu Asp 2435
2440 2445 Glu Leu Ala Leu Arg Ala Leu
Lys Glu Asp Arg Lys Arg Phe Leu 2450 2455
2460 Cys Lys Ala Val Glu Asn Tyr Ile Asn Cys Leu Leu
Ser Gly Glu 2465 2470 2475
Glu His Asp Met Trp Val Phe Arg Leu Cys Ser Leu Trp Leu Glu 2480
2485 2490 Asn Ser Gly Val Ser
Glu Val Asn Gly Met Met Lys Arg Asp Gly 2495 2500
2505 Met Lys Ile Pro Thr Tyr Lys Phe Leu Pro
Leu Met Tyr Gln Leu 2510 2515 2520
Ala Ala Arg Met Gly Thr Lys Met Met Gly Gly Leu Gly Phe His
2525 2530 2535 Glu Val
Leu Asn Asn Leu Ile Ser Arg Ile Ser Met Asp His Pro 2540
2545 2550 His His Thr Leu Phe Ile Ile
Leu Ala Leu Ala Asn Ala Asn Arg 2555 2560
2565 Asp Glu Phe Leu Thr Lys Pro Glu Val Ala Arg Arg
Ser Arg Ile 2570 2575 2580
Thr Lys Asn Val Pro Lys Gln Ser Ser Gln Leu Asp Glu Asp Arg 2585
2590 2595 Thr Glu Ala Ala Asn
Arg Ile Ile Cys Thr Ile Arg Ser Arg Arg 2600 2605
2610 Pro Gln Met Val Arg Ser Val Glu Ala Leu
Cys Asp Ala Tyr Ile 2615 2620 2625
Ile Leu Ala Asn Leu Asp Ala Thr Gln Trp Lys Thr Gln Arg Lys
2630 2635 2640 Gly Ile
Asn Ile Pro Ala Asp Gln Pro Ile Thr Lys Leu Lys Asn 2645
2650 2655 Leu Glu Asp Val Val Val Pro
Thr Met Glu Ile Lys Val Asp His 2660 2665
2670 Thr Gly Glu Tyr Gly Asn Leu Val Thr Ile Gln Ser
Phe Lys Ala 2675 2680 2685
Glu Phe Arg Leu Ala Gly Gly Val Asn Leu Pro Lys Ile Ile Asp 2690
2695 2700 Cys Val Gly Ser Asp
Gly Lys Glu Arg Arg Gln Leu Val Lys Gly 2705 2710
2715 Arg Asp Asp Leu Arg Gln Asp Ala Val Met
Gln Gln Val Phe Gln 2720 2725 2730
Met Cys Asn Thr Leu Leu Gln Arg Asn Thr Glu Thr Arg Lys Arg
2735 2740 2745 Lys Leu
Thr Ile Cys Thr Tyr Lys Val Val Pro Leu Ser Gln Arg 2750
2755 2760 Ser Gly Val Leu Glu Trp Cys
Thr Gly Thr Val Pro Ile Gly Glu 2765 2770
2775 Phe Leu Val Asn Asn Glu Asp Gly Ala His Lys Arg
Tyr Arg Pro 2780 2785 2790
Asn Asp Phe Ser Ala Phe Gln Cys Gln Lys Lys Met Met Glu Val 2795
2800 2805 Gln Lys Lys Ser Phe
Glu Glu Lys Tyr Glu Val Phe Met Asp Val 2810 2815
2820 Cys Gln Asn Phe Gln Pro Val Phe Arg Tyr
Phe Cys Met Glu Lys 2825 2830 2835
Phe Leu Asp Pro Ala Ile Trp Phe Glu Lys Arg Leu Ala Tyr Thr
2840 2845 2850 Arg Ser
Val Ala Thr Ser Ser Ile Val Gly Tyr Ile Leu Gly Leu 2855
2860 2865 Gly Asp Arg His Val Gln Asn
Ile Leu Ile Asn Glu Gln Ser Ala 2870 2875
2880 Glu Leu Val His Ile Asp Leu Gly Val Ala Phe Glu
Gln Gly Lys 2885 2890 2895
Ile Leu Pro Thr Pro Glu Thr Val Pro Phe Arg Leu Thr Arg Asp 2900
2905 2910 Ile Val Asp Gly Met
Gly Ile Thr Gly Val Glu Gly Val Phe Arg 2915 2920
2925 Arg Cys Cys Glu Lys Thr Met Glu Val Met
Arg Asn Ser Gln Glu 2930 2935 2940
Thr Leu Leu Thr Ile Val Glu Val Leu Leu Tyr Asp Pro Leu Phe
2945 2950 2955 Asp Trp
Thr Met Asn Pro Leu Lys Ala Leu Tyr Leu Gln Gln Arg 2960
2965 2970 Pro Glu Asp Glu Thr Glu Leu
His Pro Thr Leu Asn Ala Asp Asp 2975 2980
2985 Gln Glu Cys Lys Arg Asn Leu Ser Asp Ile Asp Gln
Ser Phe Asn 2990 2995 3000
Lys Val Ala Glu Arg Val Leu Met Arg Leu Gln Glu Lys Leu Lys 3005
3010 3015 Gly Val Glu Glu Gly
Thr Val Leu Ser Val Gly Gly Gln Val Asn 3020 3025
3030 Leu Leu Ile Gln Gln Ala Ile Asp Pro Lys
Asn Leu Ser Arg Leu 3035 3040 3045
Phe Pro Gly Trp Lys Ala Trp Val 3050 3055
1813147DNAArtificial SequenceATM sequence 18ccggagcccg agccgaaggg
cgagccgcaa acgctaagtc gctggccatt ggtggacatg 60gcgcaggcgc gtttgctccg
acgggccgaa tgttttgggg cagtgttttg agcgcggaga 120ccgcgtgata ctggatgcgc
atgggcatac cgtgctctgc ggctgcttgg cgttgcttct 180tcctccagaa gtgggcgctg
ggcagtcacg cagggtttga accggaagcg ggagtaggta 240gctgcgtggc taacggagaa
aagaagccgt ggccgcggga ggaggcgaga ggagtcggga 300tctgcgctgc agccaccgcc
gcggttgata ctactttgac cttccgagtg cagtgacagt 360gatgtgtgtt ctgaaattgt
gaaccatgag tctagtactt aatgatctgc ttatctgctg 420ccgtcaacta gaacatgata
gagctacaga acgaaagaaa gaagttgaga aatttaagcg 480cctgattcga gatcctgaaa
caattaaaca tctagatcgg cattcagatt ccaaacaagg 540aaaatatttg aattgggatg
ctgtttttag atttttacag aaatatattc agaaagaaac 600agaatgtctg agaatagcaa
aaccaaatgt atcagcctca acacaagcct ccaggcagaa 660aaagatgcag gaaatcagta
gtttggtcaa atacttcatc aaatgtgcaa acagaagagc 720acctaggcta aaatgtcaag
aactcttaaa ttatatcatg gatacagtga aagattcatc 780taatggtgct atttacggag
ctgattgtag caacatacta ctcaaagaca ttctttctgt 840gagaaaatac tggtgtgaaa
tatctcagca acagtggtta gaattgttct ctgtgtactt 900caggctctat ctgaaacctt
cacaagatgt tcatagagtt ttagtggcta gaataattca 960tgctgttacc aaaggatgct
gttctcagac tgacggatta aattccaaat ttttggactt 1020tttttccaag gctattcagt
gtgcgagaca agaaaagagc tcttcaggtc taaatcatat 1080cttagcagct cttactatct
tcctcaagac tttggctgtc aactttcgaa ttcgagtgtg 1140tgaattagga gatgaaattc
ttcccacttt gctttatatt tggactcaac ataggcttaa 1200tgattcttta aaagaagtca
ttattgaatt atttcaactg caaatttata tccatcatcc 1260gaaaggagcc aaaacccaag
aaaaaggtgc ttatgaatca acaaaatgga gaagtatttt 1320atacaactta tatgatctgc
tagtgaatga gataagtcat ataggaagta gaggaaagta 1380ttcttcagga tttcgtaata
ttgccgtcaa agaaaatttg attgaattga tggcagatat 1440ctgtcaccag gtttttaatg
aagataccag atccttggag atttctcaat cttacactac 1500tacacaaaga gaatctagtg
attacagtgt cccttgcaaa aggaagaaaa tagaactagg 1560ctgggaagta ataaaagatc
accttcagaa gtcacagaat gattttgatc ttgtgccttg 1620gctacagatt gcaacccaat
taatatcaaa gtatcctgca agtttaccta actgtgagct 1680gtctccatta ctgatgatac
tatctcagct tctaccccaa cagcgacatg gggaacgtac 1740accatatgtg ttacgatgcc
ttacggaagt tgcattgtgt caagacaaga ggtcaaacct 1800agaaagctca caaaagtcag
atttattaaa actctggaat aaaatttggt gtattacctt 1860tcgtggtata agttctgagc
aaatacaagc tgaaaacttt ggcttacttg gagccataat 1920tcagggtagt ttagttgagg
ttgacagaga attctggaag ttatttactg ggtcagcctg 1980cagaccttca tgtcctgcag
tatgctgttt gactttggca ctgaccacca gtatagttcc 2040aggaacggta aaaatgggaa
tagagcaaaa tatgtgtgaa gtaaatagaa gcttttcttt 2100aaaggaatca ataatgaaat
ggctcttatt ctatcagtta gagggtgact tagaaaatag 2160cacagaagtg cctccaattc
ttcacagtaa ttttcctcat cttgtactgg agaaaattct 2220tgtgagtctc actatgaaaa
actgtaaagc tgcaatgaat tttttccaaa gcgtgccaga 2280atgtgaacac caccaaaaag
ataaagaaga actttcattc tcagaagtag aagaactatt 2340tcttcagaca acttttgaca
agatggactt tttaaccatt gtgagagaat gtggtataga 2400aaagcaccag tccagtattg
gcttctctgt ccaccagaat ctcaaggaat cactggatcg 2460ctgtcttctg ggattatcag
aacagcttct gaataattac tcatctgaga ttacaaattc 2520agaaactctt gtccggtgtt
cacgtctttt ggtgggtgtc cttggctgct actgttacat 2580gggtgtaata gctgaagagg
aagcatataa gtcagaatta ttccagaaag ccaagtctct 2640aatgcaatgt gcaggagaaa
gtatcactct gtttaaaaat aagacaaatg aggaattcag 2700aattggttcc ttgagaaata
tgatgcagct atgtacacgt tgcttgagca actgtaccaa 2760gaagagtcca aataagattg
catctggctt tttcctgcga ttgttaacat caaagctaat 2820gaatgacatt gcagatattt
gtaaaagttt agcatccttc atcaaaaagc catttgaccg 2880tggagaagta gaatcaatgg
aagatgatac taatggaaat ctaatggagg tggaggatca 2940gtcatccatg aatctattta
acgattaccc tgatagtagt gttagtgatg caaacgaacc 3000tggagagagc caaagtacca
taggtgccat taatccttta gctgaagaat atctgtcaaa 3060gcaagatcta cttttcttag
acatgctcaa gttcttgtgt ttgtgtgtaa ctactgctca 3120gaccaatact gtgtccttta
gggcagctga tattcggagg aaattgttaa tgttaattga 3180ttctagcacg ctagaaccta
ccaaatccct ccacctgcat atgtatctaa tgcttttaaa 3240ggagcttcct ggagaagagt
accccttgcc aatggaagat gttcttgaac ttctgaaacc 3300actatccaat gtgtgttctt
tgtatcgtcg tgaccaagat gtttgtaaaa ctattttaaa 3360ccatgtcctt catgtagtga
aaaacctagg tcaaagcaat atggactctg agaacacaag 3420ggatgctcaa ggacagtttc
ttacagtaat tggagcattt tggcatctaa caaaggagag 3480gaaatatata ttctctgtaa
gaatggccct agtaaattgc cttaaaactt tgcttgaggc 3540tgatccttat tcaaaatggg
ccattcttaa tgtaatggga aaagactttc ctgtaaatga 3600agtatttaca caatttcttg
ctgacaatca tcaccaagtt cgcatgttgg ctgcagagtc 3660aatcaataga ttgttccagg
acacgaaggg agattcttcc aggttactga aagcacttcc 3720tttgaagctt cagcaaacag
cttttgaaaa tgcatacttg aaagctcagg aaggaatgag 3780agaaatgtcc catagtgctg
agaaccctga aactttggat gaaatttata atagaaaatc 3840tgttttactg acgttgatag
ctgtggtttt atcctgtagc cctatctgcg aaaaacaggc 3900tttgtttgcc ctgtgtaaat
ctgtgaaaga gaatggatta gaacctcacc ttgtgaaaaa 3960ggttttagag aaagtttctg
aaacttttgg atatagacgt ttagaagact ttatggcatc 4020tcatttagat tatctggttt
tggaatggct aaatcttcaa gatactgaat acaacttatc 4080ttcttttcct tttattttat
taaactacac aaatattgag gatttctata gatcttgtta 4140taaggttttg attccacatc
tggtgattag aagtcatttt gatgaggtga agtccattgc 4200taatcagatt caagaggact
ggaaaagtct tctaacagac tgctttccaa agattcttgt 4260aaatattctt ccttattttg
cctatgaggg taccagagac agtgggatgg cacagcaaag 4320agagactgct accaaggtct
atgatatgct taaaagtgaa aacttattgg gaaaacagat 4380tgatcactta ttcattagta
atttaccaga gattgtggtg gagttattga tgacgttaca 4440tgagccagca aattctagtg
ccagtcagag cactgacctc tgtgactttt caggggattt 4500ggatcctgct cctaatccac
ctcattttcc atcgcatgtg attaaagcaa catttgccta 4560tatcagcaat tgtcataaaa
ccaagttaaa aagcatttta gaaattcttt ccaaaagccc 4620tgattcctat cagaaaattc
ttcttgccat atgtgagcaa gcagctgaaa caaataatgt 4680ttataagaag cacagaattc
ttaaaatata tcacctgttt gttagtttat tactgaaaga 4740tataaaaagt ggcttaggag
gagcttgggc ctttgttctt cgagacgtta tttatacttt 4800gattcactat atcaaccaaa
ggccttcttg tatcatggat gtgtcattac gtagcttctc 4860cctttgttgt gacttattaa
gtcaggtttg ccagacagcc gtgacttact gtaaggatgc 4920tctagaaaac catcttcatg
ttattgttgg tacacttata ccccttgtgt atgagcaggt 4980ggaggttcag aaacaggtat
tggacttgtt gaaatactta gtgatagata acaaggataa 5040tgaaaacctc tatatcacga
ttaagctttt agatcctttt cctgaccatg ttgtttttaa 5100ggatttgcgt attactcagc
aaaaaatcaa atacagtaga ggaccctttt cactcttgga 5160ggaaattaac cattttctct
cagtaagtgt ttatgatgca cttccattga caagacttga 5220aggactaaag gatcttcgaa
gacaactgga actacataaa gatcagatgg tggacattat 5280gagagcttct caggataatc
cgcaagatgg gattatggtg aaactagttg tcaatttgtt 5340gcagttatcc aagatggcaa
taaaccacac tggtgaaaaa gaagttctag aggctgttgg 5400aagctgcttg ggagaagtgg
gtcctataga tttctctacc atagctatac aacatagtaa 5460agatgcatct tataccaagg
cccttaagtt atttgaagat aaagaacttc agtggacctt 5520cataatgctg acctacctga
ataacacact ggtagaagat tgtgtcaaag ttcgatcagc 5580agctgttacc tgtttgaaaa
acattttagc cacaaagact ggacatagtt tctgggagat 5640ttataagatg acaacagatc
caatgctggc ctatctacag ccttttagaa catcaagaaa 5700aaagttttta gaagtaccca
gatttgacaa agaaaaccct tttgaaggcc tggatgatat 5760aaatctgtgg attcctctaa
gtgaaaatca tgacatttgg ataaagacac tgacttgtgc 5820ttttttggac agtggaggca
caaaatgtga aattcttcaa ttattaaagc caatgtgtga 5880agtgaaaact gacttttgtc
agactgtact tccatacttg attcatgata ttttactcca 5940agatacaaat gaatcatgga
gaaatctgct ttctacacat gttcagggat ttttcaccag 6000ctgtcttcga cacttctcgc
aaacgagccg atccacaacc cctgcaaact tggattcaga 6060gtcagagcac tttttccgat
gctgtttgga taaaaaatca caaagaacaa tgcttgctgt 6120tgtggactac atgagaagac
aaaagagacc ttcttcagga acaattttta atgatgcttt 6180ctggctggat ttaaattatc
tagaagttgc caaggtagct cagtcttgtg ctgctcactt 6240tacagcttta ctctatgcag
aaatctatgc agataagaaa agtatggatg atcaagagaa 6300aagaagtctt gcatttgaag
aaggaagcca gagtacaact atttctagct tgagtgaaaa 6360aagtaaagaa gaaactggaa
taagtttaca ggatcttctc ttagaaatct acagaagtat 6420aggggagcca gatagtttgt
atggctgtgg tggagggaag atgttacaac ccattactag 6480actacgaaca tatgaacacg
aagcaatgtg gggcaaagcc ctagtaacat atgacctcga 6540aacagcaatc ccctcatcaa
cacgccaggc aggaatcatt caggccttgc agaatttggg 6600actctgccat attctttccg
tctatttaaa aggattggat tatgaaaata aagactggtg 6660tcctgaacta gaagaacttc
attaccaagc agcatggagg aatatgcagt gggaccattg 6720cacttccgtc agcaaagaag
tagaaggaac cagttaccat gaatcattgt acaatgctct 6780acaatctcta agagacagag
aattctctac attttatgaa agtctcaaat atgccagagt 6840aaaagaagtg gaagagatgt
gtaagcgcag ccttgagtct gtgtattcgc tctatcccac 6900acttagcagg ttgcaggcca
ttggagagct ggaaagcatt ggggagcttt tctcaagatc 6960agtcacacat agacaactct
ctgaagtata tattaagtgg cagaaacact cccagcttct 7020caaggacagt gattttagtt
ttcaggagcc tatcatggct ctacgcacag tcattttgga 7080gatcctgatg gaaaaggaaa
tggacaactc acaaagagaa tgtattaagg acattctcac 7140caaacacctt gtagaactct
ctatactggc cagaactttc aagaacactc agctccctga 7200aagggcaata tttcaaatta
aacagtacaa ttcagttagc tgtggagtct ctgagtggca 7260gctggaagaa gcacaagtat
tctgggcaaa aaaggagcag agtcttgccc tgagtattct 7320caagcaaatg atcaagaagt
tggatgccag ctgtgcagcg aacaatccca gcctaaaact 7380tacatacaca gaatgtctga
gggtttgtgg caactggtta gcagaaacgt gcttagaaaa 7440tcctgcggtc atcatgcaga
cctatctaga aaaggcagta gaagttgctg gaaattatga 7500tggagaaagt agtgatgagc
taagaaatgg aaaaatgaag gcatttctct cattagcccg 7560gttttcagat actcaatacc
aaagaattga aaactacatg aaatcatcgg aatttgaaaa 7620caagcaagct ctcctgaaaa
gagccaaaga ggaagtaggt ctccttaggg aacataaaat 7680tcagacaaac agatacacag
taaaggttca gcgagagctg gagttggatg aattagccct 7740gcgtgcactg aaagaggatc
gtaaacgctt cttatgtaaa gcagttgaaa attatatcaa 7800ctgcttatta agtggagaag
aacatgatat gtgggtattc cgactttgtt ccctctggct 7860tgaaaattct ggagtttctg
aagtcaatgg catgatgaag agagacggaa tgaagattcc 7920aacatataaa tttttgcctc
ttatgtacca attggctgct agaatgggga ccaagatgat 7980gggaggccta ggatttcatg
aagtcctcaa taatctaatc tctagaattt caatggatca 8040cccccatcac actttgttta
ttatactggc cttagcaaat gcaaacagag atgaatttct 8100gactaaacca gaggtagcca
gaagaagcag aataactaaa aatgtgccta aacaaagctc 8160tcagcttgat gaggatcgaa
cagaggctgc aaatagaata atatgtacta tcagaagtag 8220gagacctcag atggtcagaa
gtgttgaggc actttgtgat gcttatatta tattagcaaa 8280cttagatgcc actcagtgga
agactcagag aaaaggcata aatattccag cagaccagcc 8340aattactaaa cttaagaatt
tagaagatgt tgttgtccct actatggaaa ttaaggtgga 8400ccacacagga gaatatggaa
atctggtgac tatacagtca tttaaagcag aatttcgctt 8460agcaggaggt gtaaatttac
caaaaataat agattgtgta ggttccgatg gcaaggagag 8520gagacagctt gttaagggcc
gtgatgacct gagacaagat gctgtcatgc aacaggtctt 8580ccagatgtgt aatacattac
tgcagagaaa cacggaaact aggaagagga aattaactat 8640ctgtacttat aaggtggttc
ccctctctca gcgaagtggt gttcttgaat ggtgcacagg 8700aactgtcccc attggtgaat
ttcttgttaa caatgaagat ggtgctcata aaagatacag 8760gccaaatgat ttcagtgcct
ttcagtgcca aaagaaaatg atggaggtgc aaaaaaagtc 8820ttttgaagag aaatatgaag
tcttcatgga tgtttgccaa aattttcaac cagttttccg 8880ttacttctgc atggaaaaat
tcttggatcc agctatttgg tttgagaagc gattggctta 8940tacgcgcagt gtagctactt
cttctattgt tggttacata cttggacttg gtgatagaca 9000tgtacagaat atcttgataa
atgagcagtc agcagaactt gtacatatag atctaggtgt 9060tgcttttgaa cagggcaaaa
tccttcctac tcctgagaca gttcctttta gactcaccag 9120agatattgtg gatggcatgg
gcattacggg tgttgaaggt gtcttcagaa gatgctgtga 9180gaaaaccatg gaagtgatga
gaaactctca ggaaactctg ttaaccattg tagaggtcct 9240tctatatgat ccactctttg
actggaccat gaatcctttg aaagctttgt atttacagca 9300gaggccggaa gatgaaactg
agcttcaccc tactctgaat gcagatgacc aagaatgcaa 9360acgaaatctc agtgatattg
accagagttt caacaaagta gctgaacgtg tcttaatgag 9420actacaagag aaactgaaag
gagtggaaga aggcactgtg ctcagtgttg gtggacaagt 9480gaatttgctc atacagcagg
ccatagaccc caaaaatctc agccgacttt tcccaggatg 9540gaaagcttgg gtgtgatctt
cagtatatga attacccttt cattcagcct ttagaaatta 9600tattttagcc tttattttta
acctgccaac atactttaag tagggattaa tatttaagtg 9660aactattgtg ggtttttttg
aatgttggtt ttaatacttg atttaatcac cactcaaaaa 9720tgttttgatg gtcttaagga
acatctctgc tttcactctt tagaaataat ggtcattcgg 9780gctgggcgca gcggctcacg
cctgtaatcc cagcactttg ggaggccgag gtgagcggat 9840cacaaggtca ggagttcgag
accagcctgg ccaagagacc agcctggcca gtatggtgaa 9900accctgtctc tactaaaaat
acaaaaatta gccgagcatg gtggcgggca cctgtaatcc 9960cagctactcg agaggctgag
gcaggagaat ctcttgaacc tgggaggtga aggttgctgt 10020gggccaaaat catgccattg
cactccagcc tgggtgacaa gagcgaaact ccatctcaaa 10080aaaaaaaaaa aaaaaacaga
aacgtatttg gatttttcct agtaagatca ctcagtgtta 10140ctaaataatg aagttgttat
ggagaacaaa tttcaaagac acagttagtg tagttactat 10200ttttttaagt gtgtattaaa
acttctcatt ctattctctt tatcttttaa gcccttctgt 10260actgtccatg tatgttatct
ttctgtgata acttcataga ttgccttcta gttcatgaat 10320tctcttgtca gatgtatata
atctctttta ccctatccat tgggcttctt ctttcagaaa 10380ttgtttttca tttctaatta
tgcatcattt ttcagatctc tgtttcttga tgtcattttt 10440aatgtttttt taatgttttt
tatgtcacta attattttaa atgtctgtac ttgatagaca 10500ctgtaatagt tctattaaat
ttagttcctg ctgtttatat ctgttgattt ttgtatttga 10560taggctgttc atccagtttt
gtctttttga aaagtgagtt tattttcagc aaggctttat 10620ctatgggaat cttgagtgtc
tgtttatgtc atattcccag ggctgttgct gcacacaagc 10680ccattcttat tttaatttct
tggctttagg gtttccatac ctgaagtgta gcataaatac 10740tgataggaga tttcccaggc
caaggcaaac acacttcctc ctcatctcct tgtgctagtg 10800ggcagaatat ttgattgatg
cctttttcac tgagagtata agcttccatg tgtcccacct 10860ttatggcagg ggtggaagga
ggtacattta attcccactg cctgcctttg gcaagccctg 10920ggttctttgc tccccatata
gatgtctaag ctaaaagccg tgggttaatg agactggcaa 10980attgttccag gacagctaca
gcatcagctc acatattcac ctctctggtt tttcattccc 11040ctcatttttt tctgagacag
agtcttgctc tgtcacccag gctggagtgc agtggcatga 11100tctcagctca ctgaaacctc
tgcctcctgg gttcaagcaa ttctcctgcc tcagcctccc 11160gagtagctgg gactacaggc
gtgtgccaac acgcccggct aattttttgt atttttatta 11220gagacggagt ttcaccgtgt
tagccaggat ggtctcgatc gcttgacctc gtgatccacc 11280ctcctcggcc tcccaaagtg
ctgggattac aggtgtgagc caccgcgccc ggcctcattc 11340ccctcatttt tgaccgtaag
gatttcccct ttcttgtaag ttctgctatg tatttaaaag 11400aatgttttct acattttatc
cagcatttct ctgtgttctg ttggaaggga agggcttagg 11460tatctagttt gatacatagg
tagaagtgga acatttctct gtcccccagc tgtcatcata 11520taagataaac atcagataaa
aagccacctg aaagtaaaac tactgactcg tgtattagtg 11580agtataatct cttctccatc
cttaggaaaa tgttcatccc agctgcggag attaacaaat 11640gggtgattga gctttctcct
cgtatttgga ccttgaaggt tatataaatt tttttcttat 11700gaagagttgg catttctttt
tattgccaat ggcaggcact cattcatatt tgatctcctc 11760accttcccct cccctaaaac
caatctccag aactttttgg actataaatt tcttggtttg 11820acttctggag aactgttcag
aatattactt tgcatttcaa attacaaact taccttggtg 11880tatctttttc ttacaagctg
cctaaatgaa tatttggtat atattggtag ttttattact 11940atagtaaatc aaggaaatgc
agtaaactta aaatgtcttt aagaaagccc tgaaatcttc 12000atgggtgaaa ttagaaatta
tcaactagat aatagtatag ataaatgaat ttgtagctaa 12060ttcttgctag ttgttgcatc
cagagagctt tgaataacat cattaatcta ctctttagcc 12120ttgcatggta tgctatgagg
ctcctgttct gttcaagtat tctaatcaat ggctttgaaa 12180agtttatcaa atttacatac
agatcacaag cctaggagaa ataactaatt cacagatgac 12240agaattaaga ttataaaaga
tttttttttt gtaattttag tagagacagg gttgccattg 12300tattccagcc ttggcgacag
agcaagactc tgcctcaaaa aaaaaaaaaa aaaggttttg 12360gcaagctgga actctttctg
caaatgacta agatagaaaa ctgccaagga caaatgagga 12420gtagttagat tttgaaaata
ttaatcatag aatagttgtt gtatgctaag tcactgaccc 12480atattatgta cagcatttct
gatctttact ttgcaagatt agtgatacta tcccaataca 12540ctgctggaga aatcagaatt
tggagaaata agttgtccaa ggcaagaaga tagtaaatta 12600taagtacaag tgtaatatgg
acagtatcta acttgaaaag atttcaggcg aaaagaatct 12660ggggtttgcc agtcagttgc
tcaaaaggtc aatgaaaacc aaatagtgaa gctatcagag 12720aagctaataa attatagact
gcttgaacag ttgtgtccag attaagggag ataatagctt 12780tcccacccta ctttgtgcag
gtcatacctc cccaaagtgt ttacctaatc agtaggttca 12840caaactcttg gtcattatag
tatatgccta aaatgtatgc acttaggaat gctaaaaatt 12900taaatatggt ctaaagcaaa
taaaagcaaa gaggaaaaac tttggacagc gtaaagacta 12960gaatagtctt ttaaaaagaa
agccagtata ttggtttgaa atatagagat gtgtcccaat 13020ttcaagtatt ttaattgcac
cttaatgaaa ttatctattt tctatagatt ttagtactat 13080tgaatgtatt actttactgt
tacctgaatt tattataaag tgtttttgaa taaataattc 13140taaaagc
13147192644PRTArtificial
SequenceATR sequence 19Met Gly Glu His Gly Leu Glu Leu Ala Ser Met Ile
Pro Ala Leu Arg 1 5 10
15 Glu Leu Gly Ser Ala Thr Pro Glu Glu Tyr Asn Thr Val Val Gln Lys
20 25 30 Pro Arg Gln
Ile Leu Cys Gln Phe Ile Asp Arg Ile Leu Thr Asp Val 35
40 45 Asn Val Val Ala Val Glu Leu Val
Lys Lys Thr Asp Ser Gln Pro Thr 50 55
60 Ser Val Met Leu Leu Asp Phe Ile Gln His Ile Met Lys
Ser Ser Pro 65 70 75
80 Leu Met Phe Val Asn Val Ser Gly Ser His Glu Ala Lys Gly Ser Cys
85 90 95 Ile Glu Phe Ser
Asn Trp Ile Ile Thr Arg Leu Leu Arg Ile Ala Ala 100
105 110 Thr Pro Ser Cys His Leu Leu His Lys
Lys Ile Cys Glu Val Ile Cys 115 120
125 Ser Leu Leu Phe Leu Phe Lys Ser Lys Ser Pro Ala Ile Phe
Gly Val 130 135 140
Leu Thr Lys Glu Leu Leu Gln Leu Phe Glu Asp Leu Val Tyr Leu His 145
150 155 160 Arg Arg Asn Val Met
Gly His Ala Val Glu Trp Pro Val Val Met Ser 165
170 175 Arg Phe Leu Ser Gln Leu Asp Glu His Met
Gly Tyr Leu Gln Ser Ala 180 185
190 Pro Leu Gln Leu Met Ser Met Gln Asn Leu Glu Phe Ile Glu Val
Thr 195 200 205 Leu
Leu Met Val Leu Thr Arg Ile Ile Ala Ile Val Phe Phe Arg Arg 210
215 220 Gln Glu Leu Leu Leu Trp
Gln Ile Gly Cys Val Leu Leu Glu Tyr Gly 225 230
235 240 Ser Pro Lys Ile Lys Ser Leu Ala Ile Ser Phe
Leu Thr Glu Leu Phe 245 250
255 Gln Leu Gly Gly Leu Pro Ala Gln Pro Ala Ser Thr Phe Phe Ser Ser
260 265 270 Phe Leu
Glu Leu Leu Lys His Leu Val Glu Met Asp Thr Asp Gln Leu 275
280 285 Lys Leu Tyr Glu Glu Pro Leu
Ser Lys Leu Ile Lys Thr Leu Phe Pro 290 295
300 Phe Glu Ala Glu Ala Tyr Arg Asn Ile Glu Pro Val
Tyr Leu Asn Met 305 310 315
320 Leu Leu Glu Lys Leu Cys Val Met Phe Glu Asp Gly Val Leu Met Arg
325 330 335 Leu Lys Ser
Asp Leu Leu Lys Ala Ala Leu Cys His Leu Leu Gln Tyr 340
345 350 Phe Leu Lys Phe Val Pro Ala Gly
Tyr Glu Ser Ala Leu Gln Val Arg 355 360
365 Lys Val Tyr Val Arg Asn Ile Cys Lys Ala Leu Leu Asp
Val Leu Gly 370 375 380
Ile Glu Val Asp Ala Glu Tyr Leu Leu Gly Pro Leu Tyr Ala Ala Leu 385
390 395 400 Lys Met Glu Ser
Met Glu Ile Ile Glu Glu Ile Gln Cys Gln Thr Gln 405
410 415 Gln Glu Asn Leu Ser Ser Asn Ser Asp
Gly Ile Ser Pro Lys Arg Arg 420 425
430 Arg Leu Ser Ser Ser Leu Asn Pro Ser Lys Arg Ala Pro Lys
Gln Thr 435 440 445
Glu Glu Ile Lys His Val Asp Met Asn Gln Lys Ser Ile Leu Trp Ser 450
455 460 Ala Leu Lys Gln Lys
Ala Glu Ser Leu Gln Ile Ser Leu Glu Tyr Ser 465 470
475 480 Gly Leu Lys Asn Pro Val Ile Glu Met Leu
Glu Gly Ile Ala Val Val 485 490
495 Leu Gln Leu Thr Ala Leu Cys Thr Val His Cys Ser His Gln Asn
Met 500 505 510 Asn
Cys Arg Thr Phe Lys Asp Cys Gln His Lys Ser Lys Lys Lys Pro 515
520 525 Ser Val Val Ile Thr Trp
Met Ser Leu Asp Phe Tyr Thr Lys Val Leu 530 535
540 Lys Ser Cys Arg Ser Leu Leu Glu Ser Val Gln
Lys Leu Asp Leu Glu 545 550 555
560 Ala Thr Ile Asp Lys Val Val Lys Ile Tyr Asp Ala Leu Ile Tyr Met
565 570 575 Gln Val
Asn Ser Ser Phe Glu Asp His Ile Leu Glu Asp Leu Cys Gly 580
585 590 Met Leu Ser Leu Pro Trp Ile
Tyr Ser His Ser Asp Asp Gly Cys Leu 595 600
605 Lys Leu Thr Thr Phe Ala Ala Asn Leu Leu Thr Leu
Ser Cys Arg Ile 610 615 620
Ser Asp Ser Tyr Ser Pro Gln Ala Gln Ser Arg Cys Val Phe Leu Leu 625
630 635 640 Thr Leu Phe
Pro Arg Arg Ile Phe Leu Glu Trp Arg Thr Ala Val Tyr 645
650 655 Asn Trp Ala Leu Gln Ser Ser His
Glu Val Ile Arg Ala Ser Cys Val 660 665
670 Ser Gly Phe Phe Ile Leu Leu Gln Gln Gln Asn Ser Cys
Asn Arg Val 675 680 685
Pro Lys Ile Leu Ile Asp Lys Val Lys Asp Asp Ser Asp Ile Val Lys 690
695 700 Lys Glu Phe Ala
Ser Ile Leu Gly Gln Leu Val Cys Thr Leu His Gly 705 710
715 720 Met Phe Tyr Leu Thr Ser Ser Leu Thr
Glu Pro Phe Ser Glu His Gly 725 730
735 His Val Asp Leu Phe Cys Arg Asn Leu Lys Ala Thr Ser Gln
His Glu 740 745 750
Cys Ser Ser Ser Gln Leu Lys Ala Ser Val Cys Lys Pro Phe Leu Phe
755 760 765 Leu Leu Lys Lys
Lys Ile Pro Ser Pro Val Lys Leu Ala Phe Ile Asp 770
775 780 Asn Leu His His Leu Cys Lys His
Leu Asp Phe Arg Glu Asp Glu Thr 785 790
795 800 Asp Val Lys Ala Val Leu Gly Thr Leu Leu Asn Leu
Met Glu Asp Pro 805 810
815 Asp Lys Asp Val Arg Val Ala Phe Ser Gly Asn Ile Lys His Ile Leu
820 825 830 Glu Ser Leu
Asp Ser Glu Asp Gly Phe Ile Lys Glu Leu Phe Val Leu 835
840 845 Arg Met Lys Glu Ala Tyr Thr His
Ala Gln Ile Ser Arg Asn Asn Glu 850 855
860 Leu Lys Asp Thr Leu Ile Leu Thr Thr Gly Asp Ile Gly
Arg Ala Ala 865 870 875
880 Lys Gly Asp Leu Val Pro Phe Ala Leu Leu His Leu Leu His Cys Leu
885 890 895 Leu Ser Lys Ser
Ala Ser Val Ser Gly Ala Ala Tyr Thr Glu Ile Arg 900
905 910 Ala Leu Val Ala Ala Lys Ser Val Lys
Leu Gln Ser Phe Phe Ser Gln 915 920
925 Tyr Lys Lys Pro Ile Cys Gln Phe Leu Val Glu Ser Leu His
Ser Ser 930 935 940
Gln Met Thr Ala Leu Pro Asn Thr Pro Cys Gln Asn Ala Asp Val Arg 945
950 955 960 Lys Gln Asp Val Ala
His Gln Arg Glu Met Ala Leu Asn Thr Leu Ser 965
970 975 Glu Ile Ala Asn Val Phe Asp Phe Pro Asp
Leu Asn Arg Phe Leu Thr 980 985
990 Arg Thr Leu Gln Val Leu Leu Pro Asp Leu Ala Ala Lys Ala
Ser Pro 995 1000 1005
Ala Ala Ser Ala Leu Ile Arg Thr Leu Gly Lys Gln Leu Asn Val 1010
1015 1020 Asn Arg Arg Glu Ile
Leu Ile Asn Asn Phe Lys Tyr Ile Phe Ser 1025 1030
1035 His Leu Val Cys Ser Cys Ser Lys Asp Glu
Leu Glu Arg Ala Leu 1040 1045 1050
His Tyr Leu Lys Asn Glu Thr Glu Ile Glu Leu Gly Ser Leu Leu
1055 1060 1065 Arg Gln
Asp Phe Gln Gly Leu His Asn Glu Leu Leu Leu Arg Ile 1070
1075 1080 Gly Glu His Tyr Gln Gln Val
Phe Asn Gly Leu Ser Ile Leu Ala 1085 1090
1095 Ser Phe Ala Ser Ser Asp Asp Pro Tyr Gln Gly Pro
Arg Asp Ile 1100 1105 1110
Ile Ser Pro Glu Leu Met Ala Asp Tyr Leu Gln Pro Lys Leu Leu 1115
1120 1125 Gly Ile Leu Ala Phe
Phe Asn Met Gln Leu Leu Ser Ser Ser Val 1130 1135
1140 Gly Ile Glu Asp Lys Lys Met Ala Leu Asn
Ser Leu Met Ser Leu 1145 1150 1155
Met Lys Leu Met Gly Pro Lys His Val Ser Ser Val Arg Val Lys
1160 1165 1170 Met Met
Thr Thr Leu Arg Thr Gly Leu Arg Phe Lys Asp Asp Phe 1175
1180 1185 Pro Glu Leu Cys Cys Arg Ala
Trp Asp Cys Phe Val Arg Cys Leu 1190 1195
1200 Asp His Ala Cys Leu Gly Ser Leu Leu Ser His Val
Ile Val Ala 1205 1210 1215
Leu Leu Pro Leu Ile His Ile Gln Pro Lys Glu Thr Ala Ala Ile 1220
1225 1230 Phe His Tyr Leu Ile
Ile Glu Asn Arg Asp Ala Val Gln Asp Phe 1235 1240
1245 Leu His Glu Ile Tyr Phe Leu Pro Asp His
Pro Glu Leu Lys Lys 1250 1255 1260
Ile Lys Ala Val Leu Gln Glu Tyr Arg Lys Glu Thr Ser Glu Ser
1265 1270 1275 Thr Asp
Leu Gln Thr Thr Leu Gln Leu Ser Met Lys Ala Ile Gln 1280
1285 1290 His Glu Asn Val Asp Val Arg
Ile His Ala Leu Thr Ser Leu Lys 1295 1300
1305 Glu Thr Leu Tyr Lys Asn Gln Glu Lys Leu Ile Lys
Tyr Ala Thr 1310 1315 1320
Asp Ser Glu Thr Val Glu Pro Ile Ile Ser Gln Leu Val Thr Val 1325
1330 1335 Leu Leu Lys Gly Cys
Gln Asp Ala Asn Ser Gln Ala Arg Leu Leu 1340 1345
1350 Cys Gly Glu Cys Leu Gly Glu Leu Gly Ala
Ile Asp Pro Gly Arg 1355 1360 1365
Leu Asp Phe Ser Thr Thr Glu Thr Gln Gly Lys Asp Phe Thr Phe
1370 1375 1380 Val Thr
Gly Val Glu Asp Ser Ser Phe Ala Tyr Gly Leu Leu Met 1385
1390 1395 Glu Leu Thr Arg Ala Tyr Leu
Ala Tyr Ala Asp Asn Ser Arg Ala 1400 1405
1410 Gln Asp Ser Ala Ala Tyr Ala Ile Gln Glu Leu Leu
Ser Ile Tyr 1415 1420 1425
Asp Cys Arg Glu Met Glu Thr Asn Gly Pro Gly His Gln Leu Trp 1430
1435 1440 Arg Arg Phe Pro Glu
His Val Arg Glu Ile Leu Glu Pro His Leu 1445 1450
1455 Asn Thr Arg Tyr Lys Ser Ser Gln Lys Ser
Thr Asp Trp Ser Gly 1460 1465 1470
Val Lys Lys Pro Ile Tyr Leu Ser Lys Leu Gly Ser Asn Phe Ala
1475 1480 1485 Glu Trp
Ser Ala Ser Trp Ala Gly Tyr Leu Ile Thr Lys Val Arg 1490
1495 1500 His Asp Leu Ala Ser Lys Ile
Phe Thr Cys Cys Ser Ile Met Met 1505 1510
1515 Lys His Asp Phe Lys Val Thr Ile Tyr Leu Leu Pro
His Ile Leu 1520 1525 1530
Val Tyr Val Leu Leu Gly Cys Asn Gln Glu Asp Gln Gln Glu Val 1535
1540 1545 Tyr Ala Glu Ile Met
Ala Val Leu Lys His Asp Asp Gln His Thr 1550 1555
1560 Ile Asn Thr Gln Asp Ile Ala Ser Asp Leu
Cys Gln Leu Ser Thr 1565 1570 1575
Gln Thr Val Phe Ser Met Leu Asp His Leu Thr Gln Trp Ala Arg
1580 1585 1590 His Lys
Phe Gln Ala Leu Lys Ala Glu Lys Cys Pro His Ser Lys 1595
1600 1605 Ser Asn Arg Asn Lys Val Asp
Ser Met Val Ser Thr Val Asp Tyr 1610 1615
1620 Glu Asp Tyr Gln Ser Val Thr Arg Phe Leu Asp Leu
Ile Pro Gln 1625 1630 1635
Asp Thr Leu Ala Val Ala Ser Phe Arg Ser Lys Ala Tyr Thr Arg 1640
1645 1650 Ala Val Met His Phe
Glu Ser Phe Ile Thr Glu Lys Lys Gln Asn 1655 1660
1665 Ile Gln Glu His Leu Gly Phe Leu Gln Lys
Leu Tyr Ala Ala Met 1670 1675 1680
His Glu Pro Asp Gly Val Ala Gly Val Ser Ala Ile Arg Lys Ala
1685 1690 1695 Glu Pro
Ser Leu Lys Glu Gln Ile Leu Glu His Glu Ser Leu Gly 1700
1705 1710 Leu Leu Arg Asp Ala Thr Ala
Cys Tyr Asp Arg Ala Ile Gln Leu 1715 1720
1725 Glu Pro Asp Gln Ile Ile His Tyr His Gly Val Val
Lys Ser Met 1730 1735 1740
Leu Gly Leu Gly Gln Leu Ser Thr Val Ile Thr Gln Val Asn Gly 1745
1750 1755 Val His Ala Asn Arg
Ser Glu Trp Thr Asp Glu Leu Asn Thr Tyr 1760 1765
1770 Arg Val Glu Ala Ala Trp Lys Leu Ser Gln
Trp Asp Leu Val Glu 1775 1780 1785
Asn Tyr Leu Ala Ala Asp Gly Lys Ser Thr Thr Trp Ser Val Arg
1790 1795 1800 Leu Gly
Gln Leu Leu Leu Ser Ala Lys Lys Arg Asp Ile Thr Ala 1805
1810 1815 Phe Tyr Asp Ser Leu Lys Leu
Val Arg Ala Glu Gln Ile Val Pro 1820 1825
1830 Leu Ser Ala Ala Ser Phe Glu Arg Gly Ser Tyr Gln
Arg Gly Tyr 1835 1840 1845
Glu Tyr Ile Val Arg Leu His Met Leu Cys Glu Leu Glu His Ser 1850
1855 1860 Ile Lys Pro Leu Phe
Gln His Ser Pro Gly Asp Ser Ser Gln Glu 1865 1870
1875 Asp Ser Leu Asn Trp Val Ala Arg Leu Glu
Met Thr Gln Asn Ser 1880 1885 1890
Tyr Arg Ala Lys Glu Pro Ile Leu Ala Leu Arg Arg Ala Leu Leu
1895 1900 1905 Ser Leu
Asn Lys Arg Pro Asp Tyr Asn Glu Met Val Gly Glu Cys 1910
1915 1920 Trp Leu Gln Ser Ala Arg Val
Ala Arg Lys Ala Gly His His Gln 1925 1930
1935 Thr Ala Tyr Asn Ala Leu Leu Asn Ala Gly Glu Ser
Arg Leu Ala 1940 1945 1950
Glu Leu Tyr Val Glu Arg Ala Lys Trp Leu Trp Ser Lys Gly Asp 1955
1960 1965 Val His Gln Ala Leu
Ile Val Leu Gln Lys Gly Val Glu Leu Cys 1970 1975
1980 Phe Pro Glu Asn Glu Thr Pro Pro Glu Gly
Lys Asn Met Leu Ile 1985 1990 1995
His Gly Arg Ala Met Leu Leu Val Gly Arg Phe Met Glu Glu Thr
2000 2005 2010 Ala Asn
Phe Glu Ser Asn Ala Ile Met Lys Lys Tyr Lys Asp Val 2015
2020 2025 Thr Ala Cys Leu Pro Glu Trp
Glu Asp Gly His Phe Tyr Leu Ala 2030 2035
2040 Lys Tyr Tyr Asp Lys Leu Met Pro Met Val Thr Asp
Asn Lys Met 2045 2050 2055
Glu Lys Gln Gly Asp Leu Ile Arg Tyr Ile Val Leu His Phe Gly 2060
2065 2070 Arg Ser Leu Gln Tyr
Gly Asn Gln Phe Ile Tyr Gln Ser Met Pro 2075 2080
2085 Arg Met Leu Thr Leu Trp Leu Asp Tyr Gly
Thr Lys Ala Tyr Glu 2090 2095 2100
Trp Glu Lys Ala Gly Arg Ser Asp Arg Val Gln Met Arg Asn Asp
2105 2110 2115 Leu Gly
Lys Ile Asn Lys Val Ile Thr Glu His Thr Asn Tyr Leu 2120
2125 2130 Ala Pro Tyr Gln Phe Leu Thr
Ala Phe Ser Gln Leu Ile Ser Arg 2135 2140
2145 Ile Cys His Ser His Asp Glu Val Phe Val Val Leu
Met Glu Ile 2150 2155 2160
Ile Ala Lys Val Phe Leu Ala Tyr Pro Gln Gln Ala Met Trp Met 2165
2170 2175 Met Thr Ala Val Ser
Lys Ser Ser Tyr Pro Met Arg Val Asn Arg 2180 2185
2190 Cys Lys Glu Ile Leu Asn Lys Ala Ile His
Met Lys Lys Ser Leu 2195 2200 2205
Glu Lys Phe Val Gly Asp Ala Thr Arg Leu Thr Asp Lys Leu Leu
2210 2215 2220 Glu Leu
Cys Asn Lys Pro Val Asp Gly Ser Ser Ser Thr Leu Ser 2225
2230 2235 Met Ser Thr His Phe Lys Met
Leu Lys Lys Leu Val Glu Glu Ala 2240 2245
2250 Thr Phe Ser Glu Ile Leu Ile Pro Leu Gln Ser Val
Met Ile Pro 2255 2260 2265
Thr Leu Pro Ser Ile Leu Gly Thr His Ala Asn His Ala Ser His 2270
2275 2280 Glu Pro Phe Pro Gly
His Trp Ala Tyr Ile Ala Gly Phe Asp Asp 2285 2290
2295 Met Val Glu Ile Leu Ala Ser Leu Gln Lys
Pro Lys Lys Ile Ser 2300 2305 2310
Leu Lys Gly Ser Asp Gly Lys Phe Tyr Ile Met Met Cys Lys Pro
2315 2320 2325 Lys Asp
Asp Leu Arg Lys Asp Cys Arg Leu Met Glu Phe Asn Ser 2330
2335 2340 Leu Ile Asn Lys Cys Leu Arg
Lys Asp Ala Glu Ser Arg Arg Arg 2345 2350
2355 Glu Leu His Ile Arg Thr Tyr Ala Val Ile Pro Leu
Asn Asp Glu 2360 2365 2370
Cys Gly Ile Ile Glu Trp Val Asn Asn Thr Ala Gly Leu Arg Pro 2375
2380 2385 Ile Leu Thr Lys Leu
Tyr Lys Glu Lys Gly Val Tyr Met Thr Gly 2390 2395
2400 Lys Glu Leu Arg Gln Cys Met Leu Pro Lys
Ser Ala Ala Leu Ser 2405 2410 2415
Glu Lys Leu Lys Val Phe Arg Glu Phe Leu Leu Pro Arg His Pro
2420 2425 2430 Pro Ile
Phe His Glu Trp Phe Leu Arg Thr Phe Pro Asp Pro Thr 2435
2440 2445 Ser Trp Tyr Ser Ser Arg Ser
Ala Tyr Cys Arg Ser Thr Ala Val 2450 2455
2460 Met Ser Met Val Gly Tyr Ile Leu Gly Leu Gly Asp
Arg His Gly 2465 2470 2475
Glu Asn Ile Leu Phe Asp Ser Leu Thr Gly Glu Cys Val His Val 2480
2485 2490 Asp Phe Asn Cys Leu
Phe Asn Lys Gly Glu Thr Phe Glu Val Pro 2495 2500
2505 Glu Ile Val Pro Phe Arg Leu Thr His Asn
Met Val Asn Gly Met 2510 2515 2520
Gly Pro Met Gly Thr Glu Gly Leu Phe Arg Arg Ala Cys Glu Val
2525 2530 2535 Thr Met
Arg Leu Met Arg Asp Gln Arg Glu Pro Leu Met Ser Val 2540
2545 2550 Leu Lys Thr Phe Leu His Asp
Pro Leu Val Glu Trp Ser Lys Pro 2555 2560
2565 Val Lys Gly His Ser Lys Ala Pro Leu Asn Glu Thr
Gly Glu Val 2570 2575 2580
Val Asn Glu Lys Ala Lys Thr His Val Leu Asp Ile Glu Gln Arg 2585
2590 2595 Leu Gln Gly Val Ile
Lys Thr Arg Asn Arg Val Thr Gly Leu Pro 2600 2605
2610 Leu Ser Ile Glu Gly His Val His Tyr Leu
Ile Gln Glu Ala Thr 2615 2620 2625
Asp Glu Asn Leu Leu Cys Gln Met Tyr Leu Gly Trp Thr Pro Tyr
2630 2635 2640 Met
208258DNAArtificial SequenceATR sequence 20ttccgggagg agttttggcc
tccacacggc tccgtcgggc gccgcgctct tccggcagcg 60gtagctttgg agacgccggg
aacccgcgtt ggcgtggttg actagtgcct cgcagcctca 120gcatggggga acatggcctg
gagctggctt ccatgatccc cgccctgcgg gagctgggca 180gtgccacacc agaggaatat
aatacagttg tacagaagcc aagacaaatt ctgtgtcaat 240tcattgaccg gatacttaca
gatgtaaatg ttgttgctgt agaacttgta aagaaaactg 300actctcagcc aacctccgtg
atgttgcttg atttcatcca gcatatcatg aaatcctccc 360cacttatgtt tgtaaatgtg
agtggaagcc atgaggccaa aggcagttgt attgaattca 420gtaattggat cataacgaga
cttctgcgga ttgcagcaac tccctcctgt catttgttac 480acaagaaaat ctgtgaagtc
atctgttcat tattatttct ttttaaaagc aagagtcctg 540ctatttttgg ggtactcaca
aaagaattat tacaactttt tgaagacttg gtttacctcc 600atagaagaaa tgtgatgggt
catgctgtgg aatggccagt ggtcatgagc cgatttttaa 660gtcaattaga tgaacacatg
ggatatttac aatcagctcc tttgcagttg atgagtatgc 720aaaatttaga atttattgaa
gtcactttat taatggttct tactcgtatt attgcaattg 780tgttttttag aaggcaagaa
ctcttacttt ggcagatagg ttgtgttctg ctagagtatg 840gtagtccaaa aattaaatcc
ctagcaatta gctttttaac agaacttttt cagcttggag 900gactaccagc acaaccagct
agcacttttt tcagctcatt tttggaatta ttaaaacacc 960ttgtagaaat ggatactgac
caattgaaac tctatgaaga gccattatca aagctgataa 1020agacactatt tccctttgaa
gcagaagctt atagaaatat tgaacctgtc tatttaaata 1080tgctgctgga aaaactctgt
gtcatgtttg aagacggtgt gctcatgcgg cttaagtctg 1140atttgctaaa agcagctttg
tgccatttac tgcagtattt ccttaaattt gtgccagctg 1200ggtatgaatc tgctttacaa
gtcaggaagg tctatgtgag aaatatttgt aaagctcttt 1260tggatgtgct tggaattgag
gtagatgcag agtacttgtt gggcccactt tatgcagctt 1320tgaaaatgga aagtatggaa
atcattgagg agattcaatg ccaaactcaa caggaaaacc 1380tcagcagtaa tagtgatgga
atatcaccca aaaggcgtcg tctcagctcg tctctaaacc 1440cttctaaaag agcaccaaaa
cagactgagg aaattaaaca tgtggacatg aaccaaaaga 1500gcatattatg gagtgcactg
aaacagaaag ctgaatccct tcagatttcc cttgaataca 1560gtggcctaaa gaatcctgtt
attgagatgt tagaaggaat tgctgttgtc ttacaactga 1620ctgctctgtg tactgttcat
tgttctcatc aaaacatgaa ctgccgtact ttcaaggact 1680gtcaacataa atccaagaag
aaaccttctg tagtgataac ttggatgtca ttggattttt 1740acacaaaagt gcttaagagc
tgtagaagtt tgttagaatc tgttcagaaa ctggacctgg 1800aggcaaccat tgataaggtg
gtgaaaattt atgatgcttt gatttatatg caagtaaaca 1860gttcatttga agatcatatc
ctggaagatt tatgtggtat gctctcactt ccatggattt 1920attcccattc tgatgatggc
tgtttaaagt tgaccacatt tgccgctaat cttctaacat 1980taagctgtag gatttcagat
agctattcac cacaggcaca atcacgatgt gtgtttcttc 2040tgactctgtt tccaagaaga
atattccttg agtggagaac agcagtttac aactgggccc 2100tgcagagctc ccatgaagta
atccgggcta gttgtgttag tggatttttt atcttattgc 2160agcagcagaa ttcttgtaac
agagttccca agattcttat agataaagtc aaagatgatt 2220ctgacattgt caagaaagaa
tttgcttcta tacttggtca acttgtctgt actcttcacg 2280gcatgtttta tctgacaagt
tctttaacag aacctttctc tgaacacgga catgtggacc 2340tcttctgtag gaacttgaaa
gccacttctc aacatgaatg ttcatcttct caactaaaag 2400cttctgtctg caagccattc
cttttcctac tgaaaaaaaa aatacctagt ccagtaaaac 2460ttgctttcat agataatcta
catcatcttt gtaagcatct tgattttaga gaagatgaaa 2520cagatgtaaa agcagttctt
ggaactttat taaatttaat ggaagatcca gacaaagatg 2580ttagagtggc ttttagtgga
aatatcaagc acatattgga atccttggac tctgaagatg 2640gatttataaa ggagcttttt
gtcttaagaa tgaaggaagc atatacacat gcccaaatat 2700caagaaataa tgagctgaag
gataccttga ttcttacaac aggggatatt ggaagggccg 2760caaaaggaga tttggtacca
tttgcactct tacacttatt gcattgtttg ttatccaagt 2820cagcatctgt ctctggagca
gcatacacag aaattagagc tctggttgca gctaaaagtg 2880ttaaactgca aagttttttc
agccagtata agaaacccat ctgtcagttt ttggtagaat 2940cccttcactc tagtcagatg
acagcacttc cgaatactcc atgccagaat gctgacgtgc 3000gaaaacaaga tgtggctcac
cagagagaaa tggctttaaa tacgttgtct gaaattgcca 3060acgttttcga ctttcctgat
cttaatcgtt ttcttactag gacattacaa gttctactac 3120ctgatcttgc tgccaaagca
agccctgcag cttctgctct cattcgaact ttaggaaaac 3180aattaaatgt caatcgtaga
gagattttaa taaacaactt caaatatatt ttttctcatt 3240tggtctgttc ttgttccaaa
gatgaattag aacgtgccct tcattatctg aagaatgaaa 3300cagaaattga actggggagc
ctgttgagac aagatttcca aggattgcat aatgaattat 3360tgctgcgtat tggagaacac
tatcaacagg tttttaatgg tttgtcaata cttgcctcat 3420ttgcatccag tgatgatcca
tatcagggcc cgagagatat catatcacct gaactgatgg 3480ctgattattt acaacccaaa
ttgttgggca ttttggcttt ttttaacatg cagttactga 3540gctctagtgt tggcattgaa
gataagaaaa tggccttgaa cagtttgatg tctttgatga 3600agttaatggg acccaaacat
gtcagttctg tgagggtgaa gatgatgacc acactgagaa 3660ctggccttcg attcaaggat
gattttcctg aattgtgttg cagagcttgg gactgctttg 3720ttcgctgcct ggatcatgct
tgtctgggct cccttctcag tcatgtaata gtagctttgt 3780tacctcttat acacatccag
cctaaagaaa ctgcagctat cttccactac ctcataattg 3840aaaacaggga tgctgtgcaa
gattttcttc atgaaatata ttttttacct gatcatccag 3900aattaaaaaa gataaaagcc
gttctccagg aatacagaaa ggagacctct gagagcactg 3960atcttcagac aactcttcag
ctctctatga aggccattca acatgaaaat gtcgatgttc 4020gtattcatgc tcttacaagc
ttgaaggaaa ccttgtataa aaatcaggaa aaactgataa 4080agtatgcaac agacagtgaa
acagtagaac ctattatctc acagttggtg acagtgcttt 4140tgaaaggttg ccaagatgca
aactctcaag ctcggttgct ctgtggggaa tgtttagggg 4200aattgggggc gatagatcca
ggtcgattag atttctcaac aactgaaact caaggaaaag 4260attttacatt tgtgactgga
gtagaagatt caagctttgc ctatggatta ttgatggagc 4320taacaagagc ttaccttgcg
tatgctgata atagccgagc tcaagattca gctgcctatg 4380ccattcagga gttgctttct
atttatgact gtagagagat ggagaccaac ggcccaggtc 4440accaattgtg gaggagattt
cctgagcatg ttcgggaaat actagaacct catctaaata 4500ccagatacaa gagttctcag
aagtcaaccg attggtctgg agtaaagaag ccaatttact 4560taagtaaatt gggtagtaac
tttgcagaat ggtcagcatc ttgggcaggt tatcttatta 4620caaaggttcg acatgatctt
gccagtaaaa ttttcacctg ctgtagcatt atgatgaagc 4680atgatttcaa agtgaccatc
tatcttcttc cacatattct ggtgtatgtc ttactgggtt 4740gtaatcaaga agatcagcag
gaggtttatg cagaaattat ggcagttcta aagcatgacg 4800atcagcatac cataaatacc
caagacattg catctgatct gtgtcaactc agtacacaga 4860ctgtgttctc catgcttgac
catctcacac agtgggcaag gcacaaattt caggcactga 4920aagctgagaa atgtccacac
agcaaatcaa acagaaataa ggtagactca atggtatcta 4980ctgtggatta tgaagactat
cagagtgtaa cccgttttct agacctcata ccccaggata 5040ctctggcagt agcttccttt
cgctccaaag catacacacg agctgtaatg cactttgaat 5100catttattac agaaaagaag
caaaatattc aggaacatct tggattttta cagaaattgt 5160atgctgctat gcatgaacct
gatggagtgg ccggagtcag tgcaattaga aaggcagaac 5220catctctaaa agaacagatc
cttgaacatg aaagccttgg cttgctgagg gatgccactg 5280cttgttatga cagggctatt
cagctagaac cagaccagat cattcattat catggtgtag 5340taaagtccat gttaggtctt
ggtcagctgt ctactgttat cactcaggtg aatggagtgc 5400atgctaacag gtccgagtgg
acagatgaat taaacacgta cagagtggaa gcagcttgga 5460aattgtcaca gtgggatttg
gtggaaaact atttggcagc agatggaaaa tctacaacat 5520ggagtgtcag actgggacag
ctattattat cagccaaaaa aagagatatc acagcttttt 5580atgactcact gaaactagtg
agagcagaac aaattgtacc tctttcagct gcaagctttg 5640aaagaggctc ctaccaacga
ggatatgaat atattgtgag attgcacatg ttatgtgagt 5700tggagcatag catcaaacca
cttttccagc attctccagg tgacagttct caagaagatt 5760ctctaaactg ggtagctcga
ctagaaatga cccagaattc ctacagagcc aaggagccta 5820tcctggctct ccggagggct
ttactaagcc tcaacaaaag accagattac aatgaaatgg 5880ttggagaatg ctggctgcag
agtgccaggg tagctagaaa ggctggtcac caccagacag 5940cctacaatgc tctccttaat
gcaggggaat cacgactcgc tgaactgtac gtggaaaggg 6000caaagtggct ctggtccaag
ggtgatgttc accaggcact aattgttctt caaaaaggtg 6060ttgaattatg ttttcctgaa
aatgaaaccc cacctgaggg taagaacatg ttaatccatg 6120gtcgagctat gctactagtg
ggccgattta tggaagaaac agctaacttt gaaagcaatg 6180caattatgaa aaaatataag
gatgtgaccg cgtgcctgcc agaatgggag gatgggcatt 6240tttaccttgc caagtactat
gacaaattga tgcccatggt cacagacaac aaaatggaaa 6300agcaaggtga tctcatccgg
tatatagttc ttcattttgg cagatctcta caatatggaa 6360atcagttcat atatcagtca
atgccacgaa tgttaactct atggcttgat tatggtacaa 6420aggcatatga atgggaaaaa
gctggccgct ccgatcgtgt acaaatgagg aatgatttgg 6480gtaaaataaa caaggttatc
acagagcata caaactattt agctccatat caatttttga 6540ctgctttttc acaattgatc
tctcgaattt gtcattctca cgatgaagtt tttgttgtct 6600tgatggaaat aatagccaaa
gtatttctag cctatcctca acaagcaatg tggatgatga 6660cagctgtgtc aaagtcatct
tatcccatgc gtgtgaacag atgcaaggaa atcctcaata 6720aagctattca tatgaaaaaa
tccttagaga agtttgttgg agatgcaact cgcctaacag 6780ataagcttct agaattgtgc
aataaaccgg ttgatggaag tagttccaca ttaagcatga 6840gcactcattt taaaatgctt
aaaaagctgg tagaagaagc aacatttagt gaaatcctca 6900ttcctctaca atcagtcatg
atacctacac ttccatcaat tctgggtacc catgctaacc 6960atgctagcca tgaaccattt
cctggacatt gggcctatat tgcagggttt gatgatatgg 7020tggaaattct tgcttctctt
cagaaaccaa agaagatttc tttaaaaggc tcagatggaa 7080agttctacat catgatgtgt
aagccaaaag atgacctgag aaaggattgt agactaatgg 7140aattcaattc cttgattaat
aagtgcttaa gaaaagatgc agagtctcgt agaagagaac 7200ttcatattcg aacatatgca
gttattccac taaatgatga atgtgggatt attgaatggg 7260tgaacaacac tgctggtttg
agacctattc tgaccaaact atataaagaa aagggagtgt 7320atatgacagg aaaagaactt
cgccagtgta tgctaccaaa gtcagcagct ttatctgaaa 7380aactcaaagt attccgagaa
tttctcctgc ccaggcatcc tcctattttt catgagtggt 7440ttctgagaac attccctgat
cctacatcat ggtacagtag tagatcagct tactgccgtt 7500ccactgcagt aatgtcaatg
gttggttata ttctggggct tggagaccgt catggtgaaa 7560atattctctt tgattctttg
actggtgaat gcgtacatgt agatttcaat tgtcttttca 7620ataagggaga aacctttgaa
gttccagaaa ttgtgccatt tcgcctgact cataatatgg 7680ttaatggaat gggtcctatg
ggaacagagg gtctttttcg aagagcatgt gaagttacaa 7740tgaggctgat gcgtgatcag
cgagagcctt taatgagtgt cttaaagact tttctacatg 7800atcctcttgt ggaatggagt
aaaccagtga aagggcattc caaagcgcca ctgaatgaaa 7860ctggagaagt tgtcaatgaa
aaggccaaga cccatgttct tgacattgag cagcgactac 7920aaggtgtaat caagactcga
aatagagtga caggactgcc gttatctatt gaaggacatg 7980tgcattacct tatacaggaa
gctactgatg aaaacttact atgccagatg tatcttggtt 8040ggactccata tatgtgaaat
gaaattatgt aaaagaatat gttaataatc taaaagtaat 8100gcatttggta tgaatctgtg
gttgtatctg ttcaattcta aagtacaaca taaatttacg 8160ttctcagcaa ctgttatttc
tctctgatca ttaattatat gtaaaataat atacattcag 8220ttattaagaa ataaactgct
ttcttaatac aaaaaaaa 825821403PRTArtificial
SequencePTEN sequence 21Met Thr Ala Ile Ile Lys Glu Ile Val Ser Arg Asn
Lys Arg Arg Tyr 1 5 10
15 Gln Glu Asp Gly Phe Asp Leu Asp Leu Thr Tyr Ile Tyr Pro Asn Ile
20 25 30 Ile Ala Met
Gly Phe Pro Ala Glu Arg Leu Glu Gly Val Tyr Arg Asn 35
40 45 Asn Ile Asp Asp Val Val Arg Phe
Leu Asp Ser Lys His Lys Asn His 50 55
60 Tyr Lys Ile Tyr Asn Leu Cys Ala Glu Arg His Tyr Asp
Thr Ala Lys 65 70 75
80 Phe Asn Cys Arg Val Ala Gln Tyr Pro Phe Glu Asp His Asn Pro Pro
85 90 95 Gln Leu Glu Leu
Ile Lys Pro Phe Cys Glu Asp Leu Asp Gln Trp Leu 100
105 110 Ser Glu Asp Asp Asn His Val Ala Ala
Ile His Cys Lys Ala Gly Lys 115 120
125 Gly Arg Thr Gly Val Met Ile Cys Ala Tyr Leu Leu His Arg
Gly Lys 130 135 140
Phe Leu Lys Ala Gln Glu Ala Leu Asp Phe Tyr Gly Glu Val Arg Thr 145
150 155 160 Arg Asp Lys Lys Gly
Val Thr Ile Pro Ser Gln Arg Arg Tyr Val Tyr 165
170 175 Tyr Tyr Ser Tyr Leu Leu Lys Asn His Leu
Asp Tyr Arg Pro Val Ala 180 185
190 Leu Leu Phe His Lys Met Met Phe Glu Thr Ile Pro Met Phe Ser
Gly 195 200 205 Gly
Thr Cys Asn Pro Gln Phe Val Val Cys Gln Leu Lys Val Lys Ile 210
215 220 Tyr Ser Ser Asn Ser Gly
Pro Thr Arg Arg Glu Asp Lys Phe Met Tyr 225 230
235 240 Phe Glu Phe Pro Gln Pro Leu Pro Val Cys Gly
Asp Ile Lys Val Glu 245 250
255 Phe Phe His Lys Gln Asn Lys Met Leu Lys Lys Asp Lys Met Phe His
260 265 270 Phe Trp
Val Asn Thr Phe Phe Ile Pro Gly Pro Glu Glu Thr Ser Glu 275
280 285 Lys Val Glu Asn Gly Ser Leu
Cys Asp Gln Glu Ile Asp Ser Ile Cys 290 295
300 Ser Ile Glu Arg Ala Asp Asn Asp Lys Glu Tyr Leu
Val Leu Thr Leu 305 310 315
320 Thr Lys Asn Asp Leu Asp Lys Ala Asn Lys Asp Lys Ala Asn Arg Tyr
325 330 335 Phe Ser Pro
Asn Phe Lys Val Lys Leu Tyr Phe Thr Lys Thr Val Glu 340
345 350 Glu Pro Ser Asn Pro Glu Ala Ser
Ser Ser Thr Ser Val Thr Pro Asp 355 360
365 Val Ser Asp Asn Glu Pro Asp His Tyr Arg Tyr Ser Asp
Thr Thr Asp 370 375 380
Ser Asp Pro Glu Asn Glu Pro Phe Asp Glu Asp Gln His Thr Gln Ile 385
390 395 400 Thr Lys Val
225572DNAArtificial SequencePTEN sequence 22cctcccctcg cccggcgcgg
tcccgtccgc ctctcgctcg cctcccgcct cccctcggtc 60ttccgaggcg cccgggctcc
cggcgcggcg gcggaggggg cgggcaggcc ggcgggcggt 120gatgtggcgg gactctttat
gcgctgcggc aggatacgcg ctcggcgctg ggacgcgact 180gcgctcagtt ctctcctctc
ggaagctgca gccatgatgg aagtttgaga gttgagccgc 240tgtgaggcga ggccgggctc
aggcgaggga gatgagagac ggcggcggcc gcggcccgga 300gcccctctca gcgcctgtga
gcagccgcgg gggcagcgcc ctcggggagc cggccggcct 360gcggcggcgg cagcggcggc
gtttctcgcc tcctcttcgt cttttctaac cgtgcagcct 420cttcctcggc ttctcctgaa
agggaaggtg gaagccgtgg gctcgggcgg gagccggctg 480aggcgcggcg gcggcggcgg
cacctcccgc tcctggagcg ggggggagaa gcggcggcgg 540cggcggccgc ggcggctgca
gctccaggga gggggtctga gtcgcctgtc accatttcca 600gggctgggaa cgccggagag
ttggtctctc cccttctact gcctccaaca cggcggcggc 660ggcggcggca catccaggga
cccgggccgg ttttaaacct cccgtccgcc gccgccgcac 720cccccgtggc ccgggctccg
gaggccgccg gcggaggcag ccgttcggag gattattcgt 780cttctcccca ttccgctgcc
gccgctgcca ggcctctggc tgctgaggag aagcaggccc 840agtcgctgca accatccagc
agccgccgca gcagccatta cccggctgcg gtccagagcc 900aagcggcggc agagcgaggg
gcatcagcta ccgccaagtc cagagccatt tccatcctgc 960agaagaagcc ccgccaccag
cagcttctgc catctctctc ctcctttttc ttcagccaca 1020ggctcccaga catgacagcc
atcatcaaag agatcgttag cagaaacaaa aggagatatc 1080aagaggatgg attcgactta
gacttgacct atatttatcc aaacattatt gctatgggat 1140ttcctgcaga aagacttgaa
ggcgtataca ggaacaatat tgatgatgta gtaaggtttt 1200tggattcaaa gcataaaaac
cattacaaga tatacaatct ttgtgctgaa agacattatg 1260acaccgccaa atttaattgc
agagttgcac aatatccttt tgaagaccat aacccaccac 1320agctagaact tatcaaaccc
ttttgtgaag atcttgacca atggctaagt gaagatgaca 1380atcatgttgc agcaattcac
tgtaaagctg gaaagggacg aactggtgta atgatatgtg 1440catatttatt acatcggggc
aaatttttaa aggcacaaga ggccctagat ttctatgggg 1500aagtaaggac cagagacaaa
aagggagtaa ctattcccag tcagaggcgc tatgtgtatt 1560attatagcta cctgttaaag
aatcatctgg attatagacc agtggcactg ttgtttcaca 1620agatgatgtt tgaaactatt
ccaatgttca gtggcggaac ttgcaatcct cagtttgtgg 1680tctgccagct aaaggtgaag
atatattcct ccaattcagg acccacacga cgggaagaca 1740agttcatgta ctttgagttc
cctcagccgt tacctgtgtg tggtgatatc aaagtagagt 1800tcttccacaa acagaacaag
atgctaaaaa aggacaaaat gtttcacttt tgggtaaata 1860cattcttcat accaggacca
gaggaaacct cagaaaaagt agaaaatgga agtctatgtg 1920atcaagaaat cgatagcatt
tgcagtatag agcgtgcaga taatgacaag gaatatctag 1980tacttacttt aacaaaaaat
gatcttgaca aagcaaataa agacaaagcc aaccgatact 2040tttctccaaa ttttaaggtg
aagctgtact tcacaaaaac agtagaggag ccgtcaaatc 2100cagaggctag cagttcaact
tctgtaacac cagatgttag tgacaatgaa cctgatcatt 2160atagatattc tgacaccact
gactctgatc cagagaatga accttttgat gaagatcagc 2220atacacaaat tacaaaagtc
tgaatttttt tttatcaaga gggataaaac accatgaaaa 2280taaacttgaa taaactgaaa
atggaccttt ttttttttaa tggcaatagg acattgtgtc 2340agattaccag ttataggaac
aattctcttt tcctgaccaa tcttgtttta ccctatacat 2400ccacagggtt ttgacacttg
ttgtccagtt gaaaaaaggt tgtgtagctg tgtcatgtat 2460ataccttttt gtgtcaaaag
gacatttaaa attcaattag gattaataaa gatggcactt 2520tcccgtttta ttccagtttt
ataaaaagtg gagacagact gatgtgtata cgtaggaatt 2580ttttcctttt gtgttctgtc
accaactgaa gtggctaaag agctttgtga tatactggtt 2640cacatcctac ccctttgcac
ttgtggcaac agataagttt gcagttggct aagagaggtt 2700tccgaagggt tttgctacat
tctaatgcat gtattcgggt taggggaatg gagggaatgc 2760tcagaaagga aataatttta
tgctggactc tggaccatat accatctcca gctatttaca 2820cacacctttc tttagcatgc
tacagttatt aatctggaca ttcgaggaat tggccgctgt 2880cactgcttgt tgtttgcgca
ttttttttta aagcatattg gtgctagaaa aggcagctaa 2940aggaagtgaa tctgtattgg
ggtacaggaa tgaaccttct gcaacatctt aagatccaca 3000aatgaaggga tataaaaata
atgtcatagg taagaaacac agcaacaatg acttaaccat 3060ataaatgtgg aggctatcaa
caaagaatgg gcttgaaaca ttataaaaat tgacaatgat 3120ttattaaata tgttttctca
attgtaacga cttctccatc tcctgtgtaa tcaaggccag 3180tgctaaaatt cagatgctgt
tagtacctac atcagtcaac aacttacact tattttacta 3240gttttcaatc ataatacctg
ctgtggatgc ttcatgtgct gcctgcaagc ttcttttttc 3300tcattaaata taaaatattt
tgtaatgctg cacagaaatt ttcaatttga gattctacag 3360taagcgtttt ttttctttga
agatttatga tgcacttatt caatagctgt cagccgttcc 3420acccttttga ccttacacat
tctattacaa tgaattttgc agttttgcac attttttaaa 3480tgtcattaac tgttagggaa
ttttacttga atactgaata catataatgt ttatattaaa 3540aaggacattt gtgttaaaaa
ggaaattaga gttgcagtaa actttcaatg ctgcacacaa 3600aaaaaagaca tttgattttt
cagtagaaat tgtcctacat gtgctttatt gatttgctat 3660tgaaagaata gggttttttt
tttttttttt tttttttttt ttaaatgtgc agtgttgaat 3720catttcttca tagtgctccc
ccgagttggg actagggctt caatttcact tcttaaaaaa 3780aatcatcata tatttgatat
gcccagactg catacgattt taagcggagt acaactacta 3840ttgtaaagct aatgtgaaga
tattattaaa aaggtttttt tttccagaaa tttggtgtct 3900tcaaattata ccttcacctt
gacatttgaa tatccagcca ttttgtttct taatggtata 3960aaattccatt ttcaataact
tattggtgct gaaattgttc actagctgtg gtctgaccta 4020gttaatttac aaatacagat
tgaataggac ctactagagc agcatttata gagtttgatg 4080gcaaatagat taggcagaac
ttcatctaaa atattcttag taaataatgt tgacacgttt 4140tccatacctt gtcagtttca
ttcaacaatt tttaaatttt taacaaagct cttaggattt 4200acacatttat atttaaacat
tgatatatag agtattgatt gattgctcat aagttaaatt 4260ggtaaagtta gagacaacta
ttctaacacc tcaccattga aatttatatg ccaccttgtc 4320tttcataaaa gctgaaaatt
gttacctaaa atgaaaatca acttcatgtt ttgaagatag 4380ttataaatat tgttctttgt
tacaatttcg ggcaccgcat attaaaacgt aactttattg 4440ttccaatatg taacatggag
ggccaggtca taaataatga cattataatg ggcttttgca 4500ctgttattat ttttcctttg
gaatgtgaag gtctgaatga gggttttgat tttgaatgtt 4560tcaatgtttt tgagaagcct
tgcttacatt ttatggtgta gtcattggaa atggaaaaat 4620ggcattatat atattatata
tataaatata tattatacat actctcctta ctttatttca 4680gttaccatcc ccatagaatt
tgacaagaat tgctatgact gaaaggtttt cgagtcctaa 4740ttaaaacttt atttatggca
gtattcataa ttagcctgaa atgcattctg taggtaatct 4800ctgagtttct ggaatatttt
cttagacttt ttggatgtgc agcagcttac atgtctgaag 4860ttacttgaag gcatcacttt
taagaaagct tacagttggg ccctgtacca tcccaagtcc 4920tttgtagctc ctcttgaaca
tgtttgccat acttttaaaa gggtagttga ataaatagca 4980tcaccattct ttgctgtggc
acaggttata aacttaagtg gagtttaccg gcagcatcaa 5040atgtttcagc tttaaaaaat
aaaagtaggg tacaagttta atgtttagtt ctagaaattt 5100tgtgcaatat gttcataacg
atggctgtgg ttgccacaaa gtgcctcgtt tacctttaaa 5160tactgttaat gtgtcatgca
tgcagatgga aggggtggaa ctgtgcacta aagtgggggc 5220tttaactgta gtatttggca
gagttgcctt ctacctgcca gttcaaaagt tcaacctgtt 5280ttcatataga atatatatac
taaaaaattt cagtctgtta aacagcctta ctctgattca 5340gcctcttcag atactcttgt
gctgtgcagc agtggctctg tgtgtaaatg ctatgcactg 5400aggatacaca aaaataccaa
tatgatgtgt acaggataat gcctcatccc aatcagatgt 5460ccatttgtta ttgtgtttgt
taacaaccct ttatctctta gtgttataaa ctccacttaa 5520aactgattaa agtctcattc
ttgtcaaaaa aaaaaaaaaa aaaaaaaaaa aa 557223297PRTArtificial
SequenceERCC1 sequence 23Met Asp Pro Gly Lys Asp Lys Glu Gly Val Pro Gln
Pro Ser Gly Pro 1 5 10
15 Pro Ala Arg Lys Lys Phe Val Ile Pro Leu Asp Glu Asp Glu Val Pro
20 25 30 Pro Gly Val
Ala Lys Pro Leu Phe Arg Ser Thr Gln Ser Leu Pro Thr 35
40 45 Val Asp Thr Ser Ala Gln Ala Ala
Pro Gln Thr Tyr Ala Glu Tyr Ala 50 55
60 Ile Ser Gln Pro Leu Glu Gly Ala Gly Ala Thr Cys Pro
Thr Gly Ser 65 70 75
80 Glu Pro Leu Ala Gly Glu Thr Pro Asn Gln Ala Leu Lys Pro Gly Ala
85 90 95 Lys Ser Asn Ser
Ile Ile Val Ser Pro Arg Gln Arg Gly Asn Pro Val 100
105 110 Leu Lys Phe Val Arg Asn Val Pro Trp
Glu Phe Gly Asp Val Ile Pro 115 120
125 Asp Tyr Val Leu Gly Gln Ser Thr Cys Ala Leu Phe Leu Ser
Leu Arg 130 135 140
Tyr His Asn Leu His Pro Asp Tyr Ile His Gly Arg Leu Gln Ser Leu 145
150 155 160 Gly Lys Asn Phe Ala
Leu Arg Val Leu Leu Val Gln Val Asp Val Lys 165
170 175 Asp Pro Gln Gln Ala Leu Lys Glu Leu Ala
Lys Met Cys Ile Leu Ala 180 185
190 Asp Cys Thr Leu Ile Leu Ala Trp Ser Pro Glu Glu Ala Gly Arg
Tyr 195 200 205 Leu
Glu Thr Tyr Lys Ala Tyr Glu Gln Lys Pro Ala Asp Leu Leu Met 210
215 220 Glu Lys Leu Glu Gln Asp
Phe Val Ser Arg Val Thr Glu Cys Leu Thr 225 230
235 240 Thr Val Lys Ser Val Asn Lys Thr Asp Ser Gln
Thr Leu Leu Thr Thr 245 250
255 Phe Gly Ser Leu Glu Gln Leu Ile Ala Ala Ser Arg Glu Asp Leu Ala
260 265 270 Leu Cys
Pro Gly Leu Gly Pro Gln Lys Ala Arg Arg Leu Phe Asp Val 275
280 285 Leu His Glu Pro Phe Leu Lys
Val Pro 290 295 243400DNAArtificial
SequenceERCC1 sequence 24ccggaagtgc tgcgagccct gggccacgct ggccgtgctg
gcagtgggcc gcctcgatcc 60ctctgcagtc tttcccttga ggctccaaga ccagcaggtg
aggcctcgcg gcgctgaaac 120cgtgaggccc ggaccacagg ctccagatgg accctgggaa
ggacaaagag ggggtgcccc 180agccctcagg gccgccagca aggaagaaat ttgtgatacc
cctcgacgag gatgaggtcc 240ctcctggagt ggccaagccc ttattccgat ctacacagag
ccttcccact gtggacacct 300cggcccaggc ggcccctcag acctacgccg aatatgccat
ctcacagcct ctggaagggg 360ctggggccac gtgccccaca gggtcagagc ccctggcagg
agagacgccc aaccaggccc 420tgaaacccgg ggcaaaatcc aacagcatca ttgtgagccc
tcggcagagg ggcaatcccg 480tactgaagtt cgtgcgcaat gtgccctggg aatttggcga
cgtaattccc gactatgtgc 540tgggccagag cacctgtgcc ctgttcctca gcctccgcta
ccacaacctg cacccagact 600acatccatgg gcggctgcag agcctgggga agaacttcgc
cttgcgggtc ctgcttgtcc 660aggtggatgt gaaagatccc cagcaggccc tcaaggagct
ggctaagatg tgtatcctgg 720ccgactgcac attgatcctc gcctggagcc ccgaggaagc
tgggcggtac ctggagacct 780acaaggccta tgagcagaaa ccagcggacc tcctgatgga
gaagctagag caggacttcg 840tctcccgggt gactgaatgt ctgaccaccg tgaagtcagt
caacaaaacg gacagtcaga 900ccctcctgac cacatttgga tctctggaac agctcatcgc
cgcatcaaga gaagatctgg 960ccttatgccc aggcctgggc cctcagaaag cccggaggct
gtttgatgtc ctgcacgagc 1020ccttcttgaa agtaccctga tgaccccagc tgccaaggaa
acccccagtg taataataaa 1080tcgtcctccc aggccaggct cctgctggct gcgctggtgc
agtctctggg gagggattct 1140gggggtgtca ccttctggtg gcccaggtgg gcaccttcag
ctttctttag ttcctcagtt 1200tcccgggggc agactacaca ggctgctgct gctgctgctt
ccgcttcttg tcccggcctg 1260tgggagcctc ctccccagac tctgaattca gtggcggccc
tggcatctcc tcttggggca 1320ctgtctctgg catccggctt tcctgactct gcttcttcct
cttcttggtg gatcccggag 1380ttgccctggc ttcaggctgt ccctcccctg gcagttcagg
ctctagtggc tgaattggct 1440cagtcactgt gtgacctctc tctttcttct tcttcttctt
cttggtggat gtgggagctg 1500cctgaggctc aaggtcatcc ggcagctcag gccccaccac
ctctgtctct ggctccactg 1560tggcatcttg ctgtttttct ttcttcgtct tctttttggg
agctgccaga gctgcctggg 1620cctgaggctt cgctccttct ggctgttgag gcgccatggt
cccccctggg gactccagag 1680gcttcatctc cggctccact ggctccatcg cctccgtccc
tggctccatc attgccatct 1740gtcccttttc ttttttcctc ttcttcgtag ggggcagagg
gatggcttcc tccagtggct 1800ccaccttcac ctgtggctga gactcaactg tcaccccctc
ctctggctcc atcccttccg 1860tccccttttg cctctttctc tttttggtcg gggacaggac
tgtgtcttct agaggctcag 1920tgttaatctg ttcctgcttc actgtcttgt cttctggctc
gaaggtttct ttccctttgg 1980gcttcttcct cttcttggtg gtggacggga acagcactcc
cagaggctcc agtgtctcca 2040ctgtgggctc tgtccccaca ggccctgctg cctctggttc
tttcagctgc tgattttttt 2100tcttcttctt cttccgcaca tccatttctg gcgaccccaa
agccatgtcc acctccaggg 2160ccccgtgccc attcactgcc tcctgagtga ctggggcctc
tgtcacctgc atctcctttt 2220tcttcttccc tgaggtgagc aggttggggg ccaaggctga
cctaggccct gtgactggtg 2280ggttgccccc aaaggcacag aaccgaggcc tcaggccagg
agggatctgt ggtgggggac 2340ttgctgggat gggctgcaga gggctccctg acagggattg
ctggggaccc tcaaggatcc 2400ttagggtgcc ctggggggct gaggcacagg tgagtccacc
tcctgcctcc gttgaggggg 2460ccagcagggt cgcttctcca gcttggggac agctgctgag
gactcgatag cggtgccgct 2520tgcctgccaa tttgcccttg acgatctggg agccagagag
aggcacatgc cgcccattga 2580agctacagag agaaacaggg agggcagagg cttaagtgga
acaggagagg gaaggttttt 2640tgattttttt tttgtttttt tttgagagag tcttgctctg
ttgcctaggc tggagtgcag 2700tggcatgatc tcggctcact gcaatgtcca cctcctgggt
tcaagcgatt ctcctgcctc 2760agcctctcaa gtagctggga ttacaggcac ctgccaccac
gcccagccaa tttttgtatt 2820tttagtagag acaatttcac tatgttggcc aggctggtct
tgaactcctg acctcaagtg 2880atctgctcgc ctcggcctcc caaaggatgg gattacaggc
accagccact gcgcctggct 2940ggcctctggt ttttaataaa acatgactag agtgactcca
tcttaaagtg agtagctagg 3000cacttacaag gttcatgctt atggcctgaa aataaccaca
tcccaggctg accaccaatt 3060ataattacag aatatttatg gccatacaga acatgttcca
ccaagcctgc agaatgtcca 3120aatgtcctaa gaatgcagcc cccattactt aaatataaca
taaatgagca agcttaggtt 3180gcaggattaa tggtcgtgga taacaccaat agcccctacc
tttagtgagc ttatctgcac 3240actccaagtt taactatagt tccttatagt ttcttataag
tagaaatact aacaaagggc 3300tgtgggtttc tccccctgct ttctgaggac actctactct
gtaaaggagt agtttccaat 3360aaacttgttt ctttcactgt gcaaaaaaaa aaaaaaaaaa
3400253418PRTArtificial SequenceBRCA2 sequence
25Met Pro Ile Gly Ser Lys Glu Arg Pro Thr Phe Phe Glu Ile Phe Lys 1
5 10 15 Thr Arg Cys Asn
Lys Ala Asp Leu Gly Pro Ile Ser Leu Asn Trp Phe 20
25 30 Glu Glu Leu Ser Ser Glu Ala Pro Pro
Tyr Asn Ser Glu Pro Ala Glu 35 40
45 Glu Ser Glu His Lys Asn Asn Asn Tyr Glu Pro Asn Leu Phe
Lys Thr 50 55 60
Pro Gln Arg Lys Pro Ser Tyr Asn Gln Leu Ala Ser Thr Pro Ile Ile 65
70 75 80 Phe Lys Glu Gln Gly
Leu Thr Leu Pro Leu Tyr Gln Ser Pro Val Lys 85
90 95 Glu Leu Asp Lys Phe Lys Leu Asp Leu Gly
Arg Asn Val Pro Asn Ser 100 105
110 Arg His Lys Ser Leu Arg Thr Val Lys Thr Lys Met Asp Gln Ala
Asp 115 120 125 Asp
Val Ser Cys Pro Leu Leu Asn Ser Cys Leu Ser Glu Ser Pro Val 130
135 140 Val Leu Gln Cys Thr His
Val Thr Pro Gln Arg Asp Lys Ser Val Val 145 150
155 160 Cys Gly Ser Leu Phe His Thr Pro Lys Phe Val
Lys Gly Arg Gln Thr 165 170
175 Pro Lys His Ile Ser Glu Ser Leu Gly Ala Glu Val Asp Pro Asp Met
180 185 190 Ser Trp
Ser Ser Ser Leu Ala Thr Pro Pro Thr Leu Ser Ser Thr Val 195
200 205 Leu Ile Val Arg Asn Glu Glu
Ala Ser Glu Thr Val Phe Pro His Asp 210 215
220 Thr Thr Ala Asn Val Lys Ser Tyr Phe Ser Asn His
Asp Glu Ser Leu 225 230 235
240 Lys Lys Asn Asp Arg Phe Ile Ala Ser Val Thr Asp Ser Glu Asn Thr
245 250 255 Asn Gln Arg
Glu Ala Ala Ser His Gly Phe Gly Lys Thr Ser Gly Asn 260
265 270 Ser Phe Lys Val Asn Ser Cys Lys
Asp His Ile Gly Lys Ser Met Pro 275 280
285 Asn Val Leu Glu Asp Glu Val Tyr Glu Thr Val Val Asp
Thr Ser Glu 290 295 300
Glu Asp Ser Phe Ser Leu Cys Phe Ser Lys Cys Arg Thr Lys Asn Leu 305
310 315 320 Gln Lys Val Arg
Thr Ser Lys Thr Arg Lys Lys Ile Phe His Glu Ala 325
330 335 Asn Ala Asp Glu Cys Glu Lys Ser Lys
Asn Gln Val Lys Glu Lys Tyr 340 345
350 Ser Phe Val Ser Glu Val Glu Pro Asn Asp Thr Asp Pro Leu
Asp Ser 355 360 365
Asn Val Ala Asn Gln Lys Pro Phe Glu Ser Gly Ser Asp Lys Ile Ser 370
375 380 Lys Glu Val Val Pro
Ser Leu Ala Cys Glu Trp Ser Gln Leu Thr Leu 385 390
395 400 Ser Gly Leu Asn Gly Ala Gln Met Glu Lys
Ile Pro Leu Leu His Ile 405 410
415 Ser Ser Cys Asp Gln Asn Ile Ser Glu Lys Asp Leu Leu Asp Thr
Glu 420 425 430 Asn
Lys Arg Lys Lys Asp Phe Leu Thr Ser Glu Asn Ser Leu Pro Arg 435
440 445 Ile Ser Ser Leu Pro Lys
Ser Glu Lys Pro Leu Asn Glu Glu Thr Val 450 455
460 Val Asn Lys Arg Asp Glu Glu Gln His Leu Glu
Ser His Thr Asp Cys 465 470 475
480 Ile Leu Ala Val Lys Gln Ala Ile Ser Gly Thr Ser Pro Val Ala Ser
485 490 495 Ser Phe
Gln Gly Ile Lys Lys Ser Ile Phe Arg Ile Arg Glu Ser Pro 500
505 510 Lys Glu Thr Phe Asn Ala Ser
Phe Ser Gly His Met Thr Asp Pro Asn 515 520
525 Phe Lys Lys Glu Thr Glu Ala Ser Glu Ser Gly Leu
Glu Ile His Thr 530 535 540
Val Cys Ser Gln Lys Glu Asp Ser Leu Cys Pro Asn Leu Ile Asp Asn 545
550 555 560 Gly Ser Trp
Pro Ala Thr Thr Thr Gln Asn Ser Val Ala Leu Lys Asn 565
570 575 Ala Gly Leu Ile Ser Thr Leu Lys
Lys Lys Thr Asn Lys Phe Ile Tyr 580 585
590 Ala Ile His Asp Glu Thr Ser Tyr Lys Gly Lys Lys Ile
Pro Lys Asp 595 600 605
Gln Lys Ser Glu Leu Ile Asn Cys Ser Ala Gln Phe Glu Ala Asn Ala 610
615 620 Phe Glu Ala Pro
Leu Thr Phe Ala Asn Ala Asp Ser Gly Leu Leu His 625 630
635 640 Ser Ser Val Lys Arg Ser Cys Ser Gln
Asn Asp Ser Glu Glu Pro Thr 645 650
655 Leu Ser Leu Thr Ser Ser Phe Gly Thr Ile Leu Arg Lys Cys
Ser Arg 660 665 670
Asn Glu Thr Cys Ser Asn Asn Thr Val Ile Ser Gln Asp Leu Asp Tyr
675 680 685 Lys Glu Ala Lys
Cys Asn Lys Glu Lys Leu Gln Leu Phe Ile Thr Pro 690
695 700 Glu Ala Asp Ser Leu Ser Cys Leu
Gln Glu Gly Gln Cys Glu Asn Asp 705 710
715 720 Pro Lys Ser Lys Lys Val Ser Asp Ile Lys Glu Glu
Val Leu Ala Ala 725 730
735 Ala Cys His Pro Val Gln His Ser Lys Val Glu Tyr Ser Asp Thr Asp
740 745 750 Phe Gln Ser
Gln Lys Ser Leu Leu Tyr Asp His Glu Asn Ala Ser Thr 755
760 765 Leu Ile Leu Thr Pro Thr Ser Lys
Asp Val Leu Ser Asn Leu Val Met 770 775
780 Ile Ser Arg Gly Lys Glu Ser Tyr Lys Met Ser Asp Lys
Leu Lys Gly 785 790 795
800 Asn Asn Tyr Glu Ser Asp Val Glu Leu Thr Lys Asn Ile Pro Met Glu
805 810 815 Lys Asn Gln Asp
Val Cys Ala Leu Asn Glu Asn Tyr Lys Asn Val Glu 820
825 830 Leu Leu Pro Pro Glu Lys Tyr Met Arg
Val Ala Ser Pro Ser Arg Lys 835 840
845 Val Gln Phe Asn Gln Asn Thr Asn Leu Arg Val Ile Gln Lys
Asn Gln 850 855 860
Glu Glu Thr Thr Ser Ile Ser Lys Ile Thr Val Asn Pro Asp Ser Glu 865
870 875 880 Glu Leu Phe Ser Asp
Asn Glu Asn Asn Phe Val Phe Gln Val Ala Asn 885
890 895 Glu Arg Asn Asn Leu Ala Leu Gly Asn Thr
Lys Glu Leu His Glu Thr 900 905
910 Asp Leu Thr Cys Val Asn Glu Pro Ile Phe Lys Asn Ser Thr Met
Val 915 920 925 Leu
Tyr Gly Asp Thr Gly Asp Lys Gln Ala Thr Gln Val Ser Ile Lys 930
935 940 Lys Asp Leu Val Tyr Val
Leu Ala Glu Glu Asn Lys Asn Ser Val Lys 945 950
955 960 Gln His Ile Lys Met Thr Leu Gly Gln Asp Leu
Lys Ser Asp Ile Ser 965 970
975 Leu Asn Ile Asp Lys Ile Pro Glu Lys Asn Asn Asp Tyr Met Asn Lys
980 985 990 Trp Ala
Gly Leu Leu Gly Pro Ile Ser Asn His Ser Phe Gly Gly Ser 995
1000 1005 Phe Arg Thr Ala Ser
Asn Lys Glu Ile Lys Leu Ser Glu His Asn 1010 1015
1020 Ile Lys Lys Ser Lys Met Phe Phe Lys Asp
Ile Glu Glu Gln Tyr 1025 1030 1035
Pro Thr Ser Leu Ala Cys Val Glu Ile Val Asn Thr Leu Ala Leu
1040 1045 1050 Asp Asn
Gln Lys Lys Leu Ser Lys Pro Gln Ser Ile Asn Thr Val 1055
1060 1065 Ser Ala His Leu Gln Ser Ser
Val Val Val Ser Asp Cys Lys Asn 1070 1075
1080 Ser His Ile Thr Pro Gln Met Leu Phe Ser Lys Gln
Asp Phe Asn 1085 1090 1095
Ser Asn His Asn Leu Thr Pro Ser Gln Lys Ala Glu Ile Thr Glu 1100
1105 1110 Leu Ser Thr Ile Leu
Glu Glu Ser Gly Ser Gln Phe Glu Phe Thr 1115 1120
1125 Gln Phe Arg Lys Pro Ser Tyr Ile Leu Gln
Lys Ser Thr Phe Glu 1130 1135 1140
Val Pro Glu Asn Gln Met Thr Ile Leu Lys Thr Thr Ser Glu Glu
1145 1150 1155 Cys Arg
Asp Ala Asp Leu His Val Ile Met Asn Ala Pro Ser Ile 1160
1165 1170 Gly Gln Val Asp Ser Ser Lys
Gln Phe Glu Gly Thr Val Glu Ile 1175 1180
1185 Lys Arg Lys Phe Ala Gly Leu Leu Lys Asn Asp Cys
Asn Lys Ser 1190 1195 1200
Ala Ser Gly Tyr Leu Thr Asp Glu Asn Glu Val Gly Phe Arg Gly 1205
1210 1215 Phe Tyr Ser Ala His
Gly Thr Lys Leu Asn Val Ser Thr Glu Ala 1220 1225
1230 Leu Gln Lys Ala Val Lys Leu Phe Ser Asp
Ile Glu Asn Ile Ser 1235 1240 1245
Glu Glu Thr Ser Ala Glu Val His Pro Ile Ser Leu Ser Ser Ser
1250 1255 1260 Lys Cys
His Asp Ser Val Val Ser Met Phe Lys Ile Glu Asn His 1265
1270 1275 Asn Asp Lys Thr Val Ser Glu
Lys Asn Asn Lys Cys Gln Leu Ile 1280 1285
1290 Leu Gln Asn Asn Ile Glu Met Thr Thr Gly Thr Phe
Val Glu Glu 1295 1300 1305
Ile Thr Glu Asn Tyr Lys Arg Asn Thr Glu Asn Glu Asp Asn Lys 1310
1315 1320 Tyr Thr Ala Ala Ser
Arg Asn Ser His Asn Leu Glu Phe Asp Gly 1325 1330
1335 Ser Asp Ser Ser Lys Asn Asp Thr Val Cys
Ile His Lys Asp Glu 1340 1345 1350
Thr Asp Leu Leu Phe Thr Asp Gln His Asn Ile Cys Leu Lys Leu
1355 1360 1365 Ser Gly
Gln Phe Met Lys Glu Gly Asn Thr Gln Ile Lys Glu Asp 1370
1375 1380 Leu Ser Asp Leu Thr Phe Leu
Glu Val Ala Lys Ala Gln Glu Ala 1385 1390
1395 Cys His Gly Asn Thr Ser Asn Lys Glu Gln Leu Thr
Ala Thr Lys 1400 1405 1410
Thr Glu Gln Asn Ile Lys Asp Phe Glu Thr Ser Asp Thr Phe Phe 1415
1420 1425 Gln Thr Ala Ser Gly
Lys Asn Ile Ser Val Ala Lys Glu Ser Phe 1430 1435
1440 Asn Lys Ile Val Asn Phe Phe Asp Gln Lys
Pro Glu Glu Leu His 1445 1450 1455
Asn Phe Ser Leu Asn Ser Glu Leu His Ser Asp Ile Arg Lys Asn
1460 1465 1470 Lys Met
Asp Ile Leu Ser Tyr Glu Glu Thr Asp Ile Val Lys His 1475
1480 1485 Lys Ile Leu Lys Glu Ser Val
Pro Val Gly Thr Gly Asn Gln Leu 1490 1495
1500 Val Thr Phe Gln Gly Gln Pro Glu Arg Asp Glu Lys
Ile Lys Glu 1505 1510 1515
Pro Thr Leu Leu Gly Phe His Thr Ala Ser Gly Lys Lys Val Lys 1520
1525 1530 Ile Ala Lys Glu Ser
Leu Asp Lys Val Lys Asn Leu Phe Asp Glu 1535 1540
1545 Lys Glu Gln Gly Thr Ser Glu Ile Thr Ser
Phe Ser His Gln Trp 1550 1555 1560
Ala Lys Thr Leu Lys Tyr Arg Glu Ala Cys Lys Asp Leu Glu Leu
1565 1570 1575 Ala Cys
Glu Thr Ile Glu Ile Thr Ala Ala Pro Lys Cys Lys Glu 1580
1585 1590 Met Gln Asn Ser Leu Asn Asn
Asp Lys Asn Leu Val Ser Ile Glu 1595 1600
1605 Thr Val Val Pro Pro Lys Leu Leu Ser Asp Asn Leu
Cys Arg Gln 1610 1615 1620
Thr Glu Asn Leu Lys Thr Ser Lys Ser Ile Phe Leu Lys Val Lys 1625
1630 1635 Val His Glu Asn Val
Glu Lys Glu Thr Ala Lys Ser Pro Ala Thr 1640 1645
1650 Cys Tyr Thr Asn Gln Ser Pro Tyr Ser Val
Ile Glu Asn Ser Ala 1655 1660 1665
Leu Ala Phe Tyr Thr Ser Cys Ser Arg Lys Thr Ser Val Ser Gln
1670 1675 1680 Thr Ser
Leu Leu Glu Ala Lys Lys Trp Leu Arg Glu Gly Ile Phe 1685
1690 1695 Asp Gly Gln Pro Glu Arg Ile
Asn Thr Ala Asp Tyr Val Gly Asn 1700 1705
1710 Tyr Leu Tyr Glu Asn Asn Ser Asn Ser Thr Ile Ala
Glu Asn Asp 1715 1720 1725
Lys Asn His Leu Ser Glu Lys Gln Asp Thr Tyr Leu Ser Asn Ser 1730
1735 1740 Ser Met Ser Asn Ser
Tyr Ser Tyr His Ser Asp Glu Val Tyr Asn 1745 1750
1755 Asp Ser Gly Tyr Leu Ser Lys Asn Lys Leu
Asp Ser Gly Ile Glu 1760 1765 1770
Pro Val Leu Lys Asn Val Glu Asp Gln Lys Asn Thr Ser Phe Ser
1775 1780 1785 Lys Val
Ile Ser Asn Val Lys Asp Ala Asn Ala Tyr Pro Gln Thr 1790
1795 1800 Val Asn Glu Asp Ile Cys Val
Glu Glu Leu Val Thr Ser Ser Ser 1805 1810
1815 Pro Cys Lys Asn Lys Asn Ala Ala Ile Lys Leu Ser
Ile Ser Asn 1820 1825 1830
Ser Asn Asn Phe Glu Val Gly Pro Pro Ala Phe Arg Ile Ala Ser 1835
1840 1845 Gly Lys Ile Val Cys
Val Ser His Glu Thr Ile Lys Lys Val Lys 1850 1855
1860 Asp Ile Phe Thr Asp Ser Phe Ser Lys Val
Ile Lys Glu Asn Asn 1865 1870 1875
Glu Asn Lys Ser Lys Ile Cys Gln Thr Lys Ile Met Ala Gly Cys
1880 1885 1890 Tyr Glu
Ala Leu Asp Asp Ser Glu Asp Ile Leu His Asn Ser Leu 1895
1900 1905 Asp Asn Asp Glu Cys Ser Thr
His Ser His Lys Val Phe Ala Asp 1910 1915
1920 Ile Gln Ser Glu Glu Ile Leu Gln His Asn Gln Asn
Met Ser Gly 1925 1930 1935
Leu Glu Lys Val Ser Lys Ile Ser Pro Cys Asp Val Ser Leu Glu 1940
1945 1950 Thr Ser Asp Ile Cys
Lys Cys Ser Ile Gly Lys Leu His Lys Ser 1955 1960
1965 Val Ser Ser Ala Asn Thr Cys Gly Ile Phe
Ser Thr Ala Ser Gly 1970 1975 1980
Lys Ser Val Gln Val Ser Asp Ala Ser Leu Gln Asn Ala Arg Gln
1985 1990 1995 Val Phe
Ser Glu Ile Glu Asp Ser Thr Lys Gln Val Phe Ser Lys 2000
2005 2010 Val Leu Phe Lys Ser Asn Glu
His Ser Asp Gln Leu Thr Arg Glu 2015 2020
2025 Glu Asn Thr Ala Ile Arg Thr Pro Glu His Leu Ile
Ser Gln Lys 2030 2035 2040
Gly Phe Ser Tyr Asn Val Val Asn Ser Ser Ala Phe Ser Gly Phe 2045
2050 2055 Ser Thr Ala Ser Gly
Lys Gln Val Ser Ile Leu Glu Ser Ser Leu 2060 2065
2070 His Lys Val Lys Gly Val Leu Glu Glu Phe
Asp Leu Ile Arg Thr 2075 2080 2085
Glu His Ser Leu His Tyr Ser Pro Thr Ser Arg Gln Asn Val Ser
2090 2095 2100 Lys Ile
Leu Pro Arg Val Asp Lys Arg Asn Pro Glu His Cys Val 2105
2110 2115 Asn Ser Glu Met Glu Lys Thr
Cys Ser Lys Glu Phe Lys Leu Ser 2120 2125
2130 Asn Asn Leu Asn Val Glu Gly Gly Ser Ser Glu Asn
Asn His Ser 2135 2140 2145
Ile Lys Val Ser Pro Tyr Leu Ser Gln Phe Gln Gln Asp Lys Gln 2150
2155 2160 Gln Leu Val Leu Gly
Thr Lys Val Ser Leu Val Glu Asn Ile His 2165 2170
2175 Val Leu Gly Lys Glu Gln Ala Ser Pro Lys
Asn Val Lys Met Glu 2180 2185 2190
Ile Gly Lys Thr Glu Thr Phe Ser Asp Val Pro Val Lys Thr Asn
2195 2200 2205 Ile Glu
Val Cys Ser Thr Tyr Ser Lys Asp Ser Glu Asn Tyr Phe 2210
2215 2220 Glu Thr Glu Ala Val Glu Ile
Ala Lys Ala Phe Met Glu Asp Asp 2225 2230
2235 Glu Leu Thr Asp Ser Lys Leu Pro Ser His Ala Thr
His Ser Leu 2240 2245 2250
Phe Thr Cys Pro Glu Asn Glu Glu Met Val Leu Ser Asn Ser Arg 2255
2260 2265 Ile Gly Lys Arg Arg
Gly Glu Pro Leu Ile Leu Val Gly Glu Pro 2270 2275
2280 Ser Ile Lys Arg Asn Leu Leu Asn Glu Phe
Asp Arg Ile Ile Glu 2285 2290 2295
Asn Gln Glu Lys Ser Leu Lys Ala Ser Lys Ser Thr Pro Asp Gly
2300 2305 2310 Thr Ile
Lys Asp Arg Arg Leu Phe Met His His Val Ser Leu Glu 2315
2320 2325 Pro Ile Thr Cys Val Pro Phe
Arg Thr Thr Lys Glu Arg Gln Glu 2330 2335
2340 Ile Gln Asn Pro Asn Phe Thr Ala Pro Gly Gln Glu
Phe Leu Ser 2345 2350 2355
Lys Ser His Leu Tyr Glu His Leu Thr Leu Glu Lys Ser Ser Ser 2360
2365 2370 Asn Leu Ala Val Ser
Gly His Pro Phe Tyr Gln Val Ser Ala Thr 2375 2380
2385 Arg Asn Glu Lys Met Arg His Leu Ile Thr
Thr Gly Arg Pro Thr 2390 2395 2400
Lys Val Phe Val Pro Pro Phe Lys Thr Lys Ser His Phe His Arg
2405 2410 2415 Val Glu
Gln Cys Val Arg Asn Ile Asn Leu Glu Glu Asn Arg Gln 2420
2425 2430 Lys Gln Asn Ile Asp Gly His
Gly Ser Asp Asp Ser Lys Asn Lys 2435 2440
2445 Ile Asn Asp Asn Glu Ile His Gln Phe Asn Lys Asn
Asn Ser Asn 2450 2455 2460
Gln Ala Ala Ala Val Thr Phe Thr Lys Cys Glu Glu Glu Pro Leu 2465
2470 2475 Asp Leu Ile Thr Ser
Leu Gln Asn Ala Arg Asp Ile Gln Asp Met 2480 2485
2490 Arg Ile Lys Lys Lys Gln Arg Gln Arg Val
Phe Pro Gln Pro Gly 2495 2500 2505
Ser Leu Tyr Leu Ala Lys Thr Ser Thr Leu Pro Arg Ile Ser Leu
2510 2515 2520 Lys Ala
Ala Val Gly Gly Gln Val Pro Ser Ala Cys Ser His Lys 2525
2530 2535 Gln Leu Tyr Thr Tyr Gly Val
Ser Lys His Cys Ile Lys Ile Asn 2540 2545
2550 Ser Lys Asn Ala Glu Ser Phe Gln Phe His Thr Glu
Asp Tyr Phe 2555 2560 2565
Gly Lys Glu Ser Leu Trp Thr Gly Lys Gly Ile Gln Leu Ala Asp 2570
2575 2580 Gly Gly Trp Leu Ile
Pro Ser Asn Asp Gly Lys Ala Gly Lys Glu 2585 2590
2595 Glu Phe Tyr Arg Ala Leu Cys Asp Thr Pro
Gly Val Asp Pro Lys 2600 2605 2610
Leu Ile Ser Arg Ile Trp Val Tyr Asn His Tyr Arg Trp Ile Ile
2615 2620 2625 Trp Lys
Leu Ala Ala Met Glu Cys Ala Phe Pro Lys Glu Phe Ala 2630
2635 2640 Asn Arg Cys Leu Ser Pro Glu
Arg Val Leu Leu Gln Leu Lys Tyr 2645 2650
2655 Arg Tyr Asp Thr Glu Ile Asp Arg Ser Arg Arg Ser
Ala Ile Lys 2660 2665 2670
Lys Ile Met Glu Arg Asp Asp Thr Ala Ala Lys Thr Leu Val Leu 2675
2680 2685 Cys Val Ser Asp Ile
Ile Ser Leu Ser Ala Asn Ile Ser Glu Thr 2690 2695
2700 Ser Ser Asn Lys Thr Ser Ser Ala Asp Thr
Gln Lys Val Ala Ile 2705 2710 2715
Ile Glu Leu Thr Asp Gly Trp Tyr Ala Val Lys Ala Gln Leu Asp
2720 2725 2730 Pro Pro
Leu Leu Ala Val Leu Lys Asn Gly Arg Leu Thr Val Gly 2735
2740 2745 Gln Lys Ile Ile Leu His Gly
Ala Glu Leu Val Gly Ser Pro Asp 2750 2755
2760 Ala Cys Thr Pro Leu Glu Ala Pro Glu Ser Leu Met
Leu Lys Ile 2765 2770 2775
Ser Ala Asn Ser Thr Arg Pro Ala Arg Trp Tyr Thr Lys Leu Gly 2780
2785 2790 Phe Phe Pro Asp Pro
Arg Pro Phe Pro Leu Pro Leu Ser Ser Leu 2795 2800
2805 Phe Ser Asp Gly Gly Asn Val Gly Cys Val
Asp Val Ile Ile Gln 2810 2815 2820
Arg Ala Tyr Pro Ile Gln Trp Met Glu Lys Thr Ser Ser Gly Leu
2825 2830 2835 Tyr Ile
Phe Arg Asn Glu Arg Glu Glu Glu Lys Glu Ala Ala Lys 2840
2845 2850 Tyr Val Glu Ala Gln Gln Lys
Arg Leu Glu Ala Leu Phe Thr Lys 2855 2860
2865 Ile Gln Glu Glu Phe Glu Glu His Glu Glu Asn Thr
Thr Lys Pro 2870 2875 2880
Tyr Leu Pro Ser Arg Ala Leu Thr Arg Gln Gln Val Arg Ala Leu 2885
2890 2895 Gln Asp Gly Ala Glu
Leu Tyr Glu Ala Val Lys Asn Ala Ala Asp 2900 2905
2910 Pro Ala Tyr Leu Glu Gly Tyr Phe Ser Glu
Glu Gln Leu Arg Ala 2915 2920 2925
Leu Asn Asn His Arg Gln Met Leu Asn Asp Lys Lys Gln Ala Gln
2930 2935 2940 Ile Gln
Leu Glu Ile Arg Lys Ala Met Glu Ser Ala Glu Gln Lys 2945
2950 2955 Glu Gln Gly Leu Ser Arg Asp
Val Thr Thr Val Trp Lys Leu Arg 2960 2965
2970 Ile Val Ser Tyr Ser Lys Lys Glu Lys Asp Ser Val
Ile Leu Ser 2975 2980 2985
Ile Trp Arg Pro Ser Ser Asp Leu Tyr Ser Leu Leu Thr Glu Gly 2990
2995 3000 Lys Arg Tyr Arg Ile
Tyr His Leu Ala Thr Ser Lys Ser Lys Ser 3005 3010
3015 Lys Ser Glu Arg Ala Asn Ile Gln Leu Ala
Ala Thr Lys Lys Thr 3020 3025 3030
Gln Tyr Gln Gln Leu Pro Val Ser Asp Glu Ile Leu Phe Gln Ile
3035 3040 3045 Tyr Gln
Pro Arg Glu Pro Leu His Phe Ser Lys Phe Leu Asp Pro 3050
3055 3060 Asp Phe Gln Pro Ser Cys Ser
Glu Val Asp Leu Ile Gly Phe Val 3065 3070
3075 Val Ser Val Val Lys Lys Thr Gly Leu Ala Pro Phe
Val Tyr Leu 3080 3085 3090
Ser Asp Glu Cys Tyr Asn Leu Leu Ala Ile Lys Phe Trp Ile Asp 3095
3100 3105 Leu Asn Glu Asp Ile
Ile Lys Pro His Met Leu Ile Ala Ala Ser 3110 3115
3120 Asn Leu Gln Trp Arg Pro Glu Ser Lys Ser
Gly Leu Leu Thr Leu 3125 3130 3135
Phe Ala Gly Asp Phe Ser Val Phe Ser Ala Ser Pro Lys Glu Gly
3140 3145 3150 His Phe
Gln Glu Thr Phe Asn Lys Met Lys Asn Thr Val Glu Asn 3155
3160 3165 Ile Asp Ile Leu Cys Asn Glu
Ala Glu Asn Lys Leu Met His Ile 3170 3175
3180 Leu His Ala Asn Asp Pro Lys Trp Ser Thr Pro Thr
Lys Asp Cys 3185 3190 3195
Thr Ser Gly Pro Tyr Thr Ala Gln Ile Ile Pro Gly Thr Gly Asn 3200
3205 3210 Lys Leu Leu Met Ser
Ser Pro Asn Cys Glu Ile Tyr Tyr Gln Ser 3215 3220
3225 Pro Leu Ser Leu Cys Met Ala Lys Arg Lys
Ser Val Ser Thr Pro 3230 3235 3240
Val Ser Ala Gln Met Thr Ser Lys Ser Cys Lys Gly Glu Lys Glu
3245 3250 3255 Ile Asp
Asp Gln Lys Asn Cys Lys Lys Arg Arg Ala Leu Asp Phe 3260
3265 3270 Leu Ser Arg Leu Pro Leu Pro
Pro Pro Val Ser Pro Ile Cys Thr 3275 3280
3285 Phe Val Ser Pro Ala Ala Gln Lys Ala Phe Gln Pro
Pro Arg Ser 3290 3295 3300
Cys Gly Thr Lys Tyr Glu Thr Pro Ile Lys Lys Lys Glu Leu Asn 3305
3310 3315 Ser Pro Gln Met Thr
Pro Phe Lys Lys Phe Asn Glu Ile Ser Leu 3320 3325
3330 Leu Glu Ser Asn Ser Ile Ala Asp Glu Glu
Leu Ala Leu Ile Asn 3335 3340 3345
Thr Gln Ala Leu Leu Ser Gly Ser Thr Gly Glu Lys Gln Phe Ile
3350 3355 3360 Ser Val
Ser Glu Ser Thr Arg Thr Ala Pro Thr Ser Ser Glu Asp 3365
3370 3375 Tyr Leu Arg Leu Lys Arg Arg
Cys Thr Thr Ser Leu Ile Lys Glu 3380 3385
3390 Gln Glu Ser Ser Gln Ala Ser Thr Glu Glu Cys Glu
Lys Asn Lys 3395 3400 3405
Gln Asp Thr Ile Thr Thr Lys Lys Tyr Ile 3410 3415
2611386DNAArtificial SequenceBRCA2 sequence 26gtggcgcgag
cttctgaaac taggcggcag aggcggagcc gctgtggcac tgctgcgcct 60ctgctgcgcc
tcgggtgtct tttgcggcgg tgggtcgccg ccgggagaag cgtgagggga 120cagatttgtg
accggcgcgg tttttgtcag cttactccgg ccaaaaaaga actgcacctc 180tggagcggac
ttatttacca agcattggag gaatatcgta ggtaaaaatg cctattggat 240ccaaagagag
gccaacattt tttgaaattt ttaagacacg ctgcaacaaa gcagatttag 300gaccaataag
tcttaattgg tttgaagaac tttcttcaga agctccaccc tataattctg 360aacctgcaga
agaatctgaa cataaaaaca acaattacga accaaaccta tttaaaactc 420cacaaaggaa
accatcttat aatcagctgg cttcaactcc aataatattc aaagagcaag 480ggctgactct
gccgctgtac caatctcctg taaaagaatt agataaattc aaattagact 540taggaaggaa
tgttcccaat agtagacata aaagtcttcg cacagtgaaa actaaaatgg 600atcaagcaga
tgatgtttcc tgtccacttc taaattcttg tcttagtgaa agtcctgttg 660ttctacaatg
tacacatgta acaccacaaa gagataagtc agtggtatgt gggagtttgt 720ttcatacacc
aaagtttgtg aagggtcgtc agacaccaaa acatatttct gaaagtctag 780gagctgaggt
ggatcctgat atgtcttggt caagttcttt agctacacca cccaccctta 840gttctactgt
gctcatagtc agaaatgaag aagcatctga aactgtattt cctcatgata 900ctactgctaa
tgtgaaaagc tatttttcca atcatgatga aagtctgaag aaaaatgata 960gatttatcgc
ttctgtgaca gacagtgaaa acacaaatca aagagaagct gcaagtcatg 1020gatttggaaa
aacatcaggg aattcattta aagtaaatag ctgcaaagac cacattggaa 1080agtcaatgcc
aaatgtccta gaagatgaag tatatgaaac agttgtagat acctctgaag 1140aagatagttt
ttcattatgt ttttctaaat gtagaacaaa aaatctacaa aaagtaagaa 1200ctagcaagac
taggaaaaaa attttccatg aagcaaacgc tgatgaatgt gaaaaatcta 1260aaaaccaagt
gaaagaaaaa tactcatttg tatctgaagt ggaaccaaat gatactgatc 1320cattagattc
aaatgtagca aatcagaagc cctttgagag tggaagtgac aaaatctcca 1380aggaagttgt
accgtctttg gcctgtgaat ggtctcaact aaccctttca ggtctaaatg 1440gagcccagat
ggagaaaata cccctattgc atatttcttc atgtgaccaa aatatttcag 1500aaaaagacct
attagacaca gagaacaaaa gaaagaaaga ttttcttact tcagagaatt 1560ctttgccacg
tatttctagc ctaccaaaat cagagaagcc attaaatgag gaaacagtgg 1620taaataagag
agatgaagag cagcatcttg aatctcatac agactgcatt cttgcagtaa 1680agcaggcaat
atctggaact tctccagtgg cttcttcatt tcagggtatc aaaaagtcta 1740tattcagaat
aagagaatca cctaaagaga ctttcaatgc aagtttttca ggtcatatga 1800ctgatccaaa
ctttaaaaaa gaaactgaag cctctgaaag tggactggaa atacatactg 1860tttgctcaca
gaaggaggac tccttatgtc caaatttaat tgataatgga agctggccag 1920ccaccaccac
acagaattct gtagctttga agaatgcagg tttaatatcc actttgaaaa 1980agaaaacaaa
taagtttatt tatgctatac atgatgaaac atcttataaa ggaaaaaaaa 2040taccgaaaga
ccaaaaatca gaactaatta actgttcagc ccagtttgaa gcaaatgctt 2100ttgaagcacc
acttacattt gcaaatgctg attcaggttt attgcattct tctgtgaaaa 2160gaagctgttc
acagaatgat tctgaagaac caactttgtc cttaactagc tcttttggga 2220caattctgag
gaaatgttct agaaatgaaa catgttctaa taatacagta atctctcagg 2280atcttgatta
taaagaagca aaatgtaata aggaaaaact acagttattt attaccccag 2340aagctgattc
tctgtcatgc ctgcaggaag gacagtgtga aaatgatcca aaaagcaaaa 2400aagtttcaga
tataaaagaa gaggtcttgg ctgcagcatg tcacccagta caacattcaa 2460aagtggaata
cagtgatact gactttcaat cccagaaaag tcttttatat gatcatgaaa 2520atgccagcac
tcttatttta actcctactt ccaaggatgt tctgtcaaac ctagtcatga 2580tttctagagg
caaagaatca tacaaaatgt cagacaagct caaaggtaac aattatgaat 2640ctgatgttga
attaaccaaa aatattccca tggaaaagaa tcaagatgta tgtgctttaa 2700atgaaaatta
taaaaacgtt gagctgttgc cacctgaaaa atacatgaga gtagcatcac 2760cttcaagaaa
ggtacaattc aaccaaaaca caaatctaag agtaatccaa aaaaatcaag 2820aagaaactac
ttcaatttca aaaataactg tcaatccaga ctctgaagaa cttttctcag 2880acaatgagaa
taattttgtc ttccaagtag ctaatgaaag gaataatctt gctttaggaa 2940atactaagga
acttcatgaa acagacttga cttgtgtaaa cgaacccatt ttcaagaact 3000ctaccatggt
tttatatgga gacacaggtg ataaacaagc aacccaagtg tcaattaaaa 3060aagatttggt
ttatgttctt gcagaggaga acaaaaatag tgtaaagcag catataaaaa 3120tgactctagg
tcaagattta aaatcggaca tctccttgaa tatagataaa ataccagaaa 3180aaaataatga
ttacatgaac aaatgggcag gactcttagg tccaatttca aatcacagtt 3240ttggaggtag
cttcagaaca gcttcaaata aggaaatcaa gctctctgaa cataacatta 3300agaagagcaa
aatgttcttc aaagatattg aagaacaata tcctactagt ttagcttgtg 3360ttgaaattgt
aaataccttg gcattagata atcaaaagaa actgagcaag cctcagtcaa 3420ttaatactgt
atctgcacat ttacagagta gtgtagttgt ttctgattgt aaaaatagtc 3480atataacccc
tcagatgtta ttttccaagc aggattttaa ttcaaaccat aatttaacac 3540ctagccaaaa
ggcagaaatt acagaacttt ctactatatt agaagaatca ggaagtcagt 3600ttgaatttac
tcagtttaga aaaccaagct acatattgca gaagagtaca tttgaagtgc 3660ctgaaaacca
gatgactatc ttaaagacca cttctgagga atgcagagat gctgatcttc 3720atgtcataat
gaatgcccca tcgattggtc aggtagacag cagcaagcaa tttgaaggta 3780cagttgaaat
taaacggaag tttgctggcc tgttgaaaaa tgactgtaac aaaagtgctt 3840ctggttattt
aacagatgaa aatgaagtgg ggtttagggg cttttattct gctcatggca 3900caaaactgaa
tgtttctact gaagctctgc aaaaagctgt gaaactgttt agtgatattg 3960agaatattag
tgaggaaact tctgcagagg tacatccaat aagtttatct tcaagtaaat 4020gtcatgattc
tgttgtttca atgtttaaga tagaaaatca taatgataaa actgtaagtg 4080aaaaaaataa
taaatgccaa ctgatattac aaaataatat tgaaatgact actggcactt 4140ttgttgaaga
aattactgaa aattacaaga gaaatactga aaatgaagat aacaaatata 4200ctgctgccag
tagaaattct cataacttag aatttgatgg cagtgattca agtaaaaatg 4260atactgtttg
tattcataaa gatgaaacgg acttgctatt tactgatcag cacaacatat 4320gtcttaaatt
atctggccag tttatgaagg agggaaacac tcagattaaa gaagatttgt 4380cagatttaac
ttttttggaa gttgcgaaag ctcaagaagc atgtcatggt aatacttcaa 4440ataaagaaca
gttaactgct actaaaacgg agcaaaatat aaaagatttt gagacttctg 4500atacattttt
tcagactgca agtgggaaaa atattagtgt cgccaaagag tcatttaata 4560aaattgtaaa
tttctttgat cagaaaccag aagaattgca taacttttcc ttaaattctg 4620aattacattc
tgacataaga aagaacaaaa tggacattct aagttatgag gaaacagaca 4680tagttaaaca
caaaatactg aaagaaagtg tcccagttgg tactggaaat caactagtga 4740ccttccaggg
acaacccgaa cgtgatgaaa agatcaaaga acctactcta ttgggttttc 4800atacagctag
cgggaaaaaa gttaaaattg caaaggaatc tttggacaaa gtgaaaaacc 4860tttttgatga
aaaagagcaa ggtactagtg aaatcaccag ttttagccat caatgggcaa 4920agaccctaaa
gtacagagag gcctgtaaag accttgaatt agcatgtgag accattgaga 4980tcacagctgc
cccaaagtgt aaagaaatgc agaattctct caataatgat aaaaaccttg 5040tttctattga
gactgtggtg ccacctaagc tcttaagtga taatttatgt agacaaactg 5100aaaatctcaa
aacatcaaaa agtatctttt tgaaagttaa agtacatgaa aatgtagaaa 5160aagaaacagc
aaaaagtcct gcaacttgtt acacaaatca gtccccttat tcagtcattg 5220aaaattcagc
cttagctttt tacacaagtt gtagtagaaa aacttctgtg agtcagactt 5280cattacttga
agcaaaaaaa tggcttagag aaggaatatt tgatggtcaa ccagaaagaa 5340taaatactgc
agattatgta ggaaattatt tgtatgaaaa taattcaaac agtactatag 5400ctgaaaatga
caaaaatcat ctctccgaaa aacaagatac ttatttaagt aacagtagca 5460tgtctaacag
ctattcctac cattctgatg aggtatataa tgattcagga tatctctcaa 5520aaaataaact
tgattctggt attgagccag tattgaagaa tgttgaagat caaaaaaaca 5580ctagtttttc
caaagtaata tccaatgtaa aagatgcaaa tgcataccca caaactgtaa 5640atgaagatat
ttgcgttgag gaacttgtga ctagctcttc accctgcaaa aataaaaatg 5700cagccattaa
attgtccata tctaatagta ataattttga ggtagggcca cctgcattta 5760ggatagccag
tggtaaaatc gtttgtgttt cacatgaaac aattaaaaaa gtgaaagaca 5820tatttacaga
cagtttcagt aaagtaatta aggaaaacaa cgagaataaa tcaaaaattt 5880gccaaacgaa
aattatggca ggttgttacg aggcattgga tgattcagag gatattcttc 5940ataactctct
agataatgat gaatgtagca cgcattcaca taaggttttt gctgacattc 6000agagtgaaga
aattttacaa cataaccaaa atatgtctgg attggagaaa gtttctaaaa 6060tatcaccttg
tgatgttagt ttggaaactt cagatatatg taaatgtagt atagggaagc 6120ttcataagtc
agtctcatct gcaaatactt gtgggatttt tagcacagca agtggaaaat 6180ctgtccaggt
atcagatgct tcattacaaa acgcaagaca agtgttttct gaaatagaag 6240atagtaccaa
gcaagtcttt tccaaagtat tgtttaaaag taacgaacat tcagaccagc 6300tcacaagaga
agaaaatact gctatacgta ctccagaaca tttaatatcc caaaaaggct 6360tttcatataa
tgtggtaaat tcatctgctt tctctggatt tagtacagca agtggaaagc 6420aagtttccat
tttagaaagt tccttacaca aagttaaggg agtgttagag gaatttgatt 6480taatcagaac
tgagcatagt cttcactatt cacctacgtc tagacaaaat gtatcaaaaa 6540tacttcctcg
tgttgataag agaaacccag agcactgtgt aaactcagaa atggaaaaaa 6600cctgcagtaa
agaatttaaa ttatcaaata acttaaatgt tgaaggtggt tcttcagaaa 6660ataatcactc
tattaaagtt tctccatatc tctctcaatt tcaacaagac aaacaacagt 6720tggtattagg
aaccaaagtg tcacttgttg agaacattca tgttttggga aaagaacagg 6780cttcacctaa
aaacgtaaaa atggaaattg gtaaaactga aactttttct gatgttcctg 6840tgaaaacaaa
tatagaagtt tgttctactt actccaaaga ttcagaaaac tactttgaaa 6900cagaagcagt
agaaattgct aaagctttta tggaagatga tgaactgaca gattctaaac 6960tgccaagtca
tgccacacat tctcttttta catgtcccga aaatgaggaa atggttttgt 7020caaattcaag
aattggaaaa agaagaggag agccccttat cttagtggga gaaccctcaa 7080tcaaaagaaa
cttattaaat gaatttgaca ggataataga aaatcaagaa aaatccttaa 7140aggcttcaaa
aagcactcca gatggcacaa taaaagatcg aagattgttt atgcatcatg 7200tttctttaga
gccgattacc tgtgtaccct ttcgcacaac taaggaacgt caagagatac 7260agaatccaaa
ttttaccgca cctggtcaag aatttctgtc taaatctcat ttgtatgaac 7320atctgacttt
ggaaaaatct tcaagcaatt tagcagtttc aggacatcca ttttatcaag 7380tttctgctac
aagaaatgaa aaaatgagac acttgattac tacaggcaga ccaaccaaag 7440tctttgttcc
accttttaaa actaaatcac attttcacag agttgaacag tgtgttagga 7500atattaactt
ggaggaaaac agacaaaagc aaaacattga tggacatggc tctgatgata 7560gtaaaaataa
gattaatgac aatgagattc atcagtttaa caaaaacaac tccaatcaag 7620cagcagctgt
aactttcaca aagtgtgaag aagaaccttt agatttaatt acaagtcttc 7680agaatgccag
agatatacag gatatgcgaa ttaagaagaa acaaaggcaa cgcgtctttc 7740cacagccagg
cagtctgtat cttgcaaaaa catccactct gcctcgaatc tctctgaaag 7800cagcagtagg
aggccaagtt ccctctgcgt gttctcataa acagctgtat acgtatggcg 7860tttctaaaca
ttgcataaaa attaacagca aaaatgcaga gtcttttcag tttcacactg 7920aagattattt
tggtaaggaa agtttatgga ctggaaaagg aatacagttg gctgatggtg 7980gatggctcat
accctccaat gatggaaagg ctggaaaaga agaattttat agggctctgt 8040gtgacactcc
aggtgtggat ccaaagctta tttctagaat ttgggtttat aatcactata 8100gatggatcat
atggaaactg gcagctatgg aatgtgcctt tcctaaggaa tttgctaata 8160gatgcctaag
cccagaaagg gtgcttcttc aactaaaata cagatatgat acggaaattg 8220atagaagcag
aagatcggct ataaaaaaga taatggaaag ggatgacaca gctgcaaaaa 8280cacttgttct
ctgtgtttct gacataattt cattgagcgc aaatatatct gaaacttcta 8340gcaataaaac
tagtagtgca gatacccaaa aagtggccat tattgaactt acagatgggt 8400ggtatgctgt
taaggcccag ttagatcctc ccctcttagc tgtcttaaag aatggcagac 8460tgacagttgg
tcagaagatt attcttcatg gagcagaact ggtgggctct cctgatgcct 8520gtacacctct
tgaagcccca gaatctctta tgttaaagat ttctgctaac agtactcggc 8580ctgctcgctg
gtataccaaa cttggattct ttcctgaccc tagacctttt cctctgccct 8640tatcatcgct
tttcagtgat ggaggaaatg ttggttgtgt tgatgtaatt attcaaagag 8700cataccctat
acagtggatg gagaagacat catctggatt atacatattt cgcaatgaaa 8760gagaggaaga
aaaggaagca gcaaaatatg tggaggccca acaaaagaga ctagaagcct 8820tattcactaa
aattcaggag gaatttgaag aacatgaaga aaacacaaca aaaccatatt 8880taccatcacg
tgcactaaca agacagcaag ttcgtgcttt gcaagatggt gcagagcttt 8940atgaagcagt
gaagaatgca gcagacccag cttaccttga gggttatttc agtgaagagc 9000agttaagagc
cttgaataat cacaggcaaa tgttgaatga taagaaacaa gctcagatcc 9060agttggaaat
taggaaggcc atggaatctg ctgaacaaaa ggaacaaggt ttatcaaggg 9120atgtcacaac
cgtgtggaag ttgcgtattg taagctattc aaaaaaagaa aaagattcag 9180ttatactgag
tatttggcgt ccatcatcag atttatattc tctgttaaca gaaggaaaga 9240gatacagaat
ttatcatctt gcaacttcaa aatctaaaag taaatctgaa agagctaaca 9300tacagttagc
agcgacaaaa aaaactcagt atcaacaact accggtttca gatgaaattt 9360tatttcagat
ttaccagcca cgggagcccc ttcacttcag caaattttta gatccagact 9420ttcagccatc
ttgttctgag gtggacctaa taggatttgt cgtttctgtt gtgaaaaaaa 9480caggacttgc
ccctttcgtc tatttgtcag acgaatgtta caatttactg gcaataaagt 9540tttggataga
ccttaatgag gacattatta agcctcatat gttaattgct gcaagcaacc 9600tccagtggcg
accagaatcc aaatcaggcc ttcttacttt atttgctgga gatttttctg 9660tgttttctgc
tagtccaaaa gagggccact ttcaagagac attcaacaaa atgaaaaata 9720ctgttgagaa
tattgacata ctttgcaatg aagcagaaaa caagcttatg catatactgc 9780atgcaaatga
tcccaagtgg tccaccccaa ctaaagactg tacttcaggg ccgtacactg 9840ctcaaatcat
tcctggtaca ggaaacaagc ttctgatgtc ttctcctaat tgtgagatat 9900attatcaaag
tcctttatca ctttgtatgg ccaaaaggaa gtctgtttcc acacctgtct 9960cagcccagat
gacttcaaag tcttgtaaag gggagaaaga gattgatgac caaaagaact 10020gcaaaaagag
aagagccttg gatttcttga gtagactgcc tttacctcca cctgttagtc 10080ccatttgtac
atttgtttct ccggctgcac agaaggcatt tcagccacca aggagttgtg 10140gcaccaaata
cgaaacaccc ataaagaaaa aagaactgaa ttctcctcag atgactccat 10200ttaaaaaatt
caatgaaatt tctcttttgg aaagtaattc aatagctgac gaagaacttg 10260cattgataaa
tacccaagct cttttgtctg gttcaacagg agaaaaacaa tttatatctg 10320tcagtgaatc
cactaggact gctcccacca gttcagaaga ttatctcaga ctgaaacgac 10380gttgtactac
atctctgatc aaagaacagg agagttccca ggccagtacg gaagaatgtg 10440agaaaaataa
gcaggacaca attacaacta aaaaatatat ctaagcattt gcaaaggcga 10500caataaatta
ttgacgctta acctttccag tttataagac tggaatataa tttcaaacca 10560cacattagta
cttatgttgc acaatgagaa aagaaattag tttcaaattt acctcagcgt 10620ttgtgtatcg
ggcaaaaatc gttttgcccg attccgtatt ggtatacttt tgcttcagtt 10680gcatatctta
aaactaaatg taatttatta actaatcaag aaaaacatct ttggctgagc 10740tcggtggctc
atgcctgtaa tcccaacact ttgagaagct gaggtgggag gagtgcttga 10800ggccaggagt
tcaagaccag cctgggcaac atagggagac ccccatcttt acaaagaaaa 10860aaaaaagggg
aaaagaaaat cttttaaatc tttggatttg atcactacaa gtattatttt 10920acaagtgaaa
taaacatacc attttctttt agattgtgtc attaaatgga atgaggtctc 10980ttagtacagt
tattttgatg cagataattc cttttagttt agctactatt ttaggggatt 11040ttttttagag
gtaactcact atgaaatagt tctccttaat gcaaatatgt tggttctgct 11100atagttccat
cctgttcaaa agtcaggatg aatatgaaga gtggtgtttc cttttgagca 11160attcttcatc
cttaagtcag catgattata agaaaaatag aaccctcagt gtaactctaa 11220ttccttttta
ctattccagt gtgatctctg aaattaaatt acttcaacta aaaattcaaa 11280tactttaaat
cagaagattt catagttaat ttattttttt tttcaacaaa atggtcatcc 11340aaactcaaac
ttgagaaaat atcttgcttt caaattggca ctgatt
1138627633PRTArtificial SequenceXRCC1 sequence 27Met Pro Glu Ile Arg Leu
Arg His Val Val Ser Cys Ser Ser Gln Asp 1 5
10 15 Ser Thr His Cys Ala Glu Asn Leu Leu Lys Ala
Asp Thr Tyr Arg Lys 20 25
30 Trp Arg Ala Ala Lys Ala Gly Glu Lys Thr Ile Ser Val Val Leu
Gln 35 40 45 Leu
Glu Lys Glu Glu Gln Ile His Ser Val Asp Ile Gly Asn Asp Gly 50
55 60 Ser Ala Phe Val Glu Val
Leu Val Gly Ser Ser Ala Gly Gly Ala Gly 65 70
75 80 Glu Gln Asp Tyr Glu Val Leu Leu Val Thr Ser
Ser Phe Met Ser Pro 85 90
95 Ser Glu Ser Arg Ser Gly Ser Asn Pro Asn Arg Val Arg Met Phe Gly
100 105 110 Pro Asp
Lys Leu Val Arg Ala Ala Ala Glu Lys Arg Trp Asp Arg Val 115
120 125 Lys Ile Val Cys Ser Gln Pro
Tyr Ser Lys Asp Ser Pro Phe Gly Leu 130 135
140 Ser Phe Val Arg Phe His Ser Pro Pro Asp Lys Asp
Glu Ala Glu Ala 145 150 155
160 Pro Ser Gln Lys Val Thr Val Thr Lys Leu Gly Gln Phe Arg Val Lys
165 170 175 Glu Glu Asp
Glu Ser Ala Asn Ser Leu Arg Pro Gly Ala Leu Phe Phe 180
185 190 Ser Arg Ile Asn Lys Thr Ser Pro
Val Thr Ala Ser Asp Pro Ala Gly 195 200
205 Pro Ser Tyr Ala Ala Ala Thr Leu Gln Ala Ser Ser Ala
Ala Ser Ser 210 215 220
Ala Ser Pro Val Ser Arg Ala Ile Gly Ser Thr Ser Lys Pro Gln Glu 225
230 235 240 Ser Pro Lys Gly
Lys Arg Lys Leu Asp Leu Asn Gln Glu Glu Lys Lys 245
250 255 Thr Pro Ser Lys Pro Pro Ala Gln Leu
Ser Pro Ser Val Pro Lys Arg 260 265
270 Pro Lys Leu Pro Ala Pro Thr Arg Thr Pro Ala Thr Ala Pro
Val Pro 275 280 285
Ala Arg Ala Gln Gly Ala Val Thr Gly Lys Pro Arg Gly Glu Gly Thr 290
295 300 Glu Pro Arg Arg Pro
Arg Ala Gly Pro Glu Glu Leu Gly Lys Ile Leu 305 310
315 320 Gln Gly Val Val Val Val Leu Ser Gly Phe
Gln Asn Pro Phe Arg Ser 325 330
335 Glu Leu Arg Asp Lys Ala Leu Glu Leu Gly Ala Lys Tyr Arg Pro
Asp 340 345 350 Trp
Thr Arg Asp Ser Thr His Leu Ile Cys Ala Phe Ala Asn Thr Pro 355
360 365 Lys Tyr Ser Gln Val Leu
Gly Leu Gly Gly Arg Ile Val Arg Lys Glu 370 375
380 Trp Val Leu Asp Cys His Arg Met Arg Arg Arg
Leu Pro Ser Gln Arg 385 390 395
400 Tyr Leu Met Ala Gly Pro Gly Ser Ser Ser Glu Glu Asp Glu Ala Ser
405 410 415 His Ser
Gly Gly Ser Gly Asp Glu Ala Pro Lys Leu Pro Gln Lys Gln 420
425 430 Pro Gln Thr Lys Thr Lys Pro
Thr Gln Ala Ala Gly Pro Ser Ser Pro 435 440
445 Gln Lys Pro Pro Thr Pro Glu Glu Thr Lys Ala Ala
Ser Pro Val Leu 450 455 460
Gln Glu Asp Ile Asp Ile Glu Gly Val Gln Ser Glu Gly Gln Asp Asn 465
470 475 480 Gly Ala Glu
Asp Ser Gly Asp Thr Glu Asp Glu Leu Arg Arg Val Ala 485
490 495 Glu Gln Lys Glu His Arg Leu Pro
Pro Gly Gln Glu Glu Asn Gly Glu 500 505
510 Asp Pro Tyr Ala Gly Ser Thr Asp Glu Asn Thr Asp Ser
Glu Glu His 515 520 525
Gln Glu Pro Pro Asp Leu Pro Val Pro Glu Leu Pro Asp Phe Phe Gln 530
535 540 Gly Lys His Phe
Phe Leu Tyr Gly Glu Phe Pro Gly Asp Glu Arg Arg 545 550
555 560 Lys Leu Ile Arg Tyr Val Thr Ala Phe
Asn Gly Glu Leu Glu Asp Tyr 565 570
575 Met Ser Asp Arg Val Gln Phe Val Ile Thr Ala Gln Glu Trp
Asp Pro 580 585 590
Ser Phe Glu Glu Ala Leu Met Asp Asn Pro Ser Leu Ala Phe Val Arg
595 600 605 Pro Arg Trp Ile
Tyr Ser Cys Asn Glu Lys Gln Lys Leu Leu Pro His 610
615 620 Gln Leu Tyr Gly Val Val Pro Gln
Ala 625 630 282102DNAArtificial SequenceXRCC1
sequence 28ctcgcgcgct tgcgcacttt agccagcgca gggcgcaccc cgccccctcc
cactctccct 60gcccctcgga ccccatactc tacctcatcc ttctggccag gcgaagccca
cgacgttgac 120atgccggaga tccgcctccg ccatgtcgtg tcctgcagca gccaggactc
gactcactgt 180gcagaaaatc ttctcaaggc agacacttac cgaaaatggc gggcagccaa
ggcaggcgag 240aagaccatct ctgtggtcct acagttggag aaggaggagc agatacacag
tgtggacatt 300gggaatgatg gctcagcttt cgtggaggtg ctggtgggca gttcagctgg
aggcgctggg 360gagcaagact atgaggtcct tctggtcacc tcatctttca tgtccccttc
cgagagccgc 420agtggctcaa accccaaccg cgttcgcatg tttgggcctg acaagctggt
ccgggcagcc 480gccgagaagc gctgggaccg ggtcaaaatt gtttgcagcc agccctacag
caaggactcc 540ccctttggct tgagttttgt acggtttcat agccccccag acaaagatga
ggcagaggcc 600ccgtcccaga aggtgacagt gaccaagctt ggccagttcc gtgtgaagga
ggaggatgag 660agcgccaact ctctgaggcc gggggctctc ttcttcagcc ggatcaacaa
gacatcccca 720gtcacagcca gcgacccagc aggacctagc tatgcagctg ctaccctcca
ggcttctagt 780gctgcctcct cagcctctcc agtctccagg gccataggca gcacctccaa
gccccaggag 840tctcccaaag ggaagaggaa gttggatttg aaccaagaag aaaagaagac
ccccagcaaa 900ccaccagccc agctgtcgcc atctgttccc aagagaccta aattgccagc
tccaactcgt 960accccagcca cagccccagt ccctgcccga gcacaggggg cagtgacagg
caaaccccga 1020ggagaaggca ccgagcccag acgaccccga gctggcccag aggagctggg
gaagatcctt 1080cagggtgtgg tagtggtgct gagtggcttc cagaacccct tccgctccga
gctgcgagat 1140aaggccctag agcttggggc caagtatcgg ccagactgga cccgggacag
cacgcacctc 1200atctgtgcct ttgccaacac ccccaagtac agccaggtcc taggcctggg
aggccgcatc 1260gtgcgtaagg agtgggtgct ggactgtcac cgcatgcgtc ggcggctgcc
ctcccagagg 1320tacctcatgg cagggccagg ttccagcagt gaggaggatg aggcctctca
cagcggtggc 1380agcggagatg aagcccccaa gcttcctcag aagcaacccc agaccaaaac
caagcccact 1440caggcagctg gacccagctc accccagaag cccccaaccc ctgaagagac
caaagcagcc 1500tcaccagtgc tccaggaaga tatagacatt gagggggtac agtcagaagg
acaggacaat 1560ggggcggaag attctgggga cacagaggat gagctgagga gggtggcaga
gcagaaggaa 1620cacagactgc cccctggcca ggaggagaat ggggaagacc cgtatgcagg
ctccacggat 1680gagaacacgg acagtgagga acaccaggag cctcctgatc tgccagtccc
tgagctccca 1740gatttcttcc agggcaagca cttctttctt tacggggagt tccctgggga
cgagcggcgg 1800aaactcatcc gatacgtcac agccttcaat ggggagctcg aggactatat
gagtgaccgg 1860gttcagtttg tgatcacagc acaggaatgg gatcccagct ttgaggaggc
cctgatggac 1920aacccctccc tggcattcgt tcgtccccga tggatctaca gttgcaatga
gaagcagaag 1980ttacttcctc accagctcta tggggtggtg ccgcaagcct gaagtatgtg
ctatacacac 2040acacacacac acacacacac acacacacac acgatgcatt taataaagat
gagttggttc 2100tc
210229189PRTArtificial SequenceKRAS sequence 29Met Thr Glu Tyr
Lys Leu Val Val Val Gly Ala Gly Gly Val Gly Lys 1 5
10 15 Ser Ala Leu Thr Ile Gln Leu Ile Gln
Asn His Phe Val Asp Glu Tyr 20 25
30 Asp Pro Thr Ile Glu Asp Ser Tyr Arg Lys Gln Val Val Ile
Asp Gly 35 40 45
Glu Thr Cys Leu Leu Asp Ile Leu Asp Thr Ala Gly Gln Glu Glu Tyr 50
55 60 Ser Ala Met Arg Asp
Gln Tyr Met Arg Thr Gly Glu Gly Phe Leu Cys 65 70
75 80 Val Phe Ala Ile Asn Asn Thr Lys Ser Phe
Glu Asp Ile His His Tyr 85 90
95 Arg Glu Gln Ile Lys Arg Val Lys Asp Ser Glu Asp Val Pro Met
Val 100 105 110 Leu
Val Gly Asn Lys Cys Asp Leu Pro Ser Arg Thr Val Asp Thr Lys 115
120 125 Gln Ala Gln Asp Leu Ala
Arg Ser Tyr Gly Ile Pro Phe Ile Glu Thr 130 135
140 Ser Ala Lys Thr Arg Gln Arg Val Glu Asp Ala
Phe Tyr Thr Leu Val 145 150 155
160 Arg Glu Ile Arg Gln Tyr Arg Leu Lys Lys Ile Ser Lys Glu Glu Lys
165 170 175 Thr Pro
Gly Cys Val Lys Ile Lys Lys Cys Ile Ile Met 180
185 305436DNAArtificial SequenceKRAS sequence
30ggccgcggcg gcggaggcag cagcggcggc ggcagtggcg gcggcgaagg tggcggcggc
60tcggccagta ctcccggccc ccgccatttc ggactgggag cgagcgcggc gcaggcactg
120aaggcggcgg cggggccaga ggctcagcgg ctcccaggtg cgggagagag gcctgctgaa
180aatgactgaa tataaacttg tggtagttgg agctggtggc gtaggcaaga gtgccttgac
240gatacagcta attcagaatc attttgtgga cgaatatgat ccaacaatag aggattccta
300caggaagcaa gtagtaattg atggagaaac ctgtctcttg gatattctcg acacagcagg
360tcaagaggag tacagtgcaa tgagggacca gtacatgagg actggggagg gctttctttg
420tgtatttgcc ataaataata ctaaatcatt tgaagatatt caccattata gagaacaaat
480taaaagagtt aaggactctg aagatgtacc tatggtccta gtaggaaata aatgtgattt
540gccttctaga acagtagaca caaaacaggc tcaggactta gcaagaagtt atggaattcc
600ttttattgaa acatcagcaa agacaagaca gagagtggag gatgcttttt atacattggt
660gagggagatc cgacaataca gattgaaaaa aatcagcaaa gaagaaaaga ctcctggctg
720tgtgaaaatt aaaaaatgca ttataatgta atctgggtgt tgatgatgcc ttctatacat
780tagttcgaga aattcgaaaa cataaagaaa agatgagcaa agatggtaaa aagaagaaaa
840agaagtcaaa gacaaagtgt gtaattatgt aaatacaatt tgtacttttt tcttaaggca
900tactagtaca agtggtaatt tttgtacatt acactaaatt attagcattt gttttagcat
960tacctaattt ttttcctgct ccatgcagac tgttagcttt taccttaaat gcttatttta
1020aaatgacagt ggaagttttt ttttcctcta agtgccagta ttcccagagt tttggttttt
1080gaactagcaa tgcctgtgaa aaagaaactg aatacctaag atttctgtct tggggttttt
1140ggtgcatgca gttgattact tcttattttt cttaccaatt gtgaatgttg gtgtgaaaca
1200aattaatgaa gcttttgaat catccctatt ctgtgtttta tctagtcaca taaatggatt
1260aattactaat ttcagttgag accttctaat tggtttttac tgaaacattg agggaacaca
1320aatttatggg cttcctgatg atgattcttc taggcatcat gtcctatagt ttgtcatccc
1380tgatgaatgt aaagttacac tgttcacaaa ggttttgtct cctttccact gctattagtc
1440atggtcactc tccccaaaat attatatttt ttctataaaa agaaaaaaat ggaaaaaaat
1500tacaaggcaa tggaaactat tataaggcca tttccttttc acattagata aattactata
1560aagactccta atagcttttc ctgttaaggc agacccagta tgaaatgggg attattatag
1620caaccatttt ggggctatat ttacatgcta ctaaattttt ataataattg aaaagatttt
1680aacaagtata aaaaattctc ataggaatta aatgtagtct ccctgtgtca gactgctctt
1740tcatagtata actttaaatc ttttcttcaa cttgagtctt tgaagatagt tttaattctg
1800cttgtgacat taaaagatta tttgggccag ttatagctta ttaggtgttg aagagaccaa
1860ggttgcaagg ccaggccctg tgtgaacctt tgagctttca tagagagttt cacagcatgg
1920actgtgtccc cacggtcatc cagtgttgtc atgcattggt tagtcaaaat ggggagggac
1980tagggcagtt tggatagctc aacaagatac aatctcactc tgtggtggtc ctgctgacaa
2040atcaagagca ttgcttttgt ttcttaagaa aacaaactct tttttaaaaa ttacttttaa
2100atattaactc aaaagttgag attttggggt ggtggtgtgc caagacatta attttttttt
2160taaacaatga agtgaaaaag ttttacaatc tctaggtttg gctagttctc ttaacactgg
2220ttaaattaac attgcataaa cacttttcaa gtctgatcca tatttaataa tgctttaaaa
2280taaaaataaa aacaatcctt ttgataaatt taaaatgtta cttattttaa aataaatgaa
2340gtgagatggc atggtgaggt gaaagtatca ctggactagg aagaaggtga cttaggttct
2400agataggtgt cttttaggac tctgattttg aggacatcac ttactatcca tttcttcatg
2460ttaaaagaag tcatctcaaa ctcttagttt ttttttttta caactatgta atttatattc
2520catttacata aggatacact tatttgtcaa gctcagcaca atctgtaaat ttttaaccta
2580tgttacacca tcttcagtgc cagtcttggg caaaattgtg caagaggtga agtttatatt
2640tgaatatcca ttctcgtttt aggactcttc ttccatatta gtgtcatctt gcctccctac
2700cttccacatg ccccatgact tgatgcagtt ttaatacttg taattcccct aaccataaga
2760tttactgctg ctgtggatat ctccatgaag ttttcccact gagtcacatc agaaatgccc
2820tacatcttat ttcctcaggg ctcaagagaa tctgacagat accataaagg gatttgacct
2880aatcactaat tttcaggtgg tggctgatgc tttgaacatc tctttgctgc ccaatccatt
2940agcgacagta ggatttttca aacctggtat gaatagacag aaccctatcc agtggaagga
3000gaatttaata aagatagtgc tgaaagaatt ccttaggtaa tctataacta ggactactcc
3060tggtaacagt aatacattcc attgttttag taaccagaaa tcttcatgca atgaaaaata
3120ctttaattca tgaagcttac tttttttttt tggtgtcaga gtctcgctct tgtcacccag
3180gctggaatgc agtggcgcca tctcagctca ctgcaacctc catctcccag gttcaagcga
3240ttctcgtgcc tcggcctcct gagtagctgg gattacaggc gtgtgccact acactcaact
3300aatttttgta tttttaggag agacggggtt tcaccctgtt ggccaggctg gtctcgaact
3360cctgacctca agtgattcac ccaccttggc ctcataaacc tgttttgcag aactcattta
3420ttcagcaaat atttattgag tgcctaccag atgccagtca ccgcacaagg cactgggtat
3480atggtatccc caaacaagag acataatccc ggtccttagg tagtgctagt gtggtctgta
3540atatcttact aaggcctttg gtatacgacc cagagataac acgatgcgta ttttagtttt
3600gcaaagaagg ggtttggtct ctgtgccagc tctataattg ttttgctacg attccactga
3660aactcttcga tcaagctact ttatgtaaat cacttcattg ttttaaagga ataaacttga
3720ttatattgtt tttttatttg gcataactgt gattctttta ggacaattac tgtacacatt
3780aaggtgtatg tcagatattc atattgaccc aaatgtgtaa tattccagtt ttctctgcat
3840aagtaattaa aatatactta aaaattaata gttttatctg ggtacaaata aacaggtgcc
3900tgaactagtt cacagacaag gaaacttcta tgtaaaaatc actatgattt ctgaattgct
3960atgtgaaact acagatcttt ggaacactgt ttaggtaggg tgttaagact tacacagtac
4020ctcgtttcta cacagagaaa gaaatggcca tacttcagga actgcagtgc ttatgagggg
4080atatttaggc ctcttgaatt tttgatgtag atgggcattt ttttaaggta gtggttaatt
4140acctttatgt gaactttgaa tggtttaaca aaagatttgt ttttgtagag attttaaagg
4200gggagaattc tagaaataaa tgttacctaa ttattacagc cttaaagaca aaaatccttg
4260ttgaagtttt tttaaaaaaa gctaaattac atagacttag gcattaacat gtttgtggaa
4320gaatatagca gacgtatatt gtatcatttg agtgaatgtt cccaagtagg cattctaggc
4380tctatttaac tgagtcacac tgcataggaa tttagaacct aacttttata ggttatcaaa
4440actgttgtca ccattgcaca attttgtcct aatatataca tagaaacttt gtggggcatg
4500ttaagttaca gtttgcacaa gttcatctca tttgtattcc attgattttt tttttcttct
4560aaacattttt tcttcaaaca gtatataact ttttttaggg gatttttttt tagacagcaa
4620aaactatctg aagatttcca tttgtcaaaa agtaatgatt tcttgataat tgtgtagtaa
4680tgttttttag aacccagcag ttaccttaaa gctgaattta tatttagtaa cttctgtgtt
4740aatactggat agcatgaatt ctgcattgag aaactgaata gctgtcataa aatgaaactt
4800tctttctaaa gaaagatact cacatgagtt cttgaagaat agtcataact agattaagat
4860ctgtgtttta gtttaatagt ttgaagtgcc tgtttgggat aatgataggt aatttagatg
4920aatttagggg aaaaaaaagt tatctgcaga tatgttgagg gcccatctct ccccccacac
4980ccccacagag ctaactgggt tacagtgttt tatccgaaag tttccaattc cactgtcttg
5040tgttttcatg ttgaaaatac ttttgcattt ttcctttgag tgccaatttc ttactagtac
5100tatttcttaa tgtaacatgt ttacctggaa tgtattttaa ctatttttgt atagtgtaaa
5160ctgaaacatg cacattttgt acattgtgct ttcttttgtg ggacatatgc agtgtgatcc
5220agttgttttc catcatttgg ttgcgctgac ctaggaatgt tggtcatatc aaacattaaa
5280aatgaccact cttttaattg aaattaactt ttaaatgttt ataggagtat gtgctgtgaa
5340gtgatctaaa atttgtaata tttttgtcat gaactgtact actcctaatt attgtaatgt
5400aataaaaata gttacagtga caaaaaaaaa aaaaaa
543631766PRTArtificial SequenceBRAF sequence 31Met Ala Ala Leu Ser Gly
Gly Gly Gly Gly Gly Ala Glu Pro Gly Gln 1 5
10 15 Ala Leu Phe Asn Gly Asp Met Glu Pro Glu Ala
Gly Ala Gly Ala Gly 20 25
30 Ala Ala Ala Ser Ser Ala Ala Asp Pro Ala Ile Pro Glu Glu Val
Trp 35 40 45 Asn
Ile Lys Gln Met Ile Lys Leu Thr Gln Glu His Ile Glu Ala Leu 50
55 60 Leu Asp Lys Phe Gly Gly
Glu His Asn Pro Pro Ser Ile Tyr Leu Glu 65 70
75 80 Ala Tyr Glu Glu Tyr Thr Ser Lys Leu Asp Ala
Leu Gln Gln Arg Glu 85 90
95 Gln Gln Leu Leu Glu Ser Leu Gly Asn Gly Thr Asp Phe Ser Val Ser
100 105 110 Ser Ser
Ala Ser Met Asp Thr Val Thr Ser Ser Ser Ser Ser Ser Leu 115
120 125 Ser Val Leu Pro Ser Ser Leu
Ser Val Phe Gln Asn Pro Thr Asp Val 130 135
140 Ala Arg Ser Asn Pro Lys Ser Pro Gln Lys Pro Ile
Val Arg Val Phe 145 150 155
160 Leu Pro Asn Lys Gln Arg Thr Val Val Pro Ala Arg Cys Gly Val Thr
165 170 175 Val Arg Asp
Ser Leu Lys Lys Ala Leu Met Met Arg Gly Leu Ile Pro 180
185 190 Glu Cys Cys Ala Val Tyr Arg Ile
Gln Asp Gly Glu Lys Lys Pro Ile 195 200
205 Gly Trp Asp Thr Asp Ile Ser Trp Leu Thr Gly Glu Glu
Leu His Val 210 215 220
Glu Val Leu Glu Asn Val Pro Leu Thr Thr His Asn Phe Val Arg Lys 225
230 235 240 Thr Phe Phe Thr
Leu Ala Phe Cys Asp Phe Cys Arg Lys Leu Leu Phe 245
250 255 Gln Gly Phe Arg Cys Gln Thr Cys Gly
Tyr Lys Phe His Gln Arg Cys 260 265
270 Ser Thr Glu Val Pro Leu Met Cys Val Asn Tyr Asp Gln Leu
Asp Leu 275 280 285
Leu Phe Val Ser Lys Phe Phe Glu His His Pro Ile Pro Gln Glu Glu 290
295 300 Ala Ser Leu Ala Glu
Thr Ala Leu Thr Ser Gly Ser Ser Pro Ser Ala 305 310
315 320 Pro Ala Ser Asp Ser Ile Gly Pro Gln Ile
Leu Thr Ser Pro Ser Pro 325 330
335 Ser Lys Ser Ile Pro Ile Pro Gln Pro Phe Arg Pro Ala Asp Glu
Asp 340 345 350 His
Arg Asn Gln Phe Gly Gln Arg Asp Arg Ser Ser Ser Ala Pro Asn 355
360 365 Val His Ile Asn Thr Ile
Glu Pro Val Asn Ile Asp Asp Leu Ile Arg 370 375
380 Asp Gln Gly Phe Arg Gly Asp Gly Gly Ser Thr
Thr Gly Leu Ser Ala 385 390 395
400 Thr Pro Pro Ala Ser Leu Pro Gly Ser Leu Thr Asn Val Lys Ala Leu
405 410 415 Gln Lys
Ser Pro Gly Pro Gln Arg Glu Arg Lys Ser Ser Ser Ser Ser 420
425 430 Glu Asp Arg Asn Arg Met Lys
Thr Leu Gly Arg Arg Asp Ser Ser Asp 435 440
445 Asp Trp Glu Ile Pro Asp Gly Gln Ile Thr Val Gly
Gln Arg Ile Gly 450 455 460
Ser Gly Ser Phe Gly Thr Val Tyr Lys Gly Lys Trp His Gly Asp Val 465
470 475 480 Ala Val Lys
Met Leu Asn Val Thr Ala Pro Thr Pro Gln Gln Leu Gln 485
490 495 Ala Phe Lys Asn Glu Val Gly Val
Leu Arg Lys Thr Arg His Val Asn 500 505
510 Ile Leu Leu Phe Met Gly Tyr Ser Thr Lys Pro Gln Leu
Ala Ile Val 515 520 525
Thr Gln Trp Cys Glu Gly Ser Ser Leu Tyr His His Leu His Ile Ile 530
535 540 Glu Thr Lys Phe
Glu Met Ile Lys Leu Ile Asp Ile Ala Arg Gln Thr 545 550
555 560 Ala Gln Gly Met Asp Tyr Leu His Ala
Lys Ser Ile Ile His Arg Asp 565 570
575 Leu Lys Ser Asn Asn Ile Phe Leu His Glu Asp Leu Thr Val
Lys Ile 580 585 590
Gly Asp Phe Gly Leu Ala Thr Val Lys Ser Arg Trp Ser Gly Ser His
595 600 605 Gln Phe Glu Gln
Leu Ser Gly Ser Ile Leu Trp Met Ala Pro Glu Val 610
615 620 Ile Arg Met Gln Asp Lys Asn Pro
Tyr Ser Phe Gln Ser Asp Val Tyr 625 630
635 640 Ala Phe Gly Ile Val Leu Tyr Glu Leu Met Thr Gly
Gln Leu Pro Tyr 645 650
655 Ser Asn Ile Asn Asn Arg Asp Gln Ile Ile Phe Met Val Gly Arg Gly
660 665 670 Tyr Leu Ser
Pro Asp Leu Ser Lys Val Arg Ser Asn Cys Pro Lys Ala 675
680 685 Met Lys Arg Leu Met Ala Glu Cys
Leu Lys Lys Lys Arg Asp Glu Arg 690 695
700 Pro Leu Phe Pro Gln Ile Leu Ala Ser Ile Glu Leu Leu
Ala Arg Ser 705 710 715
720 Leu Pro Lys Ile His Arg Ser Ala Ser Glu Pro Ser Leu Asn Arg Ala
725 730 735 Gly Phe Gln Thr
Glu Asp Phe Ser Leu Tyr Ala Cys Ala Ser Pro Lys 740
745 750 Thr Pro Ile Gln Ala Gly Gly Tyr Gly
Ala Phe Pro Val His 755 760 765
322949DNAArtificial SequenceBRAF sequence 32cgcctccctt ccccctcccc
gcccgacagc ggccgctcgg gccccggctc tcggttataa 60gatggcggcg ctgagcggtg
gcggtggtgg cggcgcggag ccgggccagg ctctgttcaa 120cggggacatg gagcccgagg
ccggcgccgg cgccggcgcc gcggcctctt cggctgcgga 180ccctgccatt ccggaggagg
tgtggaatat caaacaaatg attaagttga cacaggaaca 240tatagaggcc ctattggaca
aatttggtgg ggagcataat ccaccatcaa tatatctgga 300ggcctatgaa gaatacacca
gcaagctaga tgcactccaa caaagagaac aacagttatt 360ggaatctctg gggaacggaa
ctgatttttc tgtttctagc tctgcatcaa tggataccgt 420tacatcttct tcctcttcta
gcctttcagt gctaccttca tctctttcag tttttcaaaa 480tcccacagat gtggcacgga
gcaaccccaa gtcaccacaa aaacctatcg ttagagtctt 540cctgcccaac aaacagagga
cagtggtacc tgcaaggtgt ggagttacag tccgagacag 600tctaaagaaa gcactgatga
tgagaggtct aatcccagag tgctgtgctg tttacagaat 660tcaggatgga gagaagaaac
caattggttg ggacactgat atttcctggc ttactggaga 720agaattgcat gtggaagtgt
tggagaatgt tccacttaca acacacaact ttgtacgaaa 780aacgtttttc accttagcat
tttgtgactt ttgtcgaaag ctgcttttcc agggtttccg 840ctgtcaaaca tgtggttata
aatttcacca gcgttgtagt acagaagttc cactgatgtg 900tgttaattat gaccaacttg
atttgctgtt tgtctccaag ttctttgaac accacccaat 960accacaggaa gaggcgtcct
tagcagagac tgccctaaca tctggatcat ccccttccgc 1020acccgcctcg gactctattg
ggccccaaat tctcaccagt ccgtctcctt caaaatccat 1080tccaattcca cagcccttcc
gaccagcaga tgaagatcat cgaaatcaat ttgggcaacg 1140agaccgatcc tcatcagctc
ccaatgtgca tataaacaca atagaacctg tcaatattga 1200tgacttgatt agagaccaag
gatttcgtgg tgatggagga tcaaccacag gtttgtctgc 1260taccccccct gcctcattac
ctggctcact aactaacgtg aaagccttac agaaatctcc 1320aggacctcag cgagaaagga
agtcatcttc atcctcagaa gacaggaatc gaatgaaaac 1380acttggtaga cgggactcga
gtgatgattg ggagattcct gatgggcaga ttacagtggg 1440acaaagaatt ggatctggat
catttggaac agtctacaag ggaaagtggc atggtgatgt 1500ggcagtgaaa atgttgaatg
tgacagcacc tacacctcag cagttacaag ccttcaaaaa 1560tgaagtagga gtactcagga
aaacacgaca tgtgaatatc ctactcttca tgggctattc 1620cacaaagcca caactggcta
ttgttaccca gtggtgtgag ggctccagct tgtatcacca 1680tctccatatc attgagacca
aatttgagat gatcaaactt atagatattg cacgacagac 1740tgcacagggc atggattact
tacacgccaa gtcaatcatc cacagagacc tcaagagtaa 1800taatatattt cttcatgaag
acctcacagt aaaaataggt gattttggtc tagctacagt 1860gaaatctcga tggagtgggt
cccatcagtt tgaacagttg tctggatcca ttttgtggat 1920ggcaccagaa gtcatcagaa
tgcaagataa aaatccatac agctttcagt cagatgtata 1980tgcatttgga attgttctgt
atgaattgat gactggacag ttaccttatt caaacatcaa 2040caacagggac cagataattt
ttatggtggg acgaggatac ctgtctccag atctcagtaa 2100ggtacggagt aactgtccaa
aagccatgaa gagattaatg gcagagtgcc tcaaaaagaa 2160aagagatgag agaccactct
ttccccaaat tctcgcctct attgagctgc tggcccgctc 2220attgccaaaa attcaccgca
gtgcatcaga accctccttg aatcgggctg gtttccaaac 2280agaggatttt agtctatatg
cttgtgcttc tccaaaaaca cccatccagg cagggggata 2340tggtgcgttt cctgtccact
gaaacaaatg agtgagagag ttcaggagag tagcaacaaa 2400aggaaaataa atgaacatat
gtttgcttat atgttaaatt gaataaaata ctctcttttt 2460ttttaaggtg aaccaaagaa
cacttgtgtg gttaaagact agatataatt tttccccaaa 2520ctaaaattta tacttaacat
tggattttta acatccaagg gttaaaatac atagacattg 2580ctaaaaattg gcagagcctc
ttctagaggc tttactttct gttccgggtt tgtatcattc 2640acttggttat tttaagtagt
aaacttcagt ttctcatgca acttttgttg ccagctatca 2700catgtccact agggactcca
gaagaagacc ctacctatgc ctgtgtttgc aggtgagaag 2760ttggcagtcg gttagcctgg
gttagataag gcaaactgaa cagatctaat ttaggaagtc 2820agtagaattt aataattcta
ttattattct taataatttt tctataacta tttcttttta 2880taacaatttg gaaaatgtgg
atgtctttta tttccttgaa gcaataaact aagtttcttt 2940ttataaaaa
2949331312PRTArtificial
SequenceRAD50 sequence 33Met Ser Arg Ile Glu Lys Met Ser Ile Leu Gly Val
Arg Ser Phe Gly 1 5 10
15 Ile Glu Asp Lys Asp Lys Gln Ile Ile Thr Phe Phe Ser Pro Leu Thr
20 25 30 Ile Leu Val
Gly Pro Asn Gly Ala Gly Lys Thr Thr Ile Ile Glu Cys 35
40 45 Leu Lys Tyr Ile Cys Thr Gly Asp
Phe Pro Pro Gly Thr Lys Gly Asn 50 55
60 Thr Phe Val His Asp Pro Lys Val Ala Gln Glu Thr Asp
Val Arg Ala 65 70 75
80 Gln Ile Arg Leu Gln Phe Arg Asp Val Asn Gly Glu Leu Ile Ala Val
85 90 95 Gln Arg Ser Met
Val Cys Thr Gln Lys Ser Lys Lys Thr Glu Phe Lys 100
105 110 Thr Leu Glu Gly Val Ile Thr Arg Thr
Lys His Gly Glu Lys Val Ser 115 120
125 Leu Ser Ser Lys Cys Ala Glu Ile Asp Arg Glu Met Ile Ser
Ser Leu 130 135 140
Gly Val Ser Lys Ala Val Leu Asn Asn Val Ile Phe Cys His Gln Glu 145
150 155 160 Asp Ser Asn Trp Pro
Leu Ser Glu Gly Lys Ala Leu Lys Gln Lys Phe 165
170 175 Asp Glu Ile Phe Ser Ala Thr Arg Tyr Ile
Lys Ala Leu Glu Thr Leu 180 185
190 Arg Gln Val Arg Gln Thr Gln Gly Gln Lys Val Lys Glu Tyr Gln
Met 195 200 205 Glu
Leu Lys Tyr Leu Lys Gln Tyr Lys Glu Lys Ala Cys Glu Ile Arg 210
215 220 Asp Gln Ile Thr Ser Lys
Glu Ala Gln Leu Thr Ser Ser Lys Glu Ile 225 230
235 240 Val Lys Ser Tyr Glu Asn Glu Leu Asp Pro Leu
Lys Asn Arg Leu Lys 245 250
255 Glu Ile Glu His Asn Leu Ser Lys Ile Met Lys Leu Asp Asn Glu Ile
260 265 270 Lys Ala
Leu Asp Ser Arg Lys Lys Gln Met Glu Lys Asp Asn Ser Glu 275
280 285 Leu Glu Glu Lys Met Glu Lys
Val Phe Gln Gly Thr Asp Glu Gln Leu 290 295
300 Asn Asp Leu Tyr His Asn His Gln Arg Thr Val Arg
Glu Lys Glu Arg 305 310 315
320 Lys Leu Val Asp Cys His Arg Glu Leu Glu Lys Leu Asn Lys Glu Ser
325 330 335 Arg Leu Leu
Asn Gln Glu Lys Ser Glu Leu Leu Val Glu Gln Gly Arg 340
345 350 Leu Gln Leu Gln Ala Asp Arg His
Gln Glu His Ile Arg Ala Arg Asp 355 360
365 Ser Leu Ile Gln Ser Leu Ala Thr Gln Leu Glu Leu Asp
Gly Phe Glu 370 375 380
Arg Gly Pro Phe Ser Glu Arg Gln Ile Lys Asn Phe His Lys Leu Val 385
390 395 400 Arg Glu Arg Gln
Glu Gly Glu Ala Lys Thr Ala Asn Gln Leu Met Asn 405
410 415 Asp Phe Ala Glu Lys Glu Thr Leu Lys
Gln Lys Gln Ile Asp Glu Ile 420 425
430 Arg Asp Lys Lys Thr Gly Leu Gly Arg Ile Ile Glu Leu Lys
Ser Glu 435 440 445
Ile Leu Ser Lys Lys Gln Asn Glu Leu Lys Asn Val Lys Tyr Glu Leu 450
455 460 Gln Gln Leu Glu Gly
Ser Ser Asp Arg Ile Leu Glu Leu Asp Gln Glu 465 470
475 480 Leu Ile Lys Ala Glu Arg Glu Leu Ser Lys
Ala Glu Lys Asn Ser Asn 485 490
495 Val Glu Thr Leu Lys Met Glu Val Ile Ser Leu Gln Asn Glu Lys
Ala 500 505 510 Asp
Leu Asp Arg Thr Leu Arg Lys Leu Asp Gln Glu Met Glu Gln Leu 515
520 525 Asn His His Thr Thr Thr
Arg Thr Gln Met Glu Met Leu Thr Lys Asp 530 535
540 Lys Ala Asp Lys Asp Glu Gln Ile Arg Lys Ile
Lys Ser Arg His Ser 545 550 555
560 Asp Glu Leu Thr Ser Leu Leu Gly Tyr Phe Pro Asn Lys Lys Gln Leu
565 570 575 Glu Asp
Trp Leu His Ser Lys Ser Lys Glu Ile Asn Gln Thr Arg Asp 580
585 590 Arg Leu Ala Lys Leu Asn Lys
Glu Leu Ala Ser Ser Glu Gln Asn Lys 595 600
605 Asn His Ile Asn Asn Glu Leu Lys Arg Lys Glu Glu
Gln Leu Ser Ser 610 615 620
Tyr Glu Asp Lys Leu Phe Asp Val Cys Gly Ser Gln Asp Phe Glu Ser 625
630 635 640 Asp Leu Asp
Arg Leu Lys Glu Glu Ile Glu Lys Ser Ser Lys Gln Arg 645
650 655 Ala Met Leu Ala Gly Ala Thr Ala
Val Tyr Ser Gln Phe Ile Thr Gln 660 665
670 Leu Thr Asp Glu Asn Gln Ser Cys Cys Pro Val Cys Gln
Arg Val Phe 675 680 685
Gln Thr Glu Ala Glu Leu Gln Glu Val Ile Ser Asp Leu Gln Ser Lys 690
695 700 Leu Arg Leu Ala
Pro Asp Lys Leu Lys Ser Thr Glu Ser Glu Leu Lys 705 710
715 720 Lys Lys Glu Lys Arg Arg Asp Glu Met
Leu Gly Leu Val Pro Met Arg 725 730
735 Gln Ser Ile Ile Asp Leu Lys Glu Lys Glu Ile Pro Glu Leu
Arg Asn 740 745 750
Lys Leu Gln Asn Val Asn Arg Asp Ile Gln Arg Leu Lys Asn Asp Ile
755 760 765 Glu Glu Gln Glu
Thr Leu Leu Gly Thr Ile Met Pro Glu Glu Glu Ser 770
775 780 Ala Lys Val Cys Leu Thr Asp Val
Thr Ile Met Glu Arg Phe Gln Met 785 790
795 800 Glu Leu Lys Asp Val Glu Arg Lys Ile Ala Gln Gln
Ala Ala Lys Leu 805 810
815 Gln Gly Ile Asp Leu Asp Arg Thr Val Gln Gln Val Asn Gln Glu Lys
820 825 830 Gln Glu Lys
Gln His Lys Leu Asp Thr Val Ser Ser Lys Ile Glu Leu 835
840 845 Asn Arg Lys Leu Ile Gln Asp Gln
Gln Glu Gln Ile Gln His Leu Lys 850 855
860 Ser Thr Thr Asn Glu Leu Lys Ser Glu Lys Leu Gln Ile
Ser Thr Asn 865 870 875
880 Leu Gln Arg Arg Gln Gln Leu Glu Glu Gln Thr Val Glu Leu Ser Thr
885 890 895 Glu Val Gln Ser
Leu Tyr Arg Glu Ile Lys Asp Ala Lys Glu Gln Val 900
905 910 Ser Pro Leu Glu Thr Thr Leu Glu Lys
Phe Gln Gln Glu Lys Glu Glu 915 920
925 Leu Ile Asn Lys Lys Asn Thr Ser Asn Lys Ile Ala Gln Asp
Lys Leu 930 935 940
Asn Asp Ile Lys Glu Lys Val Lys Asn Ile His Gly Tyr Met Lys Asp 945
950 955 960 Ile Glu Asn Tyr Ile
Gln Asp Gly Lys Asp Asp Tyr Lys Lys Gln Lys 965
970 975 Glu Thr Glu Leu Asn Lys Val Ile Ala Gln
Leu Ser Glu Cys Glu Lys 980 985
990 His Lys Glu Lys Ile Asn Glu Asp Met Arg Leu Met Arg Gln
Asp Ile 995 1000 1005
Asp Thr Gln Lys Ile Gln Glu Arg Trp Leu Gln Asp Asn Leu Thr 1010
1015 1020 Leu Arg Lys Arg Asn
Glu Glu Leu Lys Glu Val Glu Glu Glu Arg 1025 1030
1035 Lys Gln His Leu Lys Glu Met Gly Gln Met
Gln Val Leu Gln Met 1040 1045 1050
Lys Ser Glu His Gln Lys Leu Glu Glu Asn Ile Asp Asn Ile Lys
1055 1060 1065 Arg Asn
His Asn Leu Ala Leu Gly Arg Gln Lys Gly Tyr Glu Glu 1070
1075 1080 Glu Ile Ile His Phe Lys Lys
Glu Leu Arg Glu Pro Gln Phe Arg 1085 1090
1095 Asp Ala Glu Glu Lys Tyr Arg Glu Met Met Ile Val
Met Arg Thr 1100 1105 1110
Thr Glu Leu Val Asn Lys Asp Leu Asp Ile Tyr Tyr Lys Thr Leu 1115
1120 1125 Asp Gln Ala Ile Met
Lys Phe His Ser Met Lys Met Glu Glu Ile 1130 1135
1140 Asn Lys Ile Ile Arg Asp Leu Trp Arg Ser
Thr Tyr Arg Gly Gln 1145 1150 1155
Asp Ile Glu Tyr Ile Glu Ile Arg Ser Asp Ala Asp Glu Asn Val
1160 1165 1170 Ser Ala
Ser Asp Lys Arg Arg Asn Tyr Asn Tyr Arg Val Val Met 1175
1180 1185 Leu Lys Gly Asp Thr Ala Leu
Asp Met Arg Gly Arg Cys Ser Ala 1190 1195
1200 Gly Gln Lys Val Leu Ala Ser Leu Ile Ile Arg Leu
Ala Leu Ala 1205 1210 1215
Glu Thr Phe Cys Leu Asn Cys Gly Ile Ile Ala Leu Asp Glu Pro 1220
1225 1230 Thr Thr Asn Leu Asp
Arg Glu Asn Ile Glu Ser Leu Ala His Ala 1235 1240
1245 Leu Val Glu Ile Ile Lys Ser Arg Ser Gln
Gln Arg Asn Phe Gln 1250 1255 1260
Leu Leu Val Ile Thr His Asp Glu Asp Phe Val Glu Leu Leu Gly
1265 1270 1275 Arg Ser
Glu Tyr Val Glu Lys Phe Tyr Arg Ile Lys Lys Asn Ile 1280
1285 1290 Asp Gln Cys Ser Glu Ile Val
Lys Cys Ser Val Ser Ser Leu Gly 1295 1300
1305 Phe Asn Val His 1310
346597DNAArtificial SequenceRAD50 sequence 34tttcccggcg tgccccagga
gagcggcgtg gacgcgtgcg ggcctagagg cccacgtgat 60ccgcagggcg gccgaggcag
gaagctgtga gtgcgcggtt gcggggtcgc attgtggcta 120cggctttgcg tccccggcgg
gcagccccag gctggtcccc gcctccgctc tccccaccgg 180cggggaaagc agctggtgtg
ggaggaaagg ctccatcccc cgccccctct ctcccgctgt 240tggctggcag gatcttttgg
cagtcctgtg gcctcgctcc ccgcccggat cctcctgacc 300ctgagattcg cgggtctcac
gtcccgtgca cgccttgctt cggcctcagt taagcctttg 360tggactccag gtccctggtg
agattagaaa cgtttgcaaa catgtcccgg atcgaaaaga 420tgagcattct gggcgtgcgg
agttttggaa tagaggacaa agataagcaa attatcactt 480tcttcagccc ccttacaatt
ttggttggac ccaatggggc gggaaagacg accatcattg 540aatgtctaaa atatatttgt
actggagatt tccctcctgg aaccaaagga aatacatttg 600tacacgatcc caaggttgct
caagaaacag atgtgagagc ccagattcgt ctgcaatttc 660gtgatgtcaa tggagaactt
atagctgtgc aaagatctat ggtgtgtact cagaaaagca 720aaaagacaga atttaaaact
ctggaaggag tcattactag aacaaagcat ggtgaaaagg 780tcagtctgag ctctaagtgt
gcagaaattg accgagaaat gatcagttct cttggggttt 840ccaaggctgt gctaaataat
gtcattttct gtcatcaaga agattctaat tggcctttaa 900gtgaaggaaa ggctttgaag
caaaagtttg atgagatttt ttcagcaaca agatacatta 960aagccttaga aacacttcgg
caggtacgtc agacacaagg tcagaaagta aaagaatatc 1020aaatggaact aaaatatctg
aagcaatata aggaaaaagc ttgtgagatt cgtgatcaga 1080ttacaagtaa ggaagcccag
ttaacatctt caaaggaaat tgtcaaatcc tatgagaatg 1140aacttgatcc attgaagaat
cgtctaaaag aaattgaaca taatctctct aaaataatga 1200aacttgacaa tgaaattaaa
gccttggata gccgaaagaa gcaaatggag aaagataata 1260gtgaactgga agagaaaatg
gaaaaggttt ttcaagggac tgatgagcaa ctaaatgact 1320tatatcacaa tcaccagaga
acagtaaggg agaaagaaag gaaattggta gactgtcatc 1380gtgaactgga aaaactaaat
aaagaatcta ggcttctcaa tcaggaaaaa tcagaactgc 1440ttgttgaaca gggtcgtcta
cagctgcaag cagatcgcca tcaagaacat atccgagcta 1500gagattcatt aattcagtct
ttggcaacac agctagaatt ggatggcttt gagcgtggac 1560cattcagtga aagacagatt
aaaaattttc acaaacttgt gagagagaga caagaagggg 1620aagcaaaaac tgccaaccaa
ctgatgaatg actttgcaga aaaagagact ctgaaacaaa 1680aacagataga tgagataaga
gataagaaaa ctggactggg aagaataatt gagttaaaat 1740cagaaatcct aagtaagaag
cagaatgagc tgaaaaatgt gaagtatgaa ttacagcagt 1800tggaaggatc ttcagacagg
attcttgaac tggaccagga gctcataaaa gctgaacgtg 1860agttaagcaa ggctgagaaa
aacagcaatg tagaaacctt aaaaatggaa gtaataagtc 1920tccaaaatga aaaagcagac
ttagacagga ccctgcgtaa acttgaccag gagatggagc 1980agttaaacca tcatacaaca
acacgtaccc aaatggagat gctgaccaaa gacaaagctg 2040acaaagatga acaaatcaga
aaaataaaat ctaggcacag tgatgaatta acctcactgt 2100tgggatattt tcccaacaaa
aaacagcttg aagactggct acatagtaaa tcaaaagaaa 2160ttaatcagac cagggacaga
cttgccaaat tgaacaagga actagcttca tctgagcaga 2220ataaaaatca tataaataat
gaactaaaaa gaaaggaaga gcagttgtcc agttacgaag 2280acaagctgtt tgatgtttgt
ggtagccagg attttgaaag tgatttagac aggcttaaag 2340aggaaattga aaaatcatca
aaacagcgag ccatgctggc tggagccaca gcagtttact 2400cccagttcat tactcagcta
acagacgaaa accagtcatg ttgccccgtt tgtcagagag 2460tttttcagac agaggctgag
ttacaagaag tcatcagtga tttgcagtct aaactgcgac 2520ttgctccaga taaactcaag
tcaacagaat cagagctaaa aaaaaaggaa aagcggcgtg 2580atgaaatgct gggacttgtg
cccatgaggc aaagcataat tgatttgaag gagaaggaaa 2640taccagaatt aagaaacaaa
ctgcagaatg tcaatagaga catacagcgc ctaaagaacg 2700acatagaaga acaagaaaca
ctcttgggta caataatgcc tgaagaagaa agtgccaaag 2760tatgcctgac agatgttaca
attatggaga ggttccagat ggaacttaaa gatgttgaaa 2820gaaaaattgc acaacaagca
gctaagctac aaggaataga cttagatcga actgtccaac 2880aagtcaacca ggagaaacaa
gagaaacagc acaagttaga cacagtttct agtaagattg 2940aattgaatcg taagcttata
caggaccagc aggaacagat tcaacatcta aaaagtacaa 3000caaatgagct aaaatctgag
aaacttcaga tatccactaa tttgcaacgt cgtcagcaac 3060tggaggagca gactgtggaa
ttatccactg aagttcagtc tttgtacaga gagataaagg 3120atgctaaaga gcaggtaagc
cctttggaaa caacattgga aaagttccag caagaaaaag 3180aagaattaat caacaaaaaa
aatacaagca acaaaatagc acaggataaa ctgaatgata 3240ttaaagagaa ggttaaaaat
attcatggct atatgaaaga cattgagaat tatattcaag 3300atgggaaaga cgactataag
aagcaaaaag aaactgaact taataaagta atagctcaac 3360taagtgaatg cgagaaacac
aaagaaaaga taaatgaaga tatgagactc atgagacaag 3420atattgatac acagaagata
caagaaaggt ggctacaaga taaccttact ttaagaaaaa 3480gaaatgagga actaaaagaa
gttgaagaag aaagaaaaca acatttgaag gaaatgggtc 3540aaatgcaggt tttgcaaatg
aaaagtgaac atcagaagtt ggaagagaac atagacaata 3600taaaaagaaa tcataatttg
gcattagggc gacagaaagg ttatgaagaa gaaattattc 3660attttaagaa agaacttcga
gaaccacaat ttcgggatgc tgaggaaaag tatagagaaa 3720tgatgattgt tatgaggaca
acagaacttg tgaacaagga tctggatatt tattataaga 3780ctcttgacca agcaataatg
aaatttcaca gtatgaaaat ggaagaaatc aataaaatta 3840tacgtgacct gtggcgaagt
acctatcgtg gacaagatat tgaatacata gaaatacggt 3900ctgatgccga tgaaaatgta
tcagcttctg ataaaaggcg gaattataac taccgagtgg 3960tgatgctgaa gggagacaca
gccttggata tgcgaggacg atgcagtgct ggacaaaagg 4020tattagcctc actcatcatt
cgcctggccc tggctgaaac gttctgcctc aactgtggca 4080tcattgcctt ggatgagcca
acaacaaatc ttgaccgaga aaacattgaa tctcttgcac 4140atgctctggt tgagataata
aaaagtcgct cacagcagcg taacttccag cttctggtaa 4200tcactcatga tgaagatttt
gtggagcttt taggacgttc tgaatatgtg gagaaattct 4260acaggattaa aaagaacatc
gatcagtgct cagagattgt gaaatgcagt gttagctccc 4320tgggattcaa tgttcattaa
aaatatccaa gatttaaatg ccatagaaat gtaggtcctc 4380agaaagtgta taataagaaa
cttatttctc atatcaactt agtcaataag aaaatatatt 4440ctttcaaagg aacattgtgt
ctaggatttt ggatgttgag aggttctaaa atcatgaaac 4500ttgtttcact gaaaattgga
cagattgcct gtttctgatt tgctgctctt catcccattc 4560caggcagcct ctgtcaggcc
ttcagggttc agcagtacag ccgagactcg actctgtgcc 4620tccctcccca gtgcaaatgc
atgcttcttc tcaaagcact gttgagaagg agataattac 4680tgccttgaaa atttatggtt
ttggtatttt tttaaatcat agttaaatgt tacctctgaa 4740tttacttcct tgcatgtggt
ttgaaaaact gagtattaat atctgaggat gaccagaaat 4800ggtgagatgt atgtttggct
ctgcttttaa ctttataaat ccagtgacct ctctctctgg 4860gacttggttt ccccaactaa
aatttgaagt agttgaatgg ggtctcaaag tttgacagga 4920accttaagta atcatctaag
tcagtaccca ccaccttctt ctcctacata tcccttccag 4980atggtcatcc agactcagag
ctctctctac agagaggaaa ttctccactg tgcacaccca 5040cctttggaaa gctctgacca
cttgaggcct gatctgccca tcgtgaagaa gcctgtaaca 5100ctcctctgcg tctatcctgt
gtagcatact ggcttcacca tcaatcctga ttcctctcta 5160agtgggcatt gccatgtgga
aggcaagcca ggctcactca cagagtcaag gcctgctccc 5220tgtagggtcc aaccagacct
ggaagaacag gcctctccat ttgctcttca gatgccactt 5280ctaagaaaag cctaatcaca
gtttttcctg gaattgccag ctgacatctt gaatccttcc 5340attccacaca gaatgcaacc
aagtcacacg cttttgaatt atgctttgta gagttttgtc 5400attcagagtc agccaggacc
ataccgggtc ttgattcagt cacatggcat ggttttgtgc 5460catctgtagc tataatgagc
atgtttgcct agacagcttt tctcaactgg gtccagaaga 5520gaattaagcc ctaaggtcct
aaggcatcta tctgtgctag gttaaatggt tggcccccaa 5580agatagacag gtcctgattt
ctagaacccg tgactgttac tttatacagc aaaggaaact 5640ttgcagatgt gattaaagct
aaggacctta agacagagta tcctgggggt ggtggtgggg 5700tggggggggg tcctaaatgt
aatcacgagt aagattaaga gcaaatcaat tctagtcata 5760tattaaacat ccacaataac
caagatattt ttatcccaag aatgcaagat ttcagaaaat 5820gaaaaatctg ttgataaatc
catcactata ataaaaccga aggtgaaaaa aattctgaaa 5880aaattctagc agctatattt
gataaaattc aacatctcct agctttagca aactcacagt 5940tttgcaaata atattttctt
aatgttatct gttgctaaat caaaattaaa cagtcatctt 6000aactgcaaaa taaaacattt
ctcagtaaat attaaagcca gttaccttct atcaacatgt 6060taatgaaagt gctagttgtt
gcagcaaaga ataacaaagg caatacacga tcaatatagg 6120cagtgaaaca aaagtatcat
ttgcaagtta aaacagactt cccaatttta aatctggttt 6180ccccctgaat atgtggcatc
cttggcagca cttctgagag tggctgcttt cattccaaga 6240agcccatggg tttggaggtg
ggataggtgc ctttctggct tctcattgct gcttctagat 6300cagtctccaa atatccccct
tccccacatt ggaatgaata gccatcacag catggatgga 6360ggttagaatg agccagactg
cctgggctca aatcctagca caccactcac tagctgggga 6420ccttgagcaa gttatttgtc
ctgttttctg tttccttata tgtaaaagtg ggtaaaatgg 6480tacatatttt gtagggttgt
tatgaagatt gaatgacatt atttacaaac tgcttagaac 6540tgcttgccac ctactaaata
ctgtgtaagt gttcaagaaa aagctgtctt catttca 659735418PRTArtificial
SequenceRAD51 sequence 35Met Ser Gly Thr Glu Glu Ala Ile Leu Gly Gly Arg
Asp Ser His Pro 1 5 10
15 Ala Ala Gly Gly Gly Ser Val Leu Cys Phe Gly Gln Cys Gln Tyr Thr
20 25 30 Ala Glu Glu
Tyr Gln Ala Ile Gln Lys Ala Leu Arg Gln Arg Leu Gly 35
40 45 Pro Glu Tyr Ile Ser Ser Arg Met
Ala Gly Gly Gly Gln Lys Val Cys 50 55
60 Tyr Ile Glu Gly His Arg Val Ile Asn Leu Ala Asn Glu
Met Phe Gly 65 70 75
80 Tyr Asn Gly Trp Ala His Ser Ile Thr Gln Gln Asn Val Asp Phe Val
85 90 95 Asp Leu Asn Asn
Gly Lys Phe Tyr Val Gly Val Cys Ala Phe Val Arg 100
105 110 Val Gln Leu Lys Asp Gly Ser Tyr His
Glu Asp Val Gly Tyr Gly Val 115 120
125 Ser Glu Gly Leu Lys Ser Lys Ala Leu Ser Leu Glu Lys Ala
Arg Lys 130 135 140
Glu Ala Val Thr Asp Gly Leu Lys Arg Ala Leu Arg Ser Phe Gly Asn 145
150 155 160 Ala Leu Gly Asn Cys
Ile Leu Asp Lys Asp Tyr Leu Arg Ser Leu Asn 165
170 175 Lys Leu Pro Arg Gln Leu Pro Leu Glu Val
Asp Leu Thr Lys Ala Lys 180 185
190 Arg Gln Asp Leu Glu Pro Ser Val Glu Glu Ala Arg Tyr Asn Ser
Cys 195 200 205 Arg
Pro Asn Met Ala Leu Gly His Pro Gln Leu Gln Gln Val Thr Ser 210
215 220 Pro Ser Arg Pro Ser His
Ala Val Ile Pro Ala Asp Gln Asp Cys Ser 225 230
235 240 Ser Arg Ser Leu Ser Ser Ser Ala Val Glu Ser
Glu Ala Thr His Gln 245 250
255 Arg Lys Leu Arg Gln Lys Gln Leu Gln Gln Gln Phe Arg Glu Arg Met
260 265 270 Glu Lys
Gln Gln Val Arg Val Ser Thr Pro Ser Ala Glu Lys Ser Glu 275
280 285 Ala Ala Pro Pro Ala Pro Pro
Val Thr His Ser Thr Pro Val Thr Val 290 295
300 Ser Glu Pro Leu Leu Glu Lys Asp Phe Leu Ala Gly
Val Thr Gln Glu 305 310 315
320 Leu Ile Lys Thr Leu Glu Asp Asn Ser Glu Lys Trp Ala Val Thr Pro
325 330 335 Asp Ala Gly
Asp Gly Val Val Lys Pro Ser Ser Arg Ala Asp Pro Ala 340
345 350 Gln Thr Ser Asp Thr Leu Ala Leu
Asn Asn Gln Met Val Thr Gln Asn 355 360
365 Arg Thr Pro His Ser Val Cys His Gln Lys Pro Gln Ala
Lys Ser Gly 370 375 380
Ser Trp Asp Leu Gln Thr Tyr Ser Ala Asp Gln Arg Thr Thr Gly Asn 385
390 395 400 Trp Glu Ser His
Arg Lys Ser Gln Asp Met Lys Lys Arg Lys Tyr Asp 405
410 415 Pro Ser 362673DNAArtificial
SequenceRAD51 sequence 36cccattctcc tctgcgcggc ctccatctaa gatctcttcc
ccttgtccat agcctagatc 60gagctccctg tgtgcaccgc gcgctgcccg aggcgcaggt
caaccagaat caagatgtct 120gggactgagg aagcaattct tggaggacgt gacagccatc
ctgctgctgg cggcggctca 180gtgttatgct ttggacagtg ccagtacaca gcagaagagt
accaggccat ccagaaggcc 240ctgaggcaga ggctgggccc agaatacata agtagccgca
tggctggcgg aggccagaag 300gtgtgctaca ttgagggtca tcgggtaatt aatctggcca
atgagatgtt tggttacaat 360ggctgggcac actccatcac gcagcagaat gtggattttg
ttgacctcaa caatggcaag 420ttctacgtgg gagtctgtgc atttgtgagg gtccagctga
aggatggttc atatcatgaa 480gatgttggtt atggtgttag tgagggcctc aagtccaagg
ctttatcttt ggagaaggca 540aggaaggagg cggtgacaga cgggctgaag cgagccctca
ggagttttgg gaatgcactt 600ggaaactgta ttctggacaa agactacctg agatcactaa
ataagcttcc acgccagttg 660cctcttgaag tggatttaac taaagcgaag agacaagatc
ttgaaccgtc tgtggaggag 720gcaagataca acagctgccg accgaacatg gccctgggac
acccacagct gcagcaggtg 780acctcccctt ccagacccag ccatgctgtg ataccggcgg
accaggactg cagctcccga 840agcctgagct catccgccgt ggagagcgag gccacgcacc
agcggaagct ccggcagaag 900cagctgcagc agcagttccg ggagcggatg gagaagcagc
aggttcgagt ctccacgccg 960tcagctgaga agagtgaggc agcgcctccg gcccctcctg
tgacgcacag cactcctgta 1020actgtctcag aaccactcct ggagaaagac ttccttgcag
gagtgactca agaattaatc 1080aagactcttg aagacaactc tgaaaagtgg gctgtgactc
ccgatgcagg ggatggtgtg 1140gtcaagccct cgtctagagc agacccagcc cagacctctg
acacattagc cttgaacaac 1200cagatggtga cccagaacag gactccacac agcgtttgcc
accagaaacc acaagcaaaa 1260tctggatctt gggacctcca aacttatagc gctgaccaac
gcacaacagg aaactgggaa 1320tctcatagga agagccagga catgaagaaa aggaaatatg
atccatctta actgaggctc 1380aggccacata attggactct gtcacaaagg gactttggaa
aactactttt tggtcatgaa 1440attgttcatc gctgctggag aatgaacgtc attgcgattt
atcttgcttc attctgaacc 1500ttatcaagag gatctgactg agagcccact gcagttagag
ctgagcactt ttgaaaagct 1560tgtccatcac tctagtaggg agaggctctg gacagatgaa
taccttttct tcggcttgtg 1620aggcttccca ctatttatta ctgaactatt atgttaatga
agatggacat tttaggaatc 1680accaatggct ccttgccctc aagcaatata ggccagactt
ggtcctaagc acctgcctca 1740gcaattgtct acattcagtt gttttgcata acgtctgcct
tctttccttt acggtccatg 1800cctttaatgt tgcccacatt aagcactgtg gatcacgaca
ggaaaaaggt tggagcagtg 1860cttttcacta ctttgtatca atccaggcta caatcttcat
ttaatataaa taatttatgg 1920atttatgaca ttacaatcct gcattgtttc aagactgaca
ttttttccta aggaaggaaa 1980taatcatcta agaccacgaa aaaaggctgt tttttgtttt
tttttttttt tttttttttg 2040agacggggtc tggctgtgtt gccctgactg gagttcagtg
gtgcaaacac agctctctcc 2100acaacctctt gggcccaagt gatactccca cctctgcctt
acaaaataca gggattactg 2160gtgtgagcca ctgtgtctgg ccagaaaagg catttttgag
aaagcaaatc gtatacctta 2220ttaacaaaat agaatatata tatattgctt atctgaaatg
cttgaaacca gaattgtttt 2280gcattttttg aatatttgta tacacataat gagaccttgg
ggatgggacc caagtctgaa 2340cgtggaattc acctgtgttt cgtgtatatg cctcatacac
ataattttgt gcatgaaaca 2400gagtttttgt ataagaagat acactgcagc tgaagagggc
tgggtttttt tttctcttag 2460ggtcgctgca taaactgttg tatgcctggt gctttgcgac
ttgtcacacg aggtcacgtg 2520tggaattttc cacttctggc atcacgtcag tgctcagaaa
ttttctgatc tcagagcatt 2580tcaattaggg atgctcaaac gcaactgttt ctacttcccc
atttcaggtg tgagatgtaa 2640cccaccttga ccataaattg gcttttcata gtg
2673
User Contributions:
Comment about this patent or add new information about this topic: