Patent application title: Endothelium-Specific Nucleic Acid Regulatory Elements and Methods and Use Thereof
Inventors:
IPC8 Class: AA61K3923FI
USPC Class:
1 1
Class name:
Publication date: 2020-07-30
Patent application number: 20200237900
Abstract:
The present invention relates to nucleic acid regulatory elements that
are able to enhance endothelial cell-specific expression of genes,
methods employing these regulatory elements and uses of these elements.
Expression cassettes and vectors containing these nucleic acid regulatory
elements are also disclosed. The present invention is particularly useful
for applications using gene therapy, more particularly endothelial
cell-directed gene therapy, and for vaccination purposes.Claims:
1. A nucleic acid regulatory element for enhancing endothelial
cell-specific gene expression comprising a sequence selected from the
group consisting of: SEQ ID NO:1 to 33, or a sequence having at least
80%, identity to any of these sequences.
2. The nucleic acid regulatory element according to claim 1, having a maximal length of about 1000 nucleotides, still comprising said regulatory element.
3. A nucleic acid regulatory element for enhancing gene expression in endothelial cells comprising a complement of a sequence as defined by any one of SEQ ID Nos:1 to 33 or hybridizing under stringent conditions to a sequence as defined by any one of SEQ ID Nos:1 to 33.
4. (canceled)
5. A nucleic acid expression cassette comprising one or more nucleic acid regulatory elements according to claim 1, operably linked to a promoter.
6. The nucleic acid expression cassette according to claim 5, wherein the nucleic acid regulatory element is operably linked to a promoter and a transgene.
7. The nucleic acid expression cassette according to claim 5, wherein the promoter is an endothelial cell-specific promoter selected from the group consisting of: IFI27, ICAM2, VWF, EDN1, ENG, ECSCR, CDH5, PECAM1, HHIP, TIE1 or HYAL2.
8. The nucleic acid expression cassette according to claim 5, wherein the transgene encodes a therapeutic protein or an immunogenic protein.
9. The nucleic acid expression cassette according to claim 5, wherein the transgene encodes hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), coagulation factor IX (FIX), coagulation factor XI (FXI), tissue factor (TF), tissue factor pathway inhibitor (TFPI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, a1-antitrypsin (AAT), apolipoprotein A-1 (apoA-1), matrix metalloproteinase-3 (TIMP-3), insulin, nitric oxide synthase (NOS), growth factors, or antibodies directed against any one of said transgenes, factors and their cognate receptors or against any secreted protein or viral protein, small interfering RNA, guide RNA, endonuclease, and Cas9.
10. The nucleic acid expression cassette according to claim 5, further comprising a polyadenylation signal.
11. A vector comprising the nucleic acid regulatory element according to claim 1.
12. The vector according to claim 11, wherein the vector is a lentiviral vector, an adeno-associated viral (AAV) vector, or a adenoviral vector.
13. The vector according to claim 11, wherein the vector is a plasmid, a minicircle, an episomal vector, or a transposon-based vector.
14. A pharmaceutical composition comprising a nucleic acid expression cassette or a vector each comprising the nucleic acid regulatory element according to claim 1, and a pharmaceutically acceptable carrier.
15. (canceled)
16. (canceled)
17. A method for expressing a transgene product in endothelial cells, comprising: i) introducing a nucleic acid expression cassette or a vector each comprising the nucleic acid regulatory element according to claim 1 operably linked to a promoter and a transgene, into the cells; and ii) expressing a transgene product in the cells.
18. A method for treating or preventing an endothelial cell-related disease or disorder, the method comprising: administering a therapeutically effective amount of a nucleic acid expression cassette, a vector, or a pharmaceutical composition, each comprising the nucleic acid regulatory element according to claim 1, to a subject in need thereof, wherein said endothelial cell-related disease or disorder is liver diseases, hemophilia A, von Willebrand disease, microvascular thrombosis, thrombotic thrombocytopenic purpura, peripheral vascular disease, coronary artery diseases, atherosclerotic diseases, stroke, heart disease, diabetes, insulin resistance, chronic kidney failure, tumor growth, metastasis, venous thrombosis, ischemia, tumour growth, tumour vascularisation, cancer, Ebola, Dengue or Dengue hemorrhagic fever.
19. (canceled)
20. A method for enhancing transgene expression in endothelial cells in a subject undergoing endothelial cell-directed gene therapy, the method comprising: i) administering to the subject a nucleic acid expression cassette, a vector, or a pharmaceutical composition, each comprising the nucleic acid regulatory element according to claim 1, operably linked to a promoter and a transgene, and ii) expressing a therapeutically effective amount of the transgene product in endothelial cells of the subject.
21. The nucleic acid regulatory element according to claim 1, having a maximal length of about 610 nucleotides, still comprising said regulatory element.
22. The nucleic acid expression cassette according to claim 10, wherein the polyadenylation signal is the Simian Virus 40 (SV40) polyadenylation signal.
23. The vector according to claim 11, wherein the transposon-based vector is a PiggyBac-based vector or a Sleeping Beauty-based vector.
24. A method for vaccinating a subject, the method comprising: i) administering to the subject a nucleic acid expression cassette, a vector, or a pharmaceutical composition each comprising the nucleic acid regulatory element according to claim 1 operably linked to a promoter and a transgene, and ii) expressing an immunologically effective amount of the transgene product in the endothelial cells of the subject.
Description:
FIELD
[0001] The present invention relates to nucleic acid regulatory elements that are able to enhance endothelial-specific expression of genes, methods employing these regulatory elements and use thereof. The invention further encompasses expression cassettes, vectors and pharmaceutical compositions comprising these regulatory elements. The present invention is particularly useful for applications using gene therapy, more particularly endothelial-directed gene therapy, and for vaccination purposes.
BACKGROUND
[0002] Endothelial cells form a single cell layer that lines all blood vessels and regulate exchanges between the bloodstream and the surrounding tissues. Signals from endothelial cells organize the growth and development of connective tissue cells that form the surrounding layers of the blood-vessel wall. New blood vessels can develop from the walls of existing small vessels by the outgrowth of endothelial cells, which have the capacity to form hollow capillary tubes even when isolated in culture. Endothelial cells of developing arteries and veins express different cell-surface proteins, which may control the way in which they link up to create a capillary bed. (Molecular Biology of the Cell. 4th Edition). A homeostatic mechanism ensures that blood vessels permeate every region of the body. Cells that are short of oxygen increase their concentration of hypoxia-inducible factor 1 (HIF-1), which stimulates the production of vascular endothelial growth factor (VEGF). VEGF acts on endothelial cells, causing them to proliferate and invade the hypoxic tissue to supply it with new blood vessels. (Molecular Biology of the Cell. 4th Edition).
[0003] Endothelial cell phenotypes vary between different organs, between different segments of the vascular loop within the same organ, and between neighbouring endothelial cells of the same organ and blood vessel type. In addition to differences in structure, endothelial cells show remarkable heterogeneity in function. For example, the endothelial cells in the liver, called liver sinusoidal endothelial cells (LSEC), form a continuous lining of the liver capillaries, or sinusoids, separating parenchymal cells and fat-storing cells from sinusoidal blood. LSECs represent unique, highly specialized endothelial cells in the body. LSECs differ in fine structure from endothelial cells lining larger blood vessels and from other capillary endothelia in that they lack a distinct basement membrane and also contain open pores, or fenestrae, in the thin cytoplasmic projections which constitute the sinusoidal wall. This distinctive morphology supports the protective role played by liver endothelium, the cells forming a general barrier against pathogenic agents and serving as a selective sieve for substances passing from the blood to parenchymal and fat-storing cells, and vice versa. Sinusoidal endothelial cells, furthermore, significantly participate in the metabolic and clearance functions of the liver. They have been shown to be involved in the endocytosis and metabolism of a wide range of macromolecules, including glycoproteins, lipoproteins, extracellular matrix components, and inert colloids, establishing endothelial cells as a vital link in the complex network of cellular interactions and cooperation in the liver.
[0004] In addition, LSECs have long been noted to contribute to liver regeneration after liver injury. In normal liver, the major cellular source of hepatocyte growth factor (HGF) is the hepatic stellate cell, but after liver injury, HGF expression has been thought to increase markedly in proliferating LSECs (DeLeve et al. Liver sinusoidal endothelial cells and liver regeneration J Clin. Invest. 2013). Another unexpected function of LSEC was recently reported (Shahani et al., J. Thromb. Hemost 2014), demonstrating that LSECs and not hepatocytes express coagulation factor VIII (FVIII). Moreover, endothelial cells, including LSECs, also express von Willebrand factor (vWF). It is known that secreted FVIII would be relatively unstable unless it is associated with vWF. Deficiency of FVIII, a co-factor in the intrinsic coagulation pathway, results in hemophilia A. Liver transplantation in both FVIII-deficient dogs and patients with hemophilia A corrects these disorders. Although the liver is known to be the main site of FVIII production, other organs are probably also important for the regulation of FVIII secretion. Recent studies have shown that lung endothelial cells can synthesize FVIII. Microvascular endothelial cells from lung, heart, intestine, and skin as well as endothelial cells from pulmonary artery constitutively secreted FVIII and released it after treatment with phorbol-myristate acetate and epinephrine. By contrast, endothelial cells from the aorta, umbilical artery and umbilical vein did not constitutively secrete FVIII or release it after treatment with agonists, probably because of a lack of FVIII synthesis. Extrahepatic endothelial cells from certain vascular beds therefore appear to be an important FVIII production and storage site with the potential to regulate FVIII secretion in chronic and acute conditions (Shahani et al. Blood. 2010 Jun. 10; 115(23):4902-9). In addition, LSECs have also been reported to induce immunosuppressive IL-10-producing Th1 cells via the Notch pathway (Neumann et al. Eur J Immunol. 2015 July; 45(7):2008-16) suggesting an important immune-modulatory role. Therefore, LSEC dysfunction has been regarded as a key event in multiple liver disorders. Future studies are likely to disclose more fully the role of LSEC in the regulation of liver hemodynamics, in liver metabolism and blood clearance, in the maintenance of hepatic structure, in the pathogenesis of various liver diseases, and in the aging process in the liver (De Leeuw et al. J Electron Microsc Tech. 1990 March; 14(3):218-36).
[0005] The endothelium is involved in many disease states, either as a primary determinant of pathophysiology or as a secondary target. In particular, endothelial cell dysfunction can be caused by acquired, complex multifactorial, genetic or infectious diseases. In some cases, the underlying endothelial cell defect can be life-threatening, for which no effective cure is presently available. Dysfunction of the vascular endothelium is a hallmark of many human diseases. The endothelium is directly involved in many different diseases including peripheral vascular disease, stroke, heart disease, diabetes, insulin resistance, chronic kidney failure, tumor growth, metastasis, venous thrombosis, and severe viral infectious diseases. Consequently, the endothelium has substantial untapped potential as a therapeutic target. In particular, endothelial cells are attractive target cells for gene therapy to enable robust and/or sustained expression of the cognate therapeutic genes.
[0006] In particular, endothelial dysfunction is one of the major pathophysiological mechanisms that leads towards coronary artery disease and other atherosclerotic diseases. Atherosclerosis is a progressive vascular disease characterized by the accumulation of lipids, inflammatory cells, and fibrous elements. In Western societies, it is the underlying cause of approximately 50% of all deaths. Dysfunction or injury of vascular endothelial cells is critical for the development of atherosclerosis. The endothelium functions as a selectively permeable barrier between blood and tissues as it can regulate transcytosis and generate effector molecules such us nitric oxide (NO) that regulate thrombosis, inflammation, vascular tone and vascular remodeling. For example, overexpression of STAMP2 suppresses atherosclerosis and stabilizes plaques in diabetic mice. Similarly, it had been reported that over-expression of ABCG1 by somatic gene transfer to the atherosclerotic vessel wall results in a significant improvement of plaque morphology and composition, and of vascular function in vivo (Heart Int. 2012 Jun. 5; 7(2): e12.).
[0007] Endothelial cells also play a key role in angiogenesis and vasculogenesis. Angiogenesis is the physiological process through which new blood vessels form from pre-existing vessels. This is distinct from vasculogenesis, which is the de novo formation of endothelial cells from mesoderm cell precursors. Angiogenesis and vasculogenesis are normal and vital processes in growth and development. However, they also represent a fundamental step in cancer progression, justifying the use of angiogenesis or vasculogenesis inhibitors in the treatment of cancer. Conversely, promoting angiogenesis and vasculogenesis may benefit the treatment of ischemia which is associated with decrease in blood supply to certain organs or tissues. Consequently, the delivery of genes into endothelial cells that either promote or inhibit angiogenesis and/or vasculogenesis opens new perspectives for the treatment of cardiovascular disease and cancer, respectively. This includes a plethora of therapeutic genes including VEGF, PLGF, FGF, sFLT1, antibodies directed against these factors and their cognate receptors, cytokines, chemokines, etc.
[0008] Furthermore, endothelial cells are also promising targets for gene therapy to express therapeutic proteins that are missing or defective in genetic disorders that result from mutations in the respective genes. For example, this includes FVIII, vWF or ADAMTS13 (a disintegrin and metalloproteinase with a thrombospondin type 1 motif, member 13). As mentioned above, hemophilia A is due to a deficiency in FVIII. Moreover, deficiency in vWF causes a bleeding diathesis in patients suffering from von Willebrand disease (VWD). Conversely, a deficiency in ADAMTS13 is linked to the development of microvascular thrombosis characteristic of thrombotic thrombocytopenic purpura (TTP). Consequently, to establish an effective cure for these genetic diseases, robust expression of FVIII, vWF or ADAMTS13 in the endothelial cells is required. In addition, given their proximity to the blood, endothelial cells are also ideally suited to express other therapeutic proteins that are normally not expressed by endothelial cells but that can be directly secreted in the blood. This includes, but is not limited to, other coagulation factors (e.g. factor VII, IX, XI etc.), serum proteins (.alpha.1-antitrypsin, AAT, antibodies, growth factors etc.).
[0009] Finally, endothelial cells also play a key role in viral infection. For example, the Ebola virus is an aggressive pathogen that causes a highly lethal hemorrhagic fever syndrome in humans and nonhuman primates. The virus infects microvascular endothelial cells and compromises vascular integrity. Infection of endothelial cells also induces a cytopathic effect and damage to the endothelial barrier. Similarly, Dengue virus causes leakage of the vascular endothelium, resulting in dengue hemorrhagic fever and dengue shock syndrome. The endothelial cell lining of the vasculature regulates capillary permeability and is altered by immune and chemokine responses which affect fluid barrier functions of the endothelium. Human endothelial cells are susceptible to infection by dengue virus Following attachment to human endothelial cell receptors, dengue virus causes a productive infection that has the potential to increase viral dissemination and viremia. This provides the potential for dengue virus-infected endothelial cells to directly alter barrier functions of the endothelium, contribute to enhancement of immune cell activation, and serve as potential targets of immune responses which play a central role in dengue pathogenesis.
[0010] Hence, there is a need to establish effective cures by gene therapy to enable robust expression of the cognate therapeutic genes in the endothelial cells. This requires the development of potent expression cassettes containing the genes of interest. Consequently, there is a need to identify robust nucleic acid regulatory elements capable of substantially increasing transcription in the endothelium.
SUMMARY OF THE INVENTION
[0011] To achieve a robust and specific expression in endothelial cells, the inventors have developed a computational approach to identify robust nucleic acid regulatory elements such as cis-regulatory elements (CREs) that are capable of substantially increasing transcription in endothelial cells (also called EC-CREs or EC-REs) when combined with an endothelial specific promoter. Endothelial specific nucleic acid regulatory elements were identified in silico and subsequently validated in in vitro cell lines and also in vivo in mice.
[0012] These nucleic acid regulatory elements are critically important for the regulation of gene expression in an endothelial cell type-specific manner. They are typically composed of clusters of transcription factor binding site (TFBS) motifs. The types and arrangement of TFBS and epigenetic modification patterns influence gene expression levels and specificity. Conventional methods of vector design relied on haphazard trial-and-error approaches whereby transcriptional enhancers were combined with promoters to boost expression levels. Though this could sometimes be effective, it often resulted in non-productive combinations that resulted in either modest or no increased expression levels of the gene of interest and/or loss of tissue specificity. Moreover, these conventional approaches did not take into account the importance of including evolutionary conserved regulatory motifs into the expression modules. The development of nucleic acid regulatory elements that can lead to robust and specific expression in endothelial cells will be very useful for achieving safe and efficient gene delivery to endothelial cells for the treatment of disorders related to endothelial cell dysfunction.
[0013] The present inventors have relied on a computational approach (cf. FIG. 1) to identify robust nucleic acid regulatory elements that boost gene expression at the transcriptional level in endothelial cells (designated herein as "EC"). This requires the following computational steps: (1) endothelial cell-specific genes were identified that are highly and specifically expressed based on expression data from endothelial cells; (2) publicly available databases (ENSEMBL) were used for extracting the sequences upstream of the Transcriptional Start Site (TSS) of the selected genes. 3) These sequences were then submitted into UCSC Genome Browser Database for locating the transcription start site in human genome. To extract the corresponding endothelial cell nucleic acid regulatory elements, defined herein as the nucleic acid regulatory elements, the sequences were selected based on the following criteria: a) rich TFBS content, b) epigenetic signatures associated with high DNase hypersensitivity or chromatin accessibility (i.e. histone modifications), and c) evolutionary conserved clusters of TFBS associated with highly expressed endothelial cell-specific genes.
[0014] As shown in the experimental section, the inventors identified nucleic acid regulatory elements that will specifically enhance gene expression in endothelial cells.
[0015] The endothelial cell regulatory elements will subsequently be validated in vivo, yielding efficient and tissue-specific gene expression. This approach hence, allows for the use of lower and thus safer vector doses, while maximizing therapeutic efficacy.
[0016] The invention therefore provides the following aspects:
[0017] Aspect 1. A nucleic acid regulatory element for enhancing endothelial cell-specific gene expression comprising, consisting essentially of, or consisting of the sequence selected from the group consisting of: SEQ ID NO:1 to 33, or a sequence having at least 80%, preferably at least 85%, more preferably at least 90%, even more preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of these sequences. In a preferred embodiment of said aspect, said nucleic acid regulatory element for enhancing endothelial cell-specific gene expression comprises, consists essentially of, or consists of the sequence of SEQ ID NO.22, or a sequence having at least 80%, preferably at least 85%, more preferably at least 90%, even more preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of these sequences.
[0018] Aspect 2. The nucleic acid regulatory element according to aspect 1, having a maximal length of 1000 nucleotides, preferably 800 nucleotides, more preferably 700 nucleotides, most preferably of 610 nucleotides, still comprising the regulatory element defined by any one of SEQ ID Nos: 1 to 33.
[0019] Aspect 3. A nucleic acid regulatory element for enhancing gene expression in endothelial cells comprising, consisting essentially of, or consisting of the complement of a sequence as defined by any one of SEQ ID Nos: 1 to 33, or hybridizing under stringent conditions to a sequence as defined by any one of SEQ ID Nos: 1 to 33.
[0020] Aspect 4. Use of the nucleic acid regulatory element according to any one aspects 1 to 3 in a nucleic acid expression cassette, or a vector, more particularly for enhancing gene expression in endothelial cells of said nucleic acid expression cassette or vector.
[0021] Aspect 5. A nucleic acid expression cassette comprising at least one, such as one, two, three, four, five or more, nucleic acid regulatory elements according to any one of aspects 1 to 3, operably linked to a promoter.
[0022] Aspect 6. The nucleic acid expression cassette according to aspect 5, wherein the nucleic acid regulatory element is operably linked to a promoter and a transgene.
[0023] Aspect 7. The nucleic acid expression cassette according any one of aspects 5 or 6, wherein the promoter is an endothelial cell-specific promoter, such as the promotor of any one of the genes selected from the group comprising: IF127, ICAM2, VWF, EDN1, ENG, ECSCR, CDH5 (vascular endothelial cadherin promoter, cadherin 5 type 2), PECAM1, HHIP, TIE1 and HYAL2.
[0024] Aspect 8. The nucleic acid expression cassette according to any one of aspects 5 to 7, wherein the transgene encodes a therapeutic protein or an immunogenic protein.
[0025] Aspect 9. The nucleic acid expression cassette according to any one of aspects 5 to 8, wherein the transgene encodes a secretable protein or a structural protein.
[0026] Aspect 10. The nucleic acid expression cassette according to aspect 8 or 9, wherein said transgene is selected from the group comprising: hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), tissue factor (TF), tissue factor pathway inhibitor (TFPI), coagulation factor IX (FIX), coagulation factor XI (FXI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, .alpha.1-antitrypsin, AAT, apolipoprotein A-I (apoA-I), matrix metalloproteinases including but not limited to matrix metalloproteinase-3 (TIMP-3), nitric oxide synthase (NOS), antibodies, growth factors, cytokines, chemokines and antibodies, including but not limited to antibodies directed against any one of said transgenes, factors and their cognate receptors or against any secreted protein or viral protein, small interfering RNA, guide RNA, endonuclease, and Cas9.
[0027] Aspect 11. The nucleic acid expression cassette according to any one of aspects 5 to 10, further comprising a polyadenylation signal, preferably the Simian Virus 40 (SV40) polyadenylation signal, a synthetic polyadenylation signal or a bovine growth hormone polyadenylation signal.
[0028] Aspect 12. A vector comprising the nucleic acid regulatory element according to any one of aspects 1 to 3, or the nucleic acid expression cassette according to any one of aspects 5 to 11.
[0029] Aspect 13. The vector according to aspect 12, which is a viral vector, preferably a lentiviral vector (LV), an adeno-associated viral (AAV) vector, or an adenoviral vector (AV). In specific examples of said aspect, the vector is a self- or non-self inactivating lentiviral vector, preferably a self inactivating lentiviral vector.
[0030] In specific examples the LV has the following components: EC-CRE-PM-TG, with EC-CRE being one of the newly identified regulatory sequences as defined in SEQ ID Nos 1-33; PM being an endothelial cell-specific promotor such as, but not limited to, those referred to in aspect 7, and TG being a transgene such as, but not limited to, the transgenes identified herein and in particular those defined in aspect 10.
[0031] Taking the example with endothelial cell-specific promotor ICAM2, and using the self inactivating lentiviral vector backbone pCDH, the vector can be:
[0032] pCDH-CDH5-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0033] pCDH-CDH5-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0034] pCDH-CDH5-EC-CRE1c-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0035] pCDH-CDH5-EC-CRE1d-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0036] pCDH-CDH5-EC-CRE1e-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0037] pCDH-HYAL2-EC-CRE1f-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0038] pCDH-ECSCR-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0039] pCDH-ECSCR-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0040] pCDH-EDN1-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0041] pCDH-ENG-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0042] pCDH-ENG-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0043] pCDH-ENG-EC-CRE1c-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0044] pCDH-HHIP-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0045] pCDH-HHIP-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0046] pCDH-HYAL2-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0047] pCDH-HYAL2-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0048] pCDH-HYAL2-EC-CRE1c-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0049] pCDH-ICAM2-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0050] pCDH-ICAM2-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0051] pCDH-ICAM2-EC-CRE1c-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0052] pCDH-IF127-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0053] pCDH-IF127-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0054] pCDH-PECAM1-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0055] pCDH-TIE1-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0056] pCDH-TIE1-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10,
[0057] pCDH-VWF-EC-CRE1a-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10, or
[0058] pCDH-VWF-EC-CRE1b-ICAM2-TG, such as wherein the TG is any one of those identified in aspect 10.
[0059] In anyone of said VL constructs, the ICAM2-promotor can be replaced by another endothelial cell specific promotor, such as, but not limited to those exemplified in aspect 7. In anyone of said VL constructs, the vector backbone can be exchanged by another suitable backbone such as those known to the person skilled in the art.
[0060] Aspect 14. The vector according to aspect 12, which is a non-viral vector, preferably a plasmid, a minicircle, an episomal vector, or a transposon-based vector, such as a PiggyBac-based vector or a Sleeping Beauty-based vector.
[0061] Aspect 15. A pharmaceutical composition comprising the nucleic acid expression cassette according to any one of aspects 5 to 11, or the vector according to any one of aspects 12 to 14, and a pharmaceutically acceptable carrier.
[0062] Aspect 16. The nucleic acid regulatory element according to any one of aspects 1 to 3, the nucleic acid expression cassette according to any one of aspects 5 to 11, the vector according to any one of aspects 12 to 14, or the pharmaceutical composition according to aspect 15 for use in medicine, more preferably for use in gene therapy, in particular for use in treating endothelial cell dysfunction, preferably such as any one of the diseases or disorders selected from the group comprising: liver diseases, hemophilia A, von Willebrand disease, microvascular thrombosis, thrombotic thrombocytopenic purpura, peripheral vascular disease, coronary artery diseases, atherosclerotic diseases, stroke, heart disease, diabetes, insulin resistance, chronic kidney failure, tumor growth, metastasis, venous thrombosis, ischemia, tumour growth, tumour vascularisation, cancer and viral infectious diseases such as Ebola, Dengue and Dengue hemorrhagic fever.
[0063] Aspect 17. The nucleic acid regulatory element according to any one of aspects 1 to 3, the nucleic acid expression cassette according to any one of aspects 4 to 11, the vector according to any one of aspects 12 to 14, or the pharmaceutical composition according to aspect 15 for use as a vaccine, preferably a prophylactic vaccine, or for use in vaccination therapy, preferably prophylactic vaccination. Alternatively, said nucleic acid regulatory element according to any one of aspects 1 to 3, the nucleic acid expression cassette according to any one of aspects 4 to 11, the vector according to any one of aspects 12 to 14, or the pharmaceutical composition according to aspect 15 can be for use in induction of immunotolerance to the transgene.
[0064] Aspect 18. A method, preferably an in vivo method, for expressing a transgene product in endothelial cells, comprising:
[0065] introducing the nucleic acid expression cassette according to any one of aspects 4 to 11, or the vector according to any one of aspects 12 to 14, comprising the nucleic acid regulatory element according to any one of aspects 1 to 3, into the cells; and
[0066] Aspect 19. A method, preferably an in vitro or ex vivo method, for expressing a transgene product in endothelial cells, comprising:
[0067] introducing the nucleic acid expression cassette according to any one of aspects 4 to 11, or the vector according to any one of aspects 12 to 14, comprising the nucleic acid regulatory element according to any one of aspects 1 to 3, into the cells; and
[0068] expressing the transgene product in the cells.
[0069] Aspect 20. A method for treating an endothelial cell-related disease or disorder comprising the administration of a therapeutically effective amount of the nucleic acid expression cassette according to any one of aspects 5 to 11, the vector according to any one of aspects 12 to 14, or the pharmaceutical composition according to aspect 15, each comprising the nucleic acid regulatory element according to any one of aspects 1 to 3, to a subject in need thereof.
[0070] Aspect 21. The method according to aspect 20, wherein said endothelial cell-related disease or disorder is selected from the group comprising: endothelial cell dysfunction, preferably such as any one of the diseases or disorders selected from the group comprising: liver diseases, hemophilia A, von Willebrand disease, microvascular thrombosis, thrombotic thrombocytopenic purpura, peripheral vascular disease, coronary artery diseases, atherosclerotic diseases, stroke, heart disease, diabetes, insulin resistance, chronic kidney failure, tumor growth, metastasis, venous thrombosis, ischemia, tumour growth, tumour vascularisation, cancer and viral infectious diseases such as Ebola Dengue and Dengue hemorrhagic fever.
BRIEF DESCRIPTION OF DRAWINGS
[0071] FIG. 1: The selection strategies of endothelial-specific CREs
[0072] As shown in the experimental section (Example 1), the inventors identified nucleic acid regulatory elements that will specifically enhance gene expression in endothelial cells. The endothelial specific regulatory elements were subsequently be validated in vitro and in vivo assays in mice. The details of the in vitro and in vivo validation are described in Example 2 to 6 below. The successful use of the endothelial CREs will hence, allows for the use of lower and thus safer vector doses, while maximizing therapeutic efficacy.
[0073] FIG. 2: Lentiviral vector design
[0074] Lentiviral vectors were produced as described previously (VandenDriessche et al., Blood 2002). Briefly, the lentiviral vector-containing plasmids were cotransfected with a VSV-G expression plasmid, a gag-pol and Rev expression plasmid. Lentiviruses were produced by transient co-transfection of HEK293T (293T) cells using supplemented Dulbecco modified Eagle medium (Invitrogen) with 10% heat-inactivated fetal bovine serum (Invitrogen) and 1% penicillin/streptomycin. A total of 60 .mu.g lentiviral plasmid was used for transfection of one double-tray culture chamber: 60 .mu.g lentiviral plasmid, 30 .mu.g pRSV-REV, 30 .mu.g pMDLg/pRRE and 30 .mu.g pCMV-VSV-G. Plasmid was pre-complexed with calcium phosphate (Calcium phosphate transfection kit, Invitrogen) for 30 minutes at room temperature. Transfection media was added to the cells for 16 hours and then replaced by fresh medium containing NU-serum (Invitrogen) and Sodium Butyrate (Sigma). Viral supernatant was harvested 48 and 72 hours after transfection and concentrated using a Centricon concentrator (Millipore) (2000 rpm for 1 hours at 4.degree. C.). Aliquots of viruses were stored at -80.degree. C. The physical titer in nanograms per microliter of all LVs was determined using a p24 colorimetric enzyme-linked immunosorbent assay (ELISA) kit (Cell Biolabs) according to the manufacturer's instructions. This value was then used to calculate an estimated vector titer equivalent in transducing units (TU) per milliliter. Polybrene (8 .mu.g/mL) was added to the concentrated vectors to enhance transduction.
[0075] FIG. 3: Flow cytometry analysis of HUVECs and LSECs transduced with LV CMV-GFP vector (MOI 50) at 72 hr timepoint post transduction.
[0076] FIG. 4: FVIII antigen expression expressed in ng/ml in HUVECs and LSECS transduced with different lentiviral vector designs. FVIII antigen expression levels were determined at 24, 48 and 72 hrs after transduction.
[0077] FIG. 5: The FVIII expression levels of each EC-CRE-ICAM2-FVIII construct in HUVECs. The percentage compared to the ICAM2 construct without EC-CREs. The expression values were represented as mean with standard error of mean (SEM; n=3 for each group).
[0078] FIG. 6: FVIII expression after lentiviral in vivo transduction. FVIII protein expression was determined using a human FVIII-specific ELISA on the plasma samples collected 5 weeks after lentiviral vector injection. The mice cohorts (n=4 mice per cohort) include ICAM2-FVIII (no CRE control), HYAL2-EC-CRE1a-ICAM2-FVIII, HYAL2-EC-CRE1b-ICAM2-FVIII & IF127-EC-CRE1b-ICAM2-FVIII.
[0079] FIG. 7: mRNA analysis of the CD146-positive endothelial cells isolated from liver and spleen. The human FVIII expression encoded by the lentiviral vector was normalized to endogenous mouse GAPDH expression.
[0080] FIG. 8: plasmid maps of lentiviral vectors
[0081] All endothelial CRE's were cloned, in a self inactivating the lentiviral vectors called pCDH as shown in the map. A: pCDH-HYAL2-EC-CRE1a-ICAM2-FVIII (SEQ ID NO. 50); B: pCDH-ICAM2-FVIII (SEQ ID NO. 49); C: pCDH-IF127-EC-CRE1b-ICAM2-FVIII (SEQ ID NO. 52); D: pCDH-HYAL2-EC-CRE1b-ICAM2-FVIII (SEQ ID NO. 51).
DESCRIPTION
[0082] As used herein, the singular forms "a", "an", and "the" include both singular and plural referents unless the context clearly dictates otherwise.
[0083] The terms "comprising", "comprises" and "comprised of" as used herein are synonymous with "including", "includes" or "containing", "contains", and are inclusive or open-ended and do not exclude additional, non-recited members, elements or method steps. The terms also encompass "consisting of" and "consisting essentially of", which enjoy well-established meanings in patent terminology.
[0084] The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
[0085] The terms "about" or "approximately" as used herein when referring to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/-10% or less, preferably +/-5% or less, more preferably +/-1% or less, and still more preferably +/-0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosed invention. It is to be understood that the value to which the modifier "about" refers is itself also specifically, and preferably, disclosed.
[0086] Whereas the terms "one or more" or "at least one", such as one or more members or at least one member of a group of members, is clear per se, by means of further exemplification, the term encompasses inter alia a reference to any one of said members, or to any two or more of said members, such as, e.g., any or etc. of said members, and up to all said members. In another example, "one or more" or "at least one" may refer to 1, 2, 3, 4, 5, 6, 7 or more.
[0087] The discussion of the background to the invention herein is included to explain the context of the invention. This is not to be taken as an admission that any of the material referred to was published, known, or part of the common general knowledge in any country as of the priority date of any of the claims.
[0088] Throughout this disclosure, various publications, patents and published patent specifications are referenced by an identifying citation. All documents cited in the present specification are hereby incorporated by reference in their entirety. In particular, the teachings or sections of such documents herein specifically referred to are incorporated by reference.
[0089] Unless otherwise defined, all terms used in disclosing the invention, including technical and scientific terms, have the meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. By means of further guidance, term definitions are included to better appreciate the teaching of the invention. When specific terms are defined in connection with a particular aspect of the invention or a particular embodiment of the invention, such connotation is meant to apply throughout this specification, i.e., also in the context of other aspects or embodiments of the invention, unless otherwise defined.
[0090] In the following passages, different aspects or embodiments of the invention are defined in more detail. Each aspect or embodiment so defined may be combined with any other aspect(s) or embodiment(s) unless clearly indicated to the contrary. In particular, any feature indicated as being preferred or advantageous may be combined with any other feature or features indicated as being preferred or advantageous.
[0091] Reference throughout this specification to "one embodiment", "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments. Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those in the art. For example, in the appended claims, any of the claimed embodiments can be used in any combination.
[0092] For general methods relating to the invention, reference is made inter alia to well-known textbooks, including, e.g., "Molecular Cloning: A Laboratory Manual, 2nd Ed." (Sambrook et al., 1989), "Current Protocols in Molecular Biology" (Ausubel et al., 1987).
[0093] In one aspect, the invention relates to a nucleic acid regulatory element for enhancing gene expression in endothelial cells or tissue comprising, consisting essentially of (i.e., the regulatory element may for instance additionally comprise sequences used for cloning purposes, but the indicated sequences make up the essential part of the regulatory element, e.g. they do not form part of a larger regulatory region such as a promoter), or consisting of: a sequence selected from the group consisting of: SEQ ID NO:1 to 33, a sequence having at least 80%, preferably at least 85%, more preferably at least 90%, even more preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of these sequences, or a functional fragment of a sequence selected from the group consisting of: SEQ ID NO:1 to 33.
[0094] Tables 2 and 3 below depict the core nucleotide sequence of the different nucleic acid regulatory elements for enhancing gene expression in endothelial cells or tissue. Table 1 lists the corresponding genes and lengths.
[0095] A `nucleic acid regulatory element`, `cis-acting regulatory element`, `ORE` or `regulatory element` as used herein refers to a transcriptional control element, in particular a non-coding cis-acting transcriptional control element, capable of regulating and/or controlling transcription of a gene, in particular tissue-specific transcription of a gene. Regulatory elements comprise at least one transcription factor binding site (TFBS), more in particular at least one binding site for a tissue-specific transcription factor, most particularly at least one binding site for an endothelial cell-specific transcription factor. Typically, regulatory elements as used herein increase or enhance promoter-driven gene expression when compared to the transcription of the gene from the promoter alone, without the regulatory elements. Thus, regulatory elements particularly comprise enhancer sequences, although it is to be understood that the regulatory elements enhancing transcription are not limited to typical far upstream enhancer sequences, but may occur at any distance of the gene they regulate. Indeed, it is known in the art that sequences regulating transcription may be situated either upstream (e.g. in the promoter region) or downstream (e.g. in the 3'UTR) of the gene they regulate in vivo, and may be located in the immediate vicinity of the gene or further away. Of note, although regulatory elements as disclosed herein typically comprise naturally occurring sequences, combinations of (parts of) such regulatory elements or several copies of a regulatory element, i.e. regulatory elements comprising non-naturally occurring sequences, are themselves also envisaged as regulatory element. Regulatory elements as used herein may comprise part of a larger sequence involved in transcriptional control, e.g. part of a promoter sequence. However, regulatory elements alone are typically not sufficient to initiate transcription, but require a promoter to this end. The regulatory elements disclosed herein are provided as nucleic acid molecules, i.e. isolated nucleic acids, or isolated nucleic acid molecules. Said nucleic acid regulatory elements hence have a sequence which is only a small part of the naturally occurring genomic sequence and hence is not naturally occurring as such, but is isolated therefrom.
[0096] The term "nucleic acid" as used herein typically refers to an oligomer or polymer (preferably a linear polymer) of any length composed essentially of nucleotides. A nucleotide unit commonly includes a heterocyclic base, a sugar group, and at least one, e.g. one, two, or three, phosphate groups, including modified or substituted phosphate groups. Heterocyclic bases may include inter alia purine and pyrimidine bases such as adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U) which are widespread in naturally-occurring nucleic acids, other naturally-occurring bases (e.g., xanthine, inosine, hypoxanthine) as well as chemically or biochemically modified (e.g., methylated), non-natural or derivatised bases. Sugar groups may include inter alia pentose (pentofuranose) groups such as preferably ribose and/or 2-deoxyribose common in naturally-occurring nucleic acids, or arabinose, 2-deoxyarabinose, threose or hexose sugar groups, as well as modified or substituted sugar groups. Nucleic acids as intended herein may include naturally occurring nucleotides, modified nucleotides or mixtures thereof. A modified nucleotide may include a modified heterocyclic base, a modified sugar moiety, a modified phosphate group or a combination thereof. Modifications of phosphate groups or sugars may be introduced to improve stability, resistance to enzymatic degradation, or some other useful property. The term "nucleic acid" further preferably encompasses DNA, RNA and DNA/RNA hybrid molecules, specifically including hnRNA, pre-mRNA, mRNA, cDNA, genomic DNA, amplification products, oligonucleotides, and synthetic (e.g., chemically synthesised) DNA, RNA or DNA/RNA hybrids. A nucleic acid can be naturally occurring, e.g., present in or isolated from nature; or can be non-naturally occurring, e.g., recombinant, i.e., produced by recombinant DNA technology, and/or partly or entirely, chemically or biochemically synthesised. A "nucleic acid" can be double-stranded, partly double stranded, or single-stranded. Where single-stranded, the nucleic acid can be the sense strand or the antisense strand. In addition, nucleic acid can be circular or linear.
[0097] As used herein "transcription factor binding site", "transcription factor binding sequence" or "TFBS" refers to a sequence of a nucleic acid region to which transcription factors bind. Non-limiting examples of TFBS include binding sites for or such as: POLR2A, Po12(b), GATA2, GATA-2, FOS, c-Fos, NR3C1, Freac-2, MAX, FOX2P, MYC, SRY, SOX9, SRF, JUN, TEAD4, EZH2, TBP, TCF7L2, SPI1, RELA, JUND, MXI1, JUNB, BHLHE40, RCOR1, TCF12, TALI, EP300, HDAC2, GTF2F1, SIN3AK20, FOSL2, ETS1, CTBP2, GATA3, CEBPB, FOXA1, YY1, RFX5, TAF1, REST, ELF1, CTCF, SMC3, FOXP2, RUNX3, NRF1, HDAC6, IRF4, PAX5, RAD21, WRNIP1, ERalpha_a, PU.1, TCF4, TALI, HDAC2, GATA3, Mxi1, GTF2F1, ELF1, NRSF, CTCF, SMC3, Ini1, IRF4, PAX5, CTCF, Po12-4H8, YY1, CTCF, FOXO1, FOXJ2, GATA-X, Gfi-1, Hand1/E47, MAZ, USF1, REST, TFAP2A, TFAP2C, CHD2, ZNF274, BACH1, EBF, EBF1, ATF2:c-Jun, CREB1, ATF, Tax/CREB, CREB1, EGR1, NF-kappaB, c-Rel, Pax-3, FOXO4, SOX5, GR, ZNF263, Lmo2 complex, AP-4, HEN1, E2F6, PML, TRIM28, SMARCA4, RBBP5, NRF2F, TBL1XR1, STAT5A, MAFF, REST, JUND, IRF1, MAFK, eGFP-JunDATF1, ARID3A, ATF3, E2F6, GATA2, GATA-1, Brg1, TALI, JunB, NR2F2, HDAC8, BCL3, ATF2, CBX3, HNF4, FOXA2, KAP1, UBTF, GABP, GABPA, BCLAF1, SP1, FOXM1, MEF2A, ZNF143, ZBTB7A, NANOG, CTCFL, NFKB, CCNT2, EBF1, FOXA1, Max, c-Myc, STAT1, STAT2, MZF1, SMARCC1, E2F4, FOSL1, STAT3, P300, AP2gamma, MafF1, JunD, AP2alpha, FOXA2, HMGN3, ZBTB33, P300, Nkx2-2, Nkx2-5, SRF, YY1, HTF, CHX10, HNF1, OCT, Ncx, AP-2rep, Lmo2 complex, SOX5, GATA-1, CDP CR1, Cart-1, NFIV, RXRA, SREBP1, MYBL2, HNF4G, HNF4A, HEY1, ZEB1, PHF8, CHD1, PU-1, RSRFC4, MEF-2, and/or Lyf-1. The nucleic acid regulatory elements described herein can comprise any one or more of said TFBS, or combinations thereof. Transcription factor binding sites may be found in databases such as Transfac.RTM..
[0098] Sequences disclosed herein may be part of sequences of regulatory elements capable of controlling transcription of endothelial cell-specific genes in vivo. Particular examples for endothelial-specific regulatory elements may in particular be controlling the following genes: IF127, ICAM2, VWF, EDN1, ENG, ECSCR, CDH5, PECAM1, HHIP, TIE1 or HYAL2.
[0099] Accordingly, in embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from CDH5 regulatory elements, i.e. regulatory elements that control expression of the CDH5 gene (Cadherin 5 or VE-cadherin gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 1 to 6, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from ECSCR regulatory elements, i.e. regulatory elements that control expression of the ECSCR gene (Endothelial Cell-Specific Chemotaxis Regulator gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 7 or 8, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from EDN1 regulatory elements, i.e. regulatory elements that control expression of the EDN1 gene (Endothelin 1 gene) in vivo, e.g. regulatory elements comprising SEQ ID NO: 9, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from ENG regulatory elements, i.e. regulatory elements that control expression of the ENG gene (Endoglin gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 10 to 12, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from HHIP regulatory elements, i.e. regulatory elements that control expression of the HHIP gene (Hedgehog Interacting Protein gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 13 or 14, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from HYAL2 regulatory elements, i.e. regulatory elements that control expression of the HYAL2 gene (Hyaluronoglucosaminidase 2 gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 15 to 17, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from ICAM2 regulatory elements, i.e. regulatory elements that control expression of the ICAM2 gene (Intercellular Adhesion Molecule 2 gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 18 to 20, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from IF127 regulatory elements, i.e. regulatory elements that control expression of the IF127 gene (Interferon, Alpha-Inducible Protein 27 gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 21 to 23, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from PECAM1 regulatory elements, i.e. regulatory elements that control expression of the PECAM1 gene (Platelet/Endothelial Cell Adhesion Molecule 1 gene) in vivo, e.g. regulatory elements comprising SEQ ID NO: 24, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from TIE1 regulatory elements, i.e. regulatory elements that control expression of the TIE1 gene (Tyrosine Kinase Wth Immunoglobulin-Like And EGF-Like Domains 1 gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 25 or 26, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence from VWF regulatory elements, i.e. regulatory elements that control expression of the VWF gene (Von Willebrand Factor gene) in vivo, e.g. regulatory elements comprising any one or more of SEQ ID NOs: 27 or 28, or functional fragments thereof. In other embodiments, the nucleic acid regulatory elements disclosed herein comprise a sequence comprising any one or more of SEQ ID NOs: 29 to 33, or functional fragments thereof.
[0100] As used herein, the terms "identity" and "identical" and the like refer to the sequence similarity between two polymeric molecules, e.g., between two nucleic acid molecules, e.g., two DNA molecules. Sequence alignments and determination of sequence identity can be done, e.g., using the Basic Local Alignment Search Tool (BLAST) originally described by Altschul et al. 1990 (J Mol Biol 215: 403-10), such as the "Blast 2 sequences" algorithm described by Tatusova and Madden 1999 (FEMS Microbiol Lett 174: 247-250). Typically, the percentage sequence identity is calculated over the entire length of the sequence. As used herein, the term "substantially identical" denotes at least 90%, preferably at least 95%, such as 95%, 96%, 97%, 98% or 99%, sequence identity.
[0101] The term `functional fragment` as used in the application refers to fragments of the regulatory element sequences disclosed herein that retain the capability of regulating endothelial cell-specific expression, i.e. they can still confer tissue specificity and they are capable of regulating expression of a (trans)gene in the same way (although possibly not to the same extent) as the sequence from which they are derived. Functional fragments may preferably comprise at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120, at least 150, at least 200, at least 250, at least 300, at least 350, or at least 400 contiguous nucleotides from the sequence from which they are derived. Also preferably, functional fragments may comprise at least 1, more preferably at least 2, at least 3, or at least 4, even more preferably at least 5, at least 10, or at least 15, of the transcription factor binding sites (TFBS) that are present in the sequence from which they are derived.
[0102] "endothelial cell-specific expression" as used in the application, refers to the preferential or predominant expression of a (trans)gene (as RNA and/or polypeptide) in endothelial cells and tissue comprising or built from endothelial cells, as compared to other (i.e. non-endothelial) cells or tissues. According to particular embodiments, at least 50%, more particularly at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, at least 99% or 100% of the (trans)gene expression occurs within endothelial cells or endothelial tissue.
[0103] The term "endothelial cell" as used herein encompasses all endothelial cell types, such as the cells forming a single cell layer that lines all blood vessels and regulates exchanges between the bloodstream and the surrounding tissues. Many endothelial cell types exist and their phenotypes vary between different organs, between different segments of the vascular loop within the same organ, and between neighbouring endothelial cells of the same organ and blood vessel type. Non-limiting examples of such endothelial cells are: liver sinusoidal endothelial cells (LSEC), (micro)vascular endothelial cells from e.g. lung, heart, intestine, skin, retina, arterial endothelial cells, such as endothelial cells from pulmonary artery, the aorta, umbilical artery and umbilical vein, extrahepatic endothelial cells from certain vascular beds, blood-brain barrier ECs, bone marrow ECs, and high endothelial venule cells (HEVs).
[0104] According to a particular embodiment, endothelial cell specific expression entails that there is less than 10%, less than 5%, less than 2% or even less than 1% `leakage` of expressed gene product to other organs or tissue than those comprising or built by endothelial cells, such as muscle, heart, lung, liver, brain, kidney and/or spleen.
[0105] The same applies mutatis mutandis for endothelial progenitor cell (EPC)-specific expression, which may be considered as a particular form of endothelial cell-specific expression. Hence, throughout the application, where endothelial cell-specific is mentioned in the context of expression, endothelial progenitor cell (EPC)-specific expression is also explicitly envisaged.
[0106] In embodiments, the invention relates to a nucleic acid regulatory element for enhancing gene expression in endothelial cells or tissue derived therefrom comprising, consisting essentially of, or consisting of a sequence selected from the group consisting of: SEQ ID NO:1 to 33; a sequence having at least 80%, preferably at least 85%, more preferably at least 90%, even more preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of the sequences selected from the group consisting of SEQ ID NO: 1 to 33; or a functional fragment thereof, wherein said functional fragment comprises at least 20, preferably at least 25, more preferably at least 50, at least 100, at least 200 or at least 250, contiguous nucleotides from the sequence from which it is derived, and wherein said functional fragment comprises at least 1, preferably at least 2, 3, 4, or 5, more preferably at least 10 or at least 15 transcription factor binding sites (TFBS) such as those TFBS that are present in the sequence from which it is derived.
[0107] It is also possible to make nucleic acid regulatory elements that comprise an artificial sequence by combining two or more identical or different sequences disclosed herein or functional fragments thereof. Accordingly, in certain embodiments a nucleic acid regulatory element for enhancing gene expression in endothelial cells is provided comprising at least two sequences selected from the group consisting of: SEQ ID NO:1-33.
[0108] For example, disclosed herein is a nucleic acid regulatory element comprising, consisting essentially of, or consisting of 2, 3, 4, or 5 repeats, e.g. tandem repeats, of any one of SEQ ID NOs:1 to 33, or combinations thereof.
[0109] Particular examples of nucleic acid regulatory elements that comprise an artificial sequence include the regulatory elements that are obtained by rearranging the transcription factor binding sites (TFBS) that are present in the sequences disclosed herein. Said rearrangement may encompass changing the order of the TFBSs and/or changing the position of one or more TFBSs relative to the other TFBSs and/or changing the copy number of one or more of the TFBSs. For example, also disclosed herein is a nucleic acid regulatory element for enhancing endothelial cell-specific gene expression, in particular endothelial cell-specific gene expression, comprising binding sites for e.g. Sp1, EGR-1, ETS and GATA. Further for example, also disclosed herein is a nucleic acid regulatory element for enhancing endothelial cell-specific gene expression, in particular comprising binding sites for one or more of: Sp1, EGR-1, ETS and GATA and combinations thereof. In some embodiments, these nucleic acid regulatory elements comprise at least two, such as 2, 3, 4, or more copies of any one or more of the recited TFBSs.
[0110] In some embodiments, the vector used is a lentiviral vector. In other embodiments, the vector used is an adeno-associated viral vector. In yet other embodiment, the vector used is an adenoviral vector. In case a lentiviral vector is used, it can be a self inactivating or a non self-inactivating lentiviral vector. A self inactivating lentiviral vector is sometimes preferred for clinical use since it is considered safer.
[0111] In case the regulatory element is provided as a single stranded nucleic acid, e.g. when using a single-stranded AAV vector, the complement strand is considered equivalent to the disclosed sequences. Hence, also disclosed herein is a nucleic acid regulatory element for enhancing endothelial cell-specific gene expression comprising, consisting essentially of, or consisting of the complement of a sequence described herein, in particular a sequence selected from the group consisting of: SEQ ID NOs:1 to 33; a sequence having at least 80%, preferably at least 85%, more preferably at least 90%, even more preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of these sequences; or a functional fragment thereof as defined herein.
[0112] Also disclosed herein is a nucleic acid regulatory element for enhancing endothelial cell-specific gene expression hybridizing under stringent conditions to a nucleic acid regulatory element described herein, in particular to the nucleic acid regulatory element comprising, consisting essentially of, or consisting of a sequence selected from the group consisting of: SEQ ID NOs:1 to 33; a sequence having at least 90%, preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of these sequences; a functional fragment thereof as defined herein; or to its complement. Said nucleic acid regulatory elements do not need to be of equal length as the sequence they hybridize to. In preferred embodiments, the size of said hybridizing nucleic acid regulatory element does not differ more than 25% in length, in particular 20% in length, more in particular 15% in length, most in particular 10% in length from the sequence it hybridizes to.
[0113] The expression `hybridize under stringent conditions` refers to the ability of a nucleic acid molecule to hybridize to a target nucleic acid molecule under defined conditions of temperature and salt concentration. Typically, stringent hybridization conditions are no more than 25.degree. C. to 30.degree. C. (for example, 20.degree. C., 15.degree. C., 10.degree. C. or 5.degree. C.) below the melting temperature (Tm) of the native duplex. Methods of calculating Tm are well known in the art. By way of non-limiting example, representative salt and temperature conditions for achieving stringent hybridization are: 1.times.SSC, 0.5% SDS at 65.degree. C. The abbreviation SSC refers to a buffer used in nucleic acid hybridization solutions. One liter of the 20.times. (twenty times concentrate) stock SSC buffer solution (pH 7.0) contains 175.3 g sodium chloride and 88.2 g sodium citrate. A representative time period for achieving hybridization is 12 hours.
[0114] Preferably the regulatory elements as described herein are fully functional while being only of limited length. This allows their use in vectors or nucleic acid expression cassettes without unduly restricting their payload capacity. Accordingly, in embodiments, the regulatory element disclosed herein is a nucleic acid of 1500 nucleotides or less, 1000 nucleotides or less, 900 nucleotides or less, 800 nucleotides or less, 700 nucleotides or less, more preferably 610 nucleotides or less, such as 550 nucleotides or less, 500 nucleotides or less, 450 nucleotides or less, 400 nucleotides or less, 350 nucleotides or less, or 300 nucleotides or less (i.e. the nucleic acid regulatory element has a maximal length of 1500 nucleotides, 1000 nucleotides, 900 nucleotides, 800 nucleotides, 700 nucleotides, preferably 610 nucleotides, such as 550 nucleotides, 500 nucleotides, 450 nucleotides, 400 nucleotides, 350 nucleotides, or 300 nucleotides).
[0115] However, it is to be understood that the disclosed nucleic acid regulatory elements retain regulatory activity (i.e. with regard to specificity and/or activity of transcription) and thus they particularly have a minimum length of 20 nucleotides, 25 nucleotides, 30 nucleotides, 35 nucleotides, 40 nucleotides, 45 nucleotides, 50 nucleotides, 100 nucleotides, 150 nucleotides, 200 nucleotides, 250 nucleotides, 300 nucleotides, 350 nucleotides or 400 nucleotides.
[0116] In certain embodiments, the invention provides for a nucleic acid regulatory element of 1000 nucleotides or less, preferably 900 nucleotides or less, preferably 800 nucleotides or less, preferably 700 nucleotides or less of a sequence selected from the group consisting of: SEQ ID NOs:1 to 33; a sequence having at least 80%, preferably at least 85%, more preferably at least 90%, even more preferably at least 95%, such as 95%, 96%, 97%, 98%, or 99%, identity to any of said sequences; or a functional fragment thereof as defined herein.
[0117] The nucleic acid regulatory elements disclosed herein may be used in a nucleic acid expression cassette. Accordingly, in an aspect the invention provides for the use of the nucleic acid regulatory elements as described herein in a nucleic acid expression cassette.
[0118] In an aspect the invention provides a nucleic acid expression cassette comprising a nucleic acid regulatory element as described herein, operably linked to a promoter. In embodiments, the nucleic acid expression cassette does not contain a transgene. Such nucleic acid expression cassette may be used to drive expression of an endogenous gene. In preferred embodiments, the nucleic acid expression cassette comprises a nucleic acid regulatory element as described herein, operably linked to a promoter and a transgene.
[0119] As used herein, the term `nucleic acid expression cassette` refers to nucleic acid molecules that include one or more transcriptional control elements (such as, but not limited to promoters, enhancers and/or regulatory elements, polyadenylation sequences, and introns) that direct (trans)gene expression in one or more desired cell types, tissues or organs. Typically, they will also contain a transgene, although it is also envisaged that a nucleic acid expression cassette directs expression of an endogenous gene in a cell into which the nucleic acid cassette is inserted.
[0120] The term `operably linked` as used herein refers to the arrangement of various nucleic acid molecule elements relative to each other such that the elements are functionally connected and are able to interact with each other. Such elements may include, without limitation, a promoter, an enhancer and/or a regulatory element, a polyadenylation sequence, one or more introns and/or exons, and a coding sequence of a gene of interest to be expressed (i.e., the transgene). The nucleic acid sequence elements, when properly oriented or operably linked, act together to modulate the activity of one another, and ultimately may affect the level of expression of the transgene. By "modulate" is meant increasing, decreasing, or maintaining the level of activity of a particular element. The position of each element relative to other elements may be expressed in terms of the 5' terminus and the 3' terminus of each element, and the distance between any particular elements may be referenced by the number of intervening nucleotides, or base pairs, between the elements. As understood by the skilled person, operably linked implies functional activity, and is not necessarily related to a natural positional link. Indeed, when used in nucleic acid expression cassettes, the regulatory elements will typically be located immediately upstream of the promoter (although this is generally the case, it should definitely not be interpreted as a limitation or exclusion of positions within the nucleic acid expression cassette), but this need not be the case in vivo. E.g., a regulatory element sequence naturally occurring downstream of a gene whose transcription it affects is able to function in the same way when located upstream of the promoter. Hence, according to a specific embodiment, the regulatory or enhancing effect of the regulatory element is position-independent.
[0121] In particular embodiments, the nucleic acid expression cassette comprises one nucleic acid regulatory element as described herein. In alternative embodiments, the nucleic acid expression cassette comprises two or more, such as, e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10, nucleic acid regulatory elements as described herein, i.e. they are combined modularly to enhance their regulatory (and/or enhancing) effect. In further embodiments, at least two of the two or more nucleic acid regulatory elements are identical or substantially identical. In yet further embodiments, all of the two or more regulatory elements are identical or substantially identical. The copies of the identical or substantially identical nucleic acid regulatory elements may be provided as tandem repeats in the nucleic acid expression cassette. In alternative further embodiments, at least two of the two or more nucleic acid regulatory elements are different from each other, that is to say, are defined by a different SEQ ID NO:. The nucleic acid expression cassette may also comprise a combination of identical and substantially identical nucleic acid regulatory elements and non-identical nucleic acid regulatory elements.
[0122] For example, the nucleic acid expression cassette may comprise a nucleic acid regulatory element comprising SEQ ID NO:1, and a nucleic acid regulatory element comprising any one or more of SEQ ID Nos: 2 to 33. Alternatively, this can be done for remaining regulatory elements defined by SEQ ID NOs:2 to 33, which can be combine with any one or more of the other regulatory elements.
[0123] As used in the application, the term `promoter` refers to nucleic acid sequences that regulate, either directly or indirectly, the transcription of corresponding nucleic acid coding sequences to which they are operably linked (e.g. a transgene or endogenous gene). A promoter may function alone to regulate transcription or may act in concert with one or more other regulatory sequences (e.g. enhancers or silencers, or regulatory elements). In the context of the present application, a promoter is typically operably linked to a regulatory element as disclosed herein to regulate transcription of a (trans)gene. When a regulatory element as described herein is operably linked to both a promoter and a transgene, the regulatory element can (1) confer a significant degree of endothelial cell-specific expression in vivo (and/or in vitro in cell lines derived from endothelial cell- or tissue) of the transgene, and/or (2) can increase the level of expression of the transgene in endothelial cells (and/or in vitro in cell lines derived from endothelial cells or tissue).
[0124] The promoter may be homologous (i.e. from the same species as the animal, in particular mammal, to be transfected with the nucleic acid expression cassette) or heterologous (i.e. from a source other than the species of the animal, in particular mammal, to be transfected with the expression cassette). As such, the source of the promoter may be any virus, any unicellular prokaryotic or eukaryotic organism, any vertebrate or invertebrate organism, or any plant, or may even be a synthetic promoter (i.e. having a non-naturally occurring sequence), provided that the promoter is functional in combination with the regulatory elements described herein. In preferred embodiments, the promoter is a mammalian promoter, in particular a murine or human promoter.
[0125] The promoter may be an inducible or constitutive promoter.
[0126] Non-limiting exemplary endothelial cell-specific promoters are: the promotors of the genes depicted in Table 1 below, more preferably the
[0127] In particularly preferred embodiments, the promoter is a mammalian promoter, in particular a murine or human promoter.
[0128] In preferred embodiments, the promoter is from the vascular-endothelial cadherin gene, in particular the murine or human cadherin-5 gene, such as the promoter as defined in SEQ ID NO: 34 (cf. Table 4).
[0129] In preferred embodiments, the promoter is from the endothelin-1 gene, in particular the murine or human endothelin-1 gene, such as the promoter as defined in SEQ ID NO: 35 (cf. Table 4).
[0130] In preferred embodiments, the promoter is from the endoglin gene, in particular the murine or human endoglin gene, such as the promoter as defined in SEQ ID NO: 36 (cf. Table 4).
[0131] In preferred embodiments, the promoter is from the Fms-Related Tyrosine Kinase 1 gene, in particular the murine or human Fms-Related Tyrosine Kinase 1 gene, such as the promoter as defined in SEQ ID NO: 37 (cf. Table 4).
[0132] In preferred embodiments, the promoter is from the Intercellular Adhesion Molecule 2 gene, in particular the murine or Intercellular Adhesion Molecule 2 gene (ICAM2), such as the promoter as defined in SEQ ID NO: 38 (cf. Table 4).
[0133] Furthermore, the promoter does not need to be the promoter of the transgene in the nucleic acid expression cassette, although it is possible that the transgene is transcribed from its own promoter.
[0134] To minimize the length of the nucleic acid expression cassette, the regulatory elements may be linked to minimal promoters, or shortened versions of the promoters described herein. A `minimal promoter` (also referred to as basal promoter or core promoter) as used herein is part of a full-size promoter still capable of driving expression, but lacking at least part of the sequence that contributes to regulating (e.g. tissue-specific) expression. This definition covers both promoters from which (tissue-specific) regulatory elements have been deleted--that are capable of driving expression of a gene but have lost their ability to express that gene in a tissue-specific fashion and promoters from which (tissue-specific) regulatory elements have been deleted that are capable of driving (possibly decreased) expression of a gene but have not necessarily lost their ability to express that gene in a tissue-specific fashion. Preferably, the promoter contained in the nucleic acid expression cassette disclosed herein is 1000 nucleotides or less in length, 900 nucleotides or less, 800 nucleotides or less, 700 nucleotides or less, 600 nucleotides or less, 500 nucleotides or less, 400 nucleotides or less, 300 nucleotides or less, or 250 nucleotides or less. One particular non-limiting example of such a minimal promotor is the EDN1mini promoter (cf. Table 4).
[0135] The term `transgene` as used herein refers to particular nucleic acid sequences encoding a polypeptide or a portion of a polypeptide to be expressed in a cell into which the nucleic acid sequence is introduced. However, it is also possible that transgenes are expressed as RNA, typically to control (e.g. lower) the amount of a particular polypeptide in a cell into which the nucleic acid sequence is inserted. These RNA molecules include but are not limited to molecules that exert their function through RNA interference (shRNA, RNAi), micro-RNA regulation (miR) (which can be used to control expression of specific genes), catalytic RNA, antisense RNA, RNA aptamers, ZFN, TALEN, CRISPR/Cas9 or similar DNA or RNA cutters, etc.
[0136] How the nucleic acid sequence is introduced into a cell is not essential to the invention, it may for instance be through integration in the genome or as an episomal plasmid. Of note, expression of the transgene may be restricted to a subset of the cells into which the nucleic acid sequence is introduced. The term `transgene` is meant to include (1) a nucleic acid sequence that is not naturally found in the cell (i.e., a heterologous nucleic acid sequence); (2) a nucleic acid sequence that is a mutant form of a nucleic acid sequence naturally found in the cell into which it has been introduced; (3) a nucleic acid sequence that serves to add additional copies of the same (i.e., homologous) or a similar nucleic acid sequence naturally occurring in the cell into which it has been introduced; or (4) a silent naturally occurring or homologous nucleic acid sequence whose expression is induced in the cell into which it has been introduced.
[0137] The transgene may be homologous or heterologous to the promoter (and/or to the animal, in particular mammal, in which it is introduced, e.g. in cases where the nucleic acid expression cassette is used for gene therapy).
[0138] The transgene may be a full length cDNA or genomic DNA sequence, or any fragment, subunit or mutant thereof that has at least some biological activity. In particular, the transgene may be a minigene, i.e. a gene sequence lacking part, most or all of its intronic sequences. The transgene thus optionally may contain intron sequences. Optionally, the transgene may be a hybrid nucleic acid sequence, i.e., one constructed from homologous and/or heterologous cDNA and/or genomic DNA fragments. By `mutant form` is meant a nucleic acid sequence that contains one or more nucleotides that are different from the wild-type or naturally occurring sequence, i.e., the mutant nucleic acid sequence contains one or more nucleotide substitutions, deletions, and/or insertions. The nucleotide substitution, deletion, and/or insertion can give rise to a gene product (i.e. e., protein or nucleic acid) that is different in its amino acid/nucleic acid sequence from the wild type amino acid/nucleic acid sequence. Preparation of such mutants is well known in the art. In some cases, the transgene may also include a sequence encoding a leader peptide or signal sequence such that the transgene product will be secreted from the cell.
[0139] The transgene that may be contained in the nucleic acid expression cassettes described herein typically encodes a gene product such as RNA or a polypeptide (protein).
[0140] In embodiments, the transgene encodes a therapeutic protein. The therapeutic protein may be a secretable protein. Non-limiting examples of secretable proteins, in particular secretable therapeutic proteins, include hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), coagulation factor IX (FIX), coagulation factor XI (FXI), tissue factor (TF), tissue factor pathway inhibitor (TFPI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, .alpha.1-antitrypsin, AAT, apolipoprotein A-I (apoA-I), matrix metalloproteinases including but not limited to matrix metalloproteinase-3 (TIMP-3), insulin, erythropoietin, lipoprotein lipase, nitric oxide synthase (NOS), antibodies or nanobodies, including but not limited to antibodies directed against any one of said transgenes, factors and their cognate receptors or against any secreted protein or viral protein, small interfering RNA, guide RNA, endonuclease, and Cas9, growth factors, cytokines, chemokines, plasma factors etc. The therapeutic protein may also be a structural protein. Non-limiting examples of structural proteins, in particular structural therapeutic proteins, include proteins modulating vascular relaxation and vasoconstriction, atherosclerosis. In preferred embodiments, the transgene comprises the nitric oxide synthase (NOS).
[0141] In embodiments, the transgene encodes an immunogenic protein. Non-limiting examples of immunogenic proteins include epitopes and antigens derived from a pathogen.
[0142] As used herein, the term "immunogenic" refers to a substance or composition capable of eliciting an immune response.
[0143] Other sequences may be incorporated in the nucleic acid expression cassette disclosed herein as well, typically to further increase or stabilize the expression of the transgene product (e.g. introns and/or polyadenylation sequences).
[0144] Any intron can be utilized in the expression cassettes described herein, but may not be necessary. The term "intron" encompasses any portion of a whole intron that is large enough to be recognized and spliced by the nuclear splicing apparatus. Typically, short, functional, intron sequences are preferred in order to keep the size of the expression cassette as small as possible which facilitates the construction and manipulation of the expression cassette. In some embodiments, the intron is obtained from a gene that encodes the protein that is encoded by the coding sequence within the expression cassette. The intron can be located 5' to the coding sequence, 3' to the coding sequence, or within the coding sequence. An advantage of locating the intron 5' to the coding sequence is to minimize the chance of the intron interfering with the function of the polyadenylation signal. In embodiments, the nucleic acid expression cassette disclosed herein further comprises an intron. Non-limiting examples of suitable introns are Minute Virus of Mice (MVM) intron, beta-globin intron (betalVS-II), factor IX (FIX) intron A, Simian virus 40 (SV40) small-t intron, and beta-actin intron. Preferably, the intron is MVM intron.
[0145] Any polyadenylation signal that directs the synthesis of a polyA tail is useful in the expression cassettes described herein, examples of those are well known to one of skill in the art. Exemplary polyadenylation signals include, but are not limited to, polyA sequences derived from the Simian virus 40 (SV40) late gene, the bovine growth hormone (BGH) polyadenylation signal, the minimal rabbit f3-globin (mRBG) gene, and the synthetic polyA s(SPA) site as described in Levitt et al. (1989, Genes Dev 3:1019-1025). Preferably, the polyadenylation signal is derived from SV40 (i.e. SV40 pA).
[0146] In particular embodiments, the invention provides a nucleic acid expression cassette comprising, consisting essentially of, or consisting of a nucleic acid regulatory element selected from the group consisting of SEQ ID NO: 1 to 33 or a sequence having 95% identity to said sequence, operably linked to a promoter, preferably a promoter selected from the group consisting of the promoter from the cadherin-5, endothelin-1, endoglin, Fms-Related Tyrosine Kinase 1, or Intercellular Adhesion Molecule 1 gene or the promoter, and a transgene, preferably a transgene encoding a luciferase. In yet further embodiments the nucleic acid expression cassette further comprises a polyadenylation signal, preferably a polyadenylation signal derived from SV40.
[0147] In particular embodiments, the invention provides a nucleic acid expression cassette comprising, consisting essentially of, or consisting of a nucleic acid regulatory element selected from the group consisting of SEQ ID NO: 1 to 33 or a sequence having 95% identity to said sequence, operably linked to a promoter, preferably the promoter from the cadherin-5, endothelin-1, endoglin, Fms-Related Tyrosine Kinase 1, or Intercellular Adhesion Molecule 1 gene, and a transgene, preferably a transgene encoding a therapeutic or structural protein as defined herein. In yet further embodiments, the nucleic acid expression cassette further comprises a polyadenylation signal. In particular embodiments, any one of the following transgenes can introduced: secretable proteins, in particular secretable therapeutic proteins, including hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), coagulation factor IX (FIX), coagulation factor XI (FXI), tissue factor (TF), tissue factor pathway inhibitor (TFPI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, .alpha.1-antitrypsin (AAT), matrix metalloproteinases including but not limited to matrix metalloproteinase-3 (TIMP-3) (TIMP-3), insulin, erythropoietin, lipoprotein lipase, antibodies or nanobodies, growth factors, cytokines, chemokines, plasma factors etc. The therapeutic protein may also be a structural protein. Non-limiting examples of structural proteins, in particular structural therapeutic proteins, modulating vascular relaxation and vasoconstriction, atherosclerosis. In preferred embodiments, the transgene comprises the nitric oxide synthase (NOS).
[0148] The nucleic acid regulatory element and the nucleic acid expression cassette disclosed herein may be used as such, or typically, they may be part of a nucleic acid vector. Accordingly, a further aspect relates to the use of a nucleic acid regulatory element as described herein or a nucleic acid expression cassette as described herein in a vector, in particular a nucleic acid vector.
[0149] In an aspect, the invention also provides a vector comprising a nucleic acid regulatory element as disclosed herein. In further embodiments, the vector comprises a nucleic acid expression cassette as disclosed herein.
[0150] The term `vector` as used in the application refers to nucleic acid molecules, e.g. double-stranded DNA, which may have inserted into it another nucleic acid molecule (the insert nucleic acid molecule) such as, but not limited to, a cDNA molecule. The vector is used to transport the insert nucleic acid molecule into a suitable host cell. A vector may contain the necessary elements that permit transcribing the insert nucleic acid molecule, and, optionally, translating the transcript into a polypeptide. The insert nucleic acid molecule may be derived from the host cell, or may be derived from a different cell or organism. Once in the host cell, the vector can replicate independently of, or coincidental with, the host chromosomal DNA, and several copies of the vector and its inserted nucleic acid molecule may be generated. The vectors can be episomal vectors (i.e., that do not integrate into the genome of a host cell), or can be vectors that integrate into the host cell genome. The term `vector` may thus also be defined as a gene delivery vehicle that facilitates gene transfer into a target cell. This definition includes both non-viral and viral vectors. Non-viral vectors include but are not limited to cationic lipids, liposomes, nanoparticles, PEG, PEI, plasmid vectors (e.g. pUC vectors, bluescript vectors (pBS) and pBR322 or derivatives thereof that are devoid of bacterial sequences (minicircles)) transposons-based vectors (e.g. PiggyBac (PB) vectors or Sleeping Beauty (SB) vectors), etc. Viral vectors are derived from viruses and include but are not limited to retroviral, lentiviral, adeno-associated viral, adenoviral, herpes viral, hepatitis viral vectors or the like. Typically, but not necessarily, viral vectors are replication-deficient as they have lost the ability to propagate in a given cell since viral genes essential for replication have been eliminated from the viral vector. However, some viral vectors can also be adapted to replicate specifically in a given cell, such as e.g. a cancer cell, and are typically used to trigger the (cancer) cell-specific (onco)lysis. Virosomes are a non-limiting example of a vector that comprises both viral and non-viral elements, in particular they combine liposomes with an inactivated HIV or influenza virus (Yamada et al., 2003). Another example encompasses viral vectors mixed with cationic lipids.
[0151] In preferred embodiments, the vector is a viral vector, such as a retroviral, lentiviral, adenoviral, or adeno-associated viral (AAV) vector, more preferably a lentiviral vector. Lentiviral vectors are preferably derived from the human immune deficiency virus (HIV) though other lentiviral vectors based on other lentiviruses could also be used (including but not limited to Equine infectious anemia virus). Lentiviral vectors can transduce endothelial cells. Production of lentiviral vectors can be achieved by (VandenDriessche et al. J. Thromb Hemostasis, 2007) transient co-transfected of lentiviral vector plasmids encoding the gene of interest with the gag-pol, rev and env-encoding helper constructs. Typically, a heterologous envelope is used such as the vesicular stomatitis virus G glycoprotein (VSV-G) or an endotheliotropic envelope including but not limited to envelopes that confer antibody or nanobody (i.e. single chain antibody)-mediated endothelial retargeting targeting specific endothelial cell surface markers (VandenDriessche & Chuah, Blood. 2013 Sep. 19; 122(12):1993-4; Abel et al., Blood. 2013 Sep. 19; 122(12):2030-8; Buchholz et al. Trends Biotechnol. 2015 December; 33(12):777-90; Munch et al., Mol Ther. 2011 April; 19(4):686-93; Anliker et al., Nat Methods. 2010 November; 7(11):929-35).
[0152] In another embodiment the vector is an adeno-associated viral (AAV) vector. AAV vectors are preferably used as self-complementary, double-stranded AAV vectors (scAAV) in order to overcome one of the limiting steps in AAV transduction (i.e. single-stranded to double-stranded AAV conversion) (McCarty, 2001, 2003; Nathwani et al, 2002, 2006, 2011; Wu et al., 2008), although the use of single-stranded AAV vectors (ssAAV) are also encompassed herein.
[0153] Production of AAV vector particles can e.g. be achieved by transient co-transfection of AAV-vector and AAV helper constructs, encoding AAV capsids into HEK293 cells, followed by a purification step based on cesium chloride (CsCl) density gradient ultracentrifugation, as described (Vanden Driessche et al., 2007). Capsids can also be derived from different serotypes or are specifically modified to enhance endothelial cell transduction either by evolution or selection, antibody (nanobody engineering) or the use of DARPin (Work et al., Mol Ther. 2006 April; 13(4):683-93; Munch et al., Nat Commun. 2015 Feb. 10; 6:6246; Buchholz et al., Trends Biotechnol. 2015 December; 33(12):777-90; White et al. Circulation. 2004 Feb. 3; 109(4):513-9.)
[0154] In yet another embodiment the vector is an adenoviral vector. Adenoviral vectors are preferably derived from the human adenovirus 5 serotype or from other serotypes that display increased tropism to endothelial cells, including but not limited to Ad5T*F35++(White et al., J Cardiothorac Surg. 2013 Aug. 9; 8:183. doi: 10.1186/1749-8090-8-183). Alternatively, the capsid can be engineered to enhance the endotheliotropic properties of the adenoviral vectors including but not limited to the references below (Nicol et al., FEBS Lett. 2009 Jun. 18; 583(12):2100-7; Nicklin and Baker, Mol Ther. 2008 December; 16(12):1904-5; Work et al., Methods Mol Med. 2005; 108:395-413; Work et al., Genet Vaccines Ther. 2004 Oct. 8; 2(1):14). They can be derived from either early-generation or helper-dependent adenoviral vectors (Mol Ther. 2010 December; 18(12):2121-9). Production of these vectors has after transfection of adenoviral vector and helper constructs in HEK293T cells has been described previously (Mol Ther. 2010 December; 18(12):2121-9).
[0155] Since the nucleic acid regulatory elements are de facto modular, also combinations of the best endothelial cell-specific nucleic acid regulatory elements with any other endothelial cell-specific nucleic acid regulatory elements to maximize expression in the desired target tissue are tested. Consequently, this can lead to the generation of a versatile endothelial cell-specific nucleic acid regulatory element platform tailor-made for diseases that affect endothelial cells and tissues encompassing those. Furthermore, the endothelial cell-specific nucleic acid regulatory elements can also be combined with other promoters or nucleic acid regulatory elements active in other target tissues.
[0156] In other embodiments, the vector is a non-viral vector, preferably a plasmid, a minicircle, or a transposon-based vector, such as a Sleeping Beauty(SB)-based vector or piggyBac(PB)-based vector.
[0157] In yet other embodiments, the vector comprises viral and non-viral elements.
[0158] In particular embodiments, the invention provides a vector comprising a nucleic acid expression cassette comprising a nucleic acid regulatory element comprising, consisting essentially of, or consisting of a nucleic acid regulatory element selected from the group consisting of SEQ ID NO:1 to 33, a promoter, preferably the promoter from the cadherin-5, endothelin-1, endoglin, Fms-Related Tyrosine Kinase 1, or Intercellular Adhesion Molecule 1 gene, a transgene, preferably a transgene encoding a therapeutic structural or secretable protein, and a polyadenylation signal. In particular, any one of the following transgenes can introduced: secretable proteins, in particular secretable therapeutic proteins, including hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), coagulation factor IX (FIX), coagulation factor XI (FXI), tissue factor (TF), tissue factor pathway inhibitor (TFPI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, .alpha.1-antitrypsin (AAT), apolipoprotein A-I (apoA-I), matrix metalloproteinases including but not limited to matrix metalloproteinase-3 (TIMP-3) (TIMP-3), insulin, erythropoietin, lipoprotein lipase, antibodies or nanobodies, growth factors, cytokines, chemokines, plasma factors etc. The therapeutic protein may also be a structural protein. Non-limiting examples of structural proteins, in particular structural therapeutic proteins, including proteins modulating vascular relaxation, vasoconstriction or atherosclerosis. In preferred embodiments, the transgene comprises the nitric oxide synthase (NOS).
[0159] In particular embodiments, the invention provides a vector comprising a nucleic acid expression cassette comprising a nucleic acid regulatory element comprising, consisting essentially of, or consisting of a nucleic acid regulatory element selected from the group consisting of SEQ ID NO:1 to 33, a promoter, preferably the promoter from the cadherin-5, endothelin-1, endoglin, Fms-Related Tyrosine Kinase 1, or Intercellular Adhesion Molecule 1 gene, a transgene, preferably a transgene encoding secretable proteins, in particular secretable therapeutic proteins, including hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), coagulation factor IX (FIX), coagulation factor XI (FXI), tissue factor (TF), tissue factor pathway inhibitor (TFPI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, .alpha.1-antitrypsin (AAT), apolipoprotein A-I (apoA-I), matrix metalloproteinases including but not limited to matrix metalloproteinase-3 (TIMP-3) (TIMP-3), insulin, erythropoietin, lipoprotein lipase, antibodies or nanobodies, growth factors, cytokines, chemokines, plasma factors etc. The therapeutic protein may also be a structural protein. Non-limiting examples of structural proteins, in particular structural therapeutic proteins, including proteins modulating vascular relaxation, vasoconstriction or atherosclerosis. In preferred embodiments, the transgene comprises the nitric oxide synthase (NOS).
[0160] The nucleic acid expression cassettes and vectors disclosed herein may be used, for example, to express proteins that are normally expressed and utilized in endothelial cells (i.e. structural proteins), or to express proteins that are expressed in endothelial cells and that are then exported to the blood stream for transport to other portions of the body (i.e. secretable proteins). For example, the expression cassettes and vectors disclosed herein may be used to express a therapeutic amount of a gene product (such as a polypeptide, in particular a therapeutic protein, or RNA) for therapeutic purposes, in particular for gene therapy. Typically, the gene product is encoded by the transgene within the expression cassette or vector, although in principle it is also possible to increase expression of an endogenous gene for therapeutic purposes. In an alternative example, the expression cassettes and vectors disclosed herein may be used to express an immunological amount of a gene product (such as a polypeptide, in particular an immunogenic protein, or RNA) for vaccination purposes.
[0161] The nucleic acid expression cassettes and vectors as taught herein may be formulated in a pharmaceutical composition with a pharmaceutically acceptable excipient, i.e., one or more pharmaceutically acceptable carrier substances and/or additives, e.g., buffers, carriers, excipients, stabilisers, etc. The pharmaceutical composition may be provided in the form of a kit.
[0162] The term "pharmaceutically acceptable" as used herein is consistent with the art and means compatible with the other ingredients of the pharmaceutical composition and not deleterious to the recipient thereof.
[0163] Accordingly, a further aspect of the invention relates to a pharmaceutical composition comprising a nucleic acid expression cassette or a vector described herein.
[0164] The use of nucleic acid regulatory elements described herein for the manufacture of these pharmaceutical compositions is also disclosed herein.
[0165] In embodiments, the pharmaceutical composition may be a vaccine. The vaccine may further comprise one or more adjuvants for enhancing the immune response. Suitable adjuvants include, for example, but without limitation, saponin, mineral gels such as aluminium hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil or hydrocarbon emulsions, bacilli Calmette-Guerin (BCG), Corynebacterium parvum, and the synthetic adjuvant QS-21. Optionally, the vaccine may further comprise one or more immunostimulatory molecules. Non-limiting examples of immunostimulatory molecules include various cytokines, lymphokines and chemokines with immunostimulatory, immunopotentiating, and pro-inflammatory activities, such as interleukins (e.g., IL-1, IL-2, IL-3, IL-4, IL-12, IL-13); growth factors (e.g., granulocyte-macrophage (GM)-colony stimulating factor (CSF)); and other immunostimulatory molecules, such as macrophage inflammatory factor, Flt3 ligand, B7.1; B7.2, etc.
[0166] In a further aspect, the invention relates to the nucleic acid regulatory elements, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein for use in medicine.
[0167] As used herein, the terms "treat" or "treatment" refer to both therapeutic treatment and prophylactic or preventative measures. Beneficial or desired clinical results include, but are not limited to, prevention of an undesired clinical state or disorder, reducing the incidence of a disorder, alleviation of symptoms associated with a disorder, diminishment of extent of a disorder, stabilized (i.e., not worsening) state of a disorder, delay or slowing of progression of a disorder, amelioration or palliation of the state of a disorder, remission (whether partial or total), whether detectable or undetectable, or combinations thereof. "Treatment" can also mean prolonging survival as compared to expected survival if not receiving treatment.
[0168] As used herein, the terms "therapeutic treatment" or "therapy" and the like, refer to treatments wherein the object is to bring a subjects body or an element thereof from an undesired physiological change or disorder to a desired state, such as a less severe or unpleasant state (e.g., amelioration or palliation), or back to its normal, healthy state (e.g., restoring the health, the physical integrity and the physical well-being of a subject), to keep it at said undesired physiological change or disorder (e.g., stabilization, or not worsening), or to prevent or slow down progression to a more severe or worse state compared to said undesired physiological change or disorder such as a disease or disorder related to endothelial cells.
[0169] As used herein the terms "prevention", "preventive treatment" or "prophylactic treatment" and the like encompass preventing the onset of a disease or disorder, including reducing the severity of a disease or disorder or symptoms associated therewith prior to affliction with said disease or disorder. Such prevention or reduction prior to affliction refers to administration of the nucleic acid regulatory elements, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein to a patient that is not at the time of administration afflicted with clear symptoms of the disease or disorder. "Preventing" also encompasses preventing the recurrence or relapse-prevention of a disease or disorder for instance after a period of improvement. In embodiments, the nucleic acid regulatory elements according to any one of SEQ ID Nos: 1-33, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein may be for use in gene therapy, in particular endothelial cell-directed gene therapy.
[0170] Also disclosed herein is the use of the nucleic acid regulatory elements according to any one of SEQ ID Nos: 1-33, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein for the manufacture of a medicament for gene therapy, in particular endothelial cell-directed gene therapy.
[0171] Also disclosed herein is a method for gene therapy, in particular endothelial cell-directed gene therapy in a subject in need of said gene therapy comprising:
[0172] introducing in the subject, in particular in endothelial cells or tissue of the subject, a nucleic acid expression cassette, a vector or a pharmaceutical composition described herein, wherein the nucleic acid expression cassette, the vector or the pharmaceutical composition comprises a nucleic acid regulatory element according to any one of SEQ ID Nos: 1-33, operably linked to a promoter and a transgene; and
[0173] expressing a therapeutically effective amount of the transgene product in the subject, in particular in endothelial tissue or cells of the subject.
[0174] The transgene product may be any one of the following transgenes can introduced: secretable proteins, in particular secretable therapeutic proteins, including hepatocyte growth factor (HGF), coagulation factor VIII (FVIII), coagulation factor VII (FVII), coagulation factor IX (FIX), coagulation factor XI (FXI), tissue factor (TF), tissue factor pathway inhibitor (TFPI), von Willebrand factor (vWF), ADAMTS13, VEGF, PLGF, FGF, sFLT1, .alpha.1-antitrypsin (AAT), apolipoprotein A-I (apoA-I), insulin, erythropoietin, lipoprotein lipase, antibodies or nanobodies, growth factors, cytokines, chemokines, plasma factors etc. The therapeutic protein may also be a structural protein. Non-limiting examples of structural proteins, in particular structural therapeutic proteins, including proteins modulating vascular relaxation, vasoconstriction or atherosclerosis, In preferred embodiments, the transgene comprises the nitric oxide synthase (NOS).
[0175] Alternatively, the transgene product may be RNA, such as siRNA, or a nuclease such as ZFN, TALEN, CRISPR/Cas9 or similar DNA or RNA editing systems.
[0176] Exemplary diseases and disorders that may benefit from gene therapy using the nucleic acid regulatory elements, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein include: liver diseases, hemophilia A, von Willebrand disease, microvascular thrombosis, thrombotic thrombocytopenic purpura, peripheral vascular disease, coronary artery diseases, atherosclerotic diseases, stroke, heart disease, diabetes, insulin resistance, chronic kidney failure, tumor growth, metastasis, venous thrombosis, ischemia, tumour growth, tumour vascularisation, cancer and viral infectious diseases such as Ebola, Dengue fever and dengue hemorrhagic fever.
[0177] Gene therapy protocols have been extensively described in the art. These include, but are not limited to, intramuscular injection of plasmid (naked or in liposomes), hydrodynamic gene delivery in various tissues, including muscle, interstitial injection, instillation in airways, application to endothelium, intra-hepatic parenchyme, and intravenous or intra-arterial administration. Various devices have been developed for enhancing the availability of DNA to the target cell. A simple approach is to contact the target cell physically with catheters or implantable materials containing DNA. Another approach is to utilize needle-free, jet injection devices which project a column of liquid directly into the target tissue under high pressure. These delivery paradigms can also be used to deliver vectors. Another approach to targeted gene delivery is the use of molecular conjugates, which consist of protein or synthetic ligands to which a nucleic acid- or DNA-binding agent has been attached for the specific targeting of nucleic acids to cells (Cristiano et al., 1993). In embodiments, the nucleic acid regulatory elements, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein may be for use as a vaccine, more particularly for use as a prophylactic vaccine.
[0178] Also disclosed herein is the use of the nucleic acid regulatory elements, the nucleic acid expression cassettes, the vectors, or the pharmaceutical compositions described herein for the manufacture of medicament or a vaccine, in particular for the manufacture of a prophylactic vaccine.
[0179] Also disclosed herein is a method of vaccination, in particular prophylactic vaccination, of a subject in need of said vaccination comprising:
[0180] introducing in the subject, in particular in endothelial tissue or cells of the subject, a nucleic acid expression cassette, a vector or a pharmaceutical composition described herein, wherein the nucleic acid expression cassette, the vector or the pharmaceutical composition comprises a nucleic acid regulatory element according to any one of SEQ ID Nos: 1 to 33, operably linked to a promoter and a transgene; and
[0181] expressing an immunologically effective amount of the transgene product in the subject, in particular in endothelial cells or tissue of the subject.
[0182] As used herein, a phrase such as "a subject in need of treatment" includes subjects that would benefit from treatment of a recited disease or disorder. Such subjects may include, without limitation, those that have been diagnosed with said disease or disorder, those prone to contract or develop said disease or disorder and/or those in whom said disease or disorder is to be prevented.
[0183] The terms "subject" and "patient" are used interchangeably herein and refer to animals, preferably vertebrates, more preferably mammals, and specifically include human patients and non-human mammals. "Mammalian" subjects include, but are not limited to, humans, domestic animals, commercial animals, farm animals, zoo animals, sport animals, pet and experimental animals such as dogs, cats, guinea pigs, rabbits, rats, mice, horses, cattle, cows; primates such as apes, monkeys, orang-utans, and chimpanzees; canids such as dogs and wolves; felids such as cats, lions, and tigers; equids such as horses, donkeys, and zebras; food animals such as cows, pigs, and sheep; ungulates such as deer and giraffes; rodents such as mice, rats, hamsters and guinea pigs; and so on. Preferred patients or subjects are human subjects.
[0184] A `therapeutic amount` or `therapeutically effective amount` as used herein refers to the amount of gene product effective to treat a disease or disorder in a subject, i.e., to obtain a desired local or systemic effect. The term thus refers to the quantity of gene product that elicits the biological or medicinal response in a tissue, system, animal, or human that is being sought by a researcher, veterinarian, medical doctor or other clinician. Such amount will typically depend on the gene product and the severity of the disease, but can be decided by the skilled person, possibly through routine experimentation.
[0185] An "immunologically effective amount" as used herein refers to the amount of (trans)gene product effective to enhance the immune response of a subject against a subsequent exposure to the immunogen encoded by the (trans)gene. Levels of induced immunity can be determined, e.g. by measuring amounts of neutralizing secretory and/or serum antibodies, e.g., by plaque neutralization, complement fixation, enzyme-linked immunosorbent, or microneutralization assay.
[0186] Typically, the amount of (trans)gene product expressed when using an expression cassette or vector as described herein (i.e., with at least one nucleic acid regulatory element) are higher than when an identical expression cassette or vector is used but without a nucleic acid regulatory element therein. More particularly, the expression is at least double as high, at least five times as high, at least ten times as high, at least 20 times as high, at least 30 times as high, at least 40 times as high, at least 50 times as high, or even at least 60 times as high as when compared to the same nucleic acid expression cassette or vector without nucleic acid regulatory element. Preferably, the higher expression remains specific to endothelial tissues or cells. Furthermore, the expression cassettes and vectors described herein direct the expression of a therapeutic amount of the gene product for an extended period. Typically, therapeutic expression is envisaged to last at least 20 days, at least 50 days, at least 100 days, at least 200 days, and in some instances 300 days or more. Expression of the gene product (e.g. polypeptide) can be measured by any art-recognized means, such as by antibody-based assays, e.g. a Western Blot or an ELISA assay, for instance to evaluate whether therapeutic expression of the gene product is achieved. Expression of the gene product may also be measured in a bioassay that detects an enzymatic or biological activity of the gene product.
[0187] Also disclosed herein is the use of the nucleic acid regulatory elements according to SEQ ID Nos: 1 to 33, or the nucleic acid expression cassettes, or the vectors disclosed herein comprising said nucleic acid regulatory elements, for transfecting or transducing endothelial cells.
[0188] Further disclosed herein is the use of the nucleic acid expression cassettes or the vectors disclosed herein comprising the nucleic acid regulatory elements according to SEQ ID Nos: 1 to 33, for expressing a transgene product in endothelial cells, wherein the nucleic acid expression cassette or the vector comprises said nucleic acid regulatory element disclosed herein operably linked to a promoter and a transgene.
[0189] Further disclosed herein is a method for expressing a transgene product in endothelial cells, comprising:
[0190] transfecting or transducing said cells with a nucleic acid expression cassette or a vector disclosed herein, wherein the nucleic acid expression cassette or the vector comprises a nucleic acid regulatory element according to any one of SEQ ID Nos: 1 to 33, operably linked to a promoter and a transgene; and
[0191] expressing the transgene product in said cells.
[0192] Non-viral transfection or viral vector-mediated transduction of endothelial cells may be performed by in vitro, ex vivo or in vivo procedures. The in vitro approach requires the in vitro transfection or transduction of endothelial cells, e.g. cells previously harvested from a subject, cell lines or cells differentiated from e.g. induced pluripotent stem cells or embryonic cells. The ex vivo approach requires harvesting of the endothelial cells from a subject, in vitro transfection or transduction, and optionally re-introduction of the transfected cells into the subject. The in vivo approach requires the administration of the nucleic acid expression cassette or the vector disclosed herein into a subject. In preferred embodiments, the transfection of the endothelial cells is performed in vitro or ex vivo.
[0193] It is understood by the skilled person that the use of the nucleic acid regulatory elements, the nucleic acid expression cassettes and vectors disclosed herein has implications beyond gene therapy, e.g. coaxed differentiation of stem cells into endothelial cell precursors or endothelial cells, transgenic models for over-expression of proteins in endothelial cells or their precursors, etc.
[0194] The invention is further explained by the following non-limiting examples
EXAMPLES
Example 1: Identification of Endothelial Cell-Specific Nucleic Acid Regulatory Elements
[0195] To identify the endothelial cell genes that are highly expressed, we followed several steps. First, we obtained the list of genes that are highly expressed in endothelial cells from the publication of Bhasin et al., 2010 (Genomics 2010, 11:342) showing 104 genes that were identified as endothelial-restricted genes. Subsequently, the specificity and robustness of expression of the endothelial-restricted genes was compared to that of 6 types of endothelial cells (i.e. LSEC: Liver Sinusoid Endothelial cells, HCAEC: Coronary Artery Endothelial cells, HMVEC: Dermal Microvascular Endothelial cells, HUVEC: Human Umblilical Vein Endothelial cells, lEn: Iliac Artery Endothelial cells, RE: Retinal Endothelial cells) based on the Reference Database of Gene Expression Analysis (RefExA). We identified 11 genes (Table 1) from this endothelial-restricted gene list that are highly expressed and specific among these quintessential endothelial cell types. Consequently, these 11 genes were then used for designing the endothelial-specific cis-regulatory elements (CREs: Table 2)
TABLE-US-00001 TABLE 1 Highly expressed genes in endothelial cells Highly expressed genes in endothelial cells ENSEMBL Acc. No IFI27 ENSG00000165949 ICAM2 ENSG00000108622 VWF ENSG00000110799 EDN1 ENSG00000078401 ENG ENSG00000106991 ECSCR ENSG00000279686 CDH5 ENSG00000179776 PECAM1 ENSG00000261371 HHIP ENSG00000164161 TIE1 ENSG00000066056 HYAL2 ENSG00000068001
[0196] Candidate CREs were selected using the University of California Santa Cruz (UCSC) Genome Browser database based on i) high DNase hypersensitivity sites; ii) high content of epigenetic markers associated with open chromatin (acetylation, methylation); iii) high content of transcription factor binding sites; iv) strong evolutionary conservation. The ideal CREs were expected to exhibit co-existence of predicted motifs together with DNase clusters, high conservation level in vertebrates, and explicit histone modification patterns. Therefore, 28 potential CRE sequences were selected based on those criteria (FIG. 1 and Table 2).
TABLE-US-00002 TABLE 2 The EC-CREs sequence for highly expression in endothelial cells. Size Gene EC-CRE (bp) Sequence CDH5 EC-CRE1a 76 GCCCTCACAAAGGAACAATAACAGGAAACCATCCCA GGGGGAAGTGGGCCAGGGCCAGCTGGAAAACCTGA AGGGG (SEQ ID NO: 1) CDH5 EC-CRE1b 277 GGCCGAGGCAGCCGCCCACCGCAGGGCCTGCCTAT CTGCAGCCAGCCCAGCCCTCACAAAGGAACAATAAC AGGAAACCATCCCAGGGGGAAGTGGGCCAGGGCCA GCTGGAAAACCTGAAGGGGAGGCAGCCAGGCCTCC CTCGCCAGCGGGGTGTGGCTCCCCTCCAAAGACGGT CGGCTGACAGGCTCCACAGAGCTCCACTCACGCTCA GCCCTGGACGGACAGGCAGTCCAACGGAACAGAAAC ATCCCTCAGCCCACAGGCACGGTGAGTG (SEQ ID NO: 2) CDH5 EC-CRE1c 408 GCTTCCTCCTCTGCTACTAATCTGGTCTCACAGACCA TCCCATTTCCTGCTAGCCCACCAGCCGCCTTCCTTGC TCCCAATGACACTTCCTGGCCTTGTGCCCTCCTGTTA CCTCCTTTGCCTCCAGAGAGGTTGGAGCAGAGGCTG GGCAGTGCCAGAAATCAGGCATGAAATCCTCAGGGG GACCAAGGAGGCACCAGCCTCCCTCCCACAGTCTCA GCTACCTCTGCTACGGTGACCCCCAGCCCCACCCCT GGGGCCCACAGCTCATGCCTGGCTCACCATTCCTTT GTTTATGGACCACAGGAACAGTCGTTTTCAGGGCAGA GTCAACTTCCTCATGGACTGGGAGTACAAAGGGAATT GGCAGATGGTGCCAGGACAGGCCCTGTCCCCATCTG CCACAGC (SEQ ID NO: 3) CDH5 EC-CRE1d 173 GCTTCCTCCTCTGCTACTAATCTGGTCTCACAGACCA TCCCATTTCCTGCTAGCCCACCAGCCGCCTTCCTTGC TCCCAATGACACTTCCTGGCCTTGTGCCCTCCTGTTA CCTCCTTTGCCTCCAGAGAGGTTGGAGCAGAGGCTG GGCAGTGCCAGAAATCAGGCATGAAA (SEQ ID NO: 4) CDH5 EC-CRE1e 307 CCCAGCTGAGGGCTGGTGCCAGAGCCGTGTCTGCTT GCCCCATCAAGAGGTGGGAGGGATTGATCCACCTTC CTGCCCCACAGATGGTGCAGCCTCCAACCTATTGTTT TCCAGGACGCTTCGGTGGAGAGCACAAGGAATGTAG GGTCTAGAAACAGGAAGCCCTGGCTTCCGCTGGACA AGGTTTCCTCCAGACTCAGGCCTGCCCTCCAGACAA CAAGGCAGGGCCCTTGGTCCCACCCTGCCCTGCCTG GCTCACTGGGCCACCCCAAGGAAGGCCTTGCCCTCT CTGGGCTTCTGCATGTGA (SEQ ID NO: 5) CDH5 EC-CRE1f 450 AGGTATCCACCAAGGGGCCCAAGAGCTGCTGAGCCC CTGAGCAGCCCTACCAATTTCAGCTTATGGTGGTGAG GGGTAGGGGAGGTGTATACTGGCCTGGAAGGGGTTA AGCTGCCCGCCTGCAGCCTCAGCCTGAGCTATTGTG TTGCCAAACAAGGGCCGGACATGAGGGCAGGAAGCC AGCAGGGGCCACACATTTTCTGCAAAGTTGGATGATT CACTGCTGACTTGGGGACACCCAGGGGACAGAGGG GACACCATCCCAGGAAGAATCTTAGGCTCATTTTGCC CACATGGACCCATGACTGTTCCCTGTATCCTCTCTCT GCACCCCCTCAGTCACACTGAAGCAACTATGAGAATT CCCATTTGACAGATGGGACCATCGAGGCTGAGGGAA GCTGTGCAGCCAGTCCAAGGTCACACAACCAACACA AGGTAGAAGCAGGG (SEQ ID NO: 6) ECSCR EC-CRE1a 123 AGGGCCCCTGGAGCTGGTCCCAATGTGTTTCCTTCTA TTCTTTTGACAGGAAGCTCCTGGAGAGCCAGTCCCCA CCCCCATCCCGCCCCAGCACTCCCTCTCTCTTCTCCA CTATGGACAGAG (SEQ ID NO: 7) ECSCR EC-CRE1b 499 TAGGGCCTCTCTGAAAGATGTGGGGAGTCCTATCTG CATTGGGATCCCTGAGGAGGGAGAGGAATGTGGAGA ATTCAGGGTCCAGGGAGCATGGGTGACTGGTGGGCT GGGCTTCCAGGCTGAATCATGGGAAAGGAGAACCTG GTCTGAAACAGTACTGGGCGGGATTGGTGTTAGATTC CAGGAAAACCCCCAGGCGGTCTGTGGTGGAACCTGA TGGACCCTCAGAAGGGAAGAGAATGGGGATGGGGC CAGGTTGCCATGGTTGGTCATTGTGCATAGGCACTAG AGGCCATGCTGGGTGGGCACAGTCGCTGCTGCAGC CTCACATCCTCATCTGGACATGGCTGAGCAGGGCCC CTGGAGCTGGTCCCAATGTGTTTCCTTCTATTCTTTTG ACAGGAAGCTCCTGGAGAGCCAGTCCCCACCCCCAT CCCGCCCCAGCACTCCCTCTCTCTTCTCCACTATGGA CAGAGCCTCCACTGAGCTGCTGCCTGCC (SEQ ID NO: 8) EDN1 EC-CRE1a 455 GAGACATAAAAGGAAAATGAAGCGAGCAACAATTAAA AAAAATTCCCCGCACACAACAATACAATCTATTTAAAC TGTGGCTCATACTTTTCATACCAATGGTATGACTTTTT TTCTGGAGTCCCCTCTTCTGATTCTTGAACTCCGGGG CTGGCAGCTTGCAAAGGGGAAGCGGACTCCAGCACT GCACGGGCAGGTTTAGCAAAGGTCTCTAATGGGTATT TTCTTTTTCTTAGCCCTGCCCCCGAATTGTCAGACGG CGGGCGTCTGCCTCTGAAGTTAGCAGTGATTTCCTTT CGGGCCTGGCCTTATCTCCGGCTGCACGTTGCCTGT TGGTGACTAATAACACAATAACATTGTCTGGGGCTGG AATAAAGTCGGAGCTGTTTACCCCCACTCTAATAGGG GTTCAATATAAAAAGCCGGCAGAGAGCTGTCCAAGTC AGACGCGCCTC (SEQ ID NO: 9) ENG EC-CRE1a 153 TGTCCACTTCTCCTGACCCCTCGGCCGCCACCCCAG AAGGCTGGAGCAGGGACGCCGTCGCTCCGGCCGCC TGCTCCCCTCGGGTCCCCGTGCGAGCCCACGCCGG CCCCGGTGCCCGCCCGCAGCCCTGCCACTGGACAC AGGATAAGGCCC (SEQ ID NO: 10) ENG EC-CRE1b 125 GGGCCCCCCACCCAGTGACAAAGCCCGTGGCACTTC CTCTACCCGGTTGGCAGGCGGCCTGGCCCAGCCCCT TCTCTAAGGAAGCGCATTTCCTGCCTCCCTGGGCCG GCCGGGCTGGATGAGCC (SEQ ID NO: 11) ENG EC-CRE1c 498 GGGATGGGAGGGTGGGGTGCTTGGGGAGACAAGCC TAGAGCCTGGGCCCTCCCACCCCACTGCCTCCCCCC ATCCCAGGGCCCCCCACCCAGTGACAAAGCCCGTGG CACTTCCTCTACCCGGTTGGCAGGCGGCCTGGCCCA GCCCCTTCTCTAAGGAAGCGCATTTCCTGCCTCCCTG GGCCGGCCGGGCTGGATGAGCCAGGAGCTCCCTGC TGCCGGTCATACCACAGCCTTCATCTGCGCCCTGGG GCCAGGACTGCTGCTGTCACTGCCATCCATTGGAGC CCAGCACCCCCTCCCCGCCCATCCTTCGGACAGCAA CTCCAGCCCAGCCCCGCGTCCCTGTGTCCACTTCTC CTGACCCCTCGGCCGCCACCCCAGAAGGCTGGAGC AGGGACGCCGTCGCTCCGGCCGCCTGCTCCCCTCG GGTCCCCGTGCGAGCCCACGCCGGCCCCGGTGCCC GCCCGCAGCCCTGCCACTGGACACAGGATAAGGC (SEQ ID NO: 12) HHIP EC-CRE1a 136 AGCGGTGACGTCAAGGGGCGCGCTGTGGCAGCACC TCCCCGCGCGCTAGTTAAAAAGAAGAAGAAAAGAGG GAACGAAACATGAGAGGCTGTGTGAGAAGCTGCAGC CGCCGGCAGAGGAGACCTCAGCATCATCT (SEQ ID NO: 13) HHIP EC-CRE1b 574 CTGGGCGGGGGCGCGCGAGAAGCGGTGACGTCAAG GGGCGCGCTGTGGCAGCACCTCCCCGCGCGCTAGT TAAAAAGAAGAAGAAAAGAGGGAACGAAACATGAGA GGCTGTGTGAGAAGCTGCAGCCGCCGGCAGAGGAG ACCTCAGCATCATCTAGAGCCCAGCGCTGGCCCTGC CTCCGCCTGCCCCGCCGCCGCCGTCGCCGTTTCTGT TCCTGCTACTGTCCCACCTAAACAACTCCCGTTACAC GGACAAGTGAACATCTGTGGCTGTCCTCTCCTTTTCT TCCTCCTCTTCCAACTCCTTCTCCTCCTCCCACTTCC CAGCCGCAGCAGAAAGCCCCCAACCCAACTGACACT GGCACAACTGCAAACGGTGTCATCCGCACAACTTTAT CTCGCTCCTCGGGCTCCCCTAAGGCATTGGACCCAT CGCCGCGTCTTTTATTTTTTGCAAAGTTGCATCGCTG TACATATTTTTGTCCCCGCCACCTCCCTCTGTCTCTG GAGTGCCCTACAGCCCCGCAAACTCCTCCTGGAGCT GCGCCCTAGTGCCCCTGCTGGGCAGTGGCGT (SEQ ID NO: 14) HYAL2 EC-CRE1a 170 AGGGAACTCCCTGTGCTGGGCCTACCCAGCTGACCC CATCGCTGGAAACAATGGGGGTCAGGCAACACTTCC CCACTCTCTCCCGCCGGGCTGTGCTCACTTCCTTCCT GCTGGCTGCCTGAGGAAGTGTCCCTGCCCTGGGACA GTCTGGCCTAGCCTTTGTTTCCCCG (SEQ ID NO: 15) HYAL2 EC-CRE1b 470 GACAGGCTTCTGAGTGTAGGGAGCTGGTCTGCCAGT CTTTCGGAGGTTTGAACTTGTCAAGGCTAGGGCAGG ATCACCATATCCAGCCTGGACTTGCAGTTCTGTGGGG TGCCTCCCCATACCCCCATAAGATGCCAAACATGAGG CCCTGTCATCCTCCATGGTCCCCCTCTACTGGCTGTT CAAGGCCCAGGGCTCTCCCATGCCAGATAGCATCCT GTCTCCTACCACCACTGTCCCAGCCTGAGGGAACTC CCTGTGCTGGGCCTACCCAGCTGACCCCATCGCTGG AAACAATGGGGGTCAGGCAACACTTCCCCACTCTCTC CCGCCGGGCTGTGCTCACTTCCTTCCTGCTGGCTGC CTGAGGAAGTGTCCCTGCCCTGGGACAGTCTGGCCT AGCCTTTGTTTCCCCGGGGGTCCCCACCCATGGAGC TTTCAAGGCTTCTGGCCCCTGTGAAGCCAGCACA (SEQ ID NO: 16) HYAL2 EC-CRE1c 602 CAGTGGAAAAAAACGGACTCAGCTACTGGAAGTCCC CCCGACCCTCCCCCCAAGGCTAGTTCCCTTCTTGGG CACCTGCTCTGGGGGACCATCAGCTGAACGACCCCC AAGTATTTTGACTCCCAAAAGCACCACCACCTGACCC CATCCTCTCACACCCTACTGGATTTGAGGATGGGCCC CAATCCTAGGGAAGGAGTGAAGAGGTTCCCTAGTGT TGGAAGCTGTGGGTGTGGGGGAGATTGGCACCTGAT CCTGAGCCCATAGCCTTCCTGTCACCTGGCGCAGCT GGCGGGGCCAGATCCTACTCGGGAAGGGTGGGGAG GGCAGCCAGCCAGCAGGGCATTCTGGAGGGAAACA GGGTCAAGGCGATCTCCTCCCCCACGCCTGTTCCTG GCCCTTTCCTCTCAGGGGGCAGCAGGAAGTGAGGAG AAAGGGCTGGGATGGGAGGCGGGAGCGGATGGGAG GGAATGGGGTTTATCAAGTCCTCGGCGAGCTGCCCA ACGGGCAGCAGCTGGCGCAAGTAGCCTAGCTGGAG AGGCTCACCCCAGGAAGGAGGGAGGCCACCGACCT ACTGGGCCGACGGACTCCCACACAGGTGA (SEQ ID NO: 17) ICAM2 EC-CRE1a 265 CTTGCATAGATGGCCAGCGTTCATACTTTCTGCTTGT TTGTACAAAGTCATTCTTCTAGAGTAATTGTTGTAAAA TTGCTAGGCAAGGTGGCAGGTCTGATAAGATTTGATG ACGTAATGGCTCTTAGTCGCTAATAAGAGGCTTTTGT GGAGTGGCGTGTCACAGCCAGCGAAGGCTCAGCTCT GTGATCTTCGCCTGCCTCACTTGGGGGACCAGAAGG CAGCTTGTCTTGGAACTGCCTCATTCACAGAAGACCC CATTGAG (SEQ ID NO: 18) ICAM2 EC-CRE1b 545 TTTACCTAGCATGATCTTGGCAGTTCAAAGAGGAATG TGCCAGAAACCAAGCAAAGAAGAAAAAGAAATAAATG GAAATGGAAAGTGATCTGCTCAGAGGCCACAAAGTT GAGGGAGGAGGTTTCCAGAGTGGGATTTGGCCCAAA TGTTGCCTGAGGAAGTACGTAAAGGGGTCTCAACTCT GGCTACACAACAGAACAGCAGGACTGTGTGTGCAGC TCACGAAGTGGGTACACAGGGTAATCTGCAAGTTCTG GGTGGATCTAGCACCTGGATTGTTAAAACTTGCATAG ATGGCCAGCGTTCATACTTTCTGCTTGTTTGTACAAA GTCATTCTTCTAGAGTAATTGTTGTAAAATTGCTAGGC AAGGTGGCAGGTCTGATAAGATTTGATGACGTAATGG CTCTTAGTCGCTAATAAGAGGCTTTTGTGGAGTGGCG TGTCACAGCCAGCGAAGGCTCAGCTCTGTGATCTTC GCCTGCCTCACTTGGGGGACCAGAAGGCAGCTTGTC TTGGAACTGCCTCATTCACAGAAGACCCCAT (SEQ ID NO: 19) ICAM2 EC-CRE1c 554 ATGGCAGCTGGCAGGTGCCTTCACGTCCAGGGTTTC CAGAGAGAAAGCATCTCTCCTCCGCAGAGACCCTCC CACGCTCTCCCTCCCTCAAATTAGTGCATCTACATAG ACCGCCCTCCTTATAAACAGTCTCTCAGGGGATCCTA GCCCATTCCAAATCTACCTGTGATTGCAGAATCGCAA GGAATGTGATTTACCGCAGATCGCGGGGCGTCGTGT CTTTTAGGGGACCTGCTCACTTTGGCCACTAGGTGG CGGGCAGTGCAGCCCCTGCTCCTGTCGACCCTGAGC GTTCAGCGTTTCCGCCGCCTCCGCCCCACTCCGTAG GGGGAGCTGATGAGATGAGGTTGAGGTCCAGGAAGA CGTCAAGGGCTTGGTTTTGTAAACAACTCCATTCCTC GCTCGCTGATAAGTTTTCTAAGTGATGCATATTCACAA CCTTGTCCCATCCAAGGACCCAAGAATTAACACATTA CATAATATGGACAGCCCCCTCCTGTCCAACGGGCAT GATTTTGGGGTCTGATATTCTGTGGATCTGTGCAATA GTCAAC (SEQ ID NO: 20) IFI27 EC-CRE1a 450 AGACTTTTTTTGAAAAACGGAACATCTGCCTATCGCA AGGACTACTATTATTCTGAAAATCACCTTCTTCATTAG AAAGTAATATTTATCATTTTATTATAGAACTTTGATCTT ACTTCTTGTGACTTCATTCTGCGTAGAGCACACTCCC ATCCTTGAATTAAATGACAAAGCATTTTATATTAACTG ACAATGACTGATGCCATGGGCAAATCCTATTTCTGTA AATAACTGAATTTTCTTCTGGACTGCGCATGAGGGGA GAAAGATGTCTGCAGTTTCGGTTTCCTGGAAAATGAA ACCTATCTCATTTGTTGCCTGTGTCAAGGGGCAGTGC TTCAGTCGGGGTGGAGCTGCTTAAAAGGCCTGGGAT CACACCCTTTGGGAACACATCCAAGCTTAAGACGGTG AGGTCAGCTTCACATTCTCAGGAACTCTCCTTCTTTG GGT (SEQ ID NO: 21) IFI127 EC-CRE1b 570 GAGACTTTTTTTGAAAAACGGAACATCTGCCTATCGC AAGGACTACTATTATTCTGAAAATCACCTTCTTCATTA GAAAGTAATATTTATCATTTTATTATAGAACTTTGATCT TACTTCTTGTGACTTCATTCTGCGTAGAGCACACTCC
CATCCTTGAATTAAATGACAAAGCATTTTATATTAACT GACAATGACTGATGCCATGGGCAAATCCTATTTCTGT AAATAACTGAATTTTCTTCTGGACTGCGCATGAGGGG AGAAAGATGTCTGCAGTTTCGGTTTCCTGGAAAATGA AACCTATCTCATTTGTTGCCTGTGTCAAGGGGCAGTG CTTCAGTCGGGGTGGAGCTGCTTAAAAGGCCTGGGA TCACACCCTTTGGGAACACATCCAAGCTTAAGACGGT GAGGTCAGCTTCACATTCTCAGGAACTCTCCTTCTTT GGGTAAGACTGGGAGGGTGGGCAGGAGCTACCCTT CCCGTGGCCCCGGACCTTGGGTGGGCTGTGGGCTC AGGGAGCGGAGGGGAGGCCTTAAGCATCCACTCTCT GCCCGGTGTTTTTGTTC (SEQ ID NO: 22) IFI27 EC-CRE1c 513 AGGTGGGGATGAGGGGCTAAGTATGAACCAAGGAGC TAGAAATACAGCACTGGAAGCTGGAAGCAGGGGGCT TGGAGACTGGGAGCTGGAGTGCGTGTGGGCAGGGT GTGGCAGCAGCCGGCAGAGGCCATTTCCCCTTGGCA GAACATTCACCATGTGACCCTGAGCATGTCTTTGAAC TCCTCTGAGCTCCTGTTTCCTCTCCAGAGAAAAGGCT GGTAATGCCCATTCAGGGTTATGGTCAGGATTGCATA GGGTGAAACAATAGAGATTGAACACAGTAGACATGAA AGAGATGCCAGGGCTCAGCTCCCTTTGGTTTAGTTGC TTCCAGTGTGCTCTGTGGCAACACCACGGAGCCCTA GAGCTGTCTCTTTGAGCCGCTCTGAATGTGCCTCTTA CATAATCTCCTGGGCAACATCTGCTCCCCTAATGAGA TTTGCTCCCCAGCAAAGATAAGAAACTTGCCAACCAC TCCCCTGGTCCAGCATTTGGCCAAGGCAGACACTGA GG (SEQ ID NO: 23) PECA EC-CRE1a 217 ACCTCACTCAATGCATGGAAGTTGACACAATGGCTCA M1 ACATTAGCGTTGGGCTGATTCATCATTTGGCTGTTGA CACCAGCCTCTGGCCCAGCCAGGACAGAAAAAGGGC CCCTGAGGAACTTCTGGCTCTGTTCCCTCTATGGGG GAGGGGCAGTGGACTTGTGATAAGACAGGGTGTTAG GGTGAGGTGGACTTGGGGAAACAGGATATTTCTAA (SEQ ID NO: 24) TIE1 EC-CRE1a 97 GGGGGGAGGGGAGACCCCAGAACAATGTCCCCCAC CCCACCCCCCTCCTCAATAGGCGGAAGCCACTGGCT TCCTCCCTTTCCTGCCTCCTGCCTCC (SEQ ID NO: 25) TIE1 EC-CRE1b 427 GTGTGTTTGTGCCGGGGGGAGGGGAGACCCCAGAA CAATGTCCCCCACCCCACCCCCCTCCTCAATAGGCG GAAGCCACTGGCTTCCTCCCTTTCCTGCCTCCTGCCT CCTTTGTGCCAGCAAGACTGAGTACTGGAGAGAGAC AGGGGATGGGAAAAATCAGTCCAGCTGTCCCCAGGT CTGCCCTTACCATAACCTTCCCCCCACCTCAAGTGAC TCCTCCCAGGCCACACCCATCCCCAGCCTTGTGGGG GCCAGATTGGGGGGCCTAGAGGCTCAAAGGCAGAAT GAGTCCTCCCACCCCCTACCCTGCCACCCCTCCCAC CCAAGCCACCTCATTTCCTCTTCCTCCCCAGCACCGA CCCACACTGACCAACACAGGCTGAGCAGTCAGGCCC ACAGCATCTGACCCCAGGCCCAGCTCGTC (SEQ ID NO: 26) VWF EC-CRE1a 119 CTACAAAGCTTTATCAGCTTGGAGGTACTTCTAATAC CATTTCCTTTCATTGTTTCCTTTTGGTAATTAAAAGGA GGCCAATCCCCTGTTGTGGCAGCTCACAGCTATTGT GGTGGGAA (SEQ ID NO: 27) VWF EC-CRE1b 385 CTGCCAGGAGGTCTCCCTCCAAACTCTACAAAGCTTT ATCAGCTTGGAGGTACTTCTAATACCATTTCCTTTCAT TGTTTCCTTTTGGTAATTAAAAGGAGGCCAATCCCCT GTTGTGGCAGCTCACAGCTATTGTGGTGGGAAAGGG AGGGTGGTTGGTGGATGTCACAGCTTGGGCTTTATCT CCCCCAGCAGTGGGGACTCCACAGCCCCTGGGCTAC ATAACAGCAAGACAGTCCGGAGCTGTAGCAGACCTG ATTGAGCCTTTGCAGCAGCTGAGAGCATGGCCTAGG GTGGGCGGCACCATTGTCCAGCAGCTGAGTTTCCCA GGGACCTTGGAGATAGCCGCAGCCCTCATTTGCAGG GGAAGGTATGGCCTTTGGAA (SEQ ID NO: 28)
[0197] Alternatively, the VISTA enhancer browser (http://enhancer.lbl.gov) was also applied, a central resource for experimentally validated human and mouse non-coding fragments with gene enhancer activity. This also provided the predicted DNA elements associated with high expression in blood vessels. The predicted sequences from VISTA were selected based on the validated data using mouse embryonic staining. Up to 3 VISTA sequences were selected from these validated data. However, since the DNA fragment sizes were too large to be accommodated into a viral vector, the selected sequences were further trimmed down or separated into sub-fragments using the UCSC genome browser using the aforementioned criteria. This resulted in 5 CREs derived from the 3 selected
[0198] VISTA sequences (Table 3). All of the endothelial-specific CREs sequences were further validated both in vitro and in vivo to investigate their specificity and robustness in endothelial cells.
TABLE-US-00003 TABLE 3 The EC-CREs sequence for highly expression in endothelial cells from VISTA Enhancer browser. VISTA code EC-CRE Size (bp) Sequence Hs185 EC-CRE- 517 GCCATTGGCTGGTCCTTCACTGACAGCAGAAACT 9 V1 TGGCCAATGGCAATCAATCAGGGGGCCCGCGCT GCCTTAAATACCAGCAGAGCAAACAGCCTCAGAC AAAGCTGCGCCGTGTATCAATTACCGAGAGGCT GCGTGCTCCTCTGGGCGGAGGGAGCCGGAGCG AGCGGCCAGGGCTGCTGCCCCAGCTGATAAGG GCCCGCATTGTTCGGGGACAGCTGGCAGCCCGA TAAGGGCCTGCTCGCCCGAGATAATGGCAGTGG GCAGGCGCCTCGCGGCAGTTTAGAATTTCTTGG GTCTCCAAGAAAGGTTCTATTAAGCCCACTGACC CCAATTGAATATTAATTAGCTAATTAACGGATTTA TTGTTCCACGCCATTTCTGGAGAGGCCATTTTTTT TCGAGTGCCATTATTTTTGTAAATGATTTTTCGCA TTGTTCATAATTGAATCTTTGCAGCTGCCAGCATC TTCTGCATGATTTGGCAAAAAAAAGGAAGCAGAA GCACTTAGGGTT (SEQ ID NO: 29) Hs217 EC-CRE- 511 CCCCTAATAAACAGGAAGGCATCCGCGCCATTA 9 V2 GTATCCATCCTTTTCAGAGCATCTGAGACCTGTC TGGACCATCAAAGCCATCCCCAGCCCCCAGGAG CCTACTGGAGGAGACACCAGCCTCGCCAAAACA ATTCTCCATTGTGTTCTTCCCCTTAGAAATCATGG GTTTGTAAACAGGCCCTTACATTTCAGCAGGTCC TGCCCTGGCTTTGTGCTGGTGTGTTGTTTTTTCTT CCCTGAACAATGTCCTTTCCAGTAGGGCCAGCC GTTCACACCATTGTCTGAGACCCTTGGACTACAG GAAACATCACCAGATTCTTATCAGTTGGGGGCAG GAGTGGGGGGGTGAACAGATGAGATCATGTCCA CAGAGCAAGTGGCTGGTGTGCCACGTCATTCCC CATGCCTTCATCTGTGAGAGCAGAGCCCGCTCG CCCTCCATCAATCTGGGCTTCATGTGTCCAGAGT CCAGTCCCCTCTATTTGGTGGTAGACACCTGGAC TCTGTT (SEQ ID NO: 30) Hs217 EC-CRE- 390 TCCTACAGTTATGGGTCTGTCTTTCCTGCTAGAT 9 V3 CAGAAGCTCCAGGGACCTGTCTGTCTTATTAACC TTTCTTCCCCTCATGATGCCTGGCCCAGGACTCC ACGTTCAGAGGCAGTTTAATGTCTACAGAACTGA TGGATGCTCCATACCCTGTATTCATAAGCCTGTG TTTTGCTGCCAAACACCAGAGGGCACTGTTAGCA TGTCGATGAAGATTATAAACCCTCAGACCTGGAA GGCTGGGAAAGGCTTATGAAAATCTTGCTTCTGT TTTGGGATTACATAAGACTTATTGCGCCTCAATTG TTCTAAACACATCTGTTCGAGTTTATTCATGAGGC ACGTTCCTGTTGGGGTTAGAGATGAGTTTGAAAG CTCCCCGTCACAGG (SEQ ID NO: 31) Hs188 EC-CRE- 573 AGGGACTGTTGGGCAGCCCCAGACTGGCACAGG 2 V4 TGGATCGGGTGCCTAGGCAGGGGGTGGTGAGTT ATGGCGCAGCTGTCTTGGTGGCTGGGGGGAGCA GGGATAAGGGTGGACTTCTTAGTGACCGCTCTCT GCCCCAGGAGGTAGAGTCCTGGGGGCTGGGCT GGCCTGAGAGACGCCCCCTCATCCTTTCCAGGG TGAGGTACGAGGGCTCCGCCCCCTCCTGATATC ACCAGGCCTAGGGCAGCATCCTGATGGGGGAG GGGCAAGTGACCCGGGCCCTGGACTGCAGGAA CAGCCCCTCCTCCACTGGTGGAGTTCCCACTTC CTGCGGAAGGAACTATGTTAGAAGTTGTGTATAT GGGGTGGGGGTTGGGTGTGGGTGGCGGGGGG CCTGGGTGGGGTCCACTGAGTCGCCTCCCCTGT CTCCCTGCACTTCCTCCTGGAGGAAATGGGGAC AACAGGATGAAGTGAGGGCCTGCTGAGCCCAGG GCTGCCACCTGGGAGTGAAGCCGGGGCAGGCT GCAGGGTCCGGGCCCTTCTGTGTGGGCAGGTG GAAGTGGTGGGGATGCA (SEQ ID NO: 32) Hs188 EC-CRE- 441 GGGGAGAGGGTGTGGGGTGGGGTGGGGAGAG 2 V5 GGTGTGGGGTGGGGTGGGGAGAGGGGATGGGA TGGCATGGGGGGATGTGGCAGTGAGGAGGCTG GGCCCTTGGAGCTGCCGAGTGCAGGGGCCTGG AGGACTCCGGGAAGGCGTCCTAGTGCATCAAGC GTGGGCTTGGCCTGCTTGGGTCTCCCCTCCTGG CCCCCCTAGCAATGGGCGGACTTGGGCCCGCTC TGGGAGGATTCCAGGAACGGCTCCTGCCTGGTT ATAAATAGACTTCTCCGAAAGGCCTGGGGCTGTG CCAGCTGCAGCAGGTGCCTCCCAGGCCCGGCC AGAGGGCCCCAGGCAAGGGGGTGGAGCCCGGG TGGGGGTGATGAGGATGCTGGGGTCCACTTTTG TAGCGCCAGAGGCGACGGGCTCTGTCTGGTTGT AGCATCACAGAGCTTGAT (SEQ ID NO: 33)
Example 2: Lentiviral Transduction of Human Endothelial Cells In Vitro
[0199] To validate the potential endothelial cell-specific CREs, we first validated a robust endothelial-specific promoter. The selected endothelial cell-specific CREs were cloned upstream of this promoter. We identified several human endothelial-specific promoter such as the human CDH5 promoter (1,303 bp) and human EDN1 promoter (455 bp) (The sequence of the CDH5 was obtained from Genecopoeia (http://www.genecopoeia.com) and for EDN1mini promoter, we selected the promoter sequence using the same concept as we select the CREs from UCSC). In addition, we also identified several endothelial promoters that are commercially available (Invivogen, USA) such as the as ENG promoter (888 bp), FLT1 promoter (1,037 bp), and ICAM2 promoter (399 bp). The sequences of these promoters are provided in Table 4. The endothelial-specific promoters were cloned into the lentiviral vector plasmid upstream of the FVIII or GFP reporter gene (FIG. 2). Nine different lentiviral constructs were generated (designated as:
[0200] 1) pLVX-CMV-GFP (SEQ ID NO:40),
[0201] 2) pLVX-CMV-FVIII (SEQ ID NO:41),
[0202] 3) pLVX-hEDNlmini-FVIII (SEQ ID NO:42),
[0203] 4) pLVX-hEDNlmini-Kozak-FVIII (SEQ ID NO:43),
[0204] 5) pLVX-hICAM2-FVIII (SEQ ID NO:44),
[0205] 6) pLVX-hICAM2-Kozak-FVIII (SEQ ID NO:45)
[0206] 7) pLVX-hICAM2-Kozak-Luc2 (SEQ ID NO:46)
[0207] 8) pLVX-EC-CRE-h(uman)ICAM2-Kozak-Luc2 (SEQ ID NO:47)
[0208] 9) pLVX-EC-CRE-h(uman)ICAM2-Kozak-FVIII (SEQ ID NO:48), by conventional cloning. pLVX can also be a pCDH backbone or another (lenti)viral backbone.
TABLE-US-00004 TABLE 4 Endothelial-specific promoter sequences Host Size Promoter Species (bp) Sequence Source CDH5 Human 1303 GCTTGCCCAGCTATATAATAAAACAAGTTTGGGACTTCC Genecopoeia CAACCATTCACCCATGGAAAAACAGAAGCAACTCTTCAA AGGACAGATTCCCAGGATCTGCCCTGGGAGATTCCAAA TCAGTTGATCTGGGGTGAGCCCAGTCCTCTGTAGTTTTT AGAAGCTCCTCCTATGTCTCTCCTGGTCAGCAGAATCTT GGCCCCTCCCTTCCCCCCAGCCTCTTGGTTCTTCTGGG CTCTGATCCAGCCTCAGCGTCACTGTCTTCCACGCCCC TCTTTGATTCTCGTTTATGTCAAAAGCCTTGTGAGGATG AGGCTGTGATTATCCCCATTTTACAGATGAGGAAACTGT GGCTCCAGGATGACACAACTGGCCAGAGGTCACATCAG AAGCAGAGCTGGGTCACTTGACTCCACCCAATATCCCT AAATGCAAACATCCCCTACAGACCGAGGCTGGCACCTT AGAGCTGGAGTCCATGCCCGCTCTGACCAGGAGAAGC CAACCTGGTCCTCCAGAGCCAAGAGCTTCTGTCCCTTT CCCATCTCCTGAAGCCTCCCTGTCACCTTTAAAGTCCAT TCCCACAAAGACATCATGGGATCACCACAGAAAATCAA GCTCTGGGGCTAGGCTGACCCCAGCTAGATTTTTGGCT CTTTTATACCCCAGCTGGGTGGACAAGCACCTTAAACC CGCTGAGCCTCAGCTTCCCGGGCTATAAAATGGGGGTG ATGACACCTGCCTGTAGCATTCCAAGGAGGGTTAAATG TGATGCTGCAGCCAAGGGTCCCCACAGCCAGGCTCTTT GCAGGTGCTGGGTTCAGAGTCCCAGAGCTGAGGCCGG GAGTAGGGGTTCAAGTGGGGTGCCCCAGGCAGGGTCC AGTGCCAGCCCTCTGTGGAGACAGCCATCCGGGGCCG AGGCAGCCGCCCACCGCAGGGCCTGCCTATCTGCAGC CAGCCCAGCCCTCACAAAGGAACAATAACAGGAAACCA TCCCAGGGGGAAGTGGGCCAGGGCCAGCTGGAAAACC TGAAGGGGAGGCAGCCAGGCCTCCCTCGCCAGCGGG GTGTGGCTCCCCTCCAAAGACGGTCGGCTGACAGGCT CCACAGAGCTCCACTCACGCTCAGCCCTGGACGGACA GGCAGTCCAACGGAACAGAAACATCCCTCAGCCCACAG GCACGGTGAGTGGGGGCTCCCACACTCCCCTCCACCC CAAACCCGCCACCCTGCGCCCAAGATGGGAGGGTCCT CAGCTTCCCCATCTGTAGAATGGGCATCGTCCCACTCC CATGACAGAGAGGCTC (SEQ ID NO: 34) EDN1 Human 455 GAGACATAAAAGGAAAATGAAGCGAGCAACAATTAAAAA UCSC AAATTCCCCGCACACAACAATACAATCTATTTAAACTGT GGCTCATACTTTTCATACCAATGGTATGACTTTTTTTCTG GAGTCCCCTCTTCTGATTCTTGAACTCCGGGGCTGGCA GCTTGCAAAGGGGAAGCGGACTCCAGCACTGCACGGG CAGGTTTAGCAAAGGTCTCTAATGGGTATTTTCTTTTTCT TAGCCCTGCCCCCGAATTGTCAGACGGCGGGCGTCTG CCTCTGAAGTTAGCAGTGATTTCCTTTCGGGCCTGGCC TTATCTCCGGCTGCACGTTGCCTGTTGGTGACTAATAAC ACAATAACATTGTCTGGGGCTGGAATAAAGTCGGAGCT GTTTACCCCCACTCTAATAGGGGTTCAATATAAAAAGCC GGCAGAGAGCTGTCCAAGTCAGACGCGCCTC (SEQ ID NO: 35) ENG Human 888 CGCCTTGCTGTGCCACTTTGGGACTTCCCTCCCTAGCC Invivogen TGAGCTTCAGTTTTCCTGCCTGTTAGGCAGCCCCATGT Inc. CAACTGCACTTAGTAGGCCGGGTTTGATGCCCGACAAG ACGTGAAGTGGTGGAGGTGGGCAGGATCCCAGCGCTA CCATCTTCTTGAACCAGTGATCTCAACACATCGGATTTC TGTTTCCTCATCTGCAAAATGGGATCAGTGAGCTCAGGT GGGTCACAAATTCTACAGGAACTACTTTAGCCAAGCCC GGCCCCCTGAAAGTTCCCCTCGGTGGGCTGTTAGGGT GATTGTTTTCATCTGTGGGGCTCCCTGATGCGTCCCAC CCACCAGCCTTGGAGAGGGTGGGATGGGAGGGTGGG GTGCTTGGGGAGACAAGCCTAGAGCCTGGGCCCTCCC ACCCCACTGCCTCCCCCCATCCCAGGGCCCCCCACCC AGTGACAAAGCCCGTGGCACTTCCTCTACCCGGTTGGC AGGCGGCCTGGCCCAGCCCCTTCTCTAAGGAAGCGCA TTTCCTGCCTCCCTGGGCCGGCCGGGCTGGATGAGCC GGGAGCTCCCTGCTGCCGGTCATACCACAGCCTTCATC TGCGCCCTGGGGCCAGGACTGCTGCTGTCACTGCCAT CCATTGGAGCCCAGCACCCCCTCCCCGCCCATCCTTCG GACAGCAACTCCAGCCCAGCCCCGCGTCCCTGTGTCC ACTTCTCCTGACCCCTCGGCCGCCACCCCAGAAGGCTG GAGCAGGGACGCCGTCGCTCCGGCCGCCTGCTCCCCT CGGGTCCCCGTGCGAGCCCACGCCGGCCCCGGTGCC CGCCCGCAGCCCTGCCACTGGACACAGGATAAGGCCC AGCGCACAGGCCCCCACGTGGACACC (SEQ ID NO: 36) FLT1 Human 1037 TTTGCTTCTAGGAAGCAGAAGACTGAGGAAATGACTTG Invivogen GGCGGGTGCATCAATGCGGCCAAAAAAGACACGGACA Inc. CGCTCCCCTGGGACCTGAGCTGGTTCGCAGTCTTCCCA AAGGTGCCAAGCAAGCGTCAGTTCCCCTCAGGCGCTCC AGGTTCAGTGCCTTGTGCCGAGGGTCTCCGGTGCCTTC CTAGACTTCTCGGGACAGTCTGAAGGGGTCAGGAGCG GCGGGACAGCGCGGGAAGAGCAGGCAAGGGGAGACA GCCGGACTGCGCCTCAGTCCTCCGTGCCAAGAACACC GTCGCGGAGGCGCGGCCAGCTTCCCTTGGATCGGACT TTCCGCCCCTAGGGCCAGGCGGCGGAGCTTCAGCCTT GTCCCTTCCCCAGTTTCGGGCGGCCCCCAGAGCTGAG TAAGCCGGGTGGAGGGAGTCTGCAAGGATTTCCTGAG CGCGATGGGCAGGAGGAGGGGCAAGGGCAAGAGGGC GCGGAGCAAAGACCCTGAACCTGCCGGGGCCGCGCTC CCGGGCCCGCGTCGCCAGCACCTCCCCACGCGCGCTC GGCCCCGGGCCACCCGCCCTCGTCGGCCCCCGCCCCT CTCCGTAGCCGCAGGGAAGCGAGCCTGGGAGGAAGAA GAGGGTAGGTGGGGAGGCGGATGAGGGGTGGGGGAC CCCTTGACGTCACCAGAAGGAGGTGCCGGGGTAGGAA GTGGGCTGGGGAAAGGTTATAAATCGCCCCCGCCCTC GGCTGCTCTTCATCGAGGTCCGCGGGAGGCTCGGAGC GCGCCAGGCGGACACTCCTCTCGGCTCCTCCCCGGCA GCGGCGGCGGCTCGGAGCGGGCTCCGGGGCTCGGGT GCAGCGGCCAGCGGGCGCCTGGCGGCGAGGATTACC CGGGGAAGTGGTTGTCTCCTGGCTGGAGCCGCGAGAC GGGCGCTCAGGGCGCGGGGCCGGCGGCGGCGAACAA GAGGACGGACTCTGGCGGCCGGGTCGTTGGCCGCGG GGAGCGCGGGCACCGGGCGAGCAGGCCGCGTCGCGC TCACC (SEQ ID NO: 37) ICAM2 Human 399 GTCTCCCAGGCATGACTCCAACAATGCATCCCATGGGA Invivogen TTTGGGGTTCCCCAGATCTGGGGCTTGTAGGCCTGACT Inc. CTCCCCTGTGCACACGTCTCATACACGCATGCGTGCAC CCATTGCCTGCCCCGCCCCTTGCACAGGGAGTCAGCA GGGAGGACTGGGTTATGCCCTGCTTATCAGCAGCTTCC CAGCTTCCTCTGCCTGGATTCTTAGAGGCCTGGGGTCC TAGAACGAGCTGGTGCACGTGGCTTCCCAAAGATCTCT CAGATAATGAGAGGAAATGCAGTCATCAGTTTGCAGAA GGCTAGGGATTCTGGGCCATAGCTCAGACCTGCGCCC ACCATCTCCCTCCAGGCAGCCCTTGGCTGGTCCCTGCG AGCCCGTGGAGACTGCCAGTC (SEQ ID NO: 38)
[0209] The Kozak consensus sequence is present in eukaryotic mRNA and is known to improve expression by enhancing translation initiation. Consequently, we introduce the Kozak consensus sequence (i.e. GCCACC, SEQ ID NO:39) upstream of the FVIII or LUC2 gene within the lentiviral vector plasmids.
[0210] HUVECs or LSECs were transduced at a multiplicity of infection (MOI)=50. Culture medium was collected at 24, 48, and 72 hrs and FVIII levels were subsequently measured in the conditioned medium using a human FVIII-specific ELISA, according to the manufacturer's instructions (Asserachrome). Using flow cytometry, the results showed that more than nearly 90% of HUVEC and LSEC cells were transduced compared to non-transduced HUVECs (FIG. 3). Relatively robust FVIII expression could be achieved in transduced HUVECs and LSECs with CMV, ICAM2 and EDN1mini. The Kozak translational consensus sequence significantly enhanced FVIII expression levels (FIG. 4). In particular, the Kozak consensus optimized ICAM2 construct yielded the highest FVIII expression in both HUVECs and LSECs. These results confirmed the robustness of the selected promoters in endothelial cells.
Example 3: In Vitro Validation of Endothelial-Specific (EC) CREs in Transfected HUVECs
[0211] To validate whether the different EC-CREs identified by genome-wide computational analysis, led to enhanced FVIII expression when coupled to an EC-specific human ICAM2 promoter, human umbilical vein endothelial cells (HUVECs) were transfected in vitro with the corresponding lentiviral vector constructs: pCDH-EC-CRE-ICAM2-FVIII, with EC-CRE representing the respective regulatory elements named in FIG. 5, which were cloned upstream of the ICAM2 promoter driving the human FVIII gene in a pCDH lentiviral self-inactivating backbone. FIG. 8 shows 4 examples of such expression cassettes comprising HYAL2-EC-CRE1a (FIG. 8a--SEQ ID NO. 50), HYAL2-EC-CRE1b (FIG. 8d--SEQ ID NO. 51) and IF127-EC-CRE1b (FIG. 8c--SEQ ID NO. 52). The control vector (without EC-CRE) is depicted in FIG. 8b (SEQ ID NO. 49). Analogously, the person skilled in the art would be capable of cloning the other EC-CRE's in a similar manner in the expression vector backbone.
[0212] HUVECs were seeded at 1.5.times.10.sup.5 cells/well of 6-well plate transfected 24 hr later with cationic lipid-based Lipofectamine 3000 (Invitrogen, USA). For HUVECs, 2.5 microgram of each plasmid were mixed with 3.75 microliter of Lipofectamine reagents. The P3000 reagent was mixed with each plasmid at 2 microliter per 1 microgram of plasmid and incubated at room temperature for 5 mins before adding to the HUVECs. Sixteen hours after transfection, the cell culture medium was removed and then replaced with fresh medium. 72 hrs later 100 microliter of the culture medium was collected and stored in -80.degree. C. for FVIII quantification using a human FVIII-specific ELISA (Asserachrome). The results showed that about 70% of the CRMs resulted in increased FVIII expression in transfected HUVECs in vitro, relative to the control lentiviral vector without CRE (i.e. 23 out of 32). (FIG. 5).
Example 4: In Vivo Validation of Endothelial-Specific (EC) CREs Following Lentiviral Transduction in Mice
[0213] Next, it was validated whether the EC-CREs identified by genome-wide computational analysis, led to enhanced FVIII expression in mice in vivo. Self-inactivating lentiviral vectors were used to express the human codon usage optimized B-domain deleted FVIII from an EC-specific human ICAM2 promoter. To test the impact of the EC-CRE on FVIII expression, the HYAL2-EC-CRE1a (SEQ ID NO. 50), HYAL2-EC-CRE1b (SEQ ID NO. 51) and IF127-EC-CRE1b (SEQ ID NO. 52) elements were cloned upstream of the ICAM2 promoter driving the human FVIII gene. Said expression cassettes comprise respectively the EC-CRE's HYAL2-EC-CRE1a, HYAL2-EC-CRE1b and IF127-EC-CRE1b. Analogously, the person skilled in the art would be capable of cloning the other EC-CRE's in a similar manner in an expression vector backbone. In these examples the pCDH-ICAM2-FVIII backbone is used. A lentiviral vector identical in design but without any upstream EC-CRE was used as control (SEQ ID NO. 49) to compare FVIII expression levels. Lentiviral vector particles were manufactured by transient cotransfection of HEK293 packaging cells with lentiviral vector and helper plasmids (Cyagen, USA). Vector titer was determined and expressed in Transducing Units per ml (TU/ml). Lentiviral vectors were retro-orbitally injected in 2 day-old neonatal CB17-SCID mice (Taconic). The vector preparation was supplemented with 40 microgram/ml polybrene in a total volume of 80 microliter. A total vector dose of 1.times.10.sup.8TU was used. Plasma was collected 5 weeks post-injection and FVIII was measured using a human FVIII-specific ELISA.
[0214] A significant increase was detected in FVIII expression in vivo when the EC-CRE were present compared to a control lentiviral vector without EC-CRE (FIG. 6). In particular, a 27-fold increase in FVIII could be detected following in vivo transduction with lentiviral vectors containing the IF127-EC-CRE1b compared to a control lentiviral vector without EC-CRE (ICAM2-FVIII). Similarly, the HYAL2-EC-CRE1a also boosted FVIII expression, but to a lesser extent (5-fold). This is consistent with the increased FVIII expression following in vitro HUVEC transfection with the IF127-EC-CRE1b and HYAL2-EC-CRE1a vectors compared to controls without EC-CRE (ICAM2-FVIII) (FIG. 5). HYAL2-EC-CRE1b did not increase FVIII expression in vivo, consistent with the lower levels of FVIII expression in transfected HUVECs in vitro (FIG. 5).
Example 5: Confirmation of Increased Gene Expression by EC-CRE in Organ-Derived Endothelial Cells Isolated from Lentivirally Transduced Mice
[0215] The mice were injected with the lentiviral vectors containing the ICAM2-FVIII (no CRE control--SEQ ID NO. 49) or IF127-EC-CRE1b-ICAM2-FVIII (SEQ ID NO. 52) expression cassette. After euthanization, the liver and spleen were processed to obtain their respective endothelial populations (i.e. liver sinusoidal endothelial cells, splenic endothelial cells). First, a single cell suspension was obtained from the liver and spleen tissue using the GentleMACS dissociator (Product no--130-093-235, Miltenyi Biotec) according to the manufacturer's protocol (Liver Dissociation Kit: Product code: 130-105-807; Miltenyi Biotec.--http://www.miltenyibiotec.com/en/products-and-services/macs-samp- le-preparation/sample-dissociation/tissue-dissociation-kits/liver-dissocia- tion-kit-mouse.aspx), spleen (Spleen Dissociation Kit: Product code: 130-095-926; Miltenyi Biotec.--http://www.miltenyibiotec.com/en/products-and-services/macs-samp- le-preparation/sample-dissociation/tissue-dissociation-kits/liver-dissocia- tion-kit-mouse.aspx).
[0216] Subsequently, the single cell suspension from each organ was subjected to MACS cell separation technology (Miltenyi Biotec) to sort out the respective endothelial populations. The single cell suspension obtained from liver and spleen were therefore tagged with CD146 microbeads, according to the manufacturer's instructions (Product code: 130-092-007; Miltenyi Biotec.--http://www.miltenyibiotec.com/en/products-and-services/macs-samp- le-preparation/sample-dissociation/tissue-dissociation-kits/liver-dissocia- tion-kit-mouse.aspx), allowing positive selection of the respective liver-derived and splenic endothelial cells. One to 1.6.times.10.sup.6 CD146-positive endothelial cells were obtained from the liver and 6.9-8.1.times.10.sup.5 CD146-positive endothelial cells were obtained from the spleen. The isolated cells were plated at a density of 25000 cells/well of 48 well plates in 200 microliterof Endothelial Basal Medium supplemented with growth factors.
[0217] Total RNA was isolated from the cells using RNeasy Micro Kit (Qiagen) according to manufacturer's instruction. Isolated RNA concentrations were measured using Nanodrop 1000 (Thermo scientific, MA, USA). Complementary DNA (cDNA) was synthesized from 75 ng-35 ng isolated RNA using Superscript III First-Strand synthesis system (Invitrogen) according to manufacturer's instructions. The qRT-PCR was performed using SYBR Green qPCR mix (Life technology) in a qPCR ABI Prism 7900HT (Applied Biosystems, Foster City/CA, USA) using FVIII specific primers 5'-AACGGCTACGTGAACAGAAG-3' (forward--SEQ ID NO. 53) and 5'-GATAGGGCTGATTTCCAGGC-3' (reverse--SEQ ID NO. 54). The expression levels were normalized to GAPDH (glyceraldehyde-3-phosphate dehydrogenase) mRNA expression, obtained by using the forward primer 5'-GAAGGTGAAGGTCGGAGTC-3' (SEQ ID NO. 55) and reverse primer 5'-GAAGATGGTGATGGGATTTC-3' (SEQ ID NO. 56).
[0218] The results showed that the IF127-EC-CRE1b element enhanced FVIII expression in CD146-positive endothelial cells obtained from liver or spleen, as reflected by increased FVIII mRNA levels in mice injected with the lentiviral vector containing the IF127-EC-CRE1b-ICAM2-FVIII cassette compared to the ICAM2-FVIII control (FIG. 7).
Sequence CWU
1
1
56176DNAHomo sapiens 1gccctcacaa aggaacaata acaggaaacc atcccagggg
gaagtgggcc agggccagct 60ggaaaacctg aagggg
762277DNAHomo sapiens 2ggccgaggca gccgcccacc
gcagggcctg cctatctgca gccagcccag ccctcacaaa 60ggaacaataa caggaaacca
tcccaggggg aagtgggcca gggccagctg gaaaacctga 120aggggaggca gccaggcctc
cctcgccagc ggggtgtggc tcccctccaa agacggtcgg 180ctgacaggct ccacagagct
ccactcacgc tcagccctgg acggacaggc agtccaacgg 240aacagaaaca tccctcagcc
cacaggcacg gtgagtg 2773408DNAHomo sapiens
3gcttcctcct ctgctactaa tctggtctca cagaccatcc catttcctgc tagcccacca
60gccgccttcc ttgctcccaa tgacacttcc tggccttgtg ccctcctgtt acctcctttg
120cctccagaga ggttggagca gaggctgggc agtgccagaa atcaggcatg aaatcctcag
180ggggaccaag gaggcaccag cctccctccc acagtctcag ctacctctgc tacggtgacc
240cccagcccca cccctggggc ccacagctca tgcctggctc accattcctt tgtttatgga
300ccacaggaac agtcgttttc agggcagagt caacttcctc atggactggg agtacaaagg
360gaattggcag atggtgccag gacaggccct gtccccatct gccacagc
4084173DNAHomo sapiens 4gcttcctcct ctgctactaa tctggtctca cagaccatcc
catttcctgc tagcccacca 60gccgccttcc ttgctcccaa tgacacttcc tggccttgtg
ccctcctgtt acctcctttg 120cctccagaga ggttggagca gaggctgggc agtgccagaa
atcaggcatg aaa 1735307DNAHomo sapiens 5cccagctgag ggctggtgcc
agagccgtgt ctgcttgccc catcaagagg tgggagggat 60tgatccacct tcctgcccca
cagatggtgc agcctccaac ctattgtttt ccaggacgct 120tcggtggaga gcacaaggaa
tgtagggtct agaaacagga agccctggct tccgctggac 180aaggtttcct ccagactcag
gcctgccctc cagacaacaa ggcagggccc ttggtcccac 240cctgccctgc ctggctcact
gggccacccc aaggaaggcc ttgccctctc tgggcttctg 300catgtga
3076450DNAHomo sapiens
6aggtatccac caaggggccc aagagctgct gagcccctga gcagccctac caatttcagc
60ttatggtggt gaggggtagg ggaggtgtat actggcctgg aaggggttaa gctgcccgcc
120tgcagcctca gcctgagcta ttgtgttgcc aaacaagggc cggacatgag ggcaggaagc
180cagcaggggc cacacatttt ctgcaaagtt ggatgattca ctgctgactt ggggacaccc
240aggggacaga ggggacacca tcccaggaag aatcttaggc tcattttgcc cacatggacc
300catgactgtt ccctgtatcc tctctctgca ccccctcagt cacactgaag caactatgag
360aattcccatt tgacagatgg gaccatcgag gctgagggaa gctgtgcagc cagtccaagg
420tcacacaacc aacacaaggt agaagcaggg
4507123DNAHomo sapiens 7agggcccctg gagctggtcc caatgtgttt ccttctattc
ttttgacagg aagctcctgg 60agagccagtc cccaccccca tcccgcccca gcactccctc
tctcttctcc actatggaca 120gag
1238499DNAHomo sapiens 8tagggcctct ctgaaagatg
tggggagtcc tatctgcatt gggatccctg aggagggaga 60ggaatgtgga gaattcaggg
tccagggagc atgggtgact ggtgggctgg gcttccaggc 120tgaatcatgg gaaaggagaa
cctggtctga aacagtactg ggcgggattg gtgttagatt 180ccaggaaaac ccccaggcgg
tctgtggtgg aacctgatgg accctcagaa gggaagagaa 240tggggatggg gccaggttgc
catggttggt cattgtgcat aggcactaga ggccatgctg 300ggtgggcaca gtcgctgctg
cagcctcaca tcctcatctg gacatggctg agcagggccc 360ctggagctgg tcccaatgtg
tttccttcta ttcttttgac aggaagctcc tggagagcca 420gtccccaccc ccatcccgcc
ccagcactcc ctctctcttc tccactatgg acagagcctc 480cactgagctg ctgcctgcc
4999455DNAHomo sapiens
9gagacataaa aggaaaatga agcgagcaac aattaaaaaa aattccccgc acacaacaat
60acaatctatt taaactgtgg ctcatacttt tcataccaat ggtatgactt tttttctgga
120gtcccctctt ctgattcttg aactccgggg ctggcagctt gcaaagggga agcggactcc
180agcactgcac gggcaggttt agcaaaggtc tctaatgggt attttctttt tcttagccct
240gcccccgaat tgtcagacgg cgggcgtctg cctctgaagt tagcagtgat ttcctttcgg
300gcctggcctt atctccggct gcacgttgcc tgttggtgac taataacaca ataacattgt
360ctggggctgg aataaagtcg gagctgttta cccccactct aataggggtt caatataaaa
420agccggcaga gagctgtcca agtcagacgc gcctc
45510153DNAHomo sapiens 10tgtccacttc tcctgacccc tcggccgcca ccccagaagg
ctggagcagg gacgccgtcg 60ctccggccgc ctgctcccct cgggtccccg tgcgagccca
cgccggcccc ggtgcccgcc 120cgcagccctg ccactggaca caggataagg ccc
15311125DNAHomo sapiens 11gggcccccca cccagtgaca
aagcccgtgg cacttcctct acccggttgg caggcggcct 60ggcccagccc cttctctaag
gaagcgcatt tcctgcctcc ctgggccggc cgggctggat 120gagcc
12512498DNAHomo sapiens
12gggatgggag ggtggggtgc ttggggagac aagcctagag cctgggccct cccaccccac
60tgcctccccc catcccaggg ccccccaccc agtgacaaag cccgtggcac ttcctctacc
120cggttggcag gcggcctggc ccagcccctt ctctaaggaa gcgcatttcc tgcctccctg
180ggccggccgg gctggatgag ccaggagctc cctgctgccg gtcataccac agccttcatc
240tgcgccctgg ggccaggact gctgctgtca ctgccatcca ttggagccca gcaccccctc
300cccgcccatc cttcggacag caactccagc ccagccccgc gtccctgtgt ccacttctcc
360tgacccctcg gccgccaccc cagaaggctg gagcagggac gccgtcgctc cggccgcctg
420ctcccctcgg gtccccgtgc gagcccacgc cggccccggt gcccgcccgc agccctgcca
480ctggacacag gataaggc
49813136DNAHomo sapiens 13agcggtgacg tcaaggggcg cgctgtggca gcacctcccc
gcgcgctagt taaaaagaag 60aagaaaagag ggaacgaaac atgagaggct gtgtgagaag
ctgcagccgc cggcagagga 120gacctcagca tcatct
13614574DNAHomo sapiens 14ctgggcgggg gcgcgcgaga
agcggtgacg tcaaggggcg cgctgtggca gcacctcccc 60gcgcgctagt taaaaagaag
aagaaaagag ggaacgaaac atgagaggct gtgtgagaag 120ctgcagccgc cggcagagga
gacctcagca tcatctagag cccagcgctg gccctgcctc 180cgcctgcccc gccgccgccg
tcgccgtttc tgttcctgct actgtcccac ctaaacaact 240cccgttacac ggacaagtga
acatctgtgg ctgtcctctc cttttcttcc tcctcttcca 300actccttctc ctcctcccac
ttcccagccg cagcagaaag cccccaaccc aactgacact 360ggcacaactg caaacggtgt
catccgcaca actttatctc gctcctcggg ctcccctaag 420gcattggacc catcgccgcg
tcttttattt tttgcaaagt tgcatcgctg tacatatttt 480tgtccccgcc acctccctct
gtctctggag tgccctacag ccccgcaaac tcctcctgga 540gctgcgccct agtgcccctg
ctgggcagtg gcgt 57415170DNAHomo sapiens
15agggaactcc ctgtgctggg cctacccagc tgaccccatc gctggaaaca atgggggtca
60ggcaacactt ccccactctc tcccgccggg ctgtgctcac ttccttcctg ctggctgcct
120gaggaagtgt ccctgccctg ggacagtctg gcctagcctt tgtttccccg
17016470DNAHomo sapiens 16gacaggcttc tgagtgtagg gagctggtct gccagtcttt
cggaggtttg aacttgtcaa 60ggctagggca ggatcaccat atccagcctg gacttgcagt
tctgtggggt gcctccccat 120acccccataa gatgccaaac atgaggccct gtcatcctcc
atggtccccc tctactggct 180gttcaaggcc cagggctctc ccatgccaga tagcatcctg
tctcctacca ccactgtccc 240agcctgaggg aactccctgt gctgggccta cccagctgac
cccatcgctg gaaacaatgg 300gggtcaggca acacttcccc actctctccc gccgggctgt
gctcacttcc ttcctgctgg 360ctgcctgagg aagtgtccct gccctgggac agtctggcct
agcctttgtt tccccggggg 420tccccaccca tggagctttc aaggcttctg gcccctgtga
agccagcaca 47017602DNAHomo sapiens 17cagtggaaaa aaacggactc
agctactgga agtccccccg accctccccc caaggctagt 60tcccttcttg ggcacctgct
ctgggggacc atcagctgaa cgacccccaa gtattttgac 120tcccaaaagc accaccacct
gaccccatcc tctcacaccc tactggattt gaggatgggc 180cccaatccta gggaaggagt
gaagaggttc cctagtgttg gaagctgtgg gtgtggggga 240gattggcacc tgatcctgag
cccatagcct tcctgtcacc tggcgcagct ggcggggcca 300gatcctactc gggaagggtg
gggagggcag ccagccagca gggcattctg gagggaaaca 360gggtcaaggc gatctcctcc
cccacgcctg ttcctggccc tttcctctca gggggcagca 420ggaagtgagg agaaagggct
gggatgggag gcgggagcgg atgggaggga atggggttta 480tcaagtcctc ggcgagctgc
ccaacgggca gcagctggcg caagtagcct agctggagag 540gctcacccca ggaaggaggg
aggccaccga cctactgggc cgacggactc ccacacaggt 600ga
60218265DNAHomo sapiens
18cttgcataga tggccagcgt tcatactttc tgcttgtttg tacaaagtca ttcttctaga
60gtaattgttg taaaattgct aggcaaggtg gcaggtctga taagatttga tgacgtaatg
120gctcttagtc gctaataaga ggcttttgtg gagtggcgtg tcacagccag cgaaggctca
180gctctgtgat cttcgcctgc ctcacttggg ggaccagaag gcagcttgtc ttggaactgc
240ctcattcaca gaagacccca ttgag
26519545DNAHomo sapiens 19tttacctagc atgatcttgg cagttcaaag aggaatgtgc
cagaaaccaa gcaaagaaga 60aaaagaaata aatggaaatg gaaagtgatc tgctcagagg
ccacaaagtt gagggaggag 120gtttccagag tgggatttgg cccaaatgtt gcctgaggaa
gtacgtaaag gggtctcaac 180tctggctaca caacagaaca gcaggactgt gtgtgcagct
cacgaagtgg gtacacaggg 240taatctgcaa gttctgggtg gatctagcac ctggattgtt
aaaacttgca tagatggcca 300gcgttcatac tttctgcttg tttgtacaaa gtcattcttc
tagagtaatt gttgtaaaat 360tgctaggcaa ggtggcaggt ctgataagat ttgatgacgt
aatggctctt agtcgctaat 420aagaggcttt tgtggagtgg cgtgtcacag ccagcgaagg
ctcagctctg tgatcttcgc 480ctgcctcact tgggggacca gaaggcagct tgtcttggaa
ctgcctcatt cacagaagac 540cccat
54520554DNAHomo sapiens 20atggcagctg gcaggtgcct
tcacgtccag ggtttccaga gagaaagcat ctctcctccg 60cagagaccct cccacgctct
ccctccctca aattagtgca tctacataga ccgccctcct 120tataaacagt ctctcagggg
atcctagccc attccaaatc tacctgtgat tgcagaatcg 180caaggaatgt gatttaccgc
agatcgcggg gcgtcgtgtc ttttagggga cctgctcact 240ttggccacta ggtggcgggc
agtgcagccc ctgctcctgt cgaccctgag cgttcagcgt 300ttccgccgcc tccgccccac
tccgtagggg gagctgatga gatgaggttg aggtccagga 360agacgtcaag ggcttggttt
tgtaaacaac tccattcctc gctcgctgat aagttttcta 420agtgatgcat attcacaacc
ttgtcccatc caaggaccca agaattaaca cattacataa 480tatggacagc cccctcctgt
ccaacgggca tgattttggg gtctgatatt ctgtggatct 540gtgcaatagt caac
55421450DNAHomo sapiens
21agactttttt tgaaaaacgg aacatctgcc tatcgcaagg actactatta ttctgaaaat
60caccttcttc attagaaagt aatatttatc attttattat agaactttga tcttacttct
120tgtgacttca ttctgcgtag agcacactcc catccttgaa ttaaatgaca aagcatttta
180tattaactga caatgactga tgccatgggc aaatcctatt tctgtaaata actgaatttt
240cttctggact gcgcatgagg ggagaaagat gtctgcagtt tcggtttcct ggaaaatgaa
300acctatctca tttgttgcct gtgtcaaggg gcagtgcttc agtcggggtg gagctgctta
360aaaggcctgg gatcacaccc tttgggaaca catccaagct taagacggtg aggtcagctt
420cacattctca ggaactctcc ttctttgggt
45022570DNAHomo sapiens 22gagacttttt ttgaaaaacg gaacatctgc ctatcgcaag
gactactatt attctgaaaa 60tcaccttctt cattagaaag taatatttat cattttatta
tagaactttg atcttacttc 120ttgtgacttc attctgcgta gagcacactc ccatccttga
attaaatgac aaagcatttt 180atattaactg acaatgactg atgccatggg caaatcctat
ttctgtaaat aactgaattt 240tcttctggac tgcgcatgag gggagaaaga tgtctgcagt
ttcggtttcc tggaaaatga 300aacctatctc atttgttgcc tgtgtcaagg ggcagtgctt
cagtcggggt ggagctgctt 360aaaaggcctg ggatcacacc ctttgggaac acatccaagc
ttaagacggt gaggtcagct 420tcacattctc aggaactctc cttctttggg taagactggg
agggtgggca ggagctaccc 480ttcccgtggc cccggacctt gggtgggctg tgggctcagg
gagcggaggg gaggccttaa 540gcatccactc tctgcccggt gtttttgttc
57023513DNAHomo sapiens 23aggtggggat gaggggctaa
gtatgaacca aggagctaga aatacagcac tggaagctgg 60aagcaggggg cttggagact
gggagctgga gtgcgtgtgg gcagggtgtg gcagcagccg 120gcagaggcca tttccccttg
gcagaacatt caccatgtga ccctgagcat gtctttgaac 180tcctctgagc tcctgtttcc
tctccagaga aaaggctggt aatgcccatt cagggttatg 240gtcaggattg catagggtga
aacaatagag attgaacaca gtagacatga aagagatgcc 300agggctcagc tccctttggt
ttagttgctt ccagtgtgct ctgtggcaac accacggagc 360cctagagctg tctctttgag
ccgctctgaa tgtgcctctt acataatctc ctgggcaaca 420tctgctcccc taatgagatt
tgctccccag caaagataag aaacttgcca accactcccc 480tggtccagca tttggccaag
gcagacactg agg 51324217DNAHomo sapiens
24acctcactca atgcatggaa gttgacacaa tggctcaaca ttagcgttgg gctgattcat
60catttggctg ttgacaccag cctctggccc agccaggaca gaaaaagggc ccctgaggaa
120cttctggctc tgttccctct atgggggagg ggcagtggac ttgtgataag acagggtgtt
180agggtgaggt ggacttgggg aaacaggata tttctaa
2172597DNAHomo sapiens 25ggggggaggg gagaccccag aacaatgtcc cccaccccac
ccccctcctc aataggcgga 60agccactggc ttcctccctt tcctgcctcc tgcctcc
9726427DNAHomo sapiens 26gtgtgtttgt gccgggggga
ggggagaccc cagaacaatg tcccccaccc cacccccctc 60ctcaataggc ggaagccact
ggcttcctcc ctttcctgcc tcctgcctcc tttgtgccag 120caagactgag tactggagag
agacagggga tgggaaaaat cagtccagct gtccccaggt 180ctgcccttac cataaccttc
cccccacctc aagtgactcc tcccaggcca cacccatccc 240cagccttgtg ggggccagat
tggggggcct agaggctcaa aggcagaatg agtcctccca 300ccccctaccc tgccacccct
cccacccaag ccacctcatt tcctcttcct ccccagcacc 360gacccacact gaccaacaca
ggctgagcag tcaggcccac agcatctgac cccaggccca 420gctcgtc
42727119DNAHomo sapiens
27ctacaaagct ttatcagctt ggaggtactt ctaataccat ttcctttcat tgtttccttt
60tggtaattaa aaggaggcca atcccctgtt gtggcagctc acagctattg tggtgggaa
11928385DNAHomo sapiens 28ctgccaggag gtctccctcc aaactctaca aagctttatc
agcttggagg tacttctaat 60accatttcct ttcattgttt ccttttggta attaaaagga
ggccaatccc ctgttgtggc 120agctcacagc tattgtggtg ggaaagggag ggtggttggt
ggatgtcaca gcttgggctt 180tatctccccc agcagtgggg actccacagc ccctgggcta
cataacagca agacagtccg 240gagctgtagc agacctgatt gagcctttgc agcagctgag
agcatggcct agggtgggcg 300gcaccattgt ccagcagctg agtttcccag ggaccttgga
gatagccgca gccctcattt 360gcaggggaag gtatggcctt tggaa
38529517DNAHomo sapiens 29gccattggct ggtccttcac
tgacagcaga aacttggcca atggcaatca atcagggggc 60ccgcgctgcc ttaaatacca
gcagagcaaa cagcctcaga caaagctgcg ccgtgtatca 120attaccgaga ggctgcgtgc
tcctctgggc ggagggagcc ggagcgagcg gccagggctg 180ctgccccagc tgataagggc
ccgcattgtt cggggacagc tggcagcccg ataagggcct 240gctcgcccga gataatggca
gtgggcaggc gcctcgcggc agtttagaat ttcttgggtc 300tccaagaaag gttctattaa
gcccactgac cccaattgaa tattaattag ctaattaacg 360gatttattgt tccacgccat
ttctggagag gccatttttt ttcgagtgcc attatttttg 420taaatgattt ttcgcattgt
tcataattga atctttgcag ctgccagcat cttctgcatg 480atttggcaaa aaaaaggaag
cagaagcact tagggtt 51730511DNAHomo sapiens
30cccctaataa acaggaaggc atccgcgcca ttagtatcca tccttttcag agcatctgag
60acctgtctgg accatcaaag ccatccccag cccccaggag cctactggag gagacaccag
120cctcgccaaa acaattctcc attgtgttct tccccttaga aatcatgggt ttgtaaacag
180gcccttacat ttcagcaggt cctgccctgg ctttgtgctg gtgtgttgtt ttttcttccc
240tgaacaatgt cctttccagt agggccagcc gttcacacca ttgtctgaga cccttggact
300acaggaaaca tcaccagatt cttatcagtt gggggcagga gtgggggggt gaacagatga
360gatcatgtcc acagagcaag tggctggtgt gccacgtcat tccccatgcc ttcatctgtg
420agagcagagc ccgctcgccc tccatcaatc tgggcttcat gtgtccagag tccagtcccc
480tctatttggt ggtagacacc tggactctgt t
51131390DNAHomo sapiens 31tcctacagtt atgggtctgt ctttcctgct agatcagaag
ctccagggac ctgtctgtct 60tattaacctt tcttcccctc atgatgcctg gcccaggact
ccacgttcag aggcagttta 120atgtctacag aactgatgga tgctccatac cctgtattca
taagcctgtg ttttgctgcc 180aaacaccaga gggcactgtt agcatgtcga tgaagattat
aaaccctcag acctggaagg 240ctgggaaagg cttatgaaaa tcttgcttct gttttgggat
tacataagac ttattgcgcc 300tcaattgttc taaacacatc tgttcgagtt tattcatgag
gcacgttcct gttggggtta 360gagatgagtt tgaaagctcc ccgtcacagg
39032573DNAHomo sapiens 32agggactgtt gggcagcccc
agactggcac aggtggatcg ggtgcctagg cagggggtgg 60tgagttatgg cgcagctgtc
ttggtggctg gggggagcag ggataagggt ggacttctta 120gtgaccgctc tctgccccag
gaggtagagt cctgggggct gggctggcct gagagacgcc 180ccctcatcct ttccagggtg
aggtacgagg gctccgcccc ctcctgatat caccaggcct 240agggcagcat cctgatgggg
gaggggcaag tgacccgggc cctggactgc aggaacagcc 300cctcctccac tggtggagtt
cccacttcct gcggaaggaa ctatgttaga agttgtgtat 360atggggtggg ggttgggtgt
gggtggcggg gggcctgggt ggggtccact gagtcgcctc 420ccctgtctcc ctgcacttcc
tcctggagga aatggggaca acaggatgaa gtgagggcct 480gctgagccca gggctgccac
ctgggagtga agccggggca ggctgcaggg tccgggccct 540tctgtgtggg caggtggaag
tggtggggat gca 57333441DNAHomo sapiens
33ggggagaggg tgtggggtgg ggtggggaga gggtgtgggg tggggtgggg agaggggatg
60ggatggcatg gggggatgtg gcagtgagga ggctgggccc ttggagctgc cgagtgcagg
120ggcctggagg actccgggaa ggcgtcctag tgcatcaagc gtgggcttgg cctgcttggg
180tctcccctcc tggcccccct agcaatgggc ggacttgggc ccgctctggg aggattccag
240gaacggctcc tgcctggtta taaatagact tctccgaaag gcctggggct gtgccagctg
300cagcaggtgc ctcccaggcc cggccagagg gccccaggca agggggtgga gcccgggtgg
360gggtgatgag gatgctgggg tccacttttg tagcgccaga ggcgacgggc tctgtctggt
420tgtagcatca cagagcttga t
441341303DNAHomo sapiens 34gcttgcccag ctatataata aaacaagttt gggacttccc
aaccattcac ccatggaaaa 60acagaagcaa ctcttcaaag gacagattcc caggatctgc
cctgggagat tccaaatcag 120ttgatctggg gtgagcccag tcctctgtag tttttagaag
ctcctcctat gtctctcctg 180gtcagcagaa tcttggcccc tcccttcccc ccagcctctt
ggttcttctg ggctctgatc 240cagcctcagc gtcactgtct tccacgcccc tctttgattc
tcgtttatgt caaaagcctt 300gtgaggatga ggctgtgatt atccccattt tacagatgag
gaaactgtgg ctccaggatg 360acacaactgg ccagaggtca catcagaagc agagctgggt
cacttgactc cacccaatat 420ccctaaatgc aaacatcccc tacagaccga ggctggcacc
ttagagctgg agtccatgcc 480cgctctgacc aggagaagcc aacctggtcc tccagagcca
agagcttctg tccctttccc 540atctcctgaa gcctccctgt cacctttaaa gtccattccc
acaaagacat catgggatca 600ccacagaaaa tcaagctctg gggctaggct gaccccagct
agatttttgg ctcttttata 660ccccagctgg gtggacaagc accttaaacc cgctgagcct
cagcttcccg ggctataaaa 720tgggggtgat gacacctgcc tgtagcattc caaggagggt
taaatgtgat gctgcagcca 780agggtcccca cagccaggct ctttgcaggt gctgggttca
gagtcccaga gctgaggccg 840ggagtagggg ttcaagtggg gtgccccagg cagggtccag
tgccagccct ctgtggagac 900agccatccgg ggccgaggca gccgcccacc gcagggcctg
cctatctgca gccagcccag 960ccctcacaaa ggaacaataa caggaaacca tcccaggggg
aagtgggcca gggccagctg 1020gaaaacctga aggggaggca gccaggcctc cctcgccagc
ggggtgtggc tcccctccaa 1080agacggtcgg ctgacaggct ccacagagct ccactcacgc
tcagccctgg acggacaggc 1140agtccaacgg aacagaaaca tccctcagcc cacaggcacg
gtgagtgggg gctcccacac 1200tcccctccac cccaaacccg ccaccctgcg cccaagatgg
gagggtcctc agcttcccca 1260tctgtagaat gggcatcgtc ccactcccat gacagagagg
ctc 130335455DNAHomo sapiens 35gagacataaa aggaaaatga
agcgagcaac aattaaaaaa aattccccgc acacaacaat 60acaatctatt taaactgtgg
ctcatacttt tcataccaat ggtatgactt tttttctgga 120gtcccctctt ctgattcttg
aactccgggg ctggcagctt gcaaagggga agcggactcc 180agcactgcac gggcaggttt
agcaaaggtc tctaatgggt attttctttt tcttagccct 240gcccccgaat tgtcagacgg
cgggcgtctg cctctgaagt tagcagtgat ttcctttcgg 300gcctggcctt atctccggct
gcacgttgcc tgttggtgac taataacaca ataacattgt 360ctggggctgg aataaagtcg
gagctgttta cccccactct aataggggtt caatataaaa 420agccggcaga gagctgtcca
agtcagacgc gcctc 45536888DNAHomo sapiens
36cgccttgctg tgccactttg ggacttccct ccctagcctg agcttcagtt ttcctgcctg
60ttaggcagcc ccatgtcaac tgcacttagt aggccgggtt tgatgcccga caagacgtga
120agtggtggag gtgggcagga tcccagcgct accatcttct tgaaccagtg atctcaacac
180atcggatttc tgtttcctca tctgcaaaat gggatcagtg agctcaggtg ggtcacaaat
240tctacaggaa ctactttagc caagcccggc cccctgaaag ttcccctcgg tgggctgtta
300gggtgattgt tttcatctgt ggggctccct gatgcgtccc acccaccagc cttggagagg
360gtgggatggg agggtggggt gcttggggag acaagcctag agcctgggcc ctcccacccc
420actgcctccc cccatcccag ggccccccac ccagtgacaa agcccgtggc acttcctcta
480cccggttggc aggcggcctg gcccagcccc ttctctaagg aagcgcattt cctgcctccc
540tgggccggcc gggctggatg agccgggagc tccctgctgc cggtcatacc acagccttca
600tctgcgccct ggggccagga ctgctgctgt cactgccatc cattggagcc cagcaccccc
660tccccgccca tccttcggac agcaactcca gcccagcccc gcgtccctgt gtccacttct
720cctgacccct cggccgccac cccagaaggc tggagcaggg acgccgtcgc tccggccgcc
780tgctcccctc gggtccccgt gcgagcccac gccggccccg gtgcccgccc gcagccctgc
840cactggacac aggataaggc ccagcgcaca ggcccccacg tggacacc
888371037DNAHomo sapiens 37tttgcttcta ggaagcagaa gactgaggaa atgacttggg
cgggtgcatc aatgcggcca 60aaaaagacac ggacacgctc ccctgggacc tgagctggtt
cgcagtcttc ccaaaggtgc 120caagcaagcg tcagttcccc tcaggcgctc caggttcagt
gccttgtgcc gagggtctcc 180ggtgccttcc tagacttctc gggacagtct gaaggggtca
ggagcggcgg gacagcgcgg 240gaagagcagg caaggggaga cagccggact gcgcctcagt
cctccgtgcc aagaacaccg 300tcgcggaggc gcggccagct tcccttggat cggactttcc
gcccctaggg ccaggcggcg 360gagcttcagc cttgtccctt ccccagtttc gggcggcccc
cagagctgag taagccgggt 420ggagggagtc tgcaaggatt tcctgagcgc gatgggcagg
aggaggggca agggcaagag 480ggcgcggagc aaagaccctg aacctgccgg ggccgcgctc
ccgggcccgc gtcgccagca 540cctccccacg cgcgctcggc cccgggccac ccgccctcgt
cggcccccgc ccctctccgt 600agccgcaggg aagcgagcct gggaggaaga agagggtagg
tggggaggcg gatgaggggt 660gggggacccc ttgacgtcac cagaaggagg tgccggggta
ggaagtgggc tggggaaagg 720ttataaatcg cccccgccct cggctgctct tcatcgaggt
ccgcgggagg ctcggagcgc 780gccaggcgga cactcctctc ggctcctccc cggcagcggc
ggcggctcgg agcgggctcc 840ggggctcggg tgcagcggcc agcgggcgcc tggcggcgag
gattacccgg ggaagtggtt 900gtctcctggc tggagccgcg agacgggcgc tcagggcgcg
gggccggcgg cggcgaacaa 960gaggacggac tctggcggcc gggtcgttgg ccgcggggag
cgcgggcacc gggcgagcag 1020gccgcgtcgc gctcacc
103738399DNAHomo sapiens 38gtctcccagg catgactcca
acaatgcatc ccatgggatt tggggttccc cagatctggg 60gcttgtaggc ctgactctcc
cctgtgcaca cgtctcatac acgcatgcgt gcacccattg 120cctgccccgc cccttgcaca
gggagtcagc agggaggact gggttatgcc ctgcttatca 180gcagcttccc agcttcctct
gcctggattc ttagaggcct ggggtcctag aacgagctgg 240tgcacgtggc ttcccaaaga
tctctcagat aatgagagga aatgcagtca tcagtttgca 300gaaggctagg gattctgggc
catagctcag acctgcgccc accatctccc tccaggcagc 360ccttggctgg tccctgcgag
cccgtggaga ctgccagtc 399396DNAHomo sapiens
39gccacc
6408834DNAArtificial Sequencevector 40tggaagggct aattcactcc caaagaagac
aagatatcct tgatctgtgg atctaccaca 60cacaaggcta cttccctgat tagcagaact
acacaccagg gccaggggtc agatatccac 120tgacctttgg atggtgctac aagctagtac
cagttgagcc agataaggta gaagaggcca 180ataaaggaga gaacaccagc ttgttacacc
ctgtgagcct gcatgggatg gatgacccgg 240agagagaagt gttagagtgg aggtttgaca
gccgcctagc atttcatcac gtggcccgag 300agctgcatcc ggagtacttc aagaactgct
gatatcgagc ttgctacaag ggactttccg 360ctggggactt tccagggagg cgtggcctgg
gcgggactgg ggagtggcga gccctcagat 420cctgcatata agcagctgct ttttgcctgt
actgggtctc tctggttaga ccagatctga 480gcctgggagc tctctggcta actagggaac
ccactgctta agcctcaata aagcttgcct 540tgagtgcttc aagtagtgtg tgcccgtctg
ttgtgtgact ctggtaacta gagatccctc 600agaccctttt agtcagtgtg gaaaatctct
agcagtggcg cccgaacagg gacttgaaag 660cgaaagggaa accagaggag ctctctcgac
gcaggactcg gcttgctgaa gcgcgcacgg 720caagaggcga ggggcggcga ctggtgagta
cgccaaaaat tttgactagc ggaggctaga 780aggagagaga tgggtgcgag agcgtcagta
ttaagcgggg gagaattaga tcgcgatggg 840aaaaaattcg gttaaggcca gggggaaaga
aaaaatataa attaaaacat atagtatggg 900caagcaggga gctagaacga ttcgcagtta
atcctggcct gttagaaaca tcagaaggct 960gtagacaaat actgggacag ctacaaccat
cccttcagac aggatcagaa gaacttagat 1020cattatataa tacagtagca accctctatt
gtgtgcatca aaggatagag ataaaagaca 1080ccaaggaagc tttagacaag atagaggaag
agcaaaacaa aagtaagacc accgcacagc 1140aagcggccgg ccgctgatct tcagacctgg
aggaggagat atgagggaca attggagaag 1200tgaattatat aaatataaag tagtaaaaat
tgaaccatta ggagtagcac ccaccaaggc 1260aaagagaaga gtggtgcaga gagaaaaaag
agcagtggga ataggagctt tgttccttgg 1320gttcttggga gcagcaggaa gcactatggg
cgcagcgtca atgacgctga cggtacaggc 1380cagacaatta ttgtctggta tagtgcagca
gcagaacaat ttgctgaggg ctattgaggc 1440gcaacagcat ctgttgcaac tcacagtctg
gggcatcaag cagctccagg caagaatcct 1500ggctgtggaa agatacctaa aggatcaaca
gctcctgggg atttggggtt gctctggaaa 1560actcatttgc accactgctg tgccttggaa
tgctagttgg agtaataaat ctctggaaca 1620gatttggaat cacacgacct ggatggagtg
ggacagagaa attaacaatt acacaagctt 1680aatacactcc ttaattgaag aatcgcaaaa
ccagcaagaa aagaatgaac aagaattatt 1740ggaattagat aaatgggcaa gtttgtggaa
ttggtttaac ataacaaatt ggctgtggta 1800tataaaatta ttcataatga tagtaggagg
cttggtaggt ttaagaatag tttttgctgt 1860actttctata gtgaatagag ttaggcaggg
atattcacca ttatcgtttc agacccacct 1920cccaaccccg aggggacccg acaggcccga
aggaatagaa gaagaaggtg gagagagaga 1980cagagacaga tccattcgat tagtgaacgg
atctcgacgg tatcgccttt aaaagaaaag 2040gggggattgg ggggtacagt gcaggggaaa
gaatagtaga cataatagca acagacatac 2100aaactaaaga actacaaaaa caaattacaa
aaattcaaaa ttttcgggtt tattacaggg 2160acagcagaga tccagtttat cgataagctt
gggagttccg cgttacataa cttacggtaa 2220atggcccgcc tggctgaccg cccaacgacc
cccgcccatt gacgtcaata atgacgtatg 2280ttcccatagt aacgccaata gggactttcc
attgacgtca atgggtggag tatttacggt 2340aaactgccca cttggcagta catcaagtgt
atcatatgcc aagtacgccc cctattgacg 2400tcaatgacgg taaatggccc gcctggcatt
atgcccagta catgacctta cgggactttc 2460ctacttggca gtacatctac gtattagtca
tcgctattac catggtgatg cggttttggc 2520agtacaccaa tgggcgtgga tagcggtttg
actcacgggg atttccaagt ctccacccca 2580ttgacgtcaa tgggagtttg ttttggcacc
aaaatcaacg ggactttcca aaatgtcgta 2640acaactccgc cccattgacg caaatgggcg
gtaggcgtgt acggtgggag gtctatataa 2700gcagagctcg tttagtgaac cgtcagatcg
cctggagacg ccatccacgc tgttttgacc 2760tccatagaag acaccgactc tagctagagg
atctaccggt cgccaccatg gtgagcaagg 2820gcgccgagct gttcaccggc atcgtgccca
tcctgatcga gctgaatggc gatgtgaatg 2880gccacaagtt cagcgtgagc ggcgagggcg
agggcgatgc cacctacggc aagctgaccc 2940tgaagttcat ctgcaccacc ggcaagctgc
ctgtgccctg gcccaccctg gtgaccaccc 3000tgagctacgg cgtgcagtgc ttctcacgct
accccgatca catgaagcag cacgacttct 3060tcaagagcgc catgcctgag ggctacatcc
aggagcgcac catcttcttc gaggatgacg 3120gcaactacaa gtcgcgcgcc gaggtgaagt
tcgagggcga taccctggtg aatcgcatcg 3180agctgaccgg caccgatttc aaggaggatg
gcaacatcct gggcaataag atggagtaca 3240actacaacgc ccacaatgtg tacatcatga
ccgacaaggc caagaatggc atcaaggtga 3300acttcaagat ccgccacaac atcgaggatg
gcagcgtgca gctggccgac cactaccagc 3360agaatacccc catcggcgat ggccctgtgc
tgctgcccga taaccactac ctgtccaccc 3420agagcgccct gtccaaggac cccaacgaga
agcgcgatca catgatctac ttcggcttcg 3480tgaccgccgc cgccatcacc cacggcatgg
atgagctgta caagtccgga ctcagatctc 3540gagctcaagc ttcgaattct gcagtcgacg
gtaccgcggg cccgggatcc accggatcta 3600gataactgat cataattcta ccgggtaggg
gaggcgcttt tcccaaggca gtctggagca 3660tgcgctttag cagccccgct gggcacttgg
cgctacacaa gtggcctctg gcctcgcaca 3720cattccacat ccaccggtag gcgccaaccg
gctccgttct ttggtggccc cttcgcgcca 3780ccttctactc ctcccctagt caggaagttc
ccccccgccc cgcagctcgc gtcgtgcagg 3840acgtgacaaa tggaagtagc acgtctcact
agtctcgtgc agatggacag caccgctgag 3900caatggaagc gggtaggcct ttggggcagc
ggccaatagc agctttgctc cttcgctttc 3960tgggctcaga ggctgggaag gggtgggtcc
gggggcgggc tcaggggcgg gctcaggggc 4020ggggcgggcg cccgaaggtc ctccggaggc
ccggcattct gcacgcttca aaagcgcacg 4080tctgccgcgc tgttctcctc ttcctcatct
ccgggccttt cgacctgcag cccaagctta 4140ccatgaccga gtacaagccc acggtgcgcc
tcgccacccg cgacgacgtc cccagggccg 4200tacgcaccct cgccgccgcg ttcgccgact
accccgccac gcgccacacc gtcgatccgg 4260accgccacat cgagcgggtc accgagctgc
aagaactctt cctcacgcgc gtcgggctcg 4320acatcggcaa ggtgtgggtc gcggacgacg
gcgccgcggt ggcggtctgg accacgccgg 4380agagcgtcga agcgggggcg gtgttcgccg
agatcggccc gcgcatggcc gagttgagcg 4440gttcccggct ggccgcgcag caacagatgg
aaggcctcct ggcgccgcac cggcccaagg 4500agcccgcgtg gttcctggcc accgtcggcg
tctcgcccga ccaccagggc aagggtctgg 4560gcagcgccgt cgtgctcccc ggagtggagg
cggccgagcg cgccggggtg cccgccttcc 4620tggagacctc cgcgccccgc aacctcccct
tctacgagcg gctcggcttc accgtcaccg 4680ccgacgtcga ggtgcccgaa ggaccgcgca
cctggtgcat gacccgcaag cccggtgcct 4740gaccgcgtct ggaacaatca acctctggat
tacaaaattt gtgaaagatt gactggtatt 4800cttaactatg ttgctccttt tacgctatgt
ggatacgctg ctttaatgcc tttgtatcat 4860gctattgctt cccgtatggc tttcattttc
tcctccttgt ataaatcctg gttgctgtct 4920ctttatgagg agttgtggcc cgttgtcagg
caacgtggcg tggtgtgcac tgtgtttgct 4980gacgcaaccc ccactggttg gggcattgcc
accacctgtc agctcctttc cgggactttc 5040gctttccccc tccctattgc cacggcggaa
ctcatcgccg cctgccttgc ccgctgctgg 5100acaggggctc ggctgttggg cactgacaat
tccgtggtgt tgtcggggaa gctgacgtcc 5160tttccatggc tgctcgcctg tgttgccacc
tggattctgc gcgggacgtc cttctgctac 5220gtcccttcgg ccctcaatcc agcggacctt
ccttcccgcg gcctgctgcc ggctctgcgg 5280cctcttccgc gtcttcgcct tcgccctcag
acgagtcgga tctccctttg ggccgcctcc 5340ccgcctggaa ttaattctgc agtcgagacc
tagaaaaaca tggagcaatc acaagtagca 5400atacagcagc taccaatgct gattgtgcct
ggctagaagc acaagaggag gaggaggtgg 5460gtttttccag tcacacctca ggtaccttta
agaccaatga cttacaaggc agctgtagat 5520cttagccact ttttaaaaga aaagagggga
ctggaagggc taattcactc ccaacgaaga 5580caagatatcc ttgatctgtg gatctaccac
acacaaggct acttccctga ttagcagaac 5640tacacaccag ggccaggggt cagatatcca
ctgacctttg gatggtgcta caagctagta 5700ccagttgagc cagataaggt agaagaggcc
aataaaggag agaacaccag cttgttacac 5760cctgtgagcc tgcatgggat ggatgacccg
gagagagaag tgttagagtg gaggtttgac 5820agccgcctag catttcatca cgtggcccga
gagctgcatc cggagtactt caagaactgc 5880tgatatcgag cttgctacaa gggactttcc
gctggggact ttccagggag gcgtggcctg 5940ggcgggactg gggagtggcg agccctcaga
tcctgcatat aagcagctgc tttttgcctg 6000tactgggtct ctctggttag accagatctg
agcctgggag ctctctggct aactagggaa 6060cccactgctt aagcctcaat aaagcttgcc
ttgagtgctt caagtagtgt gtgcccgtct 6120gttgtgtgac tctggtaact agagatccct
cagacccttt tagtcagtgt ggaaaatctc 6180tagcagtagt agttcatgtc atcttattat
tcagtattta taacttgcaa agaaatgaat 6240atcagagagt gagaggcctt gacattgcta
gcgttttacc gtcgacctct agctagagct 6300tggcgtaatc atggtcatag ctgtttcctg
tgtgaaattg ttatccgctc acaattccac 6360acaacatacg agccggaagc ataaagtgta
aagcctgggg tgcctaatga gtgagctaac 6420tcacattaat tgcgttgcgc tcactgcccg
ctttccagtc gggaaacctg tcgtgccagc 6480tgcattaatg aatcggccaa cgcgcgggga
gaggcggttt gcgtattggg cgctcttccg 6540cttcctcgct cactgactcg ctgcgctcgg
tcgttcggct gcggcgagcg gtatcagctc 6600actcaaaggc ggtaatacgg ttatccacag
aatcagggga taacgcagga aagaacatgt 6660gagcaaaagg ccagcaaaag gccaggaacc
gtaaaaaggc cgcgttgctg gcgtttttcc 6720ataggctccg cccccctgac gagcatcaca
aaaatcgacg ctcaagtcag aggtggcgaa 6780acccgacagg actataaaga taccaggcgt
ttccccctgg aagctccctc gtgcgctctc 6840ctgttccgac cctgccgctt accggatacc
tgtccgcctt tctcccttcg ggaagcgtgg 6900cgctttctca tagctcacgc tgtaggtatc
tcagttcggt gtaggtcgtt cgctccaagc 6960tgggctgtgt gcacgaaccc cccgttcagc
ccgaccgctg cgccttatcc ggtaactatc 7020gtcttgagtc caacccggta agacacgact
tatcgccact ggcagcagcc actggtaaca 7080ggattagcag agcgaggtat gtaggcggtg
ctacagagtt cttgaagtgg tggcctaact 7140acggctacac tagaagaaca gtatttggta
tctgcgctct gctgaagcca gttaccttcg 7200gaaaaagagt tggtagctct tgatccggca
aacaaaccac cgctggtagc ggtttttttg 7260tttgcaagca gcagattacg cgcagaaaaa
aaggatctca agaagatcct ttgatctttt 7320ctacggggtc tgacgctcag tggaacgaaa
actcacgtta agggattttg gtcatgagat 7380tatcaaaaag gatcttcacc tagatccttt
taaattaaaa atgaagtttt aaatcaatct 7440aaagtatata tgagtaaact tggtctgaca
gttaccaatg cttaatcagt gaggcaccta 7500tctcagcgat ctgtctattt cgttcatcca
tagttgcctg actccccgtc gtgtagataa 7560ctacgatacg ggagggctta ccatctggcc
ccagtgctgc aatgataccg cgagacccac 7620gctcaccggc tccagattta tcagcaataa
accagccagc cggaagggcc gagcgcagaa 7680gtggtcctgc aactttatcc gcctccatcc
agtctattaa ttgttgccgg gaagctagag 7740taagtagttc gccagttaat agtttgcgca
acgttgttgc cattgctaca ggcatcgtgg 7800tgtcacgctc gtcgtttggt atggcttcat
tcagctccgg ttcccaacga tcaaggcgag 7860ttacatgatc ccccatgttg tgcaaaaaag
cggttagctc cttcggtcct ccgatcgttg 7920tcagaagtaa gttggccgca gtgttatcac
tcatggttat ggcagcactg cataattctc 7980ttactgtcat gccatccgta agatgctttt
ctgtgactgg tgagtactca accaagtcat 8040tctgagaata gtgtatgcgg cgaccgagtt
gctcttgccc ggcgtcaata cgggataata 8100ccgcgccaca tagcagaact ttaaaagtgc
tcatcattgg aaaacgttct tcggggcgaa 8160aactctcaag gatcttaccg ctgttgagat
ccagttcgat gtaacccact cgtgcaccca 8220actgatcttc agcatctttt actttcacca
gcgtttctgg gtgagcaaaa acaggaaggc 8280aaaatgccgc aaaaaaggga ataagggcga
cacggaaatg ttgaatactc atactcttcc 8340tttttcaata ttattgaagc atttatcagg
gttattgtct catgagcgga tacatatttg 8400aatgtattta gaaaaataaa caaatagggg
ttccgcgcac atttccccga aaagtgccac 8460ctgacgtcga cggatcggga gatcaacttg
tttattgcag cttataatgg ttacaaataa 8520agcaatagca tcacaaattt cacaaataaa
gcattttttt cactgcattc tagttgtggt 8580ttgtccaaac tcatcaatgt atcttatcat
gtctggatca actggataac tcaagctaac 8640caaaatcatc ccaaacttcc caccccatac
cctattacca ctgccaatta cctagtggtt 8700tcatttactc taaacctgtg attcctctga
attattttca ttttaaagaa attgtatttg 8760ttaaatatgt actacaaact tagtagtttt
taaagaaatt gtatttgtta aatatgtact 8820acaaacttag tagt
88344111329DNAArtificial Sequencevector
41tggaagggct aattcactcc caaagaagac aagatatcct tgatctgtgg atctaccaca
60cacaaggcta cttccctgat tagcagaact acacaccagg gccaggggtc agatatccac
120tgacctttgg atggtgctac aagctagtac cagttgagcc agataaggta gaagaggcca
180ataaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatgggatg gatgacccgg
240agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac gtggcccgag
300agctgcatcc ggagtacttc aagaactgct gatatcgagc ttgctacaag ggactttccg
360ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat
420cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga
480gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata aagcttgcct
540tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta gagatccctc
600agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg gacttgaaag
660cgaaagggaa accagaggag ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg
720caagaggcga ggggcggcga ctggtgagta cgccaaaaat tttgactagc ggaggctaga
780aggagagaga tgggtgcgag agcgtcagta ttaagcgggg gagaattaga tcgcgatggg
840aaaaaattcg gttaaggcca gggggaaaga aaaaatataa attaaaacat atagtatggg
900caagcaggga gctagaacga ttcgcagtta atcctggcct gttagaaaca tcagaaggct
960gtagacaaat actgggacag ctacaaccat cccttcagac aggatcagaa gaacttagat
1020cattatataa tacagtagca accctctatt gtgtgcatca aaggatagag ataaaagaca
1080ccaaggaagc tttagacaag atagaggaag agcaaaacaa aagtaagacc accgcacagc
1140aagcggccgg ccgctgatct tcagacctgg aggaggagat atgagggaca attggagaag
1200tgaattatat aaatataaag tagtaaaaat tgaaccatta ggagtagcac ccaccaaggc
1260aaagagaaga gtggtgcaga gagaaaaaag agcagtggga ataggagctt tgttccttgg
1320gttcttggga gcagcaggaa gcactatggg cgcagcgtca atgacgctga cggtacaggc
1380cagacaatta ttgtctggta tagtgcagca gcagaacaat ttgctgaggg ctattgaggc
1440gcaacagcat ctgttgcaac tcacagtctg gggcatcaag cagctccagg caagaatcct
1500ggctgtggaa agatacctaa aggatcaaca gctcctgggg atttggggtt gctctggaaa
1560actcatttgc accactgctg tgccttggaa tgctagttgg agtaataaat ctctggaaca
1620gatttggaat cacacgacct ggatggagtg ggacagagaa attaacaatt acacaagctt
1680aatacactcc ttaattgaag aatcgcaaaa ccagcaagaa aagaatgaac aagaattatt
1740ggaattagat aaatgggcaa gtttgtggaa ttggtttaac ataacaaatt ggctgtggta
1800tataaaatta ttcataatga tagtaggagg cttggtaggt ttaagaatag tttttgctgt
1860actttctata gtgaatagag ttaggcaggg atattcacca ttatcgtttc agacccacct
1920cccaaccccg aggggacccg acaggcccga aggaatagaa gaagaaggtg gagagagaga
1980cagagacaga tccattcgat tagtgaacgg atctcgacgg tatcgccttt aaaagaaaag
2040gggggattgg ggggtacagt gcaggggaaa gaatagtaga cataatagca acagacatac
2100aaactaaaga attacaaaaa caaattacaa aaattcaaaa ttttcgggtt tattacaggg
2160acagcagaga tccagtttat cgataagctt gggagttccg cgttacataa cttacggtaa
2220atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata atgacgtatg
2280ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggag tatttacggt
2340aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc cctattgacg
2400tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta tgggactttc
2460ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg cggttttggc
2520agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt ctccacccca
2580ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca aaatgtcgta
2640acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag gtctatataa
2700gcagagctcg tttagtgaac cgtcagatcg cctggagacg ccatccacgc tgttttgacc
2760tccatagaag acaccgactc tactagagga tcgctagcgc taccggactc agatctcgag
2820ctcaagcttc gaattctgca gtcgacggta ccgcgggccc ggatgcagat cgagctgtcc
2880acctgctttt ttctgtgcct gctgcggttc tgcttcagcg ccacccggcg gtactacctg
2940ggcgccgtgg agctgtcctg ggactacatg cagagcgacc tgggcgagct gcccgtggac
3000gcccggttcc cccccagagt gcccaagagc ttccccttca acaccagcgt ggtgtacaag
3060aaaaccctgt tcgtggagtt caccgaccac ctgttcaata tcgccaagcc caggcccccc
3120tggatgggcc tgctgggccc caccatccag gccgaggtgt acgacaccgt ggtgatcacc
3180ctgaagaaca tggccagcca ccccgtgagc ctgcacgccg tgggcgtgag ctactggaag
3240gccagcgagg gcgccgagta cgacgaccag accagccagc gggagaaaga agatgacaag
3300gtgttccctg gcggcagcca cacctacgtg tggcaggtgc tgaaagaaaa cggccccatg
3360gcctccgacc ccctgtgcct gacctacagc tacctgagcc acgtggacct ggtgaaggac
3420ctgaacagcg gcctgatcgg cgctctgctc gtctgccggg agggcagcct ggccaaagag
3480aaaacccaga ccctgcacaa gttcatcctg ctgttcgccg tgttcgacga gggcaagagc
3540tggcacagcg agacaaagaa cagcctgatg caggaccggg acgccgcctc tgccagagcc
3600tggcccaaga tgcacaccgt gaacggctac gtgaacagaa gcctgcccgg cctgattggc
3660tgccaccgga agagcgtgta ctggcacgtg atcggcatgg gcaccacacc cgaggtgcac
3720agcatctttc tggaagggca cacctttctg gtccggaacc accggcaggc cagcctggaa
3780atcagcccta tcaccttcct gaccgcccag acactgctga tggacctggg ccagttcctg
3840ctgttttgcc acatcagctc tcaccagcac gacggcatgg aagcctacgt gaaggtggac
3900tcttgccccg aggaacccca gctgcggatg aagaacaacg aggaagccga ggactacgac
3960gacgacctga ccgacagcga gatggacgtg gtgcggttcg acgacgacaa cagccccagc
4020ttcatccaga tcagaagcgt ggccaagaag caccccaaga cctgggtgca ctatatcgcc
4080gccgaggaag aggactggga ctacgccccc ctggtgctgg cccccgacga cagaagctac
4140aagagccagt acctgaacaa tggcccccag cggatcggcc ggaagtacaa gaaagtgcgg
4200ttcatggcct acaccgacga gacattcaag acccgggagg ccatccagca cgagagcggc
4260atcctgggcc ccctgctgta cggcgaagtg ggcgacacac tgctgatcat cttcaagaac
4320caggctagcc ggccctacaa catctacccc cacggcatca ccgacgtgcg gcccctgtac
4380agcaggcggc tgcccaaggg cgtgaagcac ctgaaggact tccccatcct gcccggcgag
4440atcttcaagt acaagtggac cgtgaccgtg gaggacggcc ccaccaagag cgaccccaga
4500tgcctgaccc ggtactacag cagcttcgtg aacatggaac gggacctggc ctccgggctg
4560atcggacctc tgctgatctg ctacaaagaa agcgtggacc agcggggcaa ccagatcatg
4620agcgacaagc ggaacgtgat cctgttcagc gtgttcgatg agaaccggtc ctggtatctg
4680accgagaaca tccagcggtt tctgcccaac cctgccggcg tgcagctgga agatcccgag
4740ttccaggcca gcaacatcat gcactccatc aatggctacg tgttcgactc tctgcagctc
4800tccgtgtgtc tgcacgaggt ggcctactgg tacatcctga gcatcggcgc ccagaccgac
4860ttcctgagcg tgttcttcag cggctacacc ttcaagcaca agatggtgta cgaggacacc
4920ctgaccctgt tccctttcag cggcgagaca gtgttcatga gcatggaaaa ccccggcctg
4980tggattctgg gctgccacaa cagcgacttc cggaaccggg gcatgaccgc cctgctgaag
5040gtgtccagct gcgacaagaa caccggcgac tactacgagg acagctacga ggatatcagc
5100gcctacctgc tgtccaagaa caacgccatc gaaccccgga gcttcagcca gaaccccccc
5160gtgctgacgc gtcaccagcg ggagatcacc cggacaaccc tgcagtccga ccaggaagag
5220atcgattacg acgacaccat cagcgtggag atgaagaaag aggatttcga tatctacgac
5280gaggacgaga accagagccc cagaagcttc cagaagaaaa cccggcacta cttcattgcc
5340gccgtggaga ggctgtggga ctacggcatg agttctagcc cccacgtgct gcggaaccgg
5400gcccagagcg gcagcgtgcc ccagttcaag aaagtggtgt tccaggaatt cacagacggc
5460agcttcaccc agcctctgta tagaggcgag ctgaacgagc acctggggct gctggggccc
5520tacatcaggg ccgaagtgga ggacaacatc atggtgacct tccggaatca ggccagcaga
5580ccctactcct tctacagcag cctgatcagc tacgaagagg accagcggca gggcgccgaa
5640ccccggaaga acttcgtgaa gcccaacgaa accaagacct acttctggaa agtgcagcac
5700cacatggccc ccaccaagga cgagttcgac tgcaaggcct gggcctactt cagcgacgtg
5760gatctggaaa aggacgtgca ctctggactg attggcccac tcctggtctg ccacactaac
5820accctcaacc ccgcccacgg ccgccaggtg accgtgcagg aattcgccct gttcttcacc
5880atcttcgacg agacaaagtc ctggtacttc accgagaata tggaacggaa ctgcagagcc
5940ccctgcaaca tccagatgga agatcctacc ttcaaagaga actaccggtt ccacgccatc
6000aacggctaca tcatggacac cctgcctggc ctggtgatgg cccaggacca gagaatccgg
6060tggtatctgc tgtccatggg cagcaacgag aatatccaca gcatccactt cagcggccac
6120gtgttcaccg tgcggaagaa agaagagtac aagatggccc tgtacaacct gtaccccggc
6180gtgttcgaga cagtggagat gctgcccagc aaggccggca tctggcgggt ggagtgtctg
6240atcggcgagc acctgcacgc tggcatgagc accctgtttc tggtgtacag caacaagtgc
6300cagaccccac tgggcatggc ctctggccac atccgggact tccagatcac cgcctccggc
6360cagtacggcc agtgggcccc caagctggcc agactgcact acagcggcag catcaacgcc
6420tggtccacca aagagccctt cagctggatc aaggtggacc tgctggcccc tatgatcatc
6480cacggcatta agacccaggg cgccaggcag aagttcagca gcctgtacat cagccagttc
6540atcatcatgt acagcctgga cggcaagaag tggcagacct accggggcaa cagcaccggc
6600accctgatgg tgttcttcgg caatgtggac agcagcggca tcaagcacaa catcttcaac
6660ccccccatca ttgcccggta catccggctg caccccaccc actacagcat tagatccaca
6720ctgagaatgg aactgatggg ctgcgacctg aactcctgca gcatgcctct gggcatggaa
6780agcaaggcca tcagcgacgc ccagatcaca gccagcagct acttcaccaa catgttcgcc
6840acctggtccc cctccaaggc caggctgcac ctgcagggcc ggtccaacgc ctggcggcct
6900caggtcaaca accccaaaga atggctgcag gtggactttc agaaaaccat gaaggtgacc
6960ggcgtgacca cccagggcgt gaaaagcctg ctgaccagca tgtacgtgaa agagtttctg
7020atcagcagct ctcaggatgg ccaccagtgg accctgttct ttcagaacgg caaggtgaaa
7080gtgttccagg gcaaccagga ctccttcacc cccgtggtga actccctgga cccccccctg
7140ctgacccgct acctgagaat ccacccccag tcttgggtgc accagatcgc cctcaggatg
7200gaagtcctgg gatgtgaggc ccaggatctg tactgatgac gtctggaaca atcaacctct
7260ggattacaaa atttgtgaaa gattgactgg tattcttaac tatgttgctc cttttacgct
7320atgtggatac gctgctttaa tgcctttgta tcatgctatt gcttcccgta tggctttcat
7380tttctcctcc ttgtataaat cctggttgct gtctctttat gaggagttgt ggcccgttgt
7440caggcaacgt ggcgtggtgt gcactgtgtt tgctgacgca acccccactg gttggggcat
7500tgccaccacc tgtcagctcc tttccgggac tttcgctttc cccctcccta ttgccacggc
7560ggaactcatc gccgcctgcc ttgcccgctg ctggacaggg gctcggctgt tgggcactga
7620caattccgtg gtgttgtcgg ggaagctgac gtcctttcca tggctgctcg cctgtgttgc
7680cacctggatt ctgcgcggga cgtccttctg ctacgtccct tcggccctca atccagcgga
7740ccttccttcc cgcggcctgc tgccggctct gcggcctctt ccgcgtcttc gccttcgccc
7800tcagacgagt cggatctccc tttgggccgc ctccccgcct ggaattaatt ctgcagtcga
7860gacctagaaa aacatggagc aatcacaagt agcaatacag cagctaccaa tgctgattgt
7920gcctggctag aagcacaaga ggaggaggag gtgggttttc cagtcacacc tcaggtacct
7980ttaagaccaa tgacttacaa ggcagctgta gatcttagcc actttttaaa agaaaagagg
8040ggactggaag ggctaattca ctcccaacga agacaagata tccttgatct gtggatctac
8100cacacacaag gctacttccc tgattagcag aactacacac cagggccagg ggtcagatat
8160ccactgacct ttggatggtg ctacaagcta gtaccagttg agccagataa ggtagaagag
8220gccaataaag gagagaacac cagcttgtta caccctgtga gcctgcatgg gatggatgac
8280ccggagagag aagtgttaga gtggaggttt gacagccgcc tagcatttca tcacgtggcc
8340cgagagctgc atccggagta cttcaagaac tgctgatatc gagcttgcta caagggactt
8400tccgctgggg actttccagg gaggcgtggc ctgggcggga ctggggagtg gcgagccctc
8460agatcctgca tataagcagc tgctttttgc ctgtactggg tctctctggt tagaccagat
8520ctgagcctgg gagctctctg gctaactagg gaacccactg cttaagcctc aataaagctt
8580gccttgagtg cttcaagtag tgtgtgcccg tctgttgtgt gactctggta actagagatc
8640cctcagaccc ttttagtcag tgtggaaaat ctctagcagt agtagttcat gtcatcttat
8700tattcagtat ttataacttg caaagaaatg aatatcagag agtgagaggc cttgacattg
8760ctagcgtttt accgtcgacc tctagctaga gcttggcgta atcatggtca tagctgtttc
8820ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt
8880gtaaagcctg gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc
8940ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg
9000ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct
9060cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca
9120cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga
9180accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc
9240acaaaaatcg acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg
9300cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat
9360acctgtccgc ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt
9420atctcagttc ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc
9480agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg
9540acttatcgcc actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg
9600gtgctacaga gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg
9660gtatctgcgc tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg
9720gcaaacaaac caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca
9780gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga
9840acgaaaactc acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga
9900tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt
9960ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt
10020catccatagt tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat
10080ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag
10140caataaacca gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct
10200ccatccagtc tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt
10260tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg
10320cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca
10380aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt
10440tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat
10500gcttttctgt gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac
10560cgagttgctc ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa
10620aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt
10680tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt
10740tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa
10800gggcgacacg gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt
10860atcagggtta ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa
10920taggggttcc gcgcacattt ccccgaaaag tgccacctga cgtcgacgga tcgggagatc
10980aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca
11040aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct
11100tatcatgtct ggatcaactg gataactcaa gctaaccaaa atcatcccaa acttcccacc
11160ccatacccta ttaccactgc caattacctg tggtttcatt tactctaaac ctgtgattcc
11220tctgaattat tttcatttta aagaaattgt atttgttaaa tatgtactac aaacttagta
11280gtttttaaag aaattgtatt tgttaaatat gtactacaaa cttagtagt
113294211220DNAArtificial Sequencevector 42tggaagggct aattcactcc
caaagaagac aagatatcct tgatctgtgg atctaccaca 60cacaaggcta cttccctgat
tagcagaact acacaccagg gccaggggtc agatatccac 120tgacctttgg atggtgctac
aagctagtac cagttgagcc agataaggta gaagaggcca 180ataaaggaga gaacaccagc
ttgttacacc ctgtgagcct gcatgggatg gatgacccgg 240agagagaagt gttagagtgg
aggtttgaca gccgcctagc atttcatcac gtggcccgag 300agctgcatcc ggagtacttc
aagaactgct gatatcgagc ttgctacaag ggactttccg 360ctggggactt tccagggagg
cgtggcctgg gcgggactgg ggagtggcga gccctcagat 420cctgcatata agcagctgct
ttttgcctgt actgggtctc tctggttaga ccagatctga 480gcctgggagc tctctggcta
actagggaac ccactgctta agcctcaata aagcttgcct 540tgagtgcttc aagtagtgtg
tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 600agaccctttt agtcagtgtg
gaaaatctct agcagtggcg cccgaacagg gacttgaaag 660cgaaagggaa accagaggag
ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg 720caagaggcga ggggcggcga
ctggtgagta cgccaaaaat tttgactagc ggaggctaga 780aggagagaga tgggtgcgag
agcgtcagta ttaagcgggg gagaattaga tcgcgatggg 840aaaaaattcg gttaaggcca
gggggaaaga aaaaatataa attaaaacat atagtatggg 900caagcaggga gctagaacga
ttcgcagtta atcctggcct gttagaaaca tcagaaggct 960gtagacaaat actgggacag
ctacaaccat cccttcagac aggatcagaa gaacttagat 1020cattatataa tacagtagca
accctctatt gtgtgcatca aaggatagag ataaaagaca 1080ccaaggaagc tttagacaag
atagaggaag agcaaaacaa aagtaagacc accgcacagc 1140aagcggccgg ccgctgatct
tcagacctgg aggaggagat atgagggaca attggagaag 1200tgaattatat aaatataaag
tagtaaaaat tgaaccatta ggagtagcac ccaccaaggc 1260aaagagaaga gtggtgcaga
gagaaaaaag agcagtggga ataggagctt tgttccttgg 1320gttcttggga gcagcaggaa
gcactatggg cgcagcgtca atgacgctga cggtacaggc 1380cagacaatta ttgtctggta
tagtgcagca gcagaacaat ttgctgaggg ctattgaggc 1440gcaacagcat ctgttgcaac
tcacagtctg gggcatcaag cagctccagg caagaatcct 1500ggctgtggaa agatacctaa
aggatcaaca gctcctgggg atttggggtt gctctggaaa 1560actcatttgc accactgctg
tgccttggaa tgctagttgg agtaataaat ctctggaaca 1620gatttggaat cacacgacct
ggatggagtg ggacagagaa attaacaatt acacaagctt 1680aatacactcc ttaattgaag
aatcgcaaaa ccagcaagaa aagaatgaac aagaattatt 1740ggaattagat aaatgggcaa
gtttgtggaa ttggtttaac ataacaaatt ggctgtggta 1800tataaaatta ttcataatga
tagtaggagg cttggtaggt ttaagaatag tttttgctgt 1860actttctata gtgaatagag
ttaggcaggg atattcacca ttatcgtttc agacccacct 1920cccaaccccg aggggggacc
cgacaggccc gaaggaatag aagaagaagg tggagagaga 1980gacagagaca gatccattcg
attagtgaac ggatctcgac ggtcgccaaa tggcagtatt 2040catccacaat tttaaaagaa
aaggggggat tggggggtac agtgcagggg aaagaatagt 2100agacataata gcaacagaca
tacaaactaa agaattacaa aaacaaatta caaaaattca 2160aaattttcgg gtttattaca
gggacagcag agatccagtt tggatcgata agcttgatat 2220cgaattcctg cgcggccgct
tcgaacgcgc gcgatgcatc atatgcgtac gcggtctaga 2280actagtgaga cataaaagga
aaatgaagcg agcaacaatt aaaaaaaatt ccccgcacac 2340aacaatacaa tctatttaaa
ctgtggctca tacttttcat accaatggta tgactttttt 2400tctggagtcc cctcttctga
ttcttgaact ccggggctgg cagcttgcaa aggggaagcg 2460gactccagca ctgcacgggc
aggtttagca aaggtctcta atgggtattt tctttttctt 2520agccctgccc ccgaattgtc
agacggcggg cgtctgcctc tgaagttagc agtgatttcc 2580tttcgggcct ggccttatct
ccggctgcac gttgcctgtt ggtgactaat aacacaataa 2640cattgtctgg ggctggaata
aagtcggagc tgtttacccc cactctaata ggggttcaat 2700ataaaaagcc ggcagagagc
tgtccaagtc agacgcgcct cagcgctgga tccatgcaga 2760tcgagctgtc cacctgcttt
tttctgtgcc tgctgcggtt ctgcttcagc gccacccggc 2820ggtactacct gggcgccgtg
gagctgtcct gggactacat gcagagcgac ctgggcgagc 2880tgcccgtgga cgcccggttc
ccccccagag tgcccaagag cttccccttc aacaccagcg 2940tggtgtacaa gaaaaccctg
ttcgtggagt tcaccgacca cctgttcaat atcgccaagc 3000ccaggccccc ctggatgggc
ctgctgggcc ccaccatcca ggccgaggtg tacgacaccg 3060tggtgatcac cctgaagaac
atggccagcc accccgtgag cctgcacgcc gtgggcgtga 3120gctactggaa ggccagcgag
ggcgccgagt acgacgacca gaccagccag cgggagaaag 3180aagatgacaa ggtgttccct
ggcggcagcc acacctacgt gtggcaggtg ctgaaagaaa 3240acggccccat ggcctccgac
cccctgtgcc tgacctacag ctacctgagc cacgtggacc 3300tggtgaagga cctgaacagc
ggcctgatcg gcgctctgct cgtctgccgg gagggcagcc 3360tggccaaaga gaaaacccag
accctgcaca agttcatcct gctgttcgcc gtgttcgacg 3420agggcaagag ctggcacagc
gagacaaaga acagcctgat gcaggaccgg gacgccgcct 3480ctgccagagc ctggcccaag
atgcacaccg tgaacggcta cgtgaacaga agcctgcccg 3540gcctgattgg ctgccaccgg
aagagcgtgt actggcacgt gatcggcatg ggcaccacac 3600ccgaggtgca cagcatcttt
ctggaagggc acacctttct ggtccggaac caccggcagg 3660ccagcctgga aatcagccct
atcaccttcc tgaccgccca gacactgctg atggacctgg 3720gccagttcct gctgttttgc
cacatcagct ctcaccagca cgacggcatg gaagcctacg 3780tgaaggtgga ctcttgcccc
gaggaacccc agctgcggat gaagaacaac gaggaagccg 3840aggactacga cgacgacctg
accgacagcg agatggacgt ggtgcggttc gacgacgaca 3900acagccccag cttcatccag
atcagaagcg tggccaagaa gcaccccaag acctgggtgc 3960actatatcgc cgccgaggaa
gaggactggg actacgcccc cctggtgctg gcccccgacg 4020acagaagcta caagagccag
tacctgaaca atggccccca gcggatcggc cggaagtaca 4080agaaagtgcg gttcatggcc
tacaccgacg agacattcaa gacccgggag gccatccagc 4140acgagagcgg catcctgggc
cccctgctgt acggcgaagt gggcgacaca ctgctgatca 4200tcttcaagaa ccaggctagc
cggccctaca acatctaccc ccacggcatc accgacgtgc 4260ggcccctgta cagcaggcgg
ctgcccaagg gcgtgaagca cctgaaggac ttccccatcc 4320tgcccggcga gatcttcaag
tacaagtgga ccgtgaccgt ggaggacggc cccaccaaga 4380gcgaccccag atgcctgacc
cggtactaca gcagcttcgt gaacatggaa cgggacctgg 4440cctccgggct gatcggacct
ctgctgatct gctacaaaga aagcgtggac cagcggggca 4500accagatcat gagcgacaag
cggaacgtga tcctgttcag cgtgttcgat gagaaccggt 4560cctggtatct gaccgagaac
atccagcggt ttctgcccaa ccctgccggc gtgcagctgg 4620aagatcccga gttccaggcc
agcaacatca tgcactccat caatggctac gtgttcgact 4680ctctgcagct ctccgtgtgt
ctgcacgagg tggcctactg gtacatcctg agcatcggcg 4740cccagaccga cttcctgagc
gtgttcttca gcggctacac cttcaagcac aagatggtgt 4800acgaggacac cctgaccctg
ttccctttca gcggcgagac agtgttcatg agcatggaaa 4860accccggcct gtggattctg
ggctgccaca acagcgactt ccggaaccgg ggcatgaccg 4920ccctgctgaa ggtgtccagc
tgcgacaaga acaccggcga ctactacgag gacagctacg 4980aggatatcag cgcctacctg
ctgtccaaga acaacgccat cgaaccccgg agcttcagcc 5040agaacccccc cgtgctgacg
cgtcaccagc gggagatcac ccggacaacc ctgcagtccg 5100accaggaaga gatcgattac
gacgacacca tcagcgtgga gatgaagaaa gaggatttcg 5160atatctacga cgaggacgag
aaccagagcc ccagaagctt ccagaagaaa acccggcact 5220acttcattgc cgccgtggag
aggctgtggg actacggcat gagttctagc ccccacgtgc 5280tgcggaaccg ggcccagagc
ggcagcgtgc cccagttcaa gaaagtggtg ttccaggaat 5340tcacagacgg cagcttcacc
cagcctctgt atagaggcga gctgaacgag cacctggggc 5400tgctggggcc ctacatcagg
gccgaagtgg aggacaacat catggtgacc ttccggaatc 5460aggccagcag accctactcc
ttctacagca gcctgatcag ctacgaagag gaccagcggc 5520agggcgccga accccggaag
aacttcgtga agcccaacga aaccaagacc tacttctgga 5580aagtgcagca ccacatggcc
cccaccaagg acgagttcga ctgcaaggcc tgggcctact 5640tcagcgacgt ggatctggaa
aaggacgtgc actctggact gattggccca ctcctggtct 5700gccacactaa caccctcaac
cccgcccacg gccgccaggt gaccgtgcag gaattcgccc 5760tgttcttcac catcttcgac
gagacaaagt cctggtactt caccgagaat atggaacgga 5820actgcagagc cccctgcaac
atccagatgg aagatcctac cttcaaagag aactaccggt 5880tccacgccat caacggctac
atcatggaca ccctgcctgg cctggtgatg gcccaggacc 5940agagaatccg gtggtatctg
ctgtccatgg gcagcaacga gaatatccac agcatccact 6000tcagcggcca cgtgttcacc
gtgcggaaga aagaagagta caagatggcc ctgtacaacc 6060tgtaccccgg cgtgttcgag
acagtggaga tgctgcccag caaggccggc atctggcggg 6120tggagtgtct gatcggcgag
cacctgcacg ctggcatgag caccctgttt ctggtgtaca 6180gcaacaagtg ccagacccca
ctgggcatgg cctctggcca catccgggac ttccagatca 6240ccgcctccgg ccagtacggc
cagtgggccc ccaagctggc cagactgcac tacagcggca 6300gcatcaacgc ctggtccacc
aaagagccct tcagctggat caaggtggac ctgctggccc 6360ctatgatcat ccacggcatt
aagacccagg gcgccaggca gaagttcagc agcctgtaca 6420tcagccagtt catcatcatg
tacagcctgg acggcaagaa gtggcagacc taccggggca 6480acagcaccgg caccctgatg
gtgttcttcg gcaatgtgga cagcagcggc atcaagcaca 6540acatcttcaa cccccccatc
attgcccggt acatccggct gcaccccacc cactacagca 6600ttagatccac actgagaatg
gaactgatgg gctgcgacct gaactcctgc agcatgcctc 6660tgggcatgga aagcaaggcc
atcagcgacg cccagatcac agccagcagc tacttcacca 6720acatgttcgc cacctggtcc
ccctccaagg ccaggctgca cctgcagggc cggtccaacg 6780cctggcggcc tcaggtcaac
aaccccaaag aatggctgca ggtggacttt cagaaaacca 6840tgaaggtgac cggcgtgacc
acccagggcg tgaaaagcct gctgaccagc atgtacgtga 6900aagagtttct gatcagcagc
tctcaggatg gccaccagtg gaccctgttc tttcagaacg 6960gcaaggtgaa agtgttccag
ggcaaccagg actccttcac ccccgtggtg aactccctgg 7020acccccccct gctgacccgc
tacctgagaa tccaccccca gtcttgggtg caccagatcg 7080ccctcaggat ggaagtcctg
ggatgtgagg cccaggatct gtactgatga cgtctggaac 7140aatcaacctc tggattacaa
aatttgtgaa agattgactg gtattcttaa ctatgttgct 7200ccttttacgc tatgtggata
cgctgcttta atgcctttgt atcatgctat tgcttcccgt 7260atggctttca ttttctcctc
cttgtataaa tcctggttgc tgtctcttta tgaggagttg 7320tggcccgttg tcaggcaacg
tggcgtggtg tgcactgtgt ttgctgacgc aacccccact 7380ggttggggca ttgccaccac
ctgtcagctc ctttccggga ctttcgcttt ccccctccct 7440attgccacgg cggaactcat
cgccgcctgc cttgcccgct gctggacagg ggctcggctg 7500ttgggcactg acaattccgt
ggtgttgtcg gggaagctga cgtcctttcc atggctgctc 7560gcctgtgttg ccacctggat
tctgcgcggg acgtccttct gctacgtccc ttcggccctc 7620aatccagcgg accttccttc
ccgcggcctg ctgccggctc tgcggcctct tccgcgtctt 7680cgccttcgcc ctcagacgag
tcggatctcc ctttgggccg cctccccgcc tggaattaat 7740tctgcagtcg agacctagaa
aaacatggag caatcacaag tagcaataca gcagctacca 7800atgctgattg tgcctggcta
gaagcacaag aggaggagga ggtgggtttt ccagtcacac 7860ctcaggtacc tttaagacca
atgacttaca aggcagctgt agatcttagc cactttttaa 7920aagaaaagag gggactggaa
gggctaattc actcccaacg aagacaagat atccttgatc 7980tgtggatcta ccacacacaa
ggctacttcc ctgattagca gaactacaca ccagggccag 8040gggtcagata tccactgacc
tttggatggt gctacaagct agtaccagtt gagccagata 8100aggtagaaga ggccaataaa
ggagagaaca ccagcttgtt acaccctgtg agcctgcatg 8160ggatggatga cccggagaga
gaagtgttag agtggaggtt tgacagccgc ctagcatttc 8220atcacgtggc ccgagagctg
catccggagt acttcaagaa ctgctgatat cgagcttgct 8280acaagggact ttccgctggg
gactttccag ggaggcgtgg cctgggcggg actggggagt 8340ggcgagccct cagatcctgc
atataagcag ctgctttttg cctgtactgg gtctctctgg 8400ttagaccaga tctgagcctg
ggagctctct ggctaactag ggaacccact gcttaagcct 8460caataaagct tgccttgagt
gcttcaagta gtgtgtgccc gtctgttgtg tgactctggt 8520aactagagat ccctcagacc
cttttagtca gtgtggaaaa tctctagcag tagtagttca 8580tgtcatctta ttattcagta
tttataactt gcaaagaaat gaatatcaga gagtgagagg 8640ccttgacatt gctagcgttt
taccgtcgac ctctagctag agcttggcgt aatcatggtc 8700atagctgttt cctgtgtgaa
attgttatcc gctcacaatt ccacacaaca tacgagccgg 8760aagcataaag tgtaaagcct
ggggtgccta atgagtgagc taactcacat taattgcgtt 8820gcgctcactg cccgctttcc
agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 8880ccaacgcgcg gggagaggcg
gtttgcgtat tgggcgctct tccgcttcct cgctcactga 8940ctcgctgcgc tcggtcgttc
ggctgcggcg agcggtatca gctcactcaa aggcggtaat 9000acggttatcc acagaatcag
gggataacgc aggaaagaac atgtgagcaa aaggccagca 9060aaaggccagg aaccgtaaaa
aggccgcgtt gctggcgttt ttccataggc tccgcccccc 9120tgacgagcat cacaaaaatc
gacgctcaag tcagaggtgg cgaaacccga caggactata 9180aagataccag gcgtttcccc
ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 9240gcttaccgga tacctgtccg
cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 9300acgctgtagg tatctcagtt
cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 9360accccccgtt cagcccgacc
gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 9420ggtaagacac gacttatcgc
cactggcagc agccactggt aacaggatta gcagagcgag 9480gtatgtaggc ggtgctacag
agttcttgaa gtggtggcct aactacggct acactagaag 9540aacagtattt ggtatctgcg
ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 9600ctcttgatcc ggcaaacaaa
ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 9660gattacgcgc agaaaaaaag
gatctcaaga agatcctttg atcttttcta cggggtctga 9720cgctcagtgg aacgaaaact
cacgttaagg gattttggtc atgagattat caaaaaggat 9780cttcacctag atccttttaa
attaaaaatg aagttttaaa tcaatctaaa gtatatatga 9840gtaaacttgg tctgacagtt
accaatgctt aatcagtgag gcacctatct cagcgatctg 9900tctatttcgt tcatccatag
ttgcctgact ccccgtcgtg tagataacta cgatacggga 9960gggcttacca tctggcccca
gtgctgcaat gataccgcga gacccacgct caccggctcc 10020agatttatca gcaataaacc
agccagccgg aagggccgag cgcagaagtg gtcctgcaac 10080tttatccgcc tccatccagt
ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 10140agttaatagt ttgcgcaacg
ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 10200gtttggtatg gcttcattca
gctccggttc ccaacgatca aggcgagtta catgatcccc 10260catgttgtgc aaaaaagcgg
ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 10320ggccgcagtg ttatcactca
tggttatggc agcactgcat aattctctta ctgtcatgcc 10380atccgtaaga tgcttttctg
tgactggtga gtactcaacc aagtcattct gagaatagtg 10440tatgcggcga ccgagttgct
cttgcccggc gtcaatacgg gataataccg cgccacatag 10500cagaacttta aaagtgctca
tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 10560cttaccgctg ttgagatcca
gttcgatgta acccactcgt gcacccaact gatcttcagc 10620atcttttact ttcaccagcg
tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 10680aaagggaata agggcgacac
ggaaatgttg aatactcata ctcttccttt ttcaatatta 10740ttgaagcatt tatcagggtt
attgtctcat gagcggatac atatttgaat gtatttagaa 10800aaataaacaa ataggggttc
cgcgcacatt tccccgaaaa gtgccacctg acgtcgacgg 10860atcgggagat caacttgttt
attgcagctt ataatggtta caaataaagc aatagcatca 10920caaatttcac aaataaagca
tttttttcac tgcattctag ttgtggtttg tccaaactca 10980tcaatgtatc ttatcatgtc
tggatcaact ggataactca agctaaccaa aatcatccca 11040aacttcccac cccataccct
attaccactg ccaattacct gtggtttcat ttactctaaa 11100cctgtgattc ctctgaatta
ttttcatttt aaagaaattg tatttgttaa atatgtacta 11160caaacttagt agtttttaaa
gaaattgtat ttgttaaata tgtactacaa acttagtagt 112204311236DNAArtificial
Sequencevector 43tggaagggct aattcactcc caaagaagac aagatatcct tgatctgtgg
atctaccaca 60cacaaggcta cttccctgat tagcagaact acacaccagg gccaggggtc
agatatccac 120tgacctttgg atggtgctac aagctagtac cagttgagcc agataaggta
gaagaggcca 180ataaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatgggatg
gatgacccgg 240agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac
gtggcccgag 300agctgcatcc ggagtacttc aagaactgct gatatcgagc ttgctacaag
ggactttccg 360ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga
gccctcagat 420cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga
ccagatctga 480gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata
aagcttgcct 540tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta
gagatccctc 600agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg
gacttgaaag 660cgaaagggaa accagaggag ctctctcgac gcaggactcg gcttgctgaa
gcgcgcacgg 720caagaggcga ggggcggcga ctggtgagta cgccaaaaat tttgactagc
ggaggctaga 780aggagagaga tgggtgcgag agcgtcagta ttaagcgggg gagaattaga
tcgcgatggg 840aaaaaattcg gttaaggcca gggggaaaga aaaaatataa attaaaacat
atagtatggg 900caagcaggga gctagaacga ttcgcagtta atcctggcct gttagaaaca
tcagaaggct 960gtagacaaat actgggacag ctacaaccat cccttcagac aggatcagaa
gaacttagat 1020cattatataa tacagtagca accctctatt gtgtgcatca aaggatagag
ataaaagaca 1080ccaaggaagc tttagacaag atagaggaag agcaaaacaa aagtaagacc
accgcacagc 1140aagcggccgg ccgctgatct tcagacctgg aggaggagat atgagggaca
attggagaag 1200tgaattatat aaatataaag tagtaaaaat tgaaccatta ggagtagcac
ccaccaaggc 1260aaagagaaga gtggtgcaga gagaaaaaag agcagtggga ataggagctt
tgttccttgg 1320gttcttggga gcagcaggaa gcactatggg cgcagcgtca atgacgctga
cggtacaggc 1380cagacaatta ttgtctggta tagtgcagca gcagaacaat ttgctgaggg
ctattgaggc 1440gcaacagcat ctgttgcaac tcacagtctg gggcatcaag cagctccagg
caagaatcct 1500ggctgtggaa agatacctaa aggatcaaca gctcctgggg atttggggtt
gctctggaaa 1560actcatttgc accactgctg tgccttggaa tgctagttgg agtaataaat
ctctggaaca 1620gatttggaat cacacgacct ggatggagtg ggacagagaa attaacaatt
acacaagctt 1680aatacactcc ttaattgaag aatcgcaaaa ccagcaagaa aagaatgaac
aagaattatt 1740ggaattagat aaatgggcaa gtttgtggaa ttggtttaac ataacaaatt
ggctgtggta 1800tataaaatta ttcataatga tagtaggagg cttggtaggt ttaagaatag
tttttgctgt 1860actttctata gtgaatagag ttaggcaggg atattcacca ttatcgtttc
agacccacct 1920cccaaccccg aggggggacc cgacaggccc gaaggaatag aagaagaagg
tggagagaga 1980gacagagaca gatccattcg attagtgaac ggatctcgac ggtcgccaaa
tggcagtatt 2040catccacaat tttaaaagaa aaggggggat tggggggtac agtgcagggg
aaagaatagt 2100agacataata gcaacagaca tacaaactaa agaattacaa aaacaaatta
caaaaattca 2160aaattttcgg gtttattaca gggacagcag agatccagtt tggatcgata
agcttgatat 2220cgaattcctg cgcggccgct tcgaacgcgc gcgatgcatc atatgcgtac
gcggtctaga 2280actagtgaga cataaaagga aaatgaagcg agcaacaatt aaaaaaaatt
ccccgcacac 2340aacaatacaa tctatttaaa ctgtggctca tacttttcat accaatggta
tgactttttt 2400tctggagtcc cctcttctga ttcttgaact ccggggctgg cagcttgcaa
aggggaagcg 2460gactccagca ctgcacgggc aggtttagca aaggtctcta atgggtattt
tctttttctt 2520agccctgccc ccgaattgtc agacggcggg cgtctgcctc tgaagttagc
agtgatttcc 2580tttcgggcct ggccttatct ccggctgcac gttgcctgtt ggtgactaat
aacacaataa 2640cattgtctgg ggctggaata aagtcggagc tgtttacccc cactctaata
ggggttcaat 2700ataaaaagcc ggcagagagc tgtccaagtc agacgcgcct cagcgctgga
tctcgggctc 2760gaggccacca tgcagatcga gctgtccacc tgcttttttc tgtgcctgct
gcggttctgc 2820ttcagcgcca cccggcggta ctacctgggc gccgtggagc tgtcctggga
ctacatgcag 2880agcgacctgg gcgagctgcc cgtggacgcc cggttccccc ccagagtgcc
caagagcttc 2940cccttcaaca ccagcgtggt gtacaagaaa accctgttcg tggagttcac
cgaccacctg 3000ttcaatatcg ccaagcccag gcccccctgg atgggcctgc tgggccccac
catccaggcc 3060gaggtgtacg acaccgtggt gatcaccctg aagaacatgg ccagccaccc
cgtgagcctg 3120cacgccgtgg gcgtgagcta ctggaaggcc agcgagggcg ccgagtacga
cgaccagacc 3180agccagcggg agaaagaaga tgacaaggtg ttccctggcg gcagccacac
ctacgtgtgg 3240caggtgctga aagaaaacgg ccccatggcc tccgaccccc tgtgcctgac
ctacagctac 3300ctgagccacg tggacctggt gaaggacctg aacagcggcc tgatcggcgc
tctgctcgtc 3360tgccgggagg gcagcctggc caaagagaaa acccagaccc tgcacaagtt
catcctgctg 3420ttcgccgtgt tcgacgaggg caagagctgg cacagcgaga caaagaacag
cctgatgcag 3480gaccgggacg ccgcctctgc cagagcctgg cccaagatgc acaccgtgaa
cggctacgtg 3540aacagaagcc tgcccggcct gattggctgc caccggaaga gcgtgtactg
gcacgtgatc 3600ggcatgggca ccacacccga ggtgcacagc atctttctgg aagggcacac
ctttctggtc 3660cggaaccacc ggcaggccag cctggaaatc agccctatca ccttcctgac
cgcccagaca 3720ctgctgatgg acctgggcca gttcctgctg ttttgccaca tcagctctca
ccagcacgac 3780ggcatggaag cctacgtgaa ggtggactct tgccccgagg aaccccagct
gcggatgaag 3840aacaacgagg aagccgagga ctacgacgac gacctgaccg acagcgagat
ggacgtggtg 3900cggttcgacg acgacaacag ccccagcttc atccagatca gaagcgtggc
caagaagcac 3960cccaagacct gggtgcacta tatcgccgcc gaggaagagg actgggacta
cgcccccctg 4020gtgctggccc ccgacgacag aagctacaag agccagtacc tgaacaatgg
cccccagcgg 4080atcggccgga agtacaagaa agtgcggttc atggcctaca ccgacgagac
attcaagacc 4140cgggaggcca tccagcacga gagcggcatc ctgggccccc tgctgtacgg
cgaagtgggc 4200gacacactgc tgatcatctt caagaaccag gctagccggc cctacaacat
ctacccccac 4260ggcatcaccg acgtgcggcc cctgtacagc aggcggctgc ccaagggcgt
gaagcacctg 4320aaggacttcc ccatcctgcc cggcgagatc ttcaagtaca agtggaccgt
gaccgtggag 4380gacggcccca ccaagagcga ccccagatgc ctgacccggt actacagcag
cttcgtgaac 4440atggaacggg acctggcctc cgggctgatc ggacctctgc tgatctgcta
caaagaaagc 4500gtggaccagc ggggcaacca gatcatgagc gacaagcgga acgtgatcct
gttcagcgtg 4560ttcgatgaga accggtcctg gtatctgacc gagaacatcc agcggtttct
gcccaaccct 4620gccggcgtgc agctggaaga tcccgagttc caggccagca acatcatgca
ctccatcaat 4680ggctacgtgt tcgactctct gcagctctcc gtgtgtctgc acgaggtggc
ctactggtac 4740atcctgagca tcggcgccca gaccgacttc ctgagcgtgt tcttcagcgg
ctacaccttc 4800aagcacaaga tggtgtacga ggacaccctg accctgttcc ctttcagcgg
cgagacagtg 4860ttcatgagca tggaaaaccc cggcctgtgg attctgggct gccacaacag
cgacttccgg 4920aaccggggca tgaccgccct gctgaaggtg tccagctgcg acaagaacac
cggcgactac 4980tacgaggaca gctacgagga tatcagcgcc tacctgctgt ccaagaacaa
cgccatcgaa 5040ccccggagct tcagccagaa cccccccgtg ctgacgcgtc accagcggga
gatcacccgg 5100acaaccctgc agtccgacca ggaagagatc gattacgacg acaccatcag
cgtggagatg 5160aagaaagagg atttcgatat ctacgacgag gacgagaacc agagccccag
aagcttccag 5220aagaaaaccc ggcactactt cattgccgcc gtggagaggc tgtgggacta
cggcatgagt 5280tctagccccc acgtgctgcg gaaccgggcc cagagcggca gcgtgcccca
gttcaagaaa 5340gtggtgttcc aggaattcac agacggcagc ttcacccagc ctctgtatag
aggcgagctg 5400aacgagcacc tggggctgct ggggccctac atcagggccg aagtggagga
caacatcatg 5460gtgaccttcc ggaatcaggc cagcagaccc tactccttct acagcagcct
gatcagctac 5520gaagaggacc agcggcaggg cgccgaaccc cggaagaact tcgtgaagcc
caacgaaacc 5580aagacctact tctggaaagt gcagcaccac atggccccca ccaaggacga
gttcgactgc 5640aaggcctggg cctacttcag cgacgtggat ctggaaaagg acgtgcactc
tggactgatt 5700ggcccactcc tggtctgcca cactaacacc ctcaaccccg cccacggccg
ccaggtgacc 5760gtgcaggaat tcgccctgtt cttcaccatc ttcgacgaga caaagtcctg
gtacttcacc 5820gagaatatgg aacggaactg cagagccccc tgcaacatcc agatggaaga
tcctaccttc 5880aaagagaact accggttcca cgccatcaac ggctacatca tggacaccct
gcctggcctg 5940gtgatggccc aggaccagag aatccggtgg tatctgctgt ccatgggcag
caacgagaat 6000atccacagca tccacttcag cggccacgtg ttcaccgtgc ggaagaaaga
agagtacaag 6060atggccctgt acaacctgta ccccggcgtg ttcgagacag tggagatgct
gcccagcaag 6120gccggcatct ggcgggtgga gtgtctgatc ggcgagcacc tgcacgctgg
catgagcacc 6180ctgtttctgg tgtacagcaa caagtgccag accccactgg gcatggcctc
tggccacatc 6240cgggacttcc agatcaccgc ctccggccag tacggccagt gggcccccaa
gctggccaga 6300ctgcactaca gcggcagcat caacgcctgg tccaccaaag agcccttcag
ctggatcaag 6360gtggacctgc tggcccctat gatcatccac ggcattaaga cccagggcgc
caggcagaag 6420ttcagcagcc tgtacatcag ccagttcatc atcatgtaca gcctggacgg
caagaagtgg 6480cagacctacc ggggcaacag caccggcacc ctgatggtgt tcttcggcaa
tgtggacagc 6540agcggcatca agcacaacat cttcaacccc cccatcattg cccggtacat
ccggctgcac 6600cccacccact acagcattag atccacactg agaatggaac tgatgggctg
cgacctgaac 6660tcctgcagca tgcctctggg catggaaagc aaggccatca gcgacgccca
gatcacagcc 6720agcagctact tcaccaacat gttcgccacc tggtccccct ccaaggccag
gctgcacctg 6780cagggccggt ccaacgcctg gcggcctcag gtcaacaacc ccaaagaatg
gctgcaggtg 6840gactttcaga aaaccatgaa ggtgaccggc gtgaccaccc agggcgtgaa
aagcctgctg 6900accagcatgt acgtgaaaga gtttctgatc agcagctctc aggatggcca
ccagtggacc 6960ctgttctttc agaacggcaa ggtgaaagtg ttccagggca accaggactc
cttcaccccc 7020gtggtgaact ccctggaccc ccccctgctg acccgctacc tgagaatcca
cccccagtct 7080tgggtgcacc agatcgccct caggatggaa gtcctgggat gtgaggccca
ggatctgtac 7140tgatgacgtc tggaacaatc aacctctgga ttacaaaatt tgtgaaagat
tgactggtat 7200tcttaactat gttgctcctt ttacgctatg tggatacgct gctttaatgc
ctttgtatca 7260tgctattgct tcccgtatgg ctttcatttt ctcctccttg tataaatcct
ggttgctgtc 7320tctttatgag gagttgtggc ccgttgtcag gcaacgtggc gtggtgtgca
ctgtgtttgc 7380tgacgcaacc cccactggtt ggggcattgc caccacctgt cagctccttt
ccgggacttt 7440cgctttcccc ctccctattg ccacggcgga actcatcgcc gcctgccttg
cccgctgctg 7500gacaggggct cggctgttgg gcactgacaa ttccgtggtg ttgtcgggga
agctgacgtc 7560ctttccatgg ctgctcgcct gtgttgccac ctggattctg cgcgggacgt
ccttctgcta 7620cgtcccttcg gccctcaatc cagcggacct tccttcccgc ggcctgctgc
cggctctgcg 7680gcctcttccg cgtcttcgcc ttcgccctca gacgagtcgg atctcccttt
gggccgcctc 7740cccgcctgga attaattctg cagtcgagac ctagaaaaac atggagcaat
cacaagtagc 7800aatacagcag ctaccaatgc tgattgtgcc tggctagaag cacaagagga
ggaggaggtg 7860ggttttccag tcacacctca ggtaccttta agaccaatga cttacaaggc
agctgtagat 7920cttagccact ttttaaaaga aaagagggga ctggaagggc taattcactc
ccaacgaaga 7980caagatatcc ttgatctgtg gatctaccac acacaaggct acttccctga
ttagcagaac 8040tacacaccag ggccaggggt cagatatcca ctgacctttg gatggtgcta
caagctagta 8100ccagttgagc cagataaggt agaagaggcc aataaaggag agaacaccag
cttgttacac 8160cctgtgagcc tgcatgggat ggatgacccg gagagagaag tgttagagtg
gaggtttgac 8220agccgcctag catttcatca cgtggcccga gagctgcatc cggagtactt
caagaactgc 8280tgatatcgag cttgctacaa gggactttcc gctggggact ttccagggag
gcgtggcctg 8340ggcgggactg gggagtggcg agccctcaga tcctgcatat aagcagctgc
tttttgcctg 8400tactgggtct ctctggttag accagatctg agcctgggag ctctctggct
aactagggaa 8460cccactgctt aagcctcaat aaagcttgcc ttgagtgctt caagtagtgt
gtgcccgtct 8520gttgtgtgac tctggtaact agagatccct cagacccttt tagtcagtgt
ggaaaatctc 8580tagcagtagt agttcatgtc atcttattat tcagtattta taacttgcaa
agaaatgaat 8640atcagagagt gagaggcctt gacattgcta gcgttttacc gtcgacctct
agctagagct 8700tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc
acaattccac 8760acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga
gtgagctaac 8820tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg
tcgtgccagc 8880tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg
cgctcttccg 8940cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg
gtatcagctc 9000actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga
aagaacatgt 9060gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg
gcgtttttcc 9120ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag
aggtggcgaa 9180acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc
gtgcgctctc 9240ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg
ggaagcgtgg 9300cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt
cgctccaagc 9360tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc
ggtaactatc 9420gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc
actggtaaca 9480ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg
tggcctaact 9540acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca
gttaccttcg 9600gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc
ggtggttttt 9660ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat
cctttgatct 9720tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt
ttggtcatga 9780gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt
tttaaatcaa 9840tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc
agtgaggcac 9900ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc
gtcgtgtaga 9960taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata
ccgcgagacc 10020cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg
gccgagcgca 10080gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc
cgggaagcta 10140gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct
acaggcatcg 10200tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa
cgatcaaggc 10260gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt
cctccgatcg 10320ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca
ctgcataatt 10380ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac
tcaaccaagt 10440cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca
atacgggata 10500ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt
tcttcggggc 10560gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc
actcgtgcac 10620ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca
aaaacaggaa 10680ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata
ctcatactct 10740tcctttttca atattattga agcatttatc agggttattg tctcatgagc
ggatacatat 10800ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc
cgaaaagtgc 10860cacctgacgt cgacggatcg ggagatcaac ttgtttattg cagcttataa
tggttacaaa 10920taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca
ttctagttgt 10980ggtttgtcca aactcatcaa tgtatcttat catgtctgga tcaactggat
aactcaagct 11040aaccaaaatc atcccaaact tcccacccca taccctatta ccactgccaa
ttacctgtgg 11100tttcatttac tctaaacctg tgattcctct gaattatttt cattttaaag
aaattgtatt 11160tgttaaatat gtactacaaa cttagtagtt tttaaagaaa ttgtatttgt
taaatatgta 11220ctacaaactt agtagt
112364411177DNAArtificial Sequencevector 44tggaagggct
aattcactcc caaagaagac aagatatcct tgatctgtgg atctaccaca 60cacaaggcta
cttccctgat tagcagaact acacaccagg gccaggggtc agatatccac 120tgacctttgg
atggtgctac aagctagtac cagttgagcc agataaggta gaagaggcca 180ataaaggaga
gaacaccagc ttgttacacc ctgtgagcct gcatgggatg gatgacccgg 240agagagaagt
gttagagtgg aggtttgaca gccgcctagc atttcatcac gtggcccgag 300agctgcatcc
ggagtacttc aagaactgct gatatcgagc ttgctacaag ggactttccg 360ctggggactt
tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat 420cctgcatata
agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga 480gcctgggagc
tctctggcta actagggaac ccactgctta agcctcaata aagcttgcct 540tgagtgcttc
aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 600agaccctttt
agtcagtgtg gaaaatctct agcagtggcg cccgaacagg gacttgaaag 660cgaaagggaa
accagaggag ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg 720caagaggcga
ggggcggcga ctggtgagta cgccaaaaat tttgactagc ggaggctaga 780aggagagaga
tgggtgcgag agcgtcagta ttaagcgggg gagaattaga tcgcgatggg 840aaaaaattcg
gttaaggcca gggggaaaga aaaaatataa attaaaacat atagtatggg 900caagcaggga
gctagaacga ttcgcagtta atcctggcct gttagaaaca tcagaaggct 960gtagacaaat
actgggacag ctacaaccat cccttcagac aggatcagaa gaacttagat 1020cattatataa
tacagtagca accctctatt gtgtgcatca aaggatagag ataaaagaca 1080ccaaggaagc
tttagacaag atagaggaag agcaaaacaa aagtaagacc accgcacagc 1140aagcggccgg
ccgctgatct tcagacctgg aggaggagat atgagggaca attggagaag 1200tgaattatat
aaatataaag tagtaaaaat tgaaccatta ggagtagcac ccaccaaggc 1260aaagagaaga
gtggtgcaga gagaaaaaag agcagtggga ataggagctt tgttccttgg 1320gttcttggga
gcagcaggaa gcactatggg cgcagcgtca atgacgctga cggtacaggc 1380cagacaatta
ttgtctggta tagtgcagca gcagaacaat ttgctgaggg ctattgaggc 1440gcaacagcat
ctgttgcaac tcacagtctg gggcatcaag cagctccagg caagaatcct 1500ggctgtggaa
agatacctaa aggatcaaca gctcctgggg atttggggtt gctctggaaa 1560actcatttgc
accactgctg tgccttggaa tgctagttgg agtaataaat ctctggaaca 1620gatttggaat
cacacgacct ggatggagtg ggacagagaa attaacaatt acacaagctt 1680aatacactcc
ttaattgaag aatcgcaaaa ccagcaagaa aagaatgaac aagaattatt 1740ggaattagat
aaatgggcaa gtttgtggaa ttggtttaac ataacaaatt ggctgtggta 1800tataaaatta
ttcataatga tagtaggagg cttggtaggt ttaagaatag tttttgctgt 1860actttctata
gtgaatagag ttaggcaggg atattcacca ttatcgtttc agacccacct 1920cccaaccccg
aggggggacc cgacaggccc gaaggaatag aagaagaagg tggagagaga 1980gacagagaca
gatccattcg attagtgaac ggatctcgac ggtcgccaaa tggcagtatt 2040catccacaat
tttaaaagaa aaggggggat tggggggtac agtgcagggg aaagaatagt 2100agacataata
gcaacagaca tacaaactaa agaattacaa aaacaaatta caaaaattca 2160aaattttcgg
gtttattaca gggacagcag agatccagtt tggatcgata agcttgatat 2220cgaattcctg
cgcggccgct tcgaacgcgc gcgatgcatc atatgcgtac gcggtctaga 2280attcctgcag
ggcccactag tctcccaggc atgactccaa caatgcatcc catgggattt 2340ggggttcccc
agatctgggg cttgtaggcc tgactctccc ctgtgcacac gtctcataca 2400cgcatgcgtg
cacccattgc ctgccccgcc ccttgcacag ggagtcagca gggaggactg 2460ggttatgccc
tgcttatcag cagcttccca gcttcctctg cctggattct tagaggcctg 2520gggtcctaga
acgagctggt gcacgtggct tcccaaagat ctctcagata atgagaggaa 2580atgcagtcat
cagtttgcag aaggctaggg attctgggcc atagctcaga cctgcgccca 2640ccatctccct
ccaggcagcc cttggctggt ccctgcgagc ccgtggagac tgccagtcag 2700cgctggatcc
atgcagatcg agctgtccac ctgctttttt ctgtgcctgc tgcggttctg 2760cttcagcgcc
acccggcggt actacctggg cgccgtggag ctgtcctggg actacatgca 2820gagcgacctg
ggcgagctgc ccgtggacgc ccggttcccc cccagagtgc ccaagagctt 2880ccccttcaac
accagcgtgg tgtacaagaa aaccctgttc gtggagttca ccgaccacct 2940gttcaatatc
gccaagccca ggcccccctg gatgggcctg ctgggcccca ccatccaggc 3000cgaggtgtac
gacaccgtgg tgatcaccct gaagaacatg gccagccacc ccgtgagcct 3060gcacgccgtg
ggcgtgagct actggaaggc cagcgagggc gccgagtacg acgaccagac 3120cagccagcgg
gagaaagaag atgacaaggt gttccctggc ggcagccaca cctacgtgtg 3180gcaggtgctg
aaagaaaacg gccccatggc ctccgacccc ctgtgcctga cctacagcta 3240cctgagccac
gtggacctgg tgaaggacct gaacagcggc ctgatcggcg ctctgctcgt 3300ctgccgggag
ggcagcctgg ccaaagagaa aacccagacc ctgcacaagt tcatcctgct 3360gttcgccgtg
ttcgacgagg gcaagagctg gcacagcgag acaaagaaca gcctgatgca 3420ggaccgggac
gccgcctctg ccagagcctg gcccaagatg cacaccgtga acggctacgt 3480gaacagaagc
ctgcccggcc tgattggctg ccaccggaag agcgtgtact ggcacgtgat 3540cggcatgggc
accacacccg aggtgcacag catctttctg gaagggcaca cctttctggt 3600ccggaaccac
cggcaggcca gcctggaaat cagccctatc accttcctga ccgcccagac 3660actgctgatg
gacctgggcc agttcctgct gttttgccac atcagctctc accagcacga 3720cggcatggaa
gcctacgtga aggtggactc ttgccccgag gaaccccagc tgcggatgaa 3780gaacaacgag
gaagccgagg actacgacga cgacctgacc gacagcgaga tggacgtggt 3840gcggttcgac
gacgacaaca gccccagctt catccagatc agaagcgtgg ccaagaagca 3900ccccaagacc
tgggtgcact atatcgccgc cgaggaagag gactgggact acgcccccct 3960ggtgctggcc
cccgacgaca gaagctacaa gagccagtac ctgaacaatg gcccccagcg 4020gatcggccgg
aagtacaaga aagtgcggtt catggcctac accgacgaga cattcaagac 4080ccgggaggcc
atccagcacg agagcggcat cctgggcccc ctgctgtacg gcgaagtggg 4140cgacacactg
ctgatcatct tcaagaacca ggctagccgg ccctacaaca tctaccccca 4200cggcatcacc
gacgtgcggc ccctgtacag caggcggctg cccaagggcg tgaagcacct 4260gaaggacttc
cccatcctgc ccggcgagat cttcaagtac aagtggaccg tgaccgtgga 4320ggacggcccc
accaagagcg accccagatg cctgacccgg tactacagca gcttcgtgaa 4380catggaacgg
gacctggcct ccgggctgat cggacctctg ctgatctgct acaaagaaag 4440cgtggaccag
cggggcaacc agatcatgag cgacaagcgg aacgtgatcc tgttcagcgt 4500gttcgatgag
aaccggtcct ggtatctgac cgagaacatc cagcggtttc tgcccaaccc 4560tgccggcgtg
cagctggaag atcccgagtt ccaggccagc aacatcatgc actccatcaa 4620tggctacgtg
ttcgactctc tgcagctctc cgtgtgtctg cacgaggtgg cctactggta 4680catcctgagc
atcggcgccc agaccgactt cctgagcgtg ttcttcagcg gctacacctt 4740caagcacaag
atggtgtacg aggacaccct gaccctgttc cctttcagcg gcgagacagt 4800gttcatgagc
atggaaaacc ccggcctgtg gattctgggc tgccacaaca gcgacttccg 4860gaaccggggc
atgaccgccc tgctgaaggt gtccagctgc gacaagaaca ccggcgacta 4920ctacgaggac
agctacgagg atatcagcgc ctacctgctg tccaagaaca acgccatcga 4980accccggagc
ttcagccaga acccccccgt gctgacgcgt caccagcggg agatcacccg 5040gacaaccctg
cagtccgacc aggaagagat cgattacgac gacaccatca gcgtggagat 5100gaagaaagag
gatttcgata tctacgacga ggacgagaac cagagcccca gaagcttcca 5160gaagaaaacc
cggcactact tcattgccgc cgtggagagg ctgtgggact acggcatgag 5220ttctagcccc
cacgtgctgc ggaaccgggc ccagagcggc agcgtgcccc agttcaagaa 5280agtggtgttc
caggaattca cagacggcag cttcacccag cctctgtata gaggcgagct 5340gaacgagcac
ctggggctgc tggggcccta catcagggcc gaagtggagg acaacatcat 5400ggtgaccttc
cggaatcagg ccagcagacc ctactccttc tacagcagcc tgatcagcta 5460cgaagaggac
cagcggcagg gcgccgaacc ccggaagaac ttcgtgaagc ccaacgaaac 5520caagacctac
ttctggaaag tgcagcacca catggccccc accaaggacg agttcgactg 5580caaggcctgg
gcctacttca gcgacgtgga tctggaaaag gacgtgcact ctggactgat 5640tggcccactc
ctggtctgcc acactaacac cctcaacccc gcccacggcc gccaggtgac 5700cgtgcaggaa
ttcgccctgt tcttcaccat cttcgacgag acaaagtcct ggtacttcac 5760cgagaatatg
gaacggaact gcagagcccc ctgcaacatc cagatggaag atcctacctt 5820caaagagaac
taccggttcc acgccatcaa cggctacatc atggacaccc tgcctggcct 5880ggtgatggcc
caggaccaga gaatccggtg gtatctgctg tccatgggca gcaacgagaa 5940tatccacagc
atccacttca gcggccacgt gttcaccgtg cggaagaaag aagagtacaa 6000gatggccctg
tacaacctgt accccggcgt gttcgagaca gtggagatgc tgcccagcaa 6060ggccggcatc
tggcgggtgg agtgtctgat cggcgagcac ctgcacgctg gcatgagcac 6120cctgtttctg
gtgtacagca acaagtgcca gaccccactg ggcatggcct ctggccacat 6180ccgggacttc
cagatcaccg cctccggcca gtacggccag tgggccccca agctggccag 6240actgcactac
agcggcagca tcaacgcctg gtccaccaaa gagcccttca gctggatcaa 6300ggtggacctg
ctggccccta tgatcatcca cggcattaag acccagggcg ccaggcagaa 6360gttcagcagc
ctgtacatca gccagttcat catcatgtac agcctggacg gcaagaagtg 6420gcagacctac
cggggcaaca gcaccggcac cctgatggtg ttcttcggca atgtggacag 6480cagcggcatc
aagcacaaca tcttcaaccc ccccatcatt gcccggtaca tccggctgca 6540ccccacccac
tacagcatta gatccacact gagaatggaa ctgatgggct gcgacctgaa 6600ctcctgcagc
atgcctctgg gcatggaaag caaggccatc agcgacgccc agatcacagc 6660cagcagctac
ttcaccaaca tgttcgccac ctggtccccc tccaaggcca ggctgcacct 6720gcagggccgg
tccaacgcct ggcggcctca ggtcaacaac cccaaagaat ggctgcaggt 6780ggactttcag
aaaaccatga aggtgaccgg cgtgaccacc cagggcgtga aaagcctgct 6840gaccagcatg
tacgtgaaag agtttctgat cagcagctct caggatggcc accagtggac 6900cctgttcttt
cagaacggca aggtgaaagt gttccagggc aaccaggact ccttcacccc 6960cgtggtgaac
tccctggacc cccccctgct gacccgctac ctgagaatcc acccccagtc 7020ttgggtgcac
cagatcgccc tcaggatgga agtcctggga tgtgaggccc aggatctgta 7080ctgatgacgt
ctggaacaat caacctctgg attacaaaat ttgtgaaaga ttgactggta 7140ttcttaacta
tgttgctcct tttacgctat gtggatacgc tgctttaatg cctttgtatc 7200atgctattgc
ttcccgtatg gctttcattt tctcctcctt gtataaatcc tggttgctgt 7260ctctttatga
ggagttgtgg cccgttgtca ggcaacgtgg cgtggtgtgc actgtgtttg 7320ctgacgcaac
ccccactggt tggggcattg ccaccacctg tcagctcctt tccgggactt 7380tcgctttccc
cctccctatt gccacggcgg aactcatcgc cgcctgcctt gcccgctgct 7440ggacaggggc
tcggctgttg ggcactgaca attccgtggt gttgtcgggg aagctgacgt 7500cctttccatg
gctgctcgcc tgtgttgcca cctggattct gcgcgggacg tccttctgct 7560acgtcccttc
ggccctcaat ccagcggacc ttccttcccg cggcctgctg ccggctctgc 7620ggcctcttcc
gcgtcttcgc cttcgccctc agacgagtcg gatctccctt tgggccgcct 7680ccccgcctgg
aattaattct gcagtcgaga cctagaaaaa catggagcaa tcacaagtag 7740caatacagca
gctaccaatg ctgattgtgc ctggctagaa gcacaagagg aggaggaggt 7800gggttttcca
gtcacacctc aggtaccttt aagaccaatg acttacaagg cagctgtaga 7860tcttagccac
tttttaaaag aaaagagggg actggaaggg ctaattcact cccaacgaag 7920acaagatatc
cttgatctgt ggatctacca cacacaaggc tacttccctg attagcagaa 7980ctacacacca
gggccagggg tcagatatcc actgaccttt ggatggtgct acaagctagt 8040accagttgag
ccagataagg tagaagaggc caataaagga gagaacacca gcttgttaca 8100ccctgtgagc
ctgcatggga tggatgaccc ggagagagaa gtgttagagt ggaggtttga 8160cagccgccta
gcatttcatc acgtggcccg agagctgcat ccggagtact tcaagaactg 8220ctgatatcga
gcttgctaca agggactttc cgctggggac tttccaggga ggcgtggcct 8280gggcgggact
ggggagtggc gagccctcag atcctgcata taagcagctg ctttttgcct 8340gtactgggtc
tctctggtta gaccagatct gagcctggga gctctctggc taactaggga 8400acccactgct
taagcctcaa taaagcttgc cttgagtgct tcaagtagtg tgtgcccgtc 8460tgttgtgtga
ctctggtaac tagagatccc tcagaccctt ttagtcagtg tggaaaatct 8520ctagcagtag
tagttcatgt catcttatta ttcagtattt ataacttgca aagaaatgaa 8580tatcagagag
tgagaggcct tgacattgct agcgttttac cgtcgacctc tagctagagc 8640ttggcgtaat
catggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca 8700cacaacatac
gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa 8760ctcacattaa
ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag 8820ctgcattaat
gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc 8880gcttcctcgc
tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct 8940cactcaaagg
cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg 9000tgagcaaaag
gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc 9060cataggctcc
gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga 9120aacccgacag
gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct 9180cctgttccga
ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg 9240gcgctttctc
atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag 9300ctgggctgtg
tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat 9360cgtcttgagt
ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac 9420aggattagca
gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac 9480tacggctaca
ctagaagaac agtatttggt atctgcgctc tgctgaagcc agttaccttc 9540ggaaaaagag
ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt 9600tttgtttgca
agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc 9660ttttctacgg
ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg 9720agattatcaa
aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca 9780atctaaagta
tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca 9840cctatctcag
cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag 9900ataactacga
tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac 9960ccacgctcac
cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc 10020agaagtggtc
ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct 10080agagtaagta
gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc 10140gtggtgtcac
gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg 10200cgagttacat
gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc 10260gttgtcagaa
gtaagttggc cgcagtgtta tcactcatgg ttatggcagc actgcataat 10320tctcttactg
tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag 10380tcattctgag
aatagtgtat gcggcgaccg agttgctctt gcccggcgtc aatacgggat 10440aataccgcgc
cacatagcag aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg 10500cgaaaactct
caaggatctt accgctgttg agatccagtt cgatgtaacc cactcgtgca 10560cccaactgat
cttcagcatc ttttactttc accagcgttt ctgggtgagc aaaaacagga 10620aggcaaaatg
ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat actcatactc 10680ttcctttttc
aatattattg aagcatttat cagggttatt gtctcatgag cggatacata 10740tttgaatgta
tttagaaaaa taaacaaata ggggttccgc gcacatttcc ccgaaaagtg 10800ccacctgacg
tcgacggatc gggagatcaa cttgtttatt gcagcttata atggttacaa 10860ataaagcaat
agcatcacaa atttcacaaa taaagcattt ttttcactgc attctagttg 10920tggtttgtcc
aaactcatca atgtatctta tcatgtctgg atcaactgga taactcaagc 10980taaccaaaat
catcccaaac ttcccacccc ataccctatt accactgcca attacctgtg 11040gtttcattta
ctctaaacct gtgattcctc tgaattattt tcattttaaa gaaattgtat 11100ttgttaaata
tgtactacaa acttagtagt ttttaaagaa attgtatttg ttaaatatgt 11160actacaaact
tagtagt
111774511196DNAArtificial Sequencevector 45tggaagggct aattcactcc
caaagaagac aagatatcct tgatctgtgg atctaccaca 60cacaaggcta cttccctgat
tagcagaact acacaccagg gccaggggtc agatatccac 120tgacctttgg atggtgctac
aagctagtac cagttgagcc agataaggta gaagaggcca 180ataaaggaga gaacaccagc
ttgttacacc ctgtgagcct gcatgggatg gatgacccgg 240agagagaagt gttagagtgg
aggtttgaca gccgcctagc atttcatcac gtggcccgag 300agctgcatcc ggagtacttc
aagaactgct gatatcgagc ttgctacaag ggactttccg 360ctggggactt tccagggagg
cgtggcctgg gcgggactgg ggagtggcga gccctcagat 420cctgcatata agcagctgct
ttttgcctgt actgggtctc tctggttaga ccagatctga 480gcctgggagc tctctggcta
actagggaac ccactgctta agcctcaata aagcttgcct 540tgagtgcttc aagtagtgtg
tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 600agaccctttt agtcagtgtg
gaaaatctct agcagtggcg cccgaacagg gacttgaaag 660cgaaagggaa accagaggag
ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg 720caagaggcga ggggcggcga
ctggtgagta cgccaaaaat tttgactagc ggaggctaga 780aggagagaga tgggtgcgag
agcgtcagta ttaagcgggg gagaattaga tcgcgatggg 840aaaaaattcg gttaaggcca
gggggaaaga aaaaatataa attaaaacat atagtatggg 900caagcaggga gctagaacga
ttcgcagtta atcctggcct gttagaaaca tcagaaggct 960gtagacaaat actgggacag
ctacaaccat cccttcagac aggatcagaa gaacttagat 1020cattatataa tacagtagca
accctctatt gtgtgcatca aaggatagag ataaaagaca 1080ccaaggaagc tttagacaag
atagaggaag agcaaaacaa aagtaagacc accgcacagc 1140aagcggccgg ccgctgatct
tcagacctgg aggaggagat atgagggaca attggagaag 1200tgaattatat aaatataaag
tagtaaaaat tgaaccatta ggagtagcac ccaccaaggc 1260aaagagaaga gtggtgcaga
gagaaaaaag agcagtggga ataggagctt tgttccttgg 1320gttcttggga gcagcaggaa
gcactatggg cgcagcgtca atgacgctga cggtacaggc 1380cagacaatta ttgtctggta
tagtgcagca gcagaacaat ttgctgaggg ctattgaggc 1440gcaacagcat ctgttgcaac
tcacagtctg gggcatcaag cagctccagg caagaatcct 1500ggctgtggaa agatacctaa
aggatcaaca gctcctgggg atttggggtt gctctggaaa 1560actcatttgc accactgctg
tgccttggaa tgctagttgg agtaataaat ctctggaaca 1620gatttggaat cacacgacct
ggatggagtg ggacagagaa attaacaatt acacaagctt 1680aatacactcc ttaattgaag
aatcgcaaaa ccagcaagaa aagaatgaac aagaattatt 1740ggaattagat aaatgggcaa
gtttgtggaa ttggtttaac ataacaaatt ggctgtggta 1800tataaaatta ttcataatga
tagtaggagg cttggtaggt ttaagaatag tttttgctgt 1860actttctata gtgaatagag
ttaggcaggg atattcacca ttatcgtttc agacccacct 1920cccaaccccg aggggggacc
cgacaggccc gaaggaatag aagaagaagg tggagagaga 1980gacagagaca gatccattcg
attagtgaac ggatctcgac ggtcgccaaa tggcagtatt 2040catccacaat tttaaaagaa
aaggggggat tggggggtac agtgcagggg aaagaatagt 2100agacataata gcaacagaca
tacaaactaa agaattacaa aaacaaatta caaaaattca 2160aaattttcgg gtttattaca
gggacagcag agatccagtt tggatcgata agcttgatat 2220cgaattcctg cgcggccgct
tcgaacgcgc gcgatgcatc atatgcgtac gcggtctaga 2280attcctgcag ggcccactag
tctcccaggc atgactccaa caatgcatcc catgggattt 2340ggggttcccc agatctgggg
cttgtaggcc tgactctccc ctgtgcacac gtctcataca 2400cgcatgcgtg cacccattgc
ctgccccgcc ccttgcacag ggagtcagca gggaggactg 2460ggttatgccc tgcttatcag
cagcttccca gcttcctctg cctggattct tagaggcctg 2520gggtcctaga acgagctggt
gcacgtggct tcccaaagat ctctcagata atgagaggaa 2580atgcagtcat cagtttgcag
aaggctaggg attctgggcc atagctcaga cctgcgccca 2640ccatctccct ccaggcagcc
cttggctggt ccctgcgagc ccgtggagac tgccagtcag 2700cgctgctgga tctcgggctc
gaggccacca tgcagatcga gctgtccacc tgcttttttc 2760tgtgcctgct gcggttctgc
ttcagcgcca cccggcggta ctacctgggc gccgtggagc 2820tgtcctggga ctacatgcag
agcgacctgg gcgagctgcc cgtggacgcc cggttccccc 2880ccagagtgcc caagagcttc
cccttcaaca ccagcgtggt gtacaagaaa accctgttcg 2940tggagttcac cgaccacctg
ttcaatatcg ccaagcccag gcccccctgg atgggcctgc 3000tgggccccac catccaggcc
gaggtgtacg acaccgtggt gatcaccctg aagaacatgg 3060ccagccaccc cgtgagcctg
cacgccgtgg gcgtgagcta ctggaaggcc agcgagggcg 3120ccgagtacga cgaccagacc
agccagcggg agaaagaaga tgacaaggtg ttccctggcg 3180gcagccacac ctacgtgtgg
caggtgctga aagaaaacgg ccccatggcc tccgaccccc 3240tgtgcctgac ctacagctac
ctgagccacg tggacctggt gaaggacctg aacagcggcc 3300tgatcggcgc tctgctcgtc
tgccgggagg gcagcctggc caaagagaaa acccagaccc 3360tgcacaagtt catcctgctg
ttcgccgtgt tcgacgaggg caagagctgg cacagcgaga 3420caaagaacag cctgatgcag
gaccgggacg ccgcctctgc cagagcctgg cccaagatgc 3480acaccgtgaa cggctacgtg
aacagaagcc tgcccggcct gattggctgc caccggaaga 3540gcgtgtactg gcacgtgatc
ggcatgggca ccacacccga ggtgcacagc atctttctgg 3600aagggcacac ctttctggtc
cggaaccacc ggcaggccag cctggaaatc agccctatca 3660ccttcctgac cgcccagaca
ctgctgatgg acctgggcca gttcctgctg ttttgccaca 3720tcagctctca ccagcacgac
ggcatggaag cctacgtgaa ggtggactct tgccccgagg 3780aaccccagct gcggatgaag
aacaacgagg aagccgagga ctacgacgac gacctgaccg 3840acagcgagat ggacgtggtg
cggttcgacg acgacaacag ccccagcttc atccagatca 3900gaagcgtggc caagaagcac
cccaagacct gggtgcacta tatcgccgcc gaggaagagg 3960actgggacta cgcccccctg
gtgctggccc ccgacgacag aagctacaag agccagtacc 4020tgaacaatgg cccccagcgg
atcggccgga agtacaagaa agtgcggttc atggcctaca 4080ccgacgagac attcaagacc
cgggaggcca tccagcacga gagcggcatc ctgggccccc 4140tgctgtacgg cgaagtgggc
gacacactgc tgatcatctt caagaaccag gctagccggc 4200cctacaacat ctacccccac
ggcatcaccg acgtgcggcc cctgtacagc aggcggctgc 4260ccaagggcgt gaagcacctg
aaggacttcc ccatcctgcc cggcgagatc ttcaagtaca 4320agtggaccgt gaccgtggag
gacggcccca ccaagagcga ccccagatgc ctgacccggt 4380actacagcag cttcgtgaac
atggaacggg acctggcctc cgggctgatc ggacctctgc 4440tgatctgcta caaagaaagc
gtggaccagc ggggcaacca gatcatgagc gacaagcgga 4500acgtgatcct gttcagcgtg
ttcgatgaga accggtcctg gtatctgacc gagaacatcc 4560agcggtttct gcccaaccct
gccggcgtgc agctggaaga tcccgagttc caggccagca 4620acatcatgca ctccatcaat
ggctacgtgt tcgactctct gcagctctcc gtgtgtctgc 4680acgaggtggc ctactggtac
atcctgagca tcggcgccca gaccgacttc ctgagcgtgt 4740tcttcagcgg ctacaccttc
aagcacaaga tggtgtacga ggacaccctg accctgttcc 4800ctttcagcgg cgagacagtg
ttcatgagca tggaaaaccc cggcctgtgg attctgggct 4860gccacaacag cgacttccgg
aaccggggca tgaccgccct gctgaaggtg tccagctgcg 4920acaagaacac cggcgactac
tacgaggaca gctacgagga tatcagcgcc tacctgctgt 4980ccaagaacaa cgccatcgaa
ccccggagct tcagccagaa cccccccgtg ctgacgcgtc 5040accagcggga gatcacccgg
acaaccctgc agtccgacca ggaagagatc gattacgacg 5100acaccatcag cgtggagatg
aagaaagagg atttcgatat ctacgacgag gacgagaacc 5160agagccccag aagcttccag
aagaaaaccc ggcactactt cattgccgcc gtggagaggc 5220tgtgggacta cggcatgagt
tctagccccc acgtgctgcg gaaccgggcc cagagcggca 5280gcgtgcccca gttcaagaaa
gtggtgttcc aggaattcac agacggcagc ttcacccagc 5340ctctgtatag aggcgagctg
aacgagcacc tggggctgct ggggccctac atcagggccg 5400aagtggagga caacatcatg
gtgaccttcc ggaatcaggc cagcagaccc tactccttct 5460acagcagcct gatcagctac
gaagaggacc agcggcaggg cgccgaaccc cggaagaact 5520tcgtgaagcc caacgaaacc
aagacctact tctggaaagt gcagcaccac atggccccca 5580ccaaggacga gttcgactgc
aaggcctggg cctacttcag cgacgtggat ctggaaaagg 5640acgtgcactc tggactgatt
ggcccactcc tggtctgcca cactaacacc ctcaaccccg 5700cccacggccg ccaggtgacc
gtgcaggaat tcgccctgtt cttcaccatc ttcgacgaga 5760caaagtcctg gtacttcacc
gagaatatgg aacggaactg cagagccccc tgcaacatcc 5820agatggaaga tcctaccttc
aaagagaact accggttcca cgccatcaac ggctacatca 5880tggacaccct gcctggcctg
gtgatggccc aggaccagag aatccggtgg tatctgctgt 5940ccatgggcag caacgagaat
atccacagca tccacttcag cggccacgtg ttcaccgtgc 6000ggaagaaaga agagtacaag
atggccctgt acaacctgta ccccggcgtg ttcgagacag 6060tggagatgct gcccagcaag
gccggcatct ggcgggtgga gtgtctgatc ggcgagcacc 6120tgcacgctgg catgagcacc
ctgtttctgg tgtacagcaa caagtgccag accccactgg 6180gcatggcctc tggccacatc
cgggacttcc agatcaccgc ctccggccag tacggccagt 6240gggcccccaa gctggccaga
ctgcactaca gcggcagcat caacgcctgg tccaccaaag 6300agcccttcag ctggatcaag
gtggacctgc tggcccctat gatcatccac ggcattaaga 6360cccagggcgc caggcagaag
ttcagcagcc tgtacatcag ccagttcatc atcatgtaca 6420gcctggacgg caagaagtgg
cagacctacc ggggcaacag caccggcacc ctgatggtgt 6480tcttcggcaa tgtggacagc
agcggcatca agcacaacat cttcaacccc cccatcattg 6540cccggtacat ccggctgcac
cccacccact acagcattag atccacactg agaatggaac 6600tgatgggctg cgacctgaac
tcctgcagca tgcctctggg catggaaagc aaggccatca 6660gcgacgccca gatcacagcc
agcagctact tcaccaacat gttcgccacc tggtccccct 6720ccaaggccag gctgcacctg
cagggccggt ccaacgcctg gcggcctcag gtcaacaacc 6780ccaaagaatg gctgcaggtg
gactttcaga aaaccatgaa ggtgaccggc gtgaccaccc 6840agggcgtgaa aagcctgctg
accagcatgt acgtgaaaga gtttctgatc agcagctctc 6900aggatggcca ccagtggacc
ctgttctttc agaacggcaa ggtgaaagtg ttccagggca 6960accaggactc cttcaccccc
gtggtgaact ccctggaccc ccccctgctg acccgctacc 7020tgagaatcca cccccagtct
tgggtgcacc agatcgccct caggatggaa gtcctgggat 7080gtgaggccca ggatctgtac
tgatgacgtc tggaacaatc aacctctgga ttacaaaatt 7140tgtgaaagat tgactggtat
tcttaactat gttgctcctt ttacgctatg tggatacgct 7200gctttaatgc ctttgtatca
tgctattgct tcccgtatgg ctttcatttt ctcctccttg 7260tataaatcct ggttgctgtc
tctttatgag gagttgtggc ccgttgtcag gcaacgtggc 7320gtggtgtgca ctgtgtttgc
tgacgcaacc cccactggtt ggggcattgc caccacctgt 7380cagctccttt ccgggacttt
cgctttcccc ctccctattg ccacggcgga actcatcgcc 7440gcctgccttg cccgctgctg
gacaggggct cggctgttgg gcactgacaa ttccgtggtg 7500ttgtcgggga agctgacgtc
ctttccatgg ctgctcgcct gtgttgccac ctggattctg 7560cgcgggacgt ccttctgcta
cgtcccttcg gccctcaatc cagcggacct tccttcccgc 7620ggcctgctgc cggctctgcg
gcctcttccg cgtcttcgcc ttcgccctca gacgagtcgg 7680atctcccttt gggccgcctc
cccgcctgga attaattctg cagtcgagac ctagaaaaac 7740atggagcaat cacaagtagc
aatacagcag ctaccaatgc tgattgtgcc tggctagaag 7800cacaagagga ggaggaggtg
ggttttccag tcacacctca ggtaccttta agaccaatga 7860cttacaaggc agctgtagat
cttagccact ttttaaaaga aaagagggga ctggaagggc 7920taattcactc ccaacgaaga
caagatatcc ttgatctgtg gatctaccac acacaaggct 7980acttccctga ttagcagaac
tacacaccag ggccaggggt cagatatcca ctgacctttg 8040gatggtgcta caagctagta
ccagttgagc cagataaggt agaagaggcc aataaaggag 8100agaacaccag cttgttacac
cctgtgagcc tgcatgggat ggatgacccg gagagagaag 8160tgttagagtg gaggtttgac
agccgcctag catttcatca cgtggcccga gagctgcatc 8220cggagtactt caagaactgc
tgatatcgag cttgctacaa gggactttcc gctggggact 8280ttccagggag gcgtggcctg
ggcgggactg gggagtggcg agccctcaga tcctgcatat 8340aagcagctgc tttttgcctg
tactgggtct ctctggttag accagatctg agcctgggag 8400ctctctggct aactagggaa
cccactgctt aagcctcaat aaagcttgcc ttgagtgctt 8460caagtagtgt gtgcccgtct
gttgtgtgac tctggtaact agagatccct cagacccttt 8520tagtcagtgt ggaaaatctc
tagcagtagt agttcatgtc atcttattat tcagtattta 8580taacttgcaa agaaatgaat
atcagagagt gagaggcctt gacattgcta gcgttttacc 8640gtcgacctct agctagagct
tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg 8700ttatccgctc acaattccac
acaacatacg agccggaagc ataaagtgta aagcctgggg 8760tgcctaatga gtgagctaac
tcacattaat tgcgttgcgc tcactgcccg ctttccagtc 8820gggaaacctg tcgtgccagc
tgcattaatg aatcggccaa cgcgcgggga gaggcggttt 8880gcgtattggg cgctcttccg
cttcctcgct cactgactcg ctgcgctcgg tcgttcggct 8940gcggcgagcg gtatcagctc
actcaaaggc ggtaatacgg ttatccacag aatcagggga 9000taacgcagga aagaacatgt
gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc 9060cgcgttgctg gcgtttttcc
ataggctccg cccccctgac gagcatcaca aaaatcgacg 9120ctcaagtcag aggtggcgaa
acccgacagg actataaaga taccaggcgt ttccccctgg 9180aagctccctc gtgcgctctc
ctgttccgac cctgccgctt accggatacc tgtccgcctt 9240tctcccttcg ggaagcgtgg
cgctttctca tagctcacgc tgtaggtatc tcagttcggt 9300gtaggtcgtt cgctccaagc
tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg 9360cgccttatcc ggtaactatc
gtcttgagtc caacccggta agacacgact tatcgccact 9420ggcagcagcc actggtaaca
ggattagcag agcgaggtat gtaggcggtg ctacagagtt 9480cttgaagtgg tggcctaact
acggctacac tagaagaaca gtatttggta tctgcgctct 9540gctgaagcca gttaccttcg
gaaaaagagt tggtagctct tgatccggca aacaaaccac 9600cgctggtagc ggtggttttt
ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc 9660tcaagaagat cctttgatct
tttctacggg gtctgacgct cagtggaacg aaaactcacg 9720ttaagggatt ttggtcatga
gattatcaaa aaggatcttc acctagatcc ttttaaatta 9780aaaatgaagt tttaaatcaa
tctaaagtat atatgagtaa acttggtctg acagttacca 9840atgcttaatc agtgaggcac
ctatctcagc gatctgtcta tttcgttcat ccatagttgc 9900ctgactcccc gtcgtgtaga
taactacgat acgggagggc ttaccatctg gccccagtgc 9960tgcaatgata ccgcgagacc
cacgctcacc ggctccagat ttatcagcaa taaaccagcc 10020agccggaagg gccgagcgca
gaagtggtcc tgcaacttta tccgcctcca tccagtctat 10080taattgttgc cgggaagcta
gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt 10140tgccattgct acaggcatcg
tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc 10200cggttcccaa cgatcaaggc
gagttacatg atcccccatg ttgtgcaaaa aagcggttag 10260ctccttcggt cctccgatcg
ttgtcagaag taagttggcc gcagtgttat cactcatggt 10320tatggcagca ctgcataatt
ctcttactgt catgccatcc gtaagatgct tttctgtgac 10380tggtgagtac tcaaccaagt
cattctgaga atagtgtatg cggcgaccga gttgctcttg 10440cccggcgtca atacgggata
ataccgcgcc acatagcaga actttaaaag tgctcatcat 10500tggaaaacgt tcttcggggc
gaaaactctc aaggatctta ccgctgttga gatccagttc 10560gatgtaaccc actcgtgcac
ccaactgatc ttcagcatct tttactttca ccagcgtttc 10620tgggtgagca aaaacaggaa
ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa 10680atgttgaata ctcatactct
tcctttttca atattattga agcatttatc agggttattg 10740tctcatgagc ggatacatat
ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg 10800cacatttccc cgaaaagtgc
cacctgacgt cgacggatcg ggagatcaac ttgtttattg 10860cagcttataa tggttacaaa
taaagcaata gcatcacaaa tttcacaaat aaagcatttt 10920tttcactgca ttctagttgt
ggtttgtcca aactcatcaa tgtatcttat catgtctgga 10980tcaactggat aactcaagct
aaccaaaatc atcccaaact tcccacccca taccctatta 11040ccactgccaa ttacctgtgg
tttcatttac tctaaacctg tgattcctct gaattatttt 11100cattttaaag aaattgtatt
tgttaaatat gtactacaaa cttagtagtt tttaaagaaa 11160ttgtatttgt taaatatgta
ctacaaactt agtagt 11196468303DNAArtificial
Sequencevector 46tggaagggct aattcactcc caaagaagac aagatatcct tgatctgtgg
atctaccaca 60cacaaggcta cttccctgat tagcagaact acacaccagg gccaggggtc
agatatccac 120tgacctttgg atggtgctac aagctagtac cagttgagcc agataaggta
gaagaggcca 180ataaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatgggatg
gatgacccgg 240agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac
gtggcccgag 300agctgcatcc ggagtacttc aagaactgct gatatcgagc ttgctacaag
ggactttccg 360ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga
gccctcagat 420cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga
ccagatctga 480gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata
aagcttgcct 540tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta
gagatccctc 600agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg
gacttgaaag 660cgaaagggaa accagaggag ctctctcgac gcaggactcg gcttgctgaa
gcgcgcacgg 720caagaggcga ggggcggcga ctggtgagta cgccaaaaat tttgactagc
ggaggctaga 780aggagagaga tgggtgcgag agcgtcagta ttaagcgggg gagaattaga
tcgcgatggg 840aaaaaattcg gttaaggcca gggggaaaga aaaaatataa attaaaacat
atagtatggg 900caagcaggga gctagaacga ttcgcagtta atcctggcct gttagaaaca
tcagaaggct 960gtagacaaat actgggacag ctacaaccat cccttcagac aggatcagaa
gaacttagat 1020cattatataa tacagtagca accctctatt gtgtgcatca aaggatagag
ataaaagaca 1080ccaaggaagc tttagacaag atagaggaag agcaaaacaa aagtaagacc
accgcacagc 1140aagcggccgg ccgctgatct tcagacctgg aggaggagat atgagggaca
attggagaag 1200tgaattatat aaatataaag tagtaaaaat tgaaccatta ggagtagcac
ccaccaaggc 1260aaagagaaga gtggtgcaga gagaaaaaag agcagtggga ataggagctt
tgttccttgg 1320gttcttggga gcagcaggaa gcactatggg cgcagcgtca atgacgctga
cggtacaggc 1380cagacaatta ttgtctggta tagtgcagca gcagaacaat ttgctgaggg
ctattgaggc 1440gcaacagcat ctgttgcaac tcacagtctg gggcatcaag cagctccagg
caagaatcct 1500ggctgtggaa agatacctaa aggatcaaca gctcctgggg atttggggtt
gctctggaaa 1560actcatttgc accactgctg tgccttggaa tgctagttgg agtaataaat
ctctggaaca 1620gatttggaat cacacgacct ggatggagtg ggacagagaa attaacaatt
acacaagctt 1680aatacactcc ttaattgaag aatcgcaaaa ccagcaagaa aagaatgaac
aagaattatt 1740ggaattagat aaatgggcaa gtttgtggaa ttggtttaac ataacaaatt
ggctgtggta 1800tataaaatta ttcataatga tagtaggagg cttggtaggt ttaagaatag
tttttgctgt 1860actttctata gtgaatagag ttaggcaggg atattcacca ttatcgtttc
agacccacct 1920cccaaccccg aggggggacc cgacaggccc gaaggaatag aagaagaagg
tggagagaga 1980gacagagaca gatccattcg attagtgaac ggatctcgac ggtcgccaaa
tggcagtatt 2040catccacaat tttaaaagaa aaggggggat tggggggtac agtgcagggg
aaagaatagt 2100agacataata gcaacagaca tacaaactaa agaattacaa aaacaaatta
caaaaattca 2160aaattttcgg gtttattaca gggacagcag agatccagtt tggatcgata
agcttgatat 2220cgaattcctg cgcggccgct tcgaacgcgc gcgatgcatc atatgcgtac
gcggtctaga 2280attcctgcag ggcccactag tctcccaggc atgactccaa caatgcatcc
catgggattt 2340ggggttcccc agatctgggg cttgtaggcc tgactctccc ctgtgcacac
gtctcataca 2400cgcatgcgtg cacccattgc ctgccccgcc ccttgcacag ggagtcagca
gggaggactg 2460ggttatgccc tgcttatcag cagcttccca gcttcctctg cctggattct
tagaggcctg 2520gggtcctaga acgagctggt gcacgtggct tcccaaagat ctctcagata
atgagaggaa 2580atgcagtcat cagtttgcag aaggctaggg attctgggcc atagctcaga
cctgcgccca 2640ccatctccct ccaggcagcc cttggctggt ccctgcgagc ccgtggagac
tgccagtcag 2700cgctgctgga tctcgggctc gaggccacca tggaagatgc caaaaacatt
aagaagggcc 2760cagcgccatt ctacccactc gaagacggga ccgccggcga gcagctgcac
aaagccatga 2820agcgctacgc cctggtgccc ggcaccatcg cctttaccga cgcacatatc
gaggtggaca 2880ttacctacgc cgagtacttc gagatgagcg ttcggctggc agaagctatg
aagcgctatg 2940ggctgaatac aaaccatcgg atcgtggtgt gcagcgagaa tagcttgcag
ttcttcatgc 3000ccgtgttggg tgccctgttc atcggtgtgg ctgtggcccc agctaacgac
atctacaacg 3060agcgcgagct gctgaacagc atgggcatca gccagcccac cgtcgtattc
gtgagcaaga 3120aagggctgca aaagatcctc aacgtgcaaa agaagctacc gatcatacaa
aagatcatca 3180tcatggatag caagaccgac taccagggct tccaaagcat gtacaccttc
gtgacttccc 3240atttgccacc cggcttcaac gagtacgact tcgtgcccga gagcttcgac
cgggacaaaa 3300ccatcgccct gatcatgaac agtagtggca gtaccggatt gcccaagggc
gtagccctac 3360cgcaccgcac cgcttgtgtc cgattcagtc atgcccgcga ccccatcttc
ggcaaccaga 3420tcatccccga caccgctatc ctcagcgtgg tgccatttca ccacggcttc
ggcatgttca 3480ccacgctggg ctacttgatc tgcggctttc gggtcgtgct catgtaccgc
ttcgaggagg 3540agctattctt gcgcagcttg caagactata agattcaatc tgccctgctg
gtgcccacac 3600tatttagctt cttcgctaag agcactctca tcgacaagta cgacctaagc
aacttgcacg 3660agatcgccag cggcggggcg ccgctcagca aggaggtagg tgaggccgtg
gccaaacgct 3720tccacctacc aggcatccgc cagggctacg gcctgacaga aacaaccagc
gccattctga 3780tcacccccga aggggacgac aagcctggcg cagtaggcaa ggtggtgccc
ttcttcgagg 3840ctaaggtggt ggacttggac accggtaaga cactgggtgt gaaccagcgc
ggcgagctgt 3900gcgtccgtgg ccccatgatc atgagcggct acgttaacaa ccccgaggct
acaaacgctc 3960tcatcgacaa ggacggctgg ctgcacagcg gcgacatcgc ctactgggac
gaggacgagc 4020acttcttcat cgtggaccgg ctgaagagcc tgatcaaata caagggctac
caggtagccc 4080cagccgaact ggagagcatc ctgctgcaac accccaacat cttcgacgcc
ggggtcgccg 4140gcctgcccga cgacgatgcc ggcgagctgc ccgccgcagt cgtcgtgctg
gaacacggta 4200aaaccatgac cgagaaggag atcgtggact atgtggccag ccaggttaca
accgccaaga 4260agctgcgcgg tggtgttgtg ttcgtggacg aggtgcctaa aggactgacc
ggcaagttgg 4320acgcccgcaa gatccgcgag attctcatta aggccaagaa gggcggcaag
atcgccgtgt 4380aaatgcagcc tagggtatac gatatcaagc ttatcgtcga caatcaacct
ctggattaca 4440aaatttgtga aagattgact ggtattctta actatgttgc tccttttacg
ctatgtggat 4500acgctgcttt aatgcctttg tatcatgcta ttgcttcccg tatggctttc
attttctcct 4560ccttgtataa atcctggttg ctgtctcttt atgaggagtt gtggcccgtt
gtcaggcaac 4620gtggcgtggt gtgcactgtg tttgctgacg caacccccac tggttggggc
attgccacca 4680cctgtcagct cctttccggg actttcgctt tccccctccc tattgccacg
gcggaactca 4740tcgccgcctg ccttgcccgc tgctggacag gggctcggct gttgggcact
gacaattccg 4800tggtgttgtc ggggaagctg acgtcctttc catggctgct cgcctgtgtt
gccacctgga 4860ttctgcgcgg gacgtccttc tgctacgtcc cttcggccct caatccagcg
gaccttcctt 4920cccgcggcct gctgccggct caagataggt acctttaaga ccaatgactt
acaaggcagc 4980tgtagatctt agccactttt taaaagaaaa gaggggactg gaagggctaa
ttcactccca 5040acgaagacaa gatatccttg atctgtggat ctaccacaca caaggctact
tccctgatta 5100gcagaactac acaccagggc caggggtcag atatccactg acctttggat
ggtgctacaa 5160gctagtacca gttgagccag ataaggtaga agaggccaat aaaggagaga
acaccagctt 5220gttacaccct gtgagcctgc atgggatgga tgacccggag agagaagtgt
tagagtggag 5280gtttgacagc cgcctagcat ttcatcacgt ggcccgagag ctgcatccgg
agtacttcaa 5340gaactgctga tatcgagctt gctacaaggg actttccgct ggggactttc
cagggaggcg 5400tggcctgggc gggactgggg agtggcgagc cctcagatcc tgcatataag
cagctgcttt 5460ttgcctgtac tgggtctctc tggttagacc agatctgagc ctgggagctc
tctggctaac 5520tagggaaccc actgcttaag cctcaataaa gcttgccttg agtgcttcaa
gtagtgtgtg 5580cccgtctgtt gtgtgactct ggtaactaga gatccctcag acccttttag
tcagtgtgga 5640aaatctctag cagtagtagt tcatgtcatc ttattattca gtatttataa
cttgcaaaga 5700aatgaatatc agagagtgag aggccttgac attgctagcg ttttaccgtc
gacctctagc 5760tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt gaaattgtta
tccgctcaca 5820attccacaca acatacgagc cggaagcata aagtgtaaag cctggggtgc
ctaatgagtg 5880agctaactca cattaattgc gttgcgctca ctgcccgctt tccagtcggg
aaacctgtcg 5940tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg
tattgggcgc 6000tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg
gcgagcggta 6060tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa
cgcaggaaag 6120aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc
gttgctggcg 6180tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc
aagtcagagg 6240tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag
ctccctcgtg 6300cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct
cccttcggga 6360agcgtggcgc tttctcatag ctcacgctgt aggtatctca gttcggtgta
ggtcgttcgc 6420tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc
cttatccggt 6480aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc
agcagccact 6540ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt
gaagtggtgg 6600cctaactacg gctacactag aagaacagta tttggtatct gcgctctgct
gaagccagtt 6660accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc
tggtagcggt 6720ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca
agaagatcct 6780ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta
agggattttg 6840gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa
atgaagtttt 6900aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg
cttaatcagt 6960gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg
actccccgtc 7020gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc
aatgataccg 7080cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc
cggaagggcc 7140gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa
ttgttgccgg 7200gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc
cattgctaca 7260ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg
ttcccaacga 7320tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc
cttcggtcct 7380ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat
ggcagcactg 7440cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg
tgagtactca 7500accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc
ggcgtcaata 7560cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg
aaaacgttct 7620tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat
gtaacccact 7680cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg
gtgagcaaaa 7740acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg
ttgaatactc 7800atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct
catgagcgga 7860tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac
atttccccga 7920aaagtgccac ctgacgtcga cggatcggga gatcaacttg tttattgcag
cttataatgg 7980ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt
cactgcattc 8040tagttgtggt ttgtccaaac tcatcaatgt atcttatcat gtctggatca
actggataac 8100tcaagctaac caaaatcatc ccaaacttcc caccccatac cctattacca
ctgccaatta 8160cctgtggttt catttactct aaacctgtga ttcctctgaa ttattttcat
tttaaagaaa 8220ttgtatttgt taaatatgta ctacaaactt agtagttttt aaagaaattg
tatttgttaa 8280atatgtacta caaacttagt agt
8303478379DNAArtificial Sequencevector 47tggaagggct aattcactcc
caaagaagac aagatatcct tgatctgtgg atctaccaca 60cacaaggcta cttccctgat
tagcagaact acacaccagg gccaggggtc agatatccac 120tgacctttgg atggtgctac
aagctagtac cagttgagcc agataaggta gaagaggcca 180ataaaggaga gaacaccagc
ttgttacacc ctgtgagcct gcatgggatg gatgacccgg 240agagagaagt gttagagtgg
aggtttgaca gccgcctagc atttcatcac gtggcccgag 300agctgcatcc ggagtacttc
aagaactgct gatatcgagc ttgctacaag ggactttccg 360ctggggactt tccagggagg
cgtggcctgg gcgggactgg ggagtggcga gccctcagat 420cctgcatata agcagctgct
ttttgcctgt actgggtctc tctggttaga ccagatctga 480gcctgggagc tctctggcta
actagggaac ccactgctta agcctcaata aagcttgcct 540tgagtgcttc aagtagtgtg
tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 600agaccctttt agtcagtgtg
gaaaatctct agcagtggcg cccgaacagg gacttgaaag 660cgaaagggaa accagaggag
ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg 720caagaggcga ggggcggcga
ctggtgagta cgccaaaaat tttgactagc ggaggctaga 780aggagagaga tgggtgcgag
agcgtcagta ttaagcgggg gagaattaga tcgcgatggg 840aaaaaattcg gttaaggcca
gggggaaaga aaaaatataa attaaaacat atagtatggg 900caagcaggga gctagaacga
ttcgcagtta atcctggcct gttagaaaca tcagaaggct 960gtagacaaat actgggacag
ctacaaccat cccttcagac aggatcagaa gaacttagat 1020cattatataa tacagtagca
accctctatt gtgtgcatca aaggatagag ataaaagaca 1080ccaaggaagc tttagacaag
atagaggaag agcaaaacaa aagtaagacc accgcacagc 1140aagcggccgg ccgctgatct
tcagacctgg aggaggagat atgagggaca attggagaag 1200tgaattatat aaatataaag
tagtaaaaat tgaaccatta ggagtagcac ccaccaaggc 1260aaagagaaga gtggtgcaga
gagaaaaaag agcagtggga ataggagctt tgttccttgg 1320gttcttggga gcagcaggaa
gcactatggg cgcagcgtca atgacgctga cggtacaggc 1380cagacaatta ttgtctggta
tagtgcagca gcagaacaat ttgctgaggg ctattgaggc 1440gcaacagcat ctgttgcaac
tcacagtctg gggcatcaag cagctccagg caagaatcct 1500ggctgtggaa agatacctaa
aggatcaaca gctcctgggg atttggggtt gctctggaaa 1560actcatttgc accactgctg
tgccttggaa tgctagttgg agtaataaat ctctggaaca 1620gatttggaat cacacgacct
ggatggagtg ggacagagaa attaacaatt acacaagctt 1680aatacactcc ttaattgaag
aatcgcaaaa ccagcaagaa aagaatgaac aagaattatt 1740ggaattagat aaatgggcaa
gtttgtggaa ttggtttaac ataacaaatt ggctgtggta 1800tataaaatta ttcataatga
tagtaggagg cttggtaggt ttaagaatag tttttgctgt 1860actttctata gtgaatagag
ttaggcaggg atattcacca ttatcgtttc agacccacct 1920cccaaccccg aggggggacc
cgacaggccc gaaggaatag aagaagaagg tggagagaga 1980gacagagaca gatccattcg
attagtgaac ggatctcgac ggtcgccaaa tggcagtatt 2040catccacaat tttaaaagaa
aaggggggat tggggggtac agtgcagggg aaagaatagt 2100agacataata gcaacagaca
tacaaactaa agaattacaa aaacaaatta caaaaattca 2160aaattttcgg gtttattaca
gggacagcag agatccagtt tggatcgata agcttgatat 2220cgaattcctg cgcggccgct
tcgaacgcgc gcgatgcatc atatggccct cacaaaggaa 2280caataacagg aaaccatccc
agggggaagt gggccagggc cagctggaaa acctgaaggg 2340gcgtacgcgg tctagaattc
ctgcagggcc cactagtctc ccaggcatga ctccaacaat 2400gcatcccatg ggatttgggg
ttccccagat ctggggcttg taggcctgac tctcccctgt 2460gcacacgtct catacacgca
tgcgtgcacc cattgcctgc cccgcccctt gcacagggag 2520tcagcaggga ggactgggtt
atgccctgct tatcagcagc ttcccagctt cctctgcctg 2580gattcttaga ggcctggggt
cctagaacga gctggtgcac gtggcttccc aaagatctct 2640cagataatga gaggaaatgc
agtcatcagt ttgcagaagg ctagggattc tgggccatag 2700ctcagacctg cgcccaccat
ctccctccag gcagcccttg gctggtccct gcgagcccgt 2760ggagactgcc agtcagcgct
gctggatctc gggctcgagg ccaccatgga agatgccaaa 2820aacattaaga agggcccagc
gccattctac ccactcgaag acgggaccgc cggcgagcag 2880ctgcacaaag ccatgaagcg
ctacgccctg gtgcccggca ccatcgcctt taccgacgca 2940catatcgagg tggacattac
ctacgccgag tacttcgaga tgagcgttcg gctggcagaa 3000gctatgaagc gctatgggct
gaatacaaac catcggatcg tggtgtgcag cgagaatagc 3060ttgcagttct tcatgcccgt
gttgggtgcc ctgttcatcg gtgtggctgt ggccccagct 3120aacgacatct acaacgagcg
cgagctgctg aacagcatgg gcatcagcca gcccaccgtc 3180gtattcgtga gcaagaaagg
gctgcaaaag atcctcaacg tgcaaaagaa gctaccgatc 3240atacaaaaga tcatcatcat
ggatagcaag accgactacc agggcttcca aagcatgtac 3300accttcgtga cttcccattt
gccacccggc ttcaacgagt acgacttcgt gcccgagagc 3360ttcgaccggg acaaaaccat
cgccctgatc atgaacagta gtggcagtac cggattgccc 3420aagggcgtag ccctaccgca
ccgcaccgct tgtgtccgat tcagtcatgc ccgcgacccc 3480atcttcggca accagatcat
ccccgacacc gctatcctca gcgtggtgcc atttcaccac 3540ggcttcggca tgttcaccac
gctgggctac ttgatctgcg gctttcgggt cgtgctcatg 3600taccgcttcg aggaggagct
attcttgcgc agcttgcaag actataagat tcaatctgcc 3660ctgctggtgc ccacactatt
tagcttcttc gctaagagca ctctcatcga caagtacgac 3720ctaagcaact tgcacgagat
cgccagcggc ggggcgccgc tcagcaagga ggtaggtgag 3780gccgtggcca aacgcttcca
cctaccaggc atccgccagg gctacggcct gacagaaaca 3840accagcgcca ttctgatcac
ccccgaaggg gacgacaagc ctggcgcagt aggcaaggtg 3900gtgcccttct tcgaggctaa
ggtggtggac ttggacaccg gtaagacact gggtgtgaac 3960cagcgcggcg agctgtgcgt
ccgtggcccc atgatcatga gcggctacgt taacaacccc 4020gaggctacaa acgctctcat
cgacaaggac ggctggctgc acagcggcga catcgcctac 4080tgggacgagg acgagcactt
cttcatcgtg gaccggctga agagcctgat caaatacaag 4140ggctaccagg tagccccagc
cgaactggag agcatcctgc tgcaacaccc caacatcttc 4200gacgccgggg tcgccggcct
gcccgacgac gatgccggcg agctgcccgc cgcagtcgtc 4260gtgctggaac acggtaaaac
catgaccgag aaggagatcg tggactatgt ggccagccag 4320gttacaaccg ccaagaagct
gcgcggtggt gttgtgttcg tggacgaggt gcctaaagga 4380ctgaccggca agttggacgc
ccgcaagatc cgcgagattc tcattaaggc caagaagggc 4440ggcaagatcg ccgtgtaaat
gcagcctagg gtatacgata tcaagcttat cgtcgacaat 4500caacctctgg attacaaaat
ttgtgaaaga ttgactggta ttcttaacta tgttgctcct 4560tttacgctat gtggatacgc
tgctttaatg cctttgtatc atgctattgc ttcccgtatg 4620gctttcattt tctcctcctt
gtataaatcc tggttgctgt ctctttatga ggagttgtgg 4680cccgttgtca ggcaacgtgg
cgtggtgtgc actgtgtttg ctgacgcaac ccccactggt 4740tggggcattg ccaccacctg
tcagctcctt tccgggactt tcgctttccc cctccctatt 4800gccacggcgg aactcatcgc
cgcctgcctt gcccgctgct ggacaggggc tcggctgttg 4860ggcactgaca attccgtggt
gttgtcgggg aagctgacgt cctttccatg gctgctcgcc 4920tgtgttgcca cctggattct
gcgcgggacg tccttctgct acgtcccttc ggccctcaat 4980ccagcggacc ttccttcccg
cggcctgctg ccggctcaag ataggtacct ttaagaccaa 5040tgacttacaa ggcagctgta
gatcttagcc actttttaaa agaaaagagg ggactggaag 5100ggctaattca ctcccaacga
agacaagata tccttgatct gtggatctac cacacacaag 5160gctacttccc tgattagcag
aactacacac cagggccagg ggtcagatat ccactgacct 5220ttggatggtg ctacaagcta
gtaccagttg agccagataa ggtagaagag gccaataaag 5280gagagaacac cagcttgtta
caccctgtga gcctgcatgg gatggatgac ccggagagag 5340aagtgttaga gtggaggttt
gacagccgcc tagcatttca tcacgtggcc cgagagctgc 5400atccggagta cttcaagaac
tgctgatatc gagcttgcta caagggactt tccgctgggg 5460actttccagg gaggcgtggc
ctgggcggga ctggggagtg gcgagccctc agatcctgca 5520tataagcagc tgctttttgc
ctgtactggg tctctctggt tagaccagat ctgagcctgg 5580gagctctctg gctaactagg
gaacccactg cttaagcctc aataaagctt gccttgagtg 5640cttcaagtag tgtgtgcccg
tctgttgtgt gactctggta actagagatc cctcagaccc 5700ttttagtcag tgtggaaaat
ctctagcagt agtagttcat gtcatcttat tattcagtat 5760ttataacttg caaagaaatg
aatatcagag agtgagaggc cttgacattg ctagcgtttt 5820accgtcgacc tctagctaga
gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa 5880ttgttatccg ctcacaattc
cacacaacat acgagccgga agcataaagt gtaaagcctg 5940gggtgcctaa tgagtgagct
aactcacatt aattgcgttg cgctcactgc ccgctttcca 6000gtcgggaaac ctgtcgtgcc
agctgcatta atgaatcggc caacgcgcgg ggagaggcgg 6060tttgcgtatt gggcgctctt
ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 6120gctgcggcga gcggtatcag
ctcactcaaa ggcggtaata cggttatcca cagaatcagg 6180ggataacgca ggaaagaaca
tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 6240ggccgcgttg ctggcgtttt
tccataggct ccgcccccct gacgagcatc acaaaaatcg 6300acgctcaagt cagaggtggc
gaaacccgac aggactataa agataccagg cgtttccccc 6360tggaagctcc ctcgtgcgct
ctcctgttcc gaccctgccg cttaccggat acctgtccgc 6420ctttctccct tcgggaagcg
tggcgctttc tcatagctca cgctgtaggt atctcagttc 6480ggtgtaggtc gttcgctcca
agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 6540ctgcgcctta tccggtaact
atcgtcttga gtccaacccg gtaagacacg acttatcgcc 6600actggcagca gccactggta
acaggattag cagagcgagg tatgtaggcg gtgctacaga 6660gttcttgaag tggtggccta
actacggcta cactagaaga acagtatttg gtatctgcgc 6720tctgctgaag ccagttacct
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 6780caccgctggt agcggtggtt
tttttgtttg caagcagcag attacgcgca gaaaaaaagg 6840atctcaagaa gatcctttga
tcttttctac ggggtctgac gctcagtgga acgaaaactc 6900acgttaaggg attttggtca
tgagattatc aaaaaggatc ttcacctaga tccttttaaa 6960ttaaaaatga agttttaaat
caatctaaag tatatatgag taaacttggt ctgacagtta 7020ccaatgctta atcagtgagg
cacctatctc agcgatctgt ctatttcgtt catccatagt 7080tgcctgactc cccgtcgtgt
agataactac gatacgggag ggcttaccat ctggccccag 7140tgctgcaatg ataccgcgag
acccacgctc accggctcca gatttatcag caataaacca 7200gccagccgga agggccgagc
gcagaagtgg tcctgcaact ttatccgcct ccatccagtc 7260tattaattgt tgccgggaag
ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt 7320tgttgccatt gctacaggca
tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag 7380ctccggttcc caacgatcaa
ggcgagttac atgatccccc atgttgtgca aaaaagcggt 7440tagctccttc ggtcctccga
tcgttgtcag aagtaagttg gccgcagtgt tatcactcat 7500ggttatggca gcactgcata
attctcttac tgtcatgcca tccgtaagat gcttttctgt 7560gactggtgag tactcaacca
agtcattctg agaatagtgt atgcggcgac cgagttgctc 7620ttgcccggcg tcaatacggg
ataataccgc gccacatagc agaactttaa aagtgctcat 7680cattggaaaa cgttcttcgg
ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag 7740ttcgatgtaa cccactcgtg
cacccaactg atcttcagca tcttttactt tcaccagcgt 7800ttctgggtga gcaaaaacag
gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg 7860gaaatgttga atactcatac
tcttcctttt tcaatattat tgaagcattt atcagggtta 7920ttgtctcatg agcggataca
tatttgaatg tatttagaaa aataaacaaa taggggttcc 7980gcgcacattt ccccgaaaag
tgccacctga cgtcgacgga tcgggagatc aacttgttta 8040ttgcagctta taatggttac
aaataaagca atagcatcac aaatttcaca aataaagcat 8100ttttttcact gcattctagt
tgtggtttgt ccaaactcat caatgtatct tatcatgtct 8160ggatcaactg gataactcaa
gctaaccaaa atcatcccaa acttcccacc ccatacccta 8220ttaccactgc caattacctg
tggtttcatt tactctaaac ctgtgattcc tctgaattat 8280tttcatttta aagaaattgt
atttgttaaa tatgtactac aaacttagta gtttttaaag 8340aaattgtatt tgttaaatat
gtactacaaa cttagtagt 83794811272DNAArtificial
Sequencevector 48tggaagggct aattcactcc caaagaagac aagatatcct tgatctgtgg
atctaccaca 60cacaaggcta cttccctgat tagcagaact acacaccagg gccaggggtc
agatatccac 120tgacctttgg atggtgctac aagctagtac cagttgagcc agataaggta
gaagaggcca 180ataaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatgggatg
gatgacccgg 240agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac
gtggcccgag 300agctgcatcc ggagtacttc aagaactgct gatatcgagc ttgctacaag
ggactttccg 360ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga
gccctcagat 420cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga
ccagatctga 480gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata
aagcttgcct 540tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta
gagatccctc 600agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg
gacttgaaag 660cgaaagggaa accagaggag ctctctcgac gcaggactcg gcttgctgaa
gcgcgcacgg 720caagaggcga ggggcggcga ctggtgagta cgccaaaaat tttgactagc
ggaggctaga 780aggagagaga tgggtgcgag agcgtcagta ttaagcgggg gagaattaga
tcgcgatggg 840aaaaaattcg gttaaggcca gggggaaaga aaaaatataa attaaaacat
atagtatggg 900caagcaggga gctagaacga ttcgcagtta atcctggcct gttagaaaca
tcagaaggct 960gtagacaaat actgggacag ctacaaccat cccttcagac aggatcagaa
gaacttagat 1020cattatataa tacagtagca accctctatt gtgtgcatca aaggatagag
ataaaagaca 1080ccaaggaagc tttagacaag atagaggaag agcaaaacaa aagtaagacc
accgcacagc 1140aagcggccgg ccgctgatct tcagacctgg aggaggagat atgagggaca
attggagaag 1200tgaattatat aaatataaag tagtaaaaat tgaaccatta ggagtagcac
ccaccaaggc 1260aaagagaaga gtggtgcaga gagaaaaaag agcagtggga ataggagctt
tgttccttgg 1320gttcttggga gcagcaggaa gcactatggg cgcagcgtca atgacgctga
cggtacaggc 1380cagacaatta ttgtctggta tagtgcagca gcagaacaat ttgctgaggg
ctattgaggc 1440gcaacagcat ctgttgcaac tcacagtctg gggcatcaag cagctccagg
caagaatcct 1500ggctgtggaa agatacctaa aggatcaaca gctcctgggg atttggggtt
gctctggaaa 1560actcatttgc accactgctg tgccttggaa tgctagttgg agtaataaat
ctctggaaca 1620gatttggaat cacacgacct ggatggagtg ggacagagaa attaacaatt
acacaagctt 1680aatacactcc ttaattgaag aatcgcaaaa ccagcaagaa aagaatgaac
aagaattatt 1740ggaattagat aaatgggcaa gtttgtggaa ttggtttaac ataacaaatt
ggctgtggta 1800tataaaatta ttcataatga tagtaggagg cttggtaggt ttaagaatag
tttttgctgt 1860actttctata gtgaatagag ttaggcaggg atattcacca ttatcgtttc
agacccacct 1920cccaaccccg aggggggacc cgacaggccc gaaggaatag aagaagaagg
tggagagaga 1980gacagagaca gatccattcg attagtgaac ggatctcgac ggtcgccaaa
tggcagtatt 2040catccacaat tttaaaagaa aaggggggat tggggggtac agtgcagggg
aaagaatagt 2100agacataata gcaacagaca tacaaactaa agaattacaa aaacaaatta
caaaaattca 2160aaattttcgg gtttattaca gggacagcag agatccagtt tggatcgata
agcttgatat 2220cgaattcctg cgcggccgct tcgaacgcgc gcgatgcatc atatggccct
cacaaaggaa 2280caataacagg aaaccatccc agggggaagt gggccagggc cagctggaaa
acctgaaggg 2340gcgtacgcgg tctagaattc ctgcagggcc cactagtctc ccaggcatga
ctccaacaat 2400gcatcccatg ggatttgggg ttccccagat ctggggcttg taggcctgac
tctcccctgt 2460gcacacgtct catacacgca tgcgtgcacc cattgcctgc cccgcccctt
gcacagggag 2520tcagcaggga ggactgggtt atgccctgct tatcagcagc ttcccagctt
cctctgcctg 2580gattcttaga ggcctggggt cctagaacga gctggtgcac gtggcttccc
aaagatctct 2640cagataatga gaggaaatgc agtcatcagt ttgcagaagg ctagggattc
tgggccatag 2700ctcagacctg cgcccaccat ctccctccag gcagcccttg gctggtccct
gcgagcccgt 2760ggagactgcc agtcagcgct gctggatctc gggctcgagg ccaccatgca
gatcgagctg 2820tccacctgct tttttctgtg cctgctgcgg ttctgcttca gcgccacccg
gcggtactac 2880ctgggcgccg tggagctgtc ctgggactac atgcagagcg acctgggcga
gctgcccgtg 2940gacgcccggt tcccccccag agtgcccaag agcttcccct tcaacaccag
cgtggtgtac 3000aagaaaaccc tgttcgtgga gttcaccgac cacctgttca atatcgccaa
gcccaggccc 3060ccctggatgg gcctgctggg ccccaccatc caggccgagg tgtacgacac
cgtggtgatc 3120accctgaaga acatggccag ccaccccgtg agcctgcacg ccgtgggcgt
gagctactgg 3180aaggccagcg agggcgccga gtacgacgac cagaccagcc agcgggagaa
agaagatgac 3240aaggtgttcc ctggcggcag ccacacctac gtgtggcagg tgctgaaaga
aaacggcccc 3300atggcctccg accccctgtg cctgacctac agctacctga gccacgtgga
cctggtgaag 3360gacctgaaca gcggcctgat cggcgctctg ctcgtctgcc gggagggcag
cctggccaaa 3420gagaaaaccc agaccctgca caagttcatc ctgctgttcg ccgtgttcga
cgagggcaag 3480agctggcaca gcgagacaaa gaacagcctg atgcaggacc gggacgccgc
ctctgccaga 3540gcctggccca agatgcacac cgtgaacggc tacgtgaaca gaagcctgcc
cggcctgatt 3600ggctgccacc ggaagagcgt gtactggcac gtgatcggca tgggcaccac
acccgaggtg 3660cacagcatct ttctggaagg gcacaccttt ctggtccgga accaccggca
ggccagcctg 3720gaaatcagcc ctatcacctt cctgaccgcc cagacactgc tgatggacct
gggccagttc 3780ctgctgtttt gccacatcag ctctcaccag cacgacggca tggaagccta
cgtgaaggtg 3840gactcttgcc ccgaggaacc ccagctgcgg atgaagaaca acgaggaagc
cgaggactac 3900gacgacgacc tgaccgacag cgagatggac gtggtgcggt tcgacgacga
caacagcccc 3960agcttcatcc agatcagaag cgtggccaag aagcacccca agacctgggt
gcactatatc 4020gccgccgagg aagaggactg ggactacgcc cccctggtgc tggcccccga
cgacagaagc 4080tacaagagcc agtacctgaa caatggcccc cagcggatcg gccggaagta
caagaaagtg 4140cggttcatgg cctacaccga cgagacattc aagacccggg aggccatcca
gcacgagagc 4200ggcatcctgg gccccctgct gtacggcgaa gtgggcgaca cactgctgat
catcttcaag 4260aaccaggcta gccggcccta caacatctac ccccacggca tcaccgacgt
gcggcccctg 4320tacagcaggc ggctgcccaa gggcgtgaag cacctgaagg acttccccat
cctgcccggc 4380gagatcttca agtacaagtg gaccgtgacc gtggaggacg gccccaccaa
gagcgacccc 4440agatgcctga cccggtacta cagcagcttc gtgaacatgg aacgggacct
ggcctccggg 4500ctgatcggac ctctgctgat ctgctacaaa gaaagcgtgg accagcgggg
caaccagatc 4560atgagcgaca agcggaacgt gatcctgttc agcgtgttcg atgagaaccg
gtcctggtat 4620ctgaccgaga acatccagcg gtttctgccc aaccctgccg gcgtgcagct
ggaagatccc 4680gagttccagg ccagcaacat catgcactcc atcaatggct acgtgttcga
ctctctgcag 4740ctctccgtgt gtctgcacga ggtggcctac tggtacatcc tgagcatcgg
cgcccagacc 4800gacttcctga gcgtgttctt cagcggctac accttcaagc acaagatggt
gtacgaggac 4860accctgaccc tgttcccttt cagcggcgag acagtgttca tgagcatgga
aaaccccggc 4920ctgtggattc tgggctgcca caacagcgac ttccggaacc ggggcatgac
cgccctgctg 4980aaggtgtcca gctgcgacaa gaacaccggc gactactacg aggacagcta
cgaggatatc 5040agcgcctacc tgctgtccaa gaacaacgcc atcgaacccc ggagcttcag
ccagaacccc 5100cccgtgctga cgcgtcacca gcgggagatc acccggacaa ccctgcagtc
cgaccaggaa 5160gagatcgatt acgacgacac catcagcgtg gagatgaaga aagaggattt
cgatatctac 5220gacgaggacg agaaccagag ccccagaagc ttccagaaga aaacccggca
ctacttcatt 5280gccgccgtgg agaggctgtg ggactacggc atgagttcta gcccccacgt
gctgcggaac 5340cgggcccaga gcggcagcgt gccccagttc aagaaagtgg tgttccagga
attcacagac 5400ggcagcttca cccagcctct gtatagaggc gagctgaacg agcacctggg
gctgctgggg 5460ccctacatca gggccgaagt ggaggacaac atcatggtga ccttccggaa
tcaggccagc 5520agaccctact ccttctacag cagcctgatc agctacgaag aggaccagcg
gcagggcgcc 5580gaaccccgga agaacttcgt gaagcccaac gaaaccaaga cctacttctg
gaaagtgcag 5640caccacatgg cccccaccaa ggacgagttc gactgcaagg cctgggccta
cttcagcgac 5700gtggatctgg aaaaggacgt gcactctgga ctgattggcc cactcctggt
ctgccacact 5760aacaccctca accccgccca cggccgccag gtgaccgtgc aggaattcgc
cctgttcttc 5820accatcttcg acgagacaaa gtcctggtac ttcaccgaga atatggaacg
gaactgcaga 5880gccccctgca acatccagat ggaagatcct accttcaaag agaactaccg
gttccacgcc 5940atcaacggct acatcatgga caccctgcct ggcctggtga tggcccagga
ccagagaatc 6000cggtggtatc tgctgtccat gggcagcaac gagaatatcc acagcatcca
cttcagcggc 6060cacgtgttca ccgtgcggaa gaaagaagag tacaagatgg ccctgtacaa
cctgtacccc 6120ggcgtgttcg agacagtgga gatgctgccc agcaaggccg gcatctggcg
ggtggagtgt 6180ctgatcggcg agcacctgca cgctggcatg agcaccctgt ttctggtgta
cagcaacaag 6240tgccagaccc cactgggcat ggcctctggc cacatccggg acttccagat
caccgcctcc 6300ggccagtacg gccagtgggc ccccaagctg gccagactgc actacagcgg
cagcatcaac 6360gcctggtcca ccaaagagcc cttcagctgg atcaaggtgg acctgctggc
ccctatgatc 6420atccacggca ttaagaccca gggcgccagg cagaagttca gcagcctgta
catcagccag 6480ttcatcatca tgtacagcct ggacggcaag aagtggcaga cctaccgggg
caacagcacc 6540ggcaccctga tggtgttctt cggcaatgtg gacagcagcg gcatcaagca
caacatcttc 6600aaccccccca tcattgcccg gtacatccgg ctgcacccca cccactacag
cattagatcc 6660acactgagaa tggaactgat gggctgcgac ctgaactcct gcagcatgcc
tctgggcatg 6720gaaagcaagg ccatcagcga cgcccagatc acagccagca gctacttcac
caacatgttc 6780gccacctggt ccccctccaa ggccaggctg cacctgcagg gccggtccaa
cgcctggcgg 6840cctcaggtca acaaccccaa agaatggctg caggtggact ttcagaaaac
catgaaggtg 6900accggcgtga ccacccaggg cgtgaaaagc ctgctgacca gcatgtacgt
gaaagagttt 6960ctgatcagca gctctcagga tggccaccag tggaccctgt tctttcagaa
cggcaaggtg 7020aaagtgttcc agggcaacca ggactccttc acccccgtgg tgaactccct
ggaccccccc 7080ctgctgaccc gctacctgag aatccacccc cagtcttggg tgcaccagat
cgccctcagg 7140atggaagtcc tgggatgtga ggcccaggat ctgtactgat gacgtctgga
acaatcaacc 7200tctggattac aaaatttgtg aaagattgac tggtattctt aactatgttg
ctccttttac 7260gctatgtgga tacgctgctt taatgccttt gtatcatgct attgcttccc
gtatggcttt 7320cattttctcc tccttgtata aatcctggtt gctgtctctt tatgaggagt
tgtggcccgt 7380tgtcaggcaa cgtggcgtgg tgtgcactgt gtttgctgac gcaaccccca
ctggttgggg 7440cattgccacc acctgtcagc tcctttccgg gactttcgct ttccccctcc
ctattgccac 7500ggcggaactc atcgccgcct gccttgcccg ctgctggaca ggggctcggc
tgttgggcac 7560tgacaattcc gtggtgttgt cggggaagct gacgtccttt ccatggctgc
tcgcctgtgt 7620tgccacctgg attctgcgcg ggacgtcctt ctgctacgtc ccttcggccc
tcaatccagc 7680ggaccttcct tcccgcggcc tgctgccggc tctgcggcct cttccgcgtc
ttcgccttcg 7740ccctcagacg agtcggatct ccctttgggc cgcctccccg cctggaatta
attctgcagt 7800cgagacctag aaaaacatgg agcaatcaca agtagcaata cagcagctac
caatgctgat 7860tgtgcctggc tagaagcaca agaggaggag gaggtgggtt ttccagtcac
acctcaggta 7920cctttaagac caatgactta caaggcagct gtagatctta gccacttttt
aaaagaaaag 7980aggggactgg aagggctaat tcactcccaa cgaagacaag atatccttga
tctgtggatc 8040taccacacac aaggctactt ccctgattag cagaactaca caccagggcc
aggggtcaga 8100tatccactga cctttggatg gtgctacaag ctagtaccag ttgagccaga
taaggtagaa 8160gaggccaata aaggagagaa caccagcttg ttacaccctg tgagcctgca
tgggatggat 8220gacccggaga gagaagtgtt agagtggagg tttgacagcc gcctagcatt
tcatcacgtg 8280gcccgagagc tgcatccgga gtacttcaag aactgctgat atcgagcttg
ctacaaggga 8340ctttccgctg gggactttcc agggaggcgt ggcctgggcg ggactgggga
gtggcgagcc 8400ctcagatcct gcatataagc agctgctttt tgcctgtact gggtctctct
ggttagacca 8460gatctgagcc tgggagctct ctggctaact agggaaccca ctgcttaagc
ctcaataaag 8520cttgccttga gtgcttcaag tagtgtgtgc ccgtctgttg tgtgactctg
gtaactagag 8580atccctcaga cccttttagt cagtgtggaa aatctctagc agtagtagtt
catgtcatct 8640tattattcag tatttataac ttgcaaagaa atgaatatca gagagtgaga
ggccttgaca 8700ttgctagcgt tttaccgtcg acctctagct agagcttggc gtaatcatgg
tcatagctgt 8760ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc
ggaagcataa 8820agtgtaaagc ctggggtgcc taatgagtga gctaactcac attaattgcg
ttgcgctcac 8880tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc
ggccaacgcg 8940cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact
gactcgctgc 9000gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta
atacggttat 9060ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag
caaaaggcca 9120ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc
cctgacgagc 9180atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta
taaagatacc 9240aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg
ccgcttaccg 9300gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc
tcacgctgta 9360ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac
gaaccccccg 9420ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac
ccggtaagac 9480acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg
aggtatgtag 9540gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga
agaacagtat 9600ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt
agctcttgat 9660ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag
cagattacgc 9720gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct
gacgctcagt 9780ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg
atcttcacct 9840agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat
gagtaaactt 9900ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc
tgtctatttc 9960gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg
gagggcttac 10020catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct
ccagatttat 10080cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca
actttatccg 10140cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg
ccagttaata 10200gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg
tcgtttggta 10260tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc
cccatgttgt 10320gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag
ttggccgcag 10380tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg
ccatccgtaa 10440gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag
tgtatgcggc 10500gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgccacat
agcagaactt 10560taaaagtgct catcattgga aaacgttctt cggggcgaaa actctcaagg
atcttaccgc 10620tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca
gcatctttta 10680ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca
aaaaagggaa 10740taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaatat
tattgaagca 10800tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag
aaaaataaac 10860aaataggggt tccgcgcaca tttccccgaa aagtgccacc tgacgtcgac
ggatcgggag 10920atcaacttgt ttattgcagc ttataatggt tacaaataaa gcaatagcat
cacaaatttc 10980acaaataaag catttttttc actgcattct agttgtggtt tgtccaaact
catcaatgta 11040tcttatcatg tctggatcaa ctggataact caagctaacc aaaatcatcc
caaacttccc 11100accccatacc ctattaccac tgccaattac ctgtggtttc atttactcta
aacctgtgat 11160tcctctgaat tattttcatt ttaaagaaat tgtatttgtt aaatatgtac
tacaaactta 11220gtagttttta aagaaattgt atttgttaaa tatgtactac aaacttagta
gt 112724910725DNAArtificial SequenceVector 49acgcgtgtag
tcttatgcaa tactcttgta gtcttgcaac atggtaacga tgagttagca 60acatgcctta
caaggagaga aaaagcaccg tgcatgccga ttggtggaag taaggtggta 120cgatcgtgcc
ttattaggaa ggcaacagac gggtctgaca tggattggac gaaccactga 180attgccgcat
tgcagagata ttgtatttaa gtgcctagct cgatacaata aacgggtctc 240tctggttaga
ccagatctga gcctgggagc tctctggcta actagggaac ccactgctta 300agcctcaata
aagcttgcct tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact 360ctggtaacta
gagatccctc agaccctttt agtcagtgtg gaaaatctct agcagtggcg 420cccgaacagg
gacctgaaag cgaaagggaa accagagctc tctcgacgca ggactcggct 480tgctgaagcg
cgcacggcaa gaggcgaggg gcggcgactg gtgagtacgc caaaaatttt 540gactagcgga
ggctagaagg agagagatgg gtgcgagagc gtcagtatta agcgggggag 600aattagatcg
cgatgggaaa aaattcggtt aaggccaggg ggaaagaaaa aatataaatt 660aaaacatata
gtatgggcaa gcagggagct agaacgattc gcagttaatc ctggcctgtt 720agaaacatca
gaaggctgta gacaaatact gggacagcta caaccatccc ttcagacagg 780atcagaagaa
cttagatcat tatataatac agtagcaacc ctctattgtg tgcatcaaag 840gatagagata
aaagacacca aggaagcttt agacaagata gaggaagagc aaaacaaaag 900taagaccacc
gcacagcaag cggccactga tcttcagacc tggaggagga gatatgaggg 960acaattggag
aagtgaatta tataaatata aagtagtaaa aattgaacca ttaggagtag 1020cacccaccaa
ggcaaagaga agagtggtgc agagagaaaa aagagcagtg ggaataggag 1080ctttgttcct
tgggttcttg ggagcagcag gaagcactat gggcgcagcc tcaatgacgc 1140tgacggtaca
ggccagacaa ttattgtctg gtatagtgca gcagcagaac aatttgctga 1200gggctattga
ggcgcaacag catctgttgc aactcacagt ctggggcatc aagcagctcc 1260aggcaagaat
cctggctgtg gaaagatacc taaaggatca acagctcctg gggatttggg 1320gttgctctgg
aaaactcatt tgcaccactg ctgtgccttg gaatgctagt tggagtaata 1380aatctctgga
acagattgga atcacacgac ctggatggag tgggacagag aaattaacaa 1440ttacacaagc
ttaatacact ccttaattga agaatcgcaa aaccagcaag aaaagaatga 1500acaagaatta
ttggaattag ataaatgggc aagtttgtgg aattggttta acataacaaa 1560ttggctgtgg
tatataaaat tattcataat gatagtagga ggcttggtag gtttaagaat 1620agtttttgct
gtactttcta tagtgaatag agttaggcag ggatattcac cattatcgtt 1680tcagacccac
ctcccaaccc cgaggggacc cgacaggccc gaaggaatag aagaagaagg 1740tggagagaga
gacagagaca gatccattcg attagtgaac ggatctcgac ggttaacttt 1800taaaagaaaa
ggggggattg gggggtacag tgcaggggaa agaatagtag acataatagc 1860aacagacata
caaactaaag aattacaaaa acaaattaca aaaattcaaa attttatcga 1920tagcgcggcc
gcttcgaacg cgcgcgatgc atcatatgcg tacgcggtct agaattcctg 1980cagggcccac
tagtctccca ggcatgactc caacaatgca tcccatggga tttggggttc 2040cccagatctg
gggcttgtag gcctgactct cccctgtgca cacgtctcat acacgcatgc 2100gtgcacccat
tgcctgcccc gccccttgca cagggagtca gcagggagga ctgggttatg 2160ccctgcttat
cagcagcttc ccagcttcct ctgcctggat tcttagaggc ctggggtcct 2220agaacgagct
ggtgcacgtg gcttcccaaa gatctctcag ataatgagag gaaatgcagt 2280catcagtttg
cagaaggcta gggattctgg gccatagctc agacctgcgc ccaccatctc 2340cctccaggca
gcccttggct ggtccctgcg agcccgtgga gactgccagt cagcgctgct 2400ggatctcggg
ctcgaggcca ccatgcagat cgagctgtcc acctgctttt ttctgtgcct 2460gctgcggttc
tgcttcagcg ccacccggcg gtactacctg ggcgccgtgg agctgtcctg 2520ggactacatg
cagagcgacc tgggcgagct gcccgtggac gcccggttcc cccccagagt 2580gcccaagagc
ttccccttca acaccagcgt ggtgtacaag aaaaccctgt tcgtggagtt 2640caccgaccac
ctgttcaata tcgccaagcc caggcccccc tggatgggcc tgctgggccc 2700caccatccag
gccgaggtgt acgacaccgt ggtgatcacc ctgaagaaca tggccagcca 2760ccccgtgagc
ctgcacgccg tgggcgtgag ctactggaag gccagcgagg gcgccgagta 2820cgacgaccag
accagccagc gggagaaaga agatgacaag gtgttccctg gcggcagcca 2880cacctacgtg
tggcaggtgc tgaaagaaaa cggccccatg gcctccgacc ccctgtgcct 2940gacctacagc
tacctgagcc acgtggacct ggtgaaggac ctgaacagcg gcctgatcgg 3000cgctctgctc
gtctgccggg agggcagcct ggccaaagag aaaacccaga ccctgcacaa 3060gttcatcctg
ctgttcgccg tgttcgacga gggcaagagc tggcacagcg agacaaagaa 3120cagcctgatg
caggaccggg acgccgcctc tgccagagcc tggcccaaga tgcacaccgt 3180gaacggctac
gtgaacagaa gcctgcccgg cctgattggc tgccaccgga agagcgtgta 3240ctggcacgtg
atcggcatgg gcaccacacc cgaggtgcac agcatctttc tggaagggca 3300cacctttctg
gtccggaacc accggcaggc cagcctggaa atcagcccta tcaccttcct 3360gaccgcccag
acactgctga tggacctggg ccagttcctg ctgttttgcc acatcagctc 3420tcaccagcac
gacggcatgg aagcctacgt gaaggtggac tcttgccccg aggaacccca 3480gctgcggatg
aagaacaacg aggaagccga ggactacgac gacgacctga ccgacagcga 3540gatggacgtg
gtgcggttcg acgacgacaa cagccccagc ttcatccaga tcagaagcgt 3600ggccaagaag
caccccaaga cctgggtgca ctatatcgcc gccgaggaag aggactggga 3660ctacgccccc
ctggtgctgg cccccgacga cagaagctac aagagccagt acctgaacaa 3720tggcccccag
cggatcggcc ggaagtacaa gaaagtgcgg ttcatggcct acaccgacga 3780gacattcaag
acccgggagg ccatccagca cgagagcggc atcctgggcc ccctgctgta 3840cggcgaagtg
ggcgacacac tgctgatcat cttcaagaac caggctagcc ggccctacaa 3900catctacccc
cacggcatca ccgacgtgcg gcccctgtac agcaggcggc tgcccaaggg 3960cgtgaagcac
ctgaaggact tccccatcct gcccggcgag atcttcaagt acaagtggac 4020cgtgaccgtg
gaggacggcc ccaccaagag cgaccccaga tgcctgaccc ggtactacag 4080cagcttcgtg
aacatggaac gggacctggc ctccgggctg atcggacctc tgctgatctg 4140ctacaaagaa
agcgtggacc agcggggcaa ccagatcatg agcgacaagc ggaacgtgat 4200cctgttcagc
gtgttcgatg agaaccggtc ctggtatctg accgagaaca tccagcggtt 4260tctgcccaac
cctgccggcg tgcagctgga agatcccgag ttccaggcca gcaacatcat 4320gcactccatc
aatggctacg tgttcgactc tctgcagctc tccgtgtgtc tgcacgaggt 4380ggcctactgg
tacatcctga gcatcggcgc ccagaccgac ttcctgagcg tgttcttcag 4440cggctacacc
ttcaagcaca agatggtgta cgaggacacc ctgaccctgt tccctttcag 4500cggcgagaca
gtgttcatga gcatggaaaa ccccggcctg tggattctgg gctgccacaa 4560cagcgacttc
cggaaccggg gcatgaccgc cctgctgaag gtgtccagct gcgacaagaa 4620caccggcgac
tactacgagg acagctacga ggatatcagc gcctacctgc tgtccaagaa 4680caacgccatc
gaaccccgga gcttcagcca gaaccccccc gtgctgacgc gtcaccagcg 4740ggagatcacc
cggacaaccc tgcagtccga ccaggaagag atcgattacg acgacaccat 4800cagcgtggag
atgaagaaag aggatttcga tatctacgac gaggacgaga accagagccc 4860cagaagcttc
cagaagaaaa cccggcacta cttcattgcc gccgtggaga ggctgtggga 4920ctacggcatg
agttctagcc cccacgtgct gcggaaccgg gcccagagcg gcagcgtgcc 4980ccagttcaag
aaagtggtgt tccaggaatt cacagacggc agcttcaccc agcctctgta 5040tagaggcgag
ctgaacgagc acctggggct gctggggccc tacatcaggg ccgaagtgga 5100ggacaacatc
atggtgacct tccggaatca ggccagcaga ccctactcct tctacagcag 5160cctgatcagc
tacgaagagg accagcggca gggcgccgaa ccccggaaga acttcgtgaa 5220gcccaacgaa
accaagacct acttctggaa agtgcagcac cacatggccc ccaccaagga 5280cgagttcgac
tgcaaggcct gggcctactt cagcgacgtg gatctggaaa aggacgtgca 5340ctctggactg
attggcccac tcctggtctg ccacactaac accctcaacc ccgcccacgg 5400ccgccaggtg
accgtgcagg aattcgccct gttcttcacc atcttcgacg agacaaagtc 5460ctggtacttc
accgagaata tggaacggaa ctgcagagcc ccctgcaaca tccagatgga 5520agatcctacc
ttcaaagaga actaccggtt ccacgccatc aacggctaca tcatggacac 5580cctgcctggc
ctggtgatgg cccaggacca gagaatccgg tggtatctgc tgtccatggg 5640cagcaacgag
aatatccaca gcatccactt cagcggccac gtgttcaccg tgcggaagaa 5700agaagagtac
aagatggccc tgtacaacct gtaccccggc gtgttcgaga cagtggagat 5760gctgcccagc
aaggccggca tctggcgggt ggagtgtctg atcggcgagc acctgcacgc 5820tggcatgagc
accctgtttc tggtgtacag caacaagtgc cagaccccac tgggcatggc 5880ctctggccac
atccgggact tccagatcac cgcctccggc cagtacggcc agtgggcccc 5940caagctggcc
agactgcact acagcggcag catcaacgcc tggtccacca aagagccctt 6000cagctggatc
aaggtggacc tgctggcccc tatgatcatc cacggcatta agacccaggg 6060cgccaggcag
aagttcagca gcctgtacat cagccagttc atcatcatgt acagcctgga 6120cggcaagaag
tggcagacct accggggcaa cagcaccggc accctgatgg tgttcttcgg 6180caatgtggac
agcagcggca tcaagcacaa catcttcaac ccccccatca ttgcccggta 6240catccggctg
caccccaccc actacagcat tagatccaca ctgagaatgg aactgatggg 6300ctgcgacctg
aactcctgca gcatgcctct gggcatggaa agcaaggcca tcagcgacgc 6360ccagatcaca
gccagcagct acttcaccaa catgttcgcc acctggtccc cctccaaggc 6420caggctgcac
ctgcagggcc ggtccaacgc ctggcggcct caggtcaaca accccaaaga 6480atggctgcag
gtggactttc agaaaaccat gaaggtgacc ggcgtgacca cccagggcgt 6540gaaaagcctg
ctgaccagca tgtacgtgaa agagtttctg atcagcagct ctcaggatgg 6600ccaccagtgg
accctgttct ttcagaacgg caaggtgaaa gtgttccagg gcaaccagga 6660ctccttcacc
cccgtggtga actccctgga cccccccctg ctgacccgct acctgagaat 6720ccacccccag
tcttgggtgc accagatcgc cctcaggatg gaagtcctgg gatgtgaggc 6780ccaggatctg
tactgatgac gtctggaacg cgtcgacaat caacctctgg attacaaaat 6840ttgtgaaaga
ttgactggta ttcttaacta tgttgctcct tttacgctat gtggatacgc 6900tgctttaatg
cctttgtatc atgctattgc ttcccgtatg gctttcattt tctcctcctt 6960gtataaatcc
tggttgctgt ctctttatga ggagttgtgg cccgttgtca ggcaacgtgg 7020cgtggtgtgc
actgtgtttg ctgacgcaac ccccactggt tggggcattg ccaccacctg 7080tcagctcctt
tccgggactt tcgctttccc cctccctatt gccacggcgg aactcatcgc 7140cgcctgcctt
gcccgctgct ggacaggggc tcggctgttg ggcactgaca attccgtggt 7200gttgtcgggg
aaatcatcgt cctttccttg gctgctcgcc tgtgttgcca cctggattct 7260gcgcgggacg
tccttctgct acgtcccttc ggccctcaat ccagcggacc ttccttcccg 7320cggcctgctg
ccggctctgc ggcctcttcc gcgtcttcgc cttcgccctc agacgagtcg 7380gatctccctt
tgggccgcct ccccgcctgg tacctttaag accaatgact tacaaggcag 7440ctgtagatct
tagccacttt ttaaaagaaa aggggggact ggaagggcta attcactccc 7500aacgaaaata
agatctgctt tttgcttgta ctgggtctct ctggttagac cagatctgag 7560cctgggagct
ctctggctaa ctagggaacc cactgcttaa gcctcaataa agcttgcctt 7620gagtgcttca
agtagtgtgt gcccgtctgt tgtgtgactc tggtaactag agatccctca 7680gaccctttta
gtcagtgtgg aaaatctcta gcagtagtag ttcatgtcat cttattattc 7740agtatttata
acttgcaaag aaatgaatat cagagagtga gaggaacttg tttattgcag 7800cttataatgg
ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt 7860cactgcattc
tagttgtggt ttgtccaaac tcatcaatgt atcttatcat gtctggctct 7920agctatcccg
cccctaactc cgcccagttc cgcccattct ccgccccatg gctgactaat 7980tttttttatt
tatgcagagg ccgaggccgc ctcggcctct gagctattcc agaagtagtg 8040aggaggcttt
tttggaggcc tagacttttg cagagacggc ccaaattcgt aatcatggtc 8100atagctgttt
cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 8160aagcataaag
tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt 8220gcgctcactg
cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 8280ccaacgcgcg
gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 8340ctcgctgcgc
tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 8400acggttatcc
acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 8460aaaggccagg
aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 8520tgacgagcat
cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 8580aagataccag
gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 8640gcttaccgga
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 8700acgctgtagg
tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 8760accccccgtt
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 8820ggtaagacac
gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 8880gtatgtaggc
ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 8940gacagtattt
ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 9000ctcttgatcc
ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 9060gattacgcgc
agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 9120cgctcagtgg
aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 9180cttcacctag
atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 9240gtaaacttgg
tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 9300tctatttcgt
tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 9360gggcttacca
tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 9420agatttatca
gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 9480tttatccgcc
tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 9540agttaatagt
ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 9600gtttggtatg
gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 9660catgttgtgc
aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 9720ggccgcagtg
ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 9780atccgtaaga
tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 9840tatgcggcga
ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 9900cagaacttta
aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 9960cttaccgctg
ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 10020atcttttact
ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 10080aaagggaata
agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 10140ttgaagcatt
tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 10200aaataaacaa
ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga 10260aaccattatt
atcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtct 10320cgcgcgtttc
ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac 10380agcttgtctg
taagcggatg ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt 10440tggcgggtgt
cggggctggc ttaactatgc ggcatcagag cagattgtac tgagagtgca 10500ccatatatgc
ggtgtgaaat accgcacaga tgcgtaagga gaaaataccg catcaggcgc 10560cattcgccat
tcaggctgcg caactgttgg gaagggcgat cggtgcgggc ctcttcgcta 10620ttacgccagc
tggcgaaagg gggatgtgct gcaaggcgat taagttgggt aacgccaggg 10680ttttcccagt
cacgacgttg taaaacgacg gccagtgcca agctg
107255010895DNAArtificial SequenceVector 50acgcgtgtag tcttatgcaa
tactcttgta gtcttgcaac atggtaacga tgagttagca 60acatgcctta caaggagaga
aaaagcaccg tgcatgccga ttggtggaag taaggtggta 120cgatcgtgcc ttattaggaa
ggcaacagac gggtctgaca tggattggac gaaccactga 180attgccgcat tgcagagata
ttgtatttaa gtgcctagct cgatacaata aacgggtctc 240tctggttaga ccagatctga
gcctgggagc tctctggcta actagggaac ccactgctta 300agcctcaata aagcttgcct
tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact 360ctggtaacta gagatccctc
agaccctttt agtcagtgtg gaaaatctct agcagtggcg 420cccgaacagg gacctgaaag
cgaaagggaa accagagctc tctcgacgca ggactcggct 480tgctgaagcg cgcacggcaa
gaggcgaggg gcggcgactg gtgagtacgc caaaaatttt 540gactagcgga ggctagaagg
agagagatgg gtgcgagagc gtcagtatta agcgggggag 600aattagatcg cgatgggaaa
aaattcggtt aaggccaggg ggaaagaaaa aatataaatt 660aaaacatata gtatgggcaa
gcagggagct agaacgattc gcagttaatc ctggcctgtt 720agaaacatca gaaggctgta
gacaaatact gggacagcta caaccatccc ttcagacagg 780atcagaagaa cttagatcat
tatataatac agtagcaacc ctctattgtg tgcatcaaag 840gatagagata aaagacacca
aggaagcttt agacaagata gaggaagagc aaaacaaaag 900taagaccacc gcacagcaag
cggccactga tcttcagacc tggaggagga gatatgaggg 960acaattggag aagtgaatta
tataaatata aagtagtaaa aattgaacca ttaggagtag 1020cacccaccaa ggcaaagaga
agagtggtgc agagagaaaa aagagcagtg ggaataggag 1080ctttgttcct tgggttcttg
ggagcagcag gaagcactat gggcgcagcc tcaatgacgc 1140tgacggtaca ggccagacaa
ttattgtctg gtatagtgca gcagcagaac aatttgctga 1200gggctattga ggcgcaacag
catctgttgc aactcacagt ctggggcatc aagcagctcc 1260aggcaagaat cctggctgtg
gaaagatacc taaaggatca acagctcctg gggatttggg 1320gttgctctgg aaaactcatt
tgcaccactg ctgtgccttg gaatgctagt tggagtaata 1380aatctctgga acagattgga
atcacacgac ctggatggag tgggacagag aaattaacaa 1440ttacacaagc ttaatacact
ccttaattga agaatcgcaa aaccagcaag aaaagaatga 1500acaagaatta ttggaattag
ataaatgggc aagtttgtgg aattggttta acataacaaa 1560ttggctgtgg tatataaaat
tattcataat gatagtagga ggcttggtag gtttaagaat 1620agtttttgct gtactttcta
tagtgaatag agttaggcag ggatattcac cattatcgtt 1680tcagacccac ctcccaaccc
cgaggggacc cgacaggccc gaaggaatag aagaagaagg 1740tggagagaga gacagagaca
gatccattcg attagtgaac ggatctcgac ggttaacttt 1800taaaagaaaa ggggggattg
gggggtacag tgcaggggaa agaatagtag acataatagc 1860aacagacata caaactaaag
aattacaaaa acaaattaca aaaattcaaa attttatcga 1920tagcgcggcc gcttcgaacg
cgcgcgatgc atcatatgag ggaactccct gtgctgggcc 1980tacccagctg accccatcgc
tggaaacaat gggggtcagg caacacttcc ccactctctc 2040ccgccgggct gtgctcactt
ccttcctgct ggctgcctga ggaagtgtcc ctgccctggg 2100acagtctggc ctagcctttg
tttccccgcg tacgcggtct agaattcctg cagggcccac 2160tagtctccca ggcatgactc
caacaatgca tcccatggga tttggggttc cccagatctg 2220gggcttgtag gcctgactct
cccctgtgca cacgtctcat acacgcatgc gtgcacccat 2280tgcctgcccc gccccttgca
cagggagtca gcagggagga ctgggttatg ccctgcttat 2340cagcagcttc ccagcttcct
ctgcctggat tcttagaggc ctggggtcct agaacgagct 2400ggtgcacgtg gcttcccaaa
gatctctcag ataatgagag gaaatgcagt catcagtttg 2460cagaaggcta gggattctgg
gccatagctc agacctgcgc ccaccatctc cctccaggca 2520gcccttggct ggtccctgcg
agcccgtgga gactgccagt cagcgctgct ggatctcggg 2580ctcgaggcca ccatgcagat
cgagctgtcc acctgctttt ttctgtgcct gctgcggttc 2640tgcttcagcg ccacccggcg
gtactacctg ggcgccgtgg agctgtcctg ggactacatg 2700cagagcgacc tgggcgagct
gcccgtggac gcccggttcc cccccagagt gcccaagagc 2760ttccccttca acaccagcgt
ggtgtacaag aaaaccctgt tcgtggagtt caccgaccac 2820ctgttcaata tcgccaagcc
caggcccccc tggatgggcc tgctgggccc caccatccag 2880gccgaggtgt acgacaccgt
ggtgatcacc ctgaagaaca tggccagcca ccccgtgagc 2940ctgcacgccg tgggcgtgag
ctactggaag gccagcgagg gcgccgagta cgacgaccag 3000accagccagc gggagaaaga
agatgacaag gtgttccctg gcggcagcca cacctacgtg 3060tggcaggtgc tgaaagaaaa
cggccccatg gcctccgacc ccctgtgcct gacctacagc 3120tacctgagcc acgtggacct
ggtgaaggac ctgaacagcg gcctgatcgg cgctctgctc 3180gtctgccggg agggcagcct
ggccaaagag aaaacccaga ccctgcacaa gttcatcctg 3240ctgttcgccg tgttcgacga
gggcaagagc tggcacagcg agacaaagaa cagcctgatg 3300caggaccggg acgccgcctc
tgccagagcc tggcccaaga tgcacaccgt gaacggctac 3360gtgaacagaa gcctgcccgg
cctgattggc tgccaccgga agagcgtgta ctggcacgtg 3420atcggcatgg gcaccacacc
cgaggtgcac agcatctttc tggaagggca cacctttctg 3480gtccggaacc accggcaggc
cagcctggaa atcagcccta tcaccttcct gaccgcccag 3540acactgctga tggacctggg
ccagttcctg ctgttttgcc acatcagctc tcaccagcac 3600gacggcatgg aagcctacgt
gaaggtggac tcttgccccg aggaacccca gctgcggatg 3660aagaacaacg aggaagccga
ggactacgac gacgacctga ccgacagcga gatggacgtg 3720gtgcggttcg acgacgacaa
cagccccagc ttcatccaga tcagaagcgt ggccaagaag 3780caccccaaga cctgggtgca
ctatatcgcc gccgaggaag aggactggga ctacgccccc 3840ctggtgctgg cccccgacga
cagaagctac aagagccagt acctgaacaa tggcccccag 3900cggatcggcc ggaagtacaa
gaaagtgcgg ttcatggcct acaccgacga gacattcaag 3960acccgggagg ccatccagca
cgagagcggc atcctgggcc ccctgctgta cggcgaagtg 4020ggcgacacac tgctgatcat
cttcaagaac caggctagcc ggccctacaa catctacccc 4080cacggcatca ccgacgtgcg
gcccctgtac agcaggcggc tgcccaaggg cgtgaagcac 4140ctgaaggact tccccatcct
gcccggcgag atcttcaagt acaagtggac cgtgaccgtg 4200gaggacggcc ccaccaagag
cgaccccaga tgcctgaccc ggtactacag cagcttcgtg 4260aacatggaac gggacctggc
ctccgggctg atcggacctc tgctgatctg ctacaaagaa 4320agcgtggacc agcggggcaa
ccagatcatg agcgacaagc ggaacgtgat cctgttcagc 4380gtgttcgatg agaaccggtc
ctggtatctg accgagaaca tccagcggtt tctgcccaac 4440cctgccggcg tgcagctgga
agatcccgag ttccaggcca gcaacatcat gcactccatc 4500aatggctacg tgttcgactc
tctgcagctc tccgtgtgtc tgcacgaggt ggcctactgg 4560tacatcctga gcatcggcgc
ccagaccgac ttcctgagcg tgttcttcag cggctacacc 4620ttcaagcaca agatggtgta
cgaggacacc ctgaccctgt tccctttcag cggcgagaca 4680gtgttcatga gcatggaaaa
ccccggcctg tggattctgg gctgccacaa cagcgacttc 4740cggaaccggg gcatgaccgc
cctgctgaag gtgtccagct gcgacaagaa caccggcgac 4800tactacgagg acagctacga
ggatatcagc gcctacctgc tgtccaagaa caacgccatc 4860gaaccccgga gcttcagcca
gaaccccccc gtgctgacgc gtcaccagcg ggagatcacc 4920cggacaaccc tgcagtccga
ccaggaagag atcgattacg acgacaccat cagcgtggag 4980atgaagaaag aggatttcga
tatctacgac gaggacgaga accagagccc cagaagcttc 5040cagaagaaaa cccggcacta
cttcattgcc gccgtggaga ggctgtggga ctacggcatg 5100agttctagcc cccacgtgct
gcggaaccgg gcccagagcg gcagcgtgcc ccagttcaag 5160aaagtggtgt tccaggaatt
cacagacggc agcttcaccc agcctctgta tagaggcgag 5220ctgaacgagc acctggggct
gctggggccc tacatcaggg ccgaagtgga ggacaacatc 5280atggtgacct tccggaatca
ggccagcaga ccctactcct tctacagcag cctgatcagc 5340tacgaagagg accagcggca
gggcgccgaa ccccggaaga acttcgtgaa gcccaacgaa 5400accaagacct acttctggaa
agtgcagcac cacatggccc ccaccaagga cgagttcgac 5460tgcaaggcct gggcctactt
cagcgacgtg gatctggaaa aggacgtgca ctctggactg 5520attggcccac tcctggtctg
ccacactaac accctcaacc ccgcccacgg ccgccaggtg 5580accgtgcagg aattcgccct
gttcttcacc atcttcgacg agacaaagtc ctggtacttc 5640accgagaata tggaacggaa
ctgcagagcc ccctgcaaca tccagatgga agatcctacc 5700ttcaaagaga actaccggtt
ccacgccatc aacggctaca tcatggacac cctgcctggc 5760ctggtgatgg cccaggacca
gagaatccgg tggtatctgc tgtccatggg cagcaacgag 5820aatatccaca gcatccactt
cagcggccac gtgttcaccg tgcggaagaa agaagagtac 5880aagatggccc tgtacaacct
gtaccccggc gtgttcgaga cagtggagat gctgcccagc 5940aaggccggca tctggcgggt
ggagtgtctg atcggcgagc acctgcacgc tggcatgagc 6000accctgtttc tggtgtacag
caacaagtgc cagaccccac tgggcatggc ctctggccac 6060atccgggact tccagatcac
cgcctccggc cagtacggcc agtgggcccc caagctggcc 6120agactgcact acagcggcag
catcaacgcc tggtccacca aagagccctt cagctggatc 6180aaggtggacc tgctggcccc
tatgatcatc cacggcatta agacccaggg cgccaggcag 6240aagttcagca gcctgtacat
cagccagttc atcatcatgt acagcctgga cggcaagaag 6300tggcagacct accggggcaa
cagcaccggc accctgatgg tgttcttcgg caatgtggac 6360agcagcggca tcaagcacaa
catcttcaac ccccccatca ttgcccggta catccggctg 6420caccccaccc actacagcat
tagatccaca ctgagaatgg aactgatggg ctgcgacctg 6480aactcctgca gcatgcctct
gggcatggaa agcaaggcca tcagcgacgc ccagatcaca 6540gccagcagct acttcaccaa
catgttcgcc acctggtccc cctccaaggc caggctgcac 6600ctgcagggcc ggtccaacgc
ctggcggcct caggtcaaca accccaaaga atggctgcag 6660gtggactttc agaaaaccat
gaaggtgacc ggcgtgacca cccagggcgt gaaaagcctg 6720ctgaccagca tgtacgtgaa
agagtttctg atcagcagct ctcaggatgg ccaccagtgg 6780accctgttct ttcagaacgg
caaggtgaaa gtgttccagg gcaaccagga ctccttcacc 6840cccgtggtga actccctgga
cccccccctg ctgacccgct acctgagaat ccacccccag 6900tcttgggtgc accagatcgc
cctcaggatg gaagtcctgg gatgtgaggc ccaggatctg 6960tactgatgac gtctggaacg
cgtcgacaat caacctctgg attacaaaat ttgtgaaaga 7020ttgactggta ttcttaacta
tgttgctcct tttacgctat gtggatacgc tgctttaatg 7080cctttgtatc atgctattgc
ttcccgtatg gctttcattt tctcctcctt gtataaatcc 7140tggttgctgt ctctttatga
ggagttgtgg cccgttgtca ggcaacgtgg cgtggtgtgc 7200actgtgtttg ctgacgcaac
ccccactggt tggggcattg ccaccacctg tcagctcctt 7260tccgggactt tcgctttccc
cctccctatt gccacggcgg aactcatcgc cgcctgcctt 7320gcccgctgct ggacaggggc
tcggctgttg ggcactgaca attccgtggt gttgtcgggg 7380aaatcatcgt cctttccttg
gctgctcgcc tgtgttgcca cctggattct gcgcgggacg 7440tccttctgct acgtcccttc
ggccctcaat ccagcggacc ttccttcccg cggcctgctg 7500ccggctctgc ggcctcttcc
gcgtcttcgc cttcgccctc agacgagtcg gatctccctt 7560tgggccgcct ccccgcctgg
tacctttaag accaatgact tacaaggcag ctgtagatct 7620tagccacttt ttaaaagaaa
aggggggact ggaagggcta attcactccc aacgaaaata 7680agatctgctt tttgcttgta
ctgggtctct ctggttagac cagatctgag cctgggagct 7740ctctggctaa ctagggaacc
cactgcttaa gcctcaataa agcttgcctt gagtgcttca 7800agtagtgtgt gcccgtctgt
tgtgtgactc tggtaactag agatccctca gaccctttta 7860gtcagtgtgg aaaatctcta
gcagtagtag ttcatgtcat cttattattc agtatttata 7920acttgcaaag aaatgaatat
cagagagtga gaggaacttg tttattgcag cttataatgg 7980ttacaaataa agcaatagca
tcacaaattt cacaaataaa gcattttttt cactgcattc 8040tagttgtggt ttgtccaaac
tcatcaatgt atcttatcat gtctggctct agctatcccg 8100cccctaactc cgcccagttc
cgcccattct ccgccccatg gctgactaat tttttttatt 8160tatgcagagg ccgaggccgc
ctcggcctct gagctattcc agaagtagtg aggaggcttt 8220tttggaggcc tagacttttg
cagagacggc ccaaattcgt aatcatggtc atagctgttt 8280cctgtgtgaa attgttatcc
gctcacaatt ccacacaaca tacgagccgg aagcataaag 8340tgtaaagcct ggggtgccta
atgagtgagc taactcacat taattgcgtt gcgctcactg 8400cccgctttcc agtcgggaaa
cctgtcgtgc cagctgcatt aatgaatcgg ccaacgcgcg 8460gggagaggcg gtttgcgtat
tgggcgctct tccgcttcct cgctcactga ctcgctgcgc 8520tcggtcgttc ggctgcggcg
agcggtatca gctcactcaa aggcggtaat acggttatcc 8580acagaatcag gggataacgc
aggaaagaac atgtgagcaa aaggccagca aaaggccagg 8640aaccgtaaaa aggccgcgtt
gctggcgttt ttccataggc tccgcccccc tgacgagcat 8700cacaaaaatc gacgctcaag
tcagaggtgg cgaaacccga caggactata aagataccag 8760gcgtttcccc ctggaagctc
cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga 8820tacctgtccg cctttctccc
ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg 8880tatctcagtt cggtgtaggt
cgttcgctcc aagctgggct gtgtgcacga accccccgtt 8940cagcccgacc gctgcgcctt
atccggtaac tatcgtcttg agtccaaccc ggtaagacac 9000gacttatcgc cactggcagc
agccactggt aacaggatta gcagagcgag gtatgtaggc 9060ggtgctacag agttcttgaa
gtggtggcct aactacggct acactagaag gacagtattt 9120ggtatctgcg ctctgctgaa
gccagttacc ttcggaaaaa gagttggtag ctcttgatcc 9180ggcaaacaaa ccaccgctgg
tagcggtggt ttttttgttt gcaagcagca gattacgcgc 9240agaaaaaaag gatctcaaga
agatcctttg atcttttcta cggggtctga cgctcagtgg 9300aacgaaaact cacgttaagg
gattttggtc atgagattat caaaaaggat cttcacctag 9360atccttttaa attaaaaatg
aagttttaaa tcaatctaaa gtatatatga gtaaacttgg 9420tctgacagtt accaatgctt
aatcagtgag gcacctatct cagcgatctg tctatttcgt 9480tcatccatag ttgcctgact
ccccgtcgtg tagataacta cgatacggga gggcttacca 9540tctggcccca gtgctgcaat
gataccgcga gacccacgct caccggctcc agatttatca 9600gcaataaacc agccagccgg
aagggccgag cgcagaagtg gtcctgcaac tttatccgcc 9660tccatccagt ctattaattg
ttgccgggaa gctagagtaa gtagttcgcc agttaatagt 9720ttgcgcaacg ttgttgccat
tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg 9780gcttcattca gctccggttc
ccaacgatca aggcgagtta catgatcccc catgttgtgc 9840aaaaaagcgg ttagctcctt
cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg 9900ttatcactca tggttatggc
agcactgcat aattctctta ctgtcatgcc atccgtaaga 9960tgcttttctg tgactggtga
gtactcaacc aagtcattct gagaatagtg tatgcggcga 10020ccgagttgct cttgcccggc
gtcaatacgg gataataccg cgccacatag cagaacttta 10080aaagtgctca tcattggaaa
acgttcttcg gggcgaaaac tctcaaggat cttaccgctg 10140ttgagatcca gttcgatgta
acccactcgt gcacccaact gatcttcagc atcttttact 10200ttcaccagcg tttctgggtg
agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata 10260agggcgacac ggaaatgttg
aatactcata ctcttccttt ttcaatatta ttgaagcatt 10320tatcagggtt attgtctcat
gagcggatac atatttgaat gtatttagaa aaataaacaa 10380ataggggttc cgcgcacatt
tccccgaaaa gtgccacctg acgtctaaga aaccattatt 10440atcatgacat taacctataa
aaataggcgt atcacgaggc cctttcgtct cgcgcgtttc 10500ggtgatgacg gtgaaaacct
ctgacacatg cagctcccgg agacggtcac agcttgtctg 10560taagcggatg ccgggagcag
acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt 10620cggggctggc ttaactatgc
ggcatcagag cagattgtac tgagagtgca ccatatatgc 10680ggtgtgaaat accgcacaga
tgcgtaagga gaaaataccg catcaggcgc cattcgccat 10740tcaggctgcg caactgttgg
gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc 10800tggcgaaagg gggatgtgct
gcaaggcgat taagttgggt aacgccaggg ttttcccagt 10860cacgacgttg taaaacgacg
gccagtgcca agctg 108955111195DNAArtificial
SequenceVector 51acgcgtgtag tcttatgcaa tactcttgta gtcttgcaac atggtaacga
tgagttagca 60acatgcctta caaggagaga aaaagcaccg tgcatgccga ttggtggaag
taaggtggta 120cgatcgtgcc ttattaggaa ggcaacagac gggtctgaca tggattggac
gaaccactga 180attgccgcat tgcagagata ttgtatttaa gtgcctagct cgatacaata
aacgggtctc 240tctggttaga ccagatctga gcctgggagc tctctggcta actagggaac
ccactgctta 300agcctcaata aagcttgcct tgagtgcttc aagtagtgtg tgcccgtctg
ttgtgtgact 360ctggtaacta gagatccctc agaccctttt agtcagtgtg gaaaatctct
agcagtggcg 420cccgaacagg gacctgaaag cgaaagggaa accagagctc tctcgacgca
ggactcggct 480tgctgaagcg cgcacggcaa gaggcgaggg gcggcgactg gtgagtacgc
caaaaatttt 540gactagcgga ggctagaagg agagagatgg gtgcgagagc gtcagtatta
agcgggggag 600aattagatcg cgatgggaaa aaattcggtt aaggccaggg ggaaagaaaa
aatataaatt 660aaaacatata gtatgggcaa gcagggagct agaacgattc gcagttaatc
ctggcctgtt 720agaaacatca gaaggctgta gacaaatact gggacagcta caaccatccc
ttcagacagg 780atcagaagaa cttagatcat tatataatac agtagcaacc ctctattgtg
tgcatcaaag 840gatagagata aaagacacca aggaagcttt agacaagata gaggaagagc
aaaacaaaag 900taagaccacc gcacagcaag cggccactga tcttcagacc tggaggagga
gatatgaggg 960acaattggag aagtgaatta tataaatata aagtagtaaa aattgaacca
ttaggagtag 1020cacccaccaa ggcaaagaga agagtggtgc agagagaaaa aagagcagtg
ggaataggag 1080ctttgttcct tgggttcttg ggagcagcag gaagcactat gggcgcagcc
tcaatgacgc 1140tgacggtaca ggccagacaa ttattgtctg gtatagtgca gcagcagaac
aatttgctga 1200gggctattga ggcgcaacag catctgttgc aactcacagt ctggggcatc
aagcagctcc 1260aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctcctg
gggatttggg 1320gttgctctgg aaaactcatt tgcaccactg ctgtgccttg gaatgctagt
tggagtaata 1380aatctctgga acagattgga atcacacgac ctggatggag tgggacagag
aaattaacaa 1440ttacacaagc ttaatacact ccttaattga agaatcgcaa aaccagcaag
aaaagaatga 1500acaagaatta ttggaattag ataaatgggc aagtttgtgg aattggttta
acataacaaa 1560ttggctgtgg tatataaaat tattcataat gatagtagga ggcttggtag
gtttaagaat 1620agtttttgct gtactttcta tagtgaatag agttaggcag ggatattcac
cattatcgtt 1680tcagacccac ctcccaaccc cgaggggacc cgacaggccc gaaggaatag
aagaagaagg 1740tggagagaga gacagagaca gatccattcg attagtgaac ggatctcgac
ggttaacttt 1800taaaagaaaa ggggggattg gggggtacag tgcaggggaa agaatagtag
acataatagc 1860aacagacata caaactaaag aattacaaaa acaaattaca aaaattcaaa
attttatcga 1920tagcgcggcc gcttcgaacg cgcgcgatgc atcatatgga caggcttctg
agtgtaggga 1980gctggtctgc cagtctttcg gaggtttgaa cttgtcaagg ctagggcagg
atcaccatat 2040ccagcctgga cttgcagttc tgtggggtgc ctccccatac ccccataaga
tgccaaacat 2100gaggccctgt catcctccat ggtccccctc tactggctgt tcaaggccca
gggctctccc 2160atgccagata gcatcctgtc tcctaccacc actgtcccag cctgagggaa
ctccctgtgc 2220tgggcctacc cagctgaccc catcgctgga aacaatgggg gtcaggcaac
acttccccac 2280tctctcccgc cgggctgtgc tcacttcctt cctgctggct gcctgaggaa
gtgtccctgc 2340cctgggacag tctggcctag cctttgtttc cccgggggtc cccacccatg
gagctttcaa 2400ggcttctggc ccctgtgaag ccagcacacg tacgcggtct agaattcctg
cagggcccac 2460tagtctccca ggcatgactc caacaatgca tcccatggga tttggggttc
cccagatctg 2520gggcttgtag gcctgactct cccctgtgca cacgtctcat acacgcatgc
gtgcacccat 2580tgcctgcccc gccccttgca cagggagtca gcagggagga ctgggttatg
ccctgcttat 2640cagcagcttc ccagcttcct ctgcctggat tcttagaggc ctggggtcct
agaacgagct 2700ggtgcacgtg gcttcccaaa gatctctcag ataatgagag gaaatgcagt
catcagtttg 2760cagaaggcta gggattctgg gccatagctc agacctgcgc ccaccatctc
cctccaggca 2820gcccttggct ggtccctgcg agcccgtgga gactgccagt cagcgctgct
ggatctcggg 2880ctcgaggcca ccatgcagat cgagctgtcc acctgctttt ttctgtgcct
gctgcggttc 2940tgcttcagcg ccacccggcg gtactacctg ggcgccgtgg agctgtcctg
ggactacatg 3000cagagcgacc tgggcgagct gcccgtggac gcccggttcc cccccagagt
gcccaagagc 3060ttccccttca acaccagcgt ggtgtacaag aaaaccctgt tcgtggagtt
caccgaccac 3120ctgttcaata tcgccaagcc caggcccccc tggatgggcc tgctgggccc
caccatccag 3180gccgaggtgt acgacaccgt ggtgatcacc ctgaagaaca tggccagcca
ccccgtgagc 3240ctgcacgccg tgggcgtgag ctactggaag gccagcgagg gcgccgagta
cgacgaccag 3300accagccagc gggagaaaga agatgacaag gtgttccctg gcggcagcca
cacctacgtg 3360tggcaggtgc tgaaagaaaa cggccccatg gcctccgacc ccctgtgcct
gacctacagc 3420tacctgagcc acgtggacct ggtgaaggac ctgaacagcg gcctgatcgg
cgctctgctc 3480gtctgccggg agggcagcct ggccaaagag aaaacccaga ccctgcacaa
gttcatcctg 3540ctgttcgccg tgttcgacga gggcaagagc tggcacagcg agacaaagaa
cagcctgatg 3600caggaccggg acgccgcctc tgccagagcc tggcccaaga tgcacaccgt
gaacggctac 3660gtgaacagaa gcctgcccgg cctgattggc tgccaccgga agagcgtgta
ctggcacgtg 3720atcggcatgg gcaccacacc cgaggtgcac agcatctttc tggaagggca
cacctttctg 3780gtccggaacc accggcaggc cagcctggaa atcagcccta tcaccttcct
gaccgcccag 3840acactgctga tggacctggg ccagttcctg ctgttttgcc acatcagctc
tcaccagcac 3900gacggcatgg aagcctacgt gaaggtggac tcttgccccg aggaacccca
gctgcggatg 3960aagaacaacg aggaagccga ggactacgac gacgacctga ccgacagcga
gatggacgtg 4020gtgcggttcg acgacgacaa cagccccagc ttcatccaga tcagaagcgt
ggccaagaag 4080caccccaaga cctgggtgca ctatatcgcc gccgaggaag aggactggga
ctacgccccc 4140ctggtgctgg cccccgacga cagaagctac aagagccagt acctgaacaa
tggcccccag 4200cggatcggcc ggaagtacaa gaaagtgcgg ttcatggcct acaccgacga
gacattcaag 4260acccgggagg ccatccagca cgagagcggc atcctgggcc ccctgctgta
cggcgaagtg 4320ggcgacacac tgctgatcat cttcaagaac caggctagcc ggccctacaa
catctacccc 4380cacggcatca ccgacgtgcg gcccctgtac agcaggcggc tgcccaaggg
cgtgaagcac 4440ctgaaggact tccccatcct gcccggcgag atcttcaagt acaagtggac
cgtgaccgtg 4500gaggacggcc ccaccaagag cgaccccaga tgcctgaccc ggtactacag
cagcttcgtg 4560aacatggaac gggacctggc ctccgggctg atcggacctc tgctgatctg
ctacaaagaa 4620agcgtggacc agcggggcaa ccagatcatg agcgacaagc ggaacgtgat
cctgttcagc 4680gtgttcgatg agaaccggtc ctggtatctg accgagaaca tccagcggtt
tctgcccaac 4740cctgccggcg tgcagctgga agatcccgag ttccaggcca gcaacatcat
gcactccatc 4800aatggctacg tgttcgactc tctgcagctc tccgtgtgtc tgcacgaggt
ggcctactgg 4860tacatcctga gcatcggcgc ccagaccgac ttcctgagcg tgttcttcag
cggctacacc 4920ttcaagcaca agatggtgta cgaggacacc ctgaccctgt tccctttcag
cggcgagaca 4980gtgttcatga gcatggaaaa ccccggcctg tggattctgg gctgccacaa
cagcgacttc 5040cggaaccggg gcatgaccgc cctgctgaag gtgtccagct gcgacaagaa
caccggcgac 5100tactacgagg acagctacga ggatatcagc gcctacctgc tgtccaagaa
caacgccatc 5160gaaccccgga gcttcagcca gaaccccccc gtgctgacgc gtcaccagcg
ggagatcacc 5220cggacaaccc tgcagtccga ccaggaagag atcgattacg acgacaccat
cagcgtggag 5280atgaagaaag aggatttcga tatctacgac gaggacgaga accagagccc
cagaagcttc 5340cagaagaaaa cccggcacta cttcattgcc gccgtggaga ggctgtggga
ctacggcatg 5400agttctagcc cccacgtgct gcggaaccgg gcccagagcg gcagcgtgcc
ccagttcaag 5460aaagtggtgt tccaggaatt cacagacggc agcttcaccc agcctctgta
tagaggcgag 5520ctgaacgagc acctggggct gctggggccc tacatcaggg ccgaagtgga
ggacaacatc 5580atggtgacct tccggaatca ggccagcaga ccctactcct tctacagcag
cctgatcagc 5640tacgaagagg accagcggca gggcgccgaa ccccggaaga acttcgtgaa
gcccaacgaa 5700accaagacct acttctggaa agtgcagcac cacatggccc ccaccaagga
cgagttcgac 5760tgcaaggcct gggcctactt cagcgacgtg gatctggaaa aggacgtgca
ctctggactg 5820attggcccac tcctggtctg ccacactaac accctcaacc ccgcccacgg
ccgccaggtg 5880accgtgcagg aattcgccct gttcttcacc atcttcgacg agacaaagtc
ctggtacttc 5940accgagaata tggaacggaa ctgcagagcc ccctgcaaca tccagatgga
agatcctacc 6000ttcaaagaga actaccggtt ccacgccatc aacggctaca tcatggacac
cctgcctggc 6060ctggtgatgg cccaggacca gagaatccgg tggtatctgc tgtccatggg
cagcaacgag 6120aatatccaca gcatccactt cagcggccac gtgttcaccg tgcggaagaa
agaagagtac 6180aagatggccc tgtacaacct gtaccccggc gtgttcgaga cagtggagat
gctgcccagc 6240aaggccggca tctggcgggt ggagtgtctg atcggcgagc acctgcacgc
tggcatgagc 6300accctgtttc tggtgtacag caacaagtgc cagaccccac tgggcatggc
ctctggccac 6360atccgggact tccagatcac cgcctccggc cagtacggcc agtgggcccc
caagctggcc 6420agactgcact acagcggcag catcaacgcc tggtccacca aagagccctt
cagctggatc 6480aaggtggacc tgctggcccc tatgatcatc cacggcatta agacccaggg
cgccaggcag 6540aagttcagca gcctgtacat cagccagttc atcatcatgt acagcctgga
cggcaagaag 6600tggcagacct accggggcaa cagcaccggc accctgatgg tgttcttcgg
caatgtggac 6660agcagcggca tcaagcacaa catcttcaac ccccccatca ttgcccggta
catccggctg 6720caccccaccc actacagcat tagatccaca ctgagaatgg aactgatggg
ctgcgacctg 6780aactcctgca gcatgcctct gggcatggaa agcaaggcca tcagcgacgc
ccagatcaca 6840gccagcagct acttcaccaa catgttcgcc acctggtccc cctccaaggc
caggctgcac 6900ctgcagggcc ggtccaacgc ctggcggcct caggtcaaca accccaaaga
atggctgcag 6960gtggactttc agaaaaccat gaaggtgacc ggcgtgacca cccagggcgt
gaaaagcctg 7020ctgaccagca tgtacgtgaa agagtttctg atcagcagct ctcaggatgg
ccaccagtgg 7080accctgttct ttcagaacgg caaggtgaaa gtgttccagg gcaaccagga
ctccttcacc 7140cccgtggtga actccctgga cccccccctg ctgacccgct acctgagaat
ccacccccag 7200tcttgggtgc accagatcgc cctcaggatg gaagtcctgg gatgtgaggc
ccaggatctg 7260tactgatgac gtctggaacg cgtcgacaat caacctctgg attacaaaat
ttgtgaaaga 7320ttgactggta ttcttaacta tgttgctcct tttacgctat gtggatacgc
tgctttaatg 7380cctttgtatc atgctattgc ttcccgtatg gctttcattt tctcctcctt
gtataaatcc 7440tggttgctgt ctctttatga ggagttgtgg cccgttgtca ggcaacgtgg
cgtggtgtgc 7500actgtgtttg ctgacgcaac ccccactggt tggggcattg ccaccacctg
tcagctcctt 7560tccgggactt tcgctttccc cctccctatt gccacggcgg aactcatcgc
cgcctgcctt 7620gcccgctgct ggacaggggc tcggctgttg ggcactgaca attccgtggt
gttgtcgggg 7680aaatcatcgt cctttccttg gctgctcgcc tgtgttgcca cctggattct
gcgcgggacg 7740tccttctgct acgtcccttc ggccctcaat ccagcggacc ttccttcccg
cggcctgctg 7800ccggctctgc ggcctcttcc gcgtcttcgc cttcgccctc agacgagtcg
gatctccctt 7860tgggccgcct ccccgcctgg tacctttaag accaatgact tacaaggcag
ctgtagatct 7920tagccacttt ttaaaagaaa aggggggact ggaagggcta attcactccc
aacgaaaata 7980agatctgctt tttgcttgta ctgggtctct ctggttagac cagatctgag
cctgggagct 8040ctctggctaa ctagggaacc cactgcttaa gcctcaataa agcttgcctt
gagtgcttca 8100agtagtgtgt gcccgtctgt tgtgtgactc tggtaactag agatccctca
gaccctttta 8160gtcagtgtgg aaaatctcta gcagtagtag ttcatgtcat cttattattc
agtatttata 8220acttgcaaag aaatgaatat cagagagtga gaggaacttg tttattgcag
cttataatgg 8280ttacaaataa agcaatagca tcacaaattt cacaaataaa gcattttttt
cactgcattc 8340tagttgtggt ttgtccaaac tcatcaatgt atcttatcat gtctggctct
agctatcccg 8400cccctaactc cgcccagttc cgcccattct ccgccccatg gctgactaat
tttttttatt 8460tatgcagagg ccgaggccgc ctcggcctct gagctattcc agaagtagtg
aggaggcttt 8520tttggaggcc tagacttttg cagagacggc ccaaattcgt aatcatggtc
atagctgttt 8580cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg
aagcataaag 8640tgtaaagcct ggggtgccta atgagtgagc taactcacat taattgcgtt
gcgctcactg 8700cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg
ccaacgcgcg 8760gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga
ctcgctgcgc 8820tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat
acggttatcc 8880acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca
aaaggccagg 8940aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc
tgacgagcat 9000cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata
aagataccag 9060gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc
gcttaccgga 9120tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc
acgctgtagg 9180tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga
accccccgtt 9240cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc
ggtaagacac 9300gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag
gtatgtaggc 9360ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag
gacagtattt 9420ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag
ctcttgatcc 9480ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca
gattacgcgc 9540agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga
cgctcagtgg 9600aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat
cttcacctag 9660atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga
gtaaacttgg 9720tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg
tctatttcgt 9780tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga
gggcttacca 9840tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc
agatttatca 9900gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac
tttatccgcc 9960tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc
agttaatagt 10020ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc
gtttggtatg 10080gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc
catgttgtgc 10140aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt
ggccgcagtg 10200ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc
atccgtaaga 10260tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg
tatgcggcga 10320ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag
cagaacttta 10380aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat
cttaccgctg 10440ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc
atcttttact 10500ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa
aaagggaata 10560agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta
ttgaagcatt 10620tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa
aaataaacaa 10680ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga
aaccattatt 10740atcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtct
cgcgcgtttc 10800ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg agacggtcac
agcttgtctg 10860taagcggatg ccgggagcag acaagcccgt cagggcgcgt cagcgggtgt
tggcgggtgt 10920cggggctggc ttaactatgc ggcatcagag cagattgtac tgagagtgca
ccatatatgc 10980ggtgtgaaat accgcacaga tgcgtaagga gaaaataccg catcaggcgc
cattcgccat 11040tcaggctgcg caactgttgg gaagggcgat cggtgcgggc ctcttcgcta
ttacgccagc 11100tggcgaaagg gggatgtgct gcaaggcgat taagttgggt aacgccaggg
ttttcccagt 11160cacgacgttg taaaacgacg gccagtgcca agctg
111955211295DNAArtificial SequenceVector 52acgcgtgtag
tcttatgcaa tactcttgta gtcttgcaac atggtaacga tgagttagca 60acatgcctta
caaggagaga aaaagcaccg tgcatgccga ttggtggaag taaggtggta 120cgatcgtgcc
ttattaggaa ggcaacagac gggtctgaca tggattggac gaaccactga 180attgccgcat
tgcagagata ttgtatttaa gtgcctagct cgatacaata aacgggtctc 240tctggttaga
ccagatctga gcctgggagc tctctggcta actagggaac ccactgctta 300agcctcaata
aagcttgcct tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact 360ctggtaacta
gagatccctc agaccctttt agtcagtgtg gaaaatctct agcagtggcg 420cccgaacagg
gacctgaaag cgaaagggaa accagagctc tctcgacgca ggactcggct 480tgctgaagcg
cgcacggcaa gaggcgaggg gcggcgactg gtgagtacgc caaaaatttt 540gactagcgga
ggctagaagg agagagatgg gtgcgagagc gtcagtatta agcgggggag 600aattagatcg
cgatgggaaa aaattcggtt aaggccaggg ggaaagaaaa aatataaatt 660aaaacatata
gtatgggcaa gcagggagct agaacgattc gcagttaatc ctggcctgtt 720agaaacatca
gaaggctgta gacaaatact gggacagcta caaccatccc ttcagacagg 780atcagaagaa
cttagatcat tatataatac agtagcaacc ctctattgtg tgcatcaaag 840gatagagata
aaagacacca aggaagcttt agacaagata gaggaagagc aaaacaaaag 900taagaccacc
gcacagcaag cggccactga tcttcagacc tggaggagga gatatgaggg 960acaattggag
aagtgaatta tataaatata aagtagtaaa aattgaacca ttaggagtag 1020cacccaccaa
ggcaaagaga agagtggtgc agagagaaaa aagagcagtg ggaataggag 1080ctttgttcct
tgggttcttg ggagcagcag gaagcactat gggcgcagcc tcaatgacgc 1140tgacggtaca
ggccagacaa ttattgtctg gtatagtgca gcagcagaac aatttgctga 1200gggctattga
ggcgcaacag catctgttgc aactcacagt ctggggcatc aagcagctcc 1260aggcaagaat
cctggctgtg gaaagatacc taaaggatca acagctcctg gggatttggg 1320gttgctctgg
aaaactcatt tgcaccactg ctgtgccttg gaatgctagt tggagtaata 1380aatctctgga
acagattgga atcacacgac ctggatggag tgggacagag aaattaacaa 1440ttacacaagc
ttaatacact ccttaattga agaatcgcaa aaccagcaag aaaagaatga 1500acaagaatta
ttggaattag ataaatgggc aagtttgtgg aattggttta acataacaaa 1560ttggctgtgg
tatataaaat tattcataat gatagtagga ggcttggtag gtttaagaat 1620agtttttgct
gtactttcta tagtgaatag agttaggcag ggatattcac cattatcgtt 1680tcagacccac
ctcccaaccc cgaggggacc cgacaggccc gaaggaatag aagaagaagg 1740tggagagaga
gacagagaca gatccattcg attagtgaac ggatctcgac ggttaacttt 1800taaaagaaaa
ggggggattg gggggtacag tgcaggggaa agaatagtag acataatagc 1860aacagacata
caaactaaag aattacaaaa acaaattaca aaaattcaaa attttatcga 1920tagcgcggcc
gcttcgaacg cgcgcgatgc atcatatgga gacttttttt gaaaaacgga 1980acatctgcct
atcgcaagga ctactattat tctgaaaatc accttcttca ttagaaagta 2040atatttatca
ttttattata gaactttgat cttacttctt gtgacttcat tctgcgtaga 2100gcacactccc
atccttgaat taaatgacaa agcattttat attaactgac aatgactgat 2160gccatgggca
aatcctattt ctgtaaataa ctgaattttc ttctggactg cgcatgaggg 2220gagaaagatg
tctgcagttt cggtttcctg gaaaatgaaa cctatctcat ttgttgcctg 2280tgtcaagggg
cagtgcttca gtcggggtgg agctgcttaa aaggcctggg atcacaccct 2340ttgggaacac
atccaagctt aagacggtga ggtcagcttc acattctcag gaactctcct 2400tctttgggta
agactgggag ggtgggcagg agctaccctt cccgtggccc cggaccttgg 2460gtgggctgtg
ggctcaggga gcggagggga ggccttaagc atccactctc tgcccggtgt 2520ttttgttccg
tacgcggtct agaattcctg cagggcccac tagtctccca ggcatgactc 2580caacaatgca
tcccatggga tttggggttc cccagatctg gggcttgtag gcctgactct 2640cccctgtgca
cacgtctcat acacgcatgc gtgcacccat tgcctgcccc gccccttgca 2700cagggagtca
gcagggagga ctgggttatg ccctgcttat cagcagcttc ccagcttcct 2760ctgcctggat
tcttagaggc ctggggtcct agaacgagct ggtgcacgtg gcttcccaaa 2820gatctctcag
ataatgagag gaaatgcagt catcagtttg cagaaggcta gggattctgg 2880gccatagctc
agacctgcgc ccaccatctc cctccaggca gcccttggct ggtccctgcg 2940agcccgtgga
gactgccagt cagcgctgct ggatctcggg ctcgaggcca ccatgcagat 3000cgagctgtcc
acctgctttt ttctgtgcct gctgcggttc tgcttcagcg ccacccggcg 3060gtactacctg
ggcgccgtgg agctgtcctg ggactacatg cagagcgacc tgggcgagct 3120gcccgtggac
gcccggttcc cccccagagt gcccaagagc ttccccttca acaccagcgt 3180ggtgtacaag
aaaaccctgt tcgtggagtt caccgaccac ctgttcaata tcgccaagcc 3240caggcccccc
tggatgggcc tgctgggccc caccatccag gccgaggtgt acgacaccgt 3300ggtgatcacc
ctgaagaaca tggccagcca ccccgtgagc ctgcacgccg tgggcgtgag 3360ctactggaag
gccagcgagg gcgccgagta cgacgaccag accagccagc gggagaaaga 3420agatgacaag
gtgttccctg gcggcagcca cacctacgtg tggcaggtgc tgaaagaaaa 3480cggccccatg
gcctccgacc ccctgtgcct gacctacagc tacctgagcc acgtggacct 3540ggtgaaggac
ctgaacagcg gcctgatcgg cgctctgctc gtctgccggg agggcagcct 3600ggccaaagag
aaaacccaga ccctgcacaa gttcatcctg ctgttcgccg tgttcgacga 3660gggcaagagc
tggcacagcg agacaaagaa cagcctgatg caggaccggg acgccgcctc 3720tgccagagcc
tggcccaaga tgcacaccgt gaacggctac gtgaacagaa gcctgcccgg 3780cctgattggc
tgccaccgga agagcgtgta ctggcacgtg atcggcatgg gcaccacacc 3840cgaggtgcac
agcatctttc tggaagggca cacctttctg gtccggaacc accggcaggc 3900cagcctggaa
atcagcccta tcaccttcct gaccgcccag acactgctga tggacctggg 3960ccagttcctg
ctgttttgcc acatcagctc tcaccagcac gacggcatgg aagcctacgt 4020gaaggtggac
tcttgccccg aggaacccca gctgcggatg aagaacaacg aggaagccga 4080ggactacgac
gacgacctga ccgacagcga gatggacgtg gtgcggttcg acgacgacaa 4140cagccccagc
ttcatccaga tcagaagcgt ggccaagaag caccccaaga cctgggtgca 4200ctatatcgcc
gccgaggaag aggactggga ctacgccccc ctggtgctgg cccccgacga 4260cagaagctac
aagagccagt acctgaacaa tggcccccag cggatcggcc ggaagtacaa 4320gaaagtgcgg
ttcatggcct acaccgacga gacattcaag acccgggagg ccatccagca 4380cgagagcggc
atcctgggcc ccctgctgta cggcgaagtg ggcgacacac tgctgatcat 4440cttcaagaac
caggctagcc ggccctacaa catctacccc cacggcatca ccgacgtgcg 4500gcccctgtac
agcaggcggc tgcccaaggg cgtgaagcac ctgaaggact tccccatcct 4560gcccggcgag
atcttcaagt acaagtggac cgtgaccgtg gaggacggcc ccaccaagag 4620cgaccccaga
tgcctgaccc ggtactacag cagcttcgtg aacatggaac gggacctggc 4680ctccgggctg
atcggacctc tgctgatctg ctacaaagaa agcgtggacc agcggggcaa 4740ccagatcatg
agcgacaagc ggaacgtgat cctgttcagc gtgttcgatg agaaccggtc 4800ctggtatctg
accgagaaca tccagcggtt tctgcccaac cctgccggcg tgcagctgga 4860agatcccgag
ttccaggcca gcaacatcat gcactccatc aatggctacg tgttcgactc 4920tctgcagctc
tccgtgtgtc tgcacgaggt ggcctactgg tacatcctga gcatcggcgc 4980ccagaccgac
ttcctgagcg tgttcttcag cggctacacc ttcaagcaca agatggtgta 5040cgaggacacc
ctgaccctgt tccctttcag cggcgagaca gtgttcatga gcatggaaaa 5100ccccggcctg
tggattctgg gctgccacaa cagcgacttc cggaaccggg gcatgaccgc 5160cctgctgaag
gtgtccagct gcgacaagaa caccggcgac tactacgagg acagctacga 5220ggatatcagc
gcctacctgc tgtccaagaa caacgccatc gaaccccgga gcttcagcca 5280gaaccccccc
gtgctgacgc gtcaccagcg ggagatcacc cggacaaccc tgcagtccga 5340ccaggaagag
atcgattacg acgacaccat cagcgtggag atgaagaaag aggatttcga 5400tatctacgac
gaggacgaga accagagccc cagaagcttc cagaagaaaa cccggcacta 5460cttcattgcc
gccgtggaga ggctgtggga ctacggcatg agttctagcc cccacgtgct 5520gcggaaccgg
gcccagagcg gcagcgtgcc ccagttcaag aaagtggtgt tccaggaatt 5580cacagacggc
agcttcaccc agcctctgta tagaggcgag ctgaacgagc acctggggct 5640gctggggccc
tacatcaggg ccgaagtgga ggacaacatc atggtgacct tccggaatca 5700ggccagcaga
ccctactcct tctacagcag cctgatcagc tacgaagagg accagcggca 5760gggcgccgaa
ccccggaaga acttcgtgaa gcccaacgaa accaagacct acttctggaa 5820agtgcagcac
cacatggccc ccaccaagga cgagttcgac tgcaaggcct gggcctactt 5880cagcgacgtg
gatctggaaa aggacgtgca ctctggactg attggcccac tcctggtctg 5940ccacactaac
accctcaacc ccgcccacgg ccgccaggtg accgtgcagg aattcgccct 6000gttcttcacc
atcttcgacg agacaaagtc ctggtacttc accgagaata tggaacggaa 6060ctgcagagcc
ccctgcaaca tccagatgga agatcctacc ttcaaagaga actaccggtt 6120ccacgccatc
aacggctaca tcatggacac cctgcctggc ctggtgatgg cccaggacca 6180gagaatccgg
tggtatctgc tgtccatggg cagcaacgag aatatccaca gcatccactt 6240cagcggccac
gtgttcaccg tgcggaagaa agaagagtac aagatggccc tgtacaacct 6300gtaccccggc
gtgttcgaga cagtggagat gctgcccagc aaggccggca tctggcgggt 6360ggagtgtctg
atcggcgagc acctgcacgc tggcatgagc accctgtttc tggtgtacag 6420caacaagtgc
cagaccccac tgggcatggc ctctggccac atccgggact tccagatcac 6480cgcctccggc
cagtacggcc agtgggcccc caagctggcc agactgcact acagcggcag 6540catcaacgcc
tggtccacca aagagccctt cagctggatc aaggtggacc tgctggcccc 6600tatgatcatc
cacggcatta agacccaggg cgccaggcag aagttcagca gcctgtacat 6660cagccagttc
atcatcatgt acagcctgga cggcaagaag tggcagacct accggggcaa 6720cagcaccggc
accctgatgg tgttcttcgg caatgtggac agcagcggca tcaagcacaa 6780catcttcaac
ccccccatca ttgcccggta catccggctg caccccaccc actacagcat 6840tagatccaca
ctgagaatgg aactgatggg ctgcgacctg aactcctgca gcatgcctct 6900gggcatggaa
agcaaggcca tcagcgacgc ccagatcaca gccagcagct acttcaccaa 6960catgttcgcc
acctggtccc cctccaaggc caggctgcac ctgcagggcc ggtccaacgc 7020ctggcggcct
caggtcaaca accccaaaga atggctgcag gtggactttc agaaaaccat 7080gaaggtgacc
ggcgtgacca cccagggcgt gaaaagcctg ctgaccagca tgtacgtgaa 7140agagtttctg
atcagcagct ctcaggatgg ccaccagtgg accctgttct ttcagaacgg 7200caaggtgaaa
gtgttccagg gcaaccagga ctccttcacc cccgtggtga actccctgga 7260cccccccctg
ctgacccgct acctgagaat ccacccccag tcttgggtgc accagatcgc 7320cctcaggatg
gaagtcctgg gatgtgaggc ccaggatctg tactgatgac gtctggaacg 7380cgtcgacaat
caacctctgg attacaaaat ttgtgaaaga ttgactggta ttcttaacta 7440tgttgctcct
tttacgctat gtggatacgc tgctttaatg cctttgtatc atgctattgc 7500ttcccgtatg
gctttcattt tctcctcctt gtataaatcc tggttgctgt ctctttatga 7560ggagttgtgg
cccgttgtca ggcaacgtgg cgtggtgtgc actgtgtttg ctgacgcaac 7620ccccactggt
tggggcattg ccaccacctg tcagctcctt tccgggactt tcgctttccc 7680cctccctatt
gccacggcgg aactcatcgc cgcctgcctt gcccgctgct ggacaggggc 7740tcggctgttg
ggcactgaca attccgtggt gttgtcgggg aaatcatcgt cctttccttg 7800gctgctcgcc
tgtgttgcca cctggattct gcgcgggacg tccttctgct acgtcccttc 7860ggccctcaat
ccagcggacc ttccttcccg cggcctgctg ccggctctgc ggcctcttcc 7920gcgtcttcgc
cttcgccctc agacgagtcg gatctccctt tgggccgcct ccccgcctgg 7980tacctttaag
accaatgact tacaaggcag ctgtagatct tagccacttt ttaaaagaaa 8040aggggggact
ggaagggcta attcactccc aacgaaaata agatctgctt tttgcttgta 8100ctgggtctct
ctggttagac cagatctgag cctgggagct ctctggctaa ctagggaacc 8160cactgcttaa
gcctcaataa agcttgcctt gagtgcttca agtagtgtgt gcccgtctgt 8220tgtgtgactc
tggtaactag agatccctca gaccctttta gtcagtgtgg aaaatctcta 8280gcagtagtag
ttcatgtcat cttattattc agtatttata acttgcaaag aaatgaatat 8340cagagagtga
gaggaacttg tttattgcag cttataatgg ttacaaataa agcaatagca 8400tcacaaattt
cacaaataaa gcattttttt cactgcattc tagttgtggt ttgtccaaac 8460tcatcaatgt
atcttatcat gtctggctct agctatcccg cccctaactc cgcccagttc 8520cgcccattct
ccgccccatg gctgactaat tttttttatt tatgcagagg ccgaggccgc 8580ctcggcctct
gagctattcc agaagtagtg aggaggcttt tttggaggcc tagacttttg 8640cagagacggc
ccaaattcgt aatcatggtc atagctgttt cctgtgtgaa attgttatcc 8700gctcacaatt
ccacacaaca tacgagccgg aagcataaag tgtaaagcct ggggtgccta 8760atgagtgagc
taactcacat taattgcgtt gcgctcactg cccgctttcc agtcgggaaa 8820cctgtcgtgc
cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg gtttgcgtat 8880tgggcgctct
tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg 8940agcggtatca
gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc 9000aggaaagaac
atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt 9060gctggcgttt
ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag 9120tcagaggtgg
cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc 9180cctcgtgcgc
tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc 9240ttcgggaagc
gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt 9300cgttcgctcc
aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt 9360atccggtaac
tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc 9420agccactggt
aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa 9480gtggtggcct
aactacggct acactagaag gacagtattt ggtatctgcg ctctgctgaa 9540gccagttacc
ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg 9600tagcggtggt
ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga 9660agatcctttg
atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg 9720gattttggtc
atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg 9780aagttttaaa
tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt 9840aatcagtgag
gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact 9900ccccgtcgtg
tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat 9960gataccgcga
gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg 10020aagggccgag
cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg 10080ttgccgggaa
gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat 10140tgctacaggc
atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc 10200ccaacgatca
aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt 10260cggtcctccg
atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc 10320agcactgcat
aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga 10380gtactcaacc
aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc 10440gtcaatacgg
gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa 10500acgttcttcg
gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta 10560acccactcgt
gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg 10620agcaaaaaca
ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg 10680aatactcata
ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat 10740gagcggatac
atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt 10800tccccgaaaa
gtgccacctg acgtctaaga aaccattatt atcatgacat taacctataa 10860aaataggcgt
atcacgaggc cctttcgtct cgcgcgtttc ggtgatgacg gtgaaaacct 10920ctgacacatg
cagctcccgg agacggtcac agcttgtctg taagcggatg ccgggagcag 10980acaagcccgt
cagggcgcgt cagcgggtgt tggcgggtgt cggggctggc ttaactatgc 11040ggcatcagag
cagattgtac tgagagtgca ccatatatgc ggtgtgaaat accgcacaga 11100tgcgtaagga
gaaaataccg catcaggcgc cattcgccat tcaggctgcg caactgttgg 11160gaagggcgat
cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg gggatgtgct 11220gcaaggcgat
taagttgggt aacgccaggg ttttcccagt cacgacgttg taaaacgacg 11280gccagtgcca
agctg
112955320DNAArtificial SequencePrimer 53aacggctacg tgaacagaag
205420DNAArtificial SequencePrimer
54gatagggctg atttccaggc
205519DNAArtificial SequencePrimer 55gaaggtgaag gtcggagtc
195620DNAArtificial SequencePrimer
56gaagatggtg atgggatttc
20
User Contributions:
Comment about this patent or add new information about this topic: