Patent application title: NOVEL C. ELEGANS P21-ACTIVATED KINASE (PAK) GENE AND ASSOCIATED LOSS-OF-FUNCTION PHENOTYPES THAT FACILITATE SCREENING FOR SMALL MOLECULE MODULATORS OF PAK ACTIVITY IN THE NEMATODE, CAENORHABDITIS ELEGANS
Inventors:
Kaj Grandien (Kelkheim, DE)
Jonathan Rothblatt (Somerville, MA, US)
Paola Concari (Munich, DE)
Isabelle Quelo (Schwalbach, DE)
Bert Klebl (Gunzlhofen, DE)
Assignees:
SANOFI-AVENTIS DEUTSCHLAND GMBH
IPC8 Class: AG01N33567FI
USPC Class:
435 72
Class name: Measuring or testing process involving enzymes or micro-organisms; composition or test strip therefore; processes of forming such composition or test strip involving antigen-antibody binding, specific binding protein assay or specific ligand-receptor binding assay involving a micro-organism or cell membrane bound antigen or cell membrane bound receptor or cell membrane bound antibody or microbial lysate
Publication date: 2008-09-04
Patent application number: 20080213798
Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
Patent application title: NOVEL C. ELEGANS P21-ACTIVATED KINASE (PAK) GENE AND ASSOCIATED LOSS-OF-FUNCTION PHENOTYPES THAT FACILITATE SCREENING FOR SMALL MOLECULE MODULATORS OF PAK ACTIVITY IN THE NEMATODE, CAENORHABDITIS ELEGANS
Inventors:
Kaj GRANDIEN
Jonathan ROTHBLATT
Paola CONCARI
Isabelle QUELO
Bert KLEBL
Agents:
ANDREA Q. RYAN;SANOFI-AVENTIS U.S. LLC
Assignees:
SANOFI-AVENTIS DEUTSCHLAND GMBH
Origin: BRIDGEWATER, NJ US
IPC8 Class: AG01N33567FI
USPC Class:
435 72
Abstract:
The invention refers to a novel C. elegans p21-activated kinase gene, the
pak-3 gene, and associated loss-of-function phenotypes. These phenotypes
can be used to elucidate PAK signaling pathways in C. elegans and to
screen compounds that modulate PAK signaling.Claims:
1. An isolated protein which is encoded by a polynucleotide sequence
comprising SEQ ID NO. 1.
2. The isolated protein as claimed in claim 1 which comprises an amino acid sequence of SEQ ID NO. 7.
3. The isolated protein as claimed in claim 1 which exhibits the activity of a pak-3a protein.
4. An assay for identifying a compound which binds to the pak-3a protein whereina] a pak-3a protein is provided,b] a compound is provided,c] the pak-3a protein and the compound are brought in contact, andd] the binding of the chemical compound to the pak-3a protein is determined and/or the activity of the pak-3a protein is determined.
5. The assay as claimed in claim 4, wherein step a] consists of providing a host cell which is expressing a pak-3a protein and step c] consists of bringing in contact the host cell with the compound.
6. The assay as claimed in claim 4, wherein the compound is inactivating, or activating, or maintaining the activity of a pak-3a protein.
Description:
[0001]A novel C. elegans p21-activated kinase (PAK) gene and associated
loss-of-function phenotypes that facilitate screening for small molecule
modulators of PAK activity in the nematode, Caenorhabditis elegans.
[0002]The invention refers to a novel C. elegans p21-activated kinase gene, the pak-3 gene, and associated loss-of-function phenotypes. These phenotypes can be used to elucidate PAK signaling pathways in C. elegans and to screen compounds that modulate PAK signaling.
[0003]The p21-activated kinases comprise a group of serine/threonine protein kinases with distinct structural features that have emerged as important regulators of several different cellular and biological processes (reviewed in Bokoch, Annu Rev Biochem 2003. 72:743-81). The PAK family can be subdivided in the PAK 1-3 subclass and the PAK 4-6 subclass (based on the numbering of the human/mammalian PAKs), the former being the focus of interest here. Members of the PAK 1-3 subclass are highly related to the STE20 kinase in yeast, the founding member of this protein class, and homologues have also been identified in other model organisms such as Drosophila and C. elegans.
[0004]The two most important structural features of PAKs of the 1-3 class are the highly conserved C-terminal catalytic domain and the N-terminal regulatory domain, respectively. A distinct motif in the regulatory domain of PAK proteins is the CRIB domain (cdc42 and Rac interactive domain), which overlaps with an autoinhibitory domain, keeping the catalytic domain inactive in the absence of stimulatory signals. Other motifs found in PAK proteins are SH3 binding domains and an acidic residue-rich domain between the regulatory domain and the catalytic domain. Additionally a binding site for the Gβy subunit of heterotrimeric G proteins has been reported to be present in the very C-terminus of PAK.
[0005]The most well described activators of PAKs are the Rho class GTPases cdc42 and Rac that upon binding to the CRIB domain block the autoinhibitory domain, leading to activation of the kinase domain. Activation of PAKs can also take place through GTPase independent mechanisms after recruitment of PAKs to the plasma membrane where tyrosine kinase receptor mediated activation occurs. PAKs are known to be activated by phosphorylation, in part through autophosphorylation at Thr423 and Ser144 (numbering according to human PAK1). One kinase that has been shown to phosphorylate Thr423 is PDK1, a 3-phosphinositide dependent kinase.
[0006]Many proteins have been reported to be phosphorylated by PAKs, several of those are proteins involved in cell structure and cell motility. It has for example been shown that LIM kinases-1 and -2, serine kinases implicated in actin cytoskeletal dynamics, are phosphorylated by PAKs. Other targets involved in cell motility are myosin light chain kinase and regulatory myosin light chain. In addition PAKs are involved in microtubule dynamics, possibly by phosphorylation of stathmin.
[0007]Through their regulatory actions on the actin cytoskeleton, myosin and microtubules, PAKs are highly involved in cellular processes such as cell motility and cell migration, which on the organism level is manifested as important role(s) for PAKs during e.g. neurogenesis and angiogenesis. It has also been suggested that PAKs are part of a signaling cascade leading to platelet activation through their regulatory action on actin cytoskeleton dynamics. PAKs are known to have both pro- and antiapoptotic effects, depending on the isoform in question. PAK2 is activated by caspase 3 and is thus part of the apoptotic signaling cascade, whereas it has been reported that PAK1 is activated by certain signaling pathways that promote cell survival, for example by IL-3 signaling.
[0008]The important role for PAKs in neurogenesis is exemplified by the hereditary disease nonsyndromic X-linked mental retardation, which is caused by point mutations in PAK3, the brain-specific PAK isoform in humans.
[0009]Several studies have suggested that PAKs may play important roles in cancer metastasis. So has it been reported that many breast cancer cell lines express elevated PAK1 and PAK2 activities. It has also been shown that heregulin, a stimulator of cancer cell growth stimulates PAK1 activity. In addition, dominant negative forms of Pak1 can inhibit motility and invasiveness in cancer cell model.
[0010]Is has been demonstrated that PAKs can associate with the HIV encoded Nef protein, a protein of central importance in HIV pathogenesis. Together with Nef, PAK appears to promote viral replication and pathogenesis of HIV, and PAK is required for survival of infected cells.
[0011]Previous studies have described the existence of one PAK encoding gene in C. elegans, denoted PAK1 (Chen et al 1996 JBC271, 26362-68, Iino & Yamamoto 1998 BBRC 245, 177-84). It was shown by in vitro biochemical assays that PAK1 encodes a bona fide PAK protein demonstrating kinase activity and interaction with CeRac1 (today known as CED-10) and CDC42Ce (CDC-42). Immunoflourescence indicated PAK-1 localization to hypodermal cell boundaries during embryonic body elongation, suggesting a role for pak-1 in embryogenesis. Analysis of transgenic worms expressing pak-1 promoter-reporter gene fusions demonstrated pak-1 expression throughout development, primarily in embryonic tissues, pharyngeal muscles, CAN neurons, motor neurons in the ventral nerve cord, the spermatheca and the distal tip cell (DTC) of the developing gonad. However, no in vivo functional characterization of pak-1 has been reported, even though a knock-out pak-1 strain, RB689, is publicly available. This might suggest that loss-of-function phenotypes of pak-1 are very subtle and hard to detect or that pak-1 is functionally redundant with other protein(s).
C45b11.1 (pak-4)
[0012]In addition to the pak-1 gene, one other predicted gene in the C. elegans genome, c45b11.1, appears to encode a PAK protein, which, based on sequence homology, belongs to the PAK 4-6 subclass of PAK proteins. We propose to call this gene pak-4.
Indications of a Hitherto Unidentified Pak Gene
[0013]Sequence homology searches for genes encoding PAK-like kinase domains identified one open reading frame, y38f1a.10 (SEQ. ID NO. 33), predicted to encode a kinase domain-only protein, without the characteristic regulatory regions of a PAK protein. In the kinase database "kinase.com" located on the world wide web, this ORF is denoted with the name PAK3 with the associated comment that a putative CRIB domain is encoded in a genomic region further upstream. However, no references or experimental data is provided that support this notion.
[0014]The invention pertains to an isolated polynucleotide comprising a DNA sequence which is selected from one of the following groups [0015]a] a DNA sequence of SEQ ID NO. 1; or [0016]b] a DNA sequence which is complementary to SEQ ID NO. 1; or [0017]c] a DNA sequence which hybridizes to a DNA sequence of SEQ ID NO. 1 or to a DNA sequence which is complementary to SEQ ID NO. 1; or [0018]d] a DNA sequence which is degenerate as a result of the genetic code to the DNA sequence of SEQ ID NO. 1 or to a DNA sequence which is complementary to SEQ ID NO. 1; or [0019]e] a DNA sequence which is encoding a pak-3a polypeptide.
[0020]In one embodiment of the invention the isolated polynucleotide consists of a polynucleotide sequence of SEQ ID NO. 1.
[0021]In a further embodiment of the invention the pak-3a polypeptide that is encoded by the DNA-sequence is the pak-3a polypeptide of C. elegans consisting of an amino acid sequence of SEQ ID NO. 7.
[0022]The hybridization can occur under conditions of medium or high stringency. Conditions of medium or high stringent hybridization can be found in textbooks as "Molecular Cloning; edited by Sambrook J. Fritsch E. F., Maniatis T.; Cold Spring Harbor Laboratory Press (ISBN: 0-87969-309-6)"
[0023]Current Protocols in Molecular Biology; edited by Ausubel F. M., Brent R., Kingston R. E., Moore D. D., Seidmann J. G., Smith A., Struhl K., John Wiley & Sons, Inc. (ISBN: 0-471-50338-X-looseleaf)."
Example of Hybridization Under Medium Stringency Conditions:
[0024]The DNA or RNA is transferred on to a membrane filter (e.g. nylon, nitrocellulose) via Southern Blot or Northern Blot.
[0025]The membrane filter containing the target DNA or RNA (e.g. polynucleotide comprising a sequence of SEQ ID NO. 1) is thoroughly wetted in 6×SSC which is prepared from 20×SSC by dilution with water.
[0026](20×SSC: 0.3 M NaCl, 0.3 M Na3-Citrat.2H2O).
[0027]The membrane filter is then prehybridized by adding 0.2 ml prehybridization solution and incubated at 68° C. for 1-2 hours. The prehybridization solution consists of 6×SSC, 5×Denhardt's reagent, 0.5% SDS and 100 μg/ml denatured, fragmented salmon sperm DNA. 5×Denhardt's reagent is prepared from 100×Denhardt's solution by dilution with water.
[0028](100×Denhardt's solution: 10 g Ficoll 400 and 10 g Polyvinylpyrollidone and 10 g Bovine Serum Albumin in 500 ml water).
[0029]To the prehybridization mix 10-20 μg/ml of radiolabeled probe (specific activity for example=109 cpm/μg) is added.
[0030]If the radiolabeled probe is double stranded, it has to be denatured by heating for 5 min. at 100° C. followed by rapid chilling to between 0° C. to 10° C.
[0031]The hybridization mix is incubated for 2 to 14 hours at 60° C. After hybridization the membrane filter is first washed in
2×SSC containing 0.5% SDS for 5 minutes at room temperature.
[0032]The filter is then washed in
2×SSC containing 0.1% SDS for 15 minutes at room temperature.
[0033]The filter is then washed in
0.1% SSC containing 0.5% SDS for 30 minutes at 37° C.
[0034]The filter is then washed in
0.1×SSC containing 0.5% SDS for 30 minutes at 42° C.
[0035]After this washing steps the filter is exposed e.g. to X-ray film or is analyzed by a phosphoimager (Applied Biosystems).
Example of Hybridization Under High Stringency Conditions:
[0036]The medium and high stringency conditions differ in particular with respect to the temperature and composition of the washing steps. Whereas the prehybridization and incubation with the radiolabeled probe is performed under the same conditions as in case of medium stringent hybridization the washing steps under stringent hybridization are as follows:
[0037]The membrane filter is first washed in 2×SSC and 0.5% SDS for 5 minutes at room temperature.
[0038]The filter is then washed in 2×SSC containing 0.1% SDS for 30 min at 50° C.
[0039]The filter is then washed in 0.1×SSC containing 0.1% SDS for 30 min at 60° C.
[0040]This last washing step is repeated one more time before the filter is exposed to a X-ray film or analyzed by a phospho imager.
[0041]In another embodiment the invention concerns an isolated polynucleotide comprising a DNA sequence that is selected from one of the following groups [0042]a] a DNA sequence of SEQ ID No. 2, 3, 4, 5 or 6; or [0043]b] a DNA sequence which is complementary to one of the DNA sequences of SEQ ID NO. 2, 3, 4, 5 or 6; or [0044]c] a DNA sequence which hybridizes to at least one DNA sequence of SEQ ID NO. 2, 3, 4, 5 or 6 or to at least one DNA sequence which is complementary to a DNA sequence of SEQ ID NO. 2, 3, 4, 5 or 6; or [0045]d] a DNA sequence which is degenerate as a result of the genetic code to at least one DNA sequence of SEQ ID NO. 2, 3, 4, 5 or 6; or [0046]e] a DNA sequence which is encoding a pak-3b polypeptide.
[0047]In one embodiment of the invention the isolated polynucleotide consists of a polynucleotide sequence of SEQ ID NO. 2, 3, 4, 5 or 6.
[0048]In a further embodiment of the invention the pak-3b polypeptide that is encoded by the DNA sequence is the pak-3b polypeptide of C. elegans consisting of an amino acid sequence of SEQ ID NO. 8, 9, 10, 11 or 12.
[0049]With respect to the hybridization of a polynucleotide to a pak-3b specifying sequence reference is made to the conditions as drafted aforementioned in context of pak-3a. The conditions as specified for pak-3a are just as applicable for pak-3b.
[0050]The invention refers in a further embodiment to a recombinant vector sequence comprising a DNA sequence selected from one of the following groups [0051]a] a DNA sequence of one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18; or [0052]b] a DNA sequence which hybridizes to one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18.
[0053]The conditions for hybridization as specified for pak-3a are applicable for a DNA sequence that hybridizes to one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18 as well.
[0054]The invention refers in a further preferred embodiment to a vector sequence that consists of a DNA sequence of one of the SEQ ID NO. 13, 14, 15, 16, 17 or 18.
[0055]The invention refers also to a host cell containing a recombinant vector system as specified in SEQ ID NO. 13, 14, 15, 16, 17 or 18.
[0056]A host cell may be any cell that is transformable by a vector sequence. Examples of host cells are: Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, insect cells, mammalian cell lines (NIH 3T3; COS; Hela etc.) and others.
[0057]A further embodiment of the invention refers to an isolated protein that is encoded by a polynucleotide sequence of SEQ ID NO. 1. This isolated protein can consist of an amino acid sequence of SEQ ID NO. 7. This isolated protein can exhibit the activity of a pak-3a protein.
[0058]A further embodiment of the invention refers to an isolated protein that is encoded by a polynucleotide sequence of SEQ ID NO. 2, 3, 4, 5 or 6. Such an isolated protein can consists of an amino acid sequence of SEQ ID NO. 8, 9, 10, 11 or 12. This isolated protein can exhibit the activity of a pak-3b protein.
[0059]The invention refers also to the use of a host cell containing a recombinant vector system of SEQ ID NO. 13, 14, 15, 16, 17 or 18 for manufacturing of a protein having an amino acid sequence of SEQ ID NO. 7 and/or exhibiting the activity of a pak-3a protein or for manufacturing of a protein having an amino acid sequence of SEQ ID NO. 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3b protein. In a further embodiment the invention refers to the use of a host cell containing a recombinant vector system of SEQ ID NO. 13, 14, 15, 16, 17 or 18 in a screening assay for identifying of a compound which interacts with a pak-3a protein or a pak-3b protein. When a compound interacts with a protein in context of the present invention it shall mean that the compound binds to the protein, or that it stimulates the activity of the protein (activation), or it diminishes the activity of the protein (inhibition), or it maintains the activity of the protein, or it stabilizes the acting of the protein.
[0060]The invention concerns further the manufacturing of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3a or pak-3b protein by cultivation of host cell which harbors a recombinant vector sequence of SEQ ID NO. 13, 14, 15, 16, 17 or 18, after cultivation the separation of the cells from cultivation medium, thereafter the lysis of the cells and the purification of the protein by means of protein purification techniques. A person skilled in the art will get access to all required protocols for performing such a method for manufacturing of the protein starting from cultivation of the cells up to the purification of the protein in a text book such as "Current Protocols in Protein Science; edited by Coligan J. E., Dunn B. M., Ploegh H. L., Speicher D. W., Wingfield P. T.; Wiley, John & Sons, Inc. (ISBN: 0471140988)".
[0061]In a further embodiment the invention refers to the use of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3a or pak-3b protein to the preparation of an antibody which exhibits binding specificity for such a protein.
[0062]In a further embodiment the invention pertains to the use of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 and/or exhibiting the activity of a pak-3a or pak-3b protein the preparation of a medicament for therapy of a disease which is caused by a deficiency, hyperactivation, or malfunction of a mammalian analogous protein of a pak-3a and/or pak-3 protein. Such a mammalian analogous protein may be derived from the human species. It can consist of a kinase protein. The disease involved may be related to a malfunction of the central nervous system, of metabolism, of the cardiovascular system, of the cell division process, or of other cellular or systemic processes.
[0063]The invention refers further to the use of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12 in a screening process for identifying of a compound that interacts with a pak-3a or a pak-3b protein. Such a screening process can be organized in form of a High-Throughput-Screening (HTS). The HTS is based upon automized screening formats by means of laboratory robot systems.
[0064]An embodiment of the invention refers to an assay for identifying of a compound that is interacting with a pak-3a and/or a pak-3b protein wherein [0065]a] a pak-3a and/or a pak-3b protein is provided, [0066]b] a chemical compound is provided, [0067]c] the pak-3a and/or the pak-3b protein and the chemical compound are brought in contact, [0068]d] the binding of the chemical compound to the pak-3a and/or pak-3b protein is determined and/or the activity of the pak-3a and/or pak-3b protein is determined.
[0069]The pak-3a or pak-3b protein can consist of a protein having an amino acid sequence of SEQ ID NO. 7, 8, 9, 10, 11 or 12. A chemical compound can be provided by means of a chemical synthesis performed in a chemist's laboratory or by an industrial process. A chemical compound can be further provided by isolation from a biological organism (e.g. bacterium, fungus, plant, mammal etc.).
[0070]The pak-3a or pak-3b protein can be provided in form of a host cell which harbors a recombinant vector of SEQ ID NO. 13, 14, 15, 16, 17 or 18 and expresses a protein having the activity of a pak-3a or pak-3b protein. In one embodiment of the invention such a host cell is brought in contact with the chemical compound. In a further embodiment of the invention the assay is used for identifying a compound that is inactivating, or activating, or binding, or maintaining the activity of a pak-3a and/or pak-3b protein.
[0071]The invention concerns further a compound that can be identified by such an assay as well as the use of such a compound as pharmaceutically active ingredient or the use of such a compound for manufacturing of a medicament. Such a compound may consist of a molecular weight of 100 to 50 000 kDa.
[0072]The invention pertains in a further embodiment to a strain of C. elegans that is exhibiting a loss-of-function phenotype with respect to the pak-3a and/or pak-3b protein. Such a loss-of-function phenotype is detectable by means of southern or northern blots in case the gene and/or the mRNA is not expressed. The loss-of-function phenotype is also detectable by western blots in case the protein is not expressed. The determination of the activity of the pak-3a or pak-3b protein proves the loss-of-function phenotype with respect to the pak-3a and/or pak-3b protein in case the organism is not able to produce functional versions of the proteins, or is degrading the proteins rapidly or contains inhibitors of the proteins. The loss-of-function phenotype of a C. elegans strains with respect to the pak-3a and/or pak-3b protein can be linked to gonad migration, embryonic lethality or sterility.
[0073]In one embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by a mutation or by a partly or complete deletion of the gene coding sequence of pak-3a and/or pak-3b.
[0074]In a further embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by an insertion of a polynucleotide sequence into the gene coding sequence of pak-3a and/or pak-3b.
[0075]In a further embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by a polynucleotide that is selected from the following group:
RNAi; (interference RNA), Ribozyme, antisense RNA, antisense DNA.
[0076]The inactivation of specific mRNAs upon exposure to double-stranded RNA (dsRNA) can in C. elegans be achieved by several different approaches. Below is a short summary of the main approaches.
By Feeding
[0077]In RNAi by feeding a cDNA or genomic DNA fragment from the gene of interest is cloned in a plasmid between two opposing T7 RNA polymerase promoter sites. The plasmid is subsequently transformed in to an E. coli host strain that contains an inducible T7 RNA polymerase gene and the E. coli strain obtained is used as a C. elegans food source (Timmons and Fire, 1998, Nature, 395, 854; Timmons et al, 2001, Gene 263, 103-112; Kamath et al, 2000, Genome Biology 2, research0002.1-0002.10research0002.1-0002.10).
[0078]The advantage of this approach is that relative large numbers of worms can be treated for RNAi and that over several generations. One disadventage is that some RNAs might be toxic to the E. coli. Normally phenotypes are scored initially in the F1 generation, although some phenotypes occasionally can be observes already in the P0 animals.
By Microinjection
[0079]The first described approach for RNAi in C. elegans was the microinjection of dsRNA into the animal body cavity (Fire et al, 1998, Nature 391, 801-811). For this approach dsRNA is obtained by in vitro transcription of a cDNA or genomic DNA fragment cloned into a vector with T3, T7 or SP6 RNA polymerase promoter sites, or from a hybrid PCR product containing both suitable RNA polymerase promoter site(s) and sequence from the gene of interest. Normally the two RNA strands are transcribed separately and subsequently annealed together. Alternatively both strands can be transcribed in one reaction (if the insert has been cloned in both orientations downstream of the same promoter), meaning that a separate annealing stap can be left out. The in vitro produced dsRNA is subsequently microinjected into the body of C. elegans animals.
[0080]With this approach the number of animals available for analysis is much lower than the feeding method. However, it has occasionally been claimed that microinjection has a higher success rate.
By Soaking
[0081]Instead of microinjecting the in vitro transcribed dsRNA the worms can be incubated ("soaked") in high concentrations of dsRNA (Maeda et al 2001, Current Biology, 11, 171-176).
[0082]This approach is less labour intensive than microinjection but is not so commonly used.
By Transgenics
[0083]DsRNA can also be produced in situ in the worms by generating transgenic animals expressing either a hairpin RNA molecule that fold on itself to a dsRNA, or by the use of two transgenic constructs expressing the two different RNA strands. Although labour intensive, this approach opens the possibility for stably knocked-down RNAs as well as tissue-specific and inducible RNAi, depending on the promoter chosen for driving the RNA expression (Tavernakis et al, 2000, Nature Genetics, 24, 180-183).
[0084]In a further embodiment of the invention the loss-of-function phenotype of the strain of C. elegans is caused by an inhibitor of the pak-3a and/or pak-3b protein.
[0085]The invention refers also to the use of a strain of C. elegans that is exhibiting a loss-of-function phenotype for identifying of a protein of the PAK signaling pathway. Such a protein can be a kinase, a phosphatase, a transcription enhancer, a transcription repressor or any other protein which is able to interact with a intracellular signaling cascade.
[0086]The invention refers further to the use of a strain that is exhibiting a loss-of-function phenotype for identifying a compound that interacts with a protein of the PAK signaling pathway.
[0087]The invention further pertains to a method for generating a C. elegans having a phenotype that is characterized by sterility and/or embryonic lethality and/or a defective gonad migration pattern in by [0088]a] inactivating the pak-1 gene and/or pak-1 protein in a C. elegans and/or inactivating the pak-3 gene and/or pak-3 protein of the same C. elegans and [0089]b] identifying of a C. elegans exhibiting the phenotype of sterility and/or embryonic lethality and/or a defective gonad migration pattern.
[0090]In context of this application the term pak-3 shall include pak-3a and pak-3b. When referring to pak-3 the reference shall pertain to pak-3a and/or pak-3b.
[0091]The inactivating of the pak-1 gene and/or pak-1 protein and pak-3 gene and/or pak-3 protein has to be performed in the same C. elegans organisms. The inactivating of both genes could occur simultaneously at the same time or consecutively one after another. In all cases of inactivation of pak-1 and/or pak-1 to obtain the loss of function phenotype it makes no difference whether the chemical inhibition and/or genetic inactivation of pak-3 is performed before or subsequently after the chemical inhibition and/or genetic inactivation of pak-1
[0092]The identifying of a C. elegans exhibiting the phenotype of sterility and/or embryonic lethality and/or a defective gonad migration pattern can occur in offsprings of the F1 and/or the F2 and/or a further following generation.
[0093]The inactivating of the pak-1 gene and/or the pak-1 protein can be achieved by means of RNA molecules that are suitable for RNA interference with a pak-1 coding polynucleotide. Such RNA molecules can be derived from at least one of the vectors of the following group: pKG61 (SEQ ID NO. 26), pKG71 (SEQ ID NO.28). For that purpose the according vector is introduced into a bacterial strain as e.g. E. coli, the RNA is transcribed from the plasmid promotor and thereafter isolated from the bacteria and purified. The purified RNA is then brought in contact with a cell of C. elegans, or a part of an organism of C. elegans or a complete organism of C. elegans.
[0094]A further possibility of inactivating the pak-1 gene and/or the pak-1 protein consists in feeding bacteria to C. elegans which bacteria contain RNA molecules which are suitable for RNA interference with a pak-1 coding polynucleotide. Such bacteria for the feeding of the C. elegans can harbor at least one plasmid of the following group: pKG61 (SEQ ID NO. 26), pKG71 (SEQ ID NO. 28). The RNA is transcribed from these vectors within the bacteria.
[0095]The inactivating of the pak-1 gene and/or the pak-1 protein can be performed by use of a pak-1 knock out strain of C. elegans. Such a knock out strain is e.g. C. elegans RB 689.
[0096]The pak-1 gene and/or pak-1 protein can be inactivated by means of an according antisense RNA, antisense DNA, a Ribozyme, an inhibitor of the pak-1 gene transcription or an inhibitor of the pak-1 protein.
[0097]The inactivating of the pak-3 gene and/or pak-3 protein can be achieved by means of RNA molecules that are suitable for RNA interference with a pak-3 coding polynucleotide. Such RNA molecules can be derived from at least one of the vectors of the following group: pKG65 (SEQ ID NO. 27), pKG71 (SEQ ID NO. 28), pKG63 (SEQ ID NO. 29), pKG64 (SEQ ID NO. 30). For that purpose the according vector is introduced into a bacterial strain as e.g. E. coli, the RNA transcribed from the plasmid promotor and thereafter isolated from the bacteria and purified. The purified RNA is then brought in contact with a cell of C. elegans, or a part of an organism of C. elegans or a complete organism of C. elegans. A further possibility of inactivating the pak-3 gene and/or the pak-3 protein consists in feeding bacteria to C. elegans which bacteria contain RNA molecules which are suitable for RNA interference with a pak-3 coding polynucleotide. Such bacteria for the feeding of the C. elegans can harbor at least one plasmid of the following group: pKG65 (SEQ ID NO. 27), pKG71 (SEQ ID NO. 28), pKG63 (SEQ ID NO. 29), pKG64 (SEQ ID NO. 30). The RNA is transcribed from these vectors within the bacteria. The inactivating of the pak-3 gene and/or pak-3 protein can be performed by use of a pak-3 knock out strain of C. elegans. The pak-3 gene and/or pak-3 protein can be inactivated by means of an according antisense RNA, antisense DNA, a Ribozyme, an inhibitor of the pak-3 gene transcription or an inhibitor of the pak-3 protein.
[0098]The invention pertains further to a strain of C. elegans which is characterized by a phenotype of sterility and/or embryonic lethality and/or a defective gonad migration and which harbors an inquired or missing pak-1 function and an impaired or missing pak-3 function. In context of this application the term function shall refer to the gene and/or the protein. The strain of C. elegans of the invention which is characterized by a phenotype of sterility and/or embryonic lethality and/or a defective gonad migration pattern could be obtainable or could be obtained by one or several of the methods for generating a C. elegans having said phenotypes. Such a strain can be used amongst other things for characterizing the intracellular signaling cascade linked to pak-1 and/or pak-3. Such a strain can also be used for identifying of a compound that interferes with one or several proteins which are part of the signaling cascade linked to pak-1 and/or pak-3. Such a strain can further be used for identification of a compound that interferes with transcription of one or several proteins that are part of the signaling cascade linked to pak-1 and/or pak-3.
[0099]The invention relates also to manufacturing of a RNA molecule wherein at least one of the polynucleotides of pKG61 (SEQ ID NO. 26), pKG65 (SEQ ID NO. 27), pKG71 (SEQ ID NO. 28), pKG63 (SEQ ID NO. 29), pKG64 (SEQ ID NO. 30), pKG167 (SEQ ID NO. 31) or pKG168 (SEQ ID NO. 32) is transformed into a bacterial strain, the RNA is transcribed from the vector and the transcribed RNA is isolated and/or purified. The invention pertains also to RNA molecules that are obtainable or obtained by such a method. These RNA molecules can be used as individual species one by one or in a combined manner for RNA interference with a pak-1 and/or pak-3 protein coding polynucleotide.
Description of SEQ IDs
[0100]SEQ ID NO. 1 is disclosing the polynucleotide sequence of the pak-3a cDNA. The pak-3a gene is consisting of the coding information of a kinase domain.
[0101]SEQ ID NO. 2 is disclosing the polynucleotide sequence of the pak-3b cDNA. The pak-3b gene is consisting of the coding information of a kinase domain of the same sequence composition as pak-3a and a additional CRIB domain (cdc42/Rac interactive binding domain) which is 5'-linked to the kinase domain.
[0102]SEQ ID NO. 3 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring a silent polymorphism (change from gct to gcc) that would leave the concerned Ala of the corresponding protein unchanged.
[0103]SEQ ID NO. 4 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring the silent polymorphism as described in SEQ ID NO. 3 and harboring further a in frame 6 bp insertion within the kinase domain.
[0104]SEQ ID NO. 5 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring an in frame 9 bp insert within the CRIB domain and having the corresponding sequence of Exon 7 deleted.
[0105]SEQ ID NO. 6 is disclosing the polynucleotide sequence of the pak-3b cDNA harboring a polymorphism (change from atc to gtc) within the CRIB domain which changes an Ile into a Val of the corresponding protein and harboring the in frame insertion of 6 bp of the kinase domain (as is the same as in SEQ ID NO. 4).
[0106]SEQ ID NO. 7 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 1 (kinase domain).
[0107]SEQ ID NO. 8 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 2 (kinase domain plus CRIB domain).
[0108]SEQ ID NO. 9 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 3 (kinase domain plus CRIB domain).
[0109]SEQ ID NO. 10 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 4 (kinase domain having a 6 bp insert plus CRIB domain).
[0110]SEQ ID NO. 11 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 5 (kinase domain plus CRIB domain having a 9 bp insert and Exon 7 deleted).
[0111]SEQ ID NO. 12 is disclosing the amino acid sequence of the corresponding protein of SEQ ID NO. 6 (kinase domain having a 6 bp insert plus CRIB domain in which an Ile is changed into a Val).
[0112]SEQ ID NO. 13 is disclosing the polynucleotide sequence of vector pKG40, which is encompassing the polynucleotide sequence of SEQ ID NO. 1. A description of the vector is given within the header of FIG. 13.
[0113]SEQ ID NO. 14 is disclosing the polynucleotide sequence of vector pKG 123, which is encompassing the polynucleotide sequence of SEQ ID NO. 2. A description of the vector is given within the header of FIG. 14.
[0114]SEQ ID NO. 15 is disclosing the polynucleotide sequence of vector pKG 43, which is encompassing the polynucleotide sequence of SEQ ID NO. 3. A description of the vector is given within the header of FIG. 15.
[0115]SEQ ID NO. 16 is disclosing the polynucleotide sequence of vector pKG 44, which is encompassing the polynucleotide sequence of SEQ ID NO. 4. A description of the vector is given within the header of FIG. 16.
[0116]SEQ ID NO. 17 is disclosing the polynucleotide sequence of vector pKG 58, which is encompassing the polynucleotide sequence of SEQ ID NO. 5. A description of the vector is given within the header of FIG. 17.
[0117]SEQ ID NO. 18 is disclosing the polynucleotide sequence of vector pKG 59, which is encompassing the polynucleotide sequence of SEQ ID NO. 6. A description of the vector is given within the header of FIG. 18.
[0118]SEQ ID NO. 19 is disclosing the primer sequence kg 1.
[0119]SEQ ID NO. 20 is disclosing the primer sequence kg 2.
[0120]SEQ ID NO. 21 is disclosing the primer sequence kg 25.
[0121]SEQ ID NO. 22 is disclosing the primer sequence kg 26.
[0122]SEQ ID NO. 23 is disclosing the primer sequence kg 37.
[0123]SEQ ID NO. 24 is disclosing the primer sequence kg 27.
[0124]SEQ ID NO. 25 is disclosing the primer sequence kg 50.
[0125]SEQ ID NO. 26 is disclosing the polynucleotide sequence of vector pkG61/dT7-pak-1. A description of the vector is given within the header of FIG. 26.
[0126]SEQ ID NO. 27 is disclosing the polynucleotide sequence of vector pkG65/dT7-pak-3. A description of the vector is given within the header of FIG. 27.
[0127]SEQ ID NO. 28 is disclosing the polynucleotide sequence of vector pkG 71/dT7-pak-3/pak-1. A description of the vector is given within the header of FIG. 28.
[0128]SEQ ID NO. 29 is disclosing the polynucleotide sequence of vector pkG63/dT7-pak-3a. A description of the vector is given within the header of FIG. 29.
[0129]SEQ ID NO. 30 is disclosing the polynucleotide sequence of vector pkG64/dT7-pak-3b. A description of the vector is given within the header of FIG. 30.
[0130]SEQ ID NO. 31 is disclosing the polynucleotide sequence of vector pkG167/dT7-ced-10. A description of the vector is given within the header of FIG. 31.
[0131]SEQ ID NO. 32 is disclosing the polynucleotide sequence of vector pkG168/dT7-mig-2. A description of the vector is given within the header of FIG. 32.
[0132]SEQ ID NO. 33 is disclosing the polynucleotide sequence of expressed sequence tag (EST) y38f1a10 of C. elegans.
[0133]SEQ ID NO. 34 is disclosing the polynucleotide sequence of EST yk65141 5' of C. elegans.
[0134]SEQ ID NO. 35 is disclosing the polynucleotide sequence of EST yk65141 3' of C. elegans.
[0135]SEQ ID NO. 36 is disclosing the polynucleotide sequence of EST F18a 11.4 of C. elegans.
DESCRIPTION OF THE FIGURES
[0136]FIG. 1 exhibits SEQ ID NO. 1.
[0137]FIG. 2 exhibits SEQ ID NO. 2.
[0138]FIG. 3 exhibits SEQ ID NO. 3.
[0139]FIG. 4 exhibits SEQ ID NO. 4.
[0140]FIG. 5 exhibits SEQ ID NO. 5.
[0141]FIG. 6 exhibits SEQ ID NO. 6.
[0142]FIG. 7 exhibits SEQ ID NO. 7.
[0143]FIG. 8 exhibits SEQ ID NO. 8.
[0144]FIG. 9 exhibits SEQ ID NO. 9.
[0145]FIG. 10 exhibits SEQ ID NO. 10.
[0146]FIG. 11 exhibits SEQ ID NO. 11.
[0147]FIG. 12 exhibits SEQ ID NO. 12.
[0148]FIG. 13 exhibits SEQ ID NO. 13.
[0149]FIG. 14 exhibits SEQ ID NO. 14.
[0150]FIG. 15 exhibits SEQ ID NO. 15.
[0151]FIG. 16 exhibits SEQ ID NO. 16.
[0152]FIG. 17 exhibits SEQ ID NO. 17.
[0153]FIG. 18 exhibits SEQ ID NO. 18.
[0154]FIG. 19 exhibits SEQ ID NO. 19.
[0155]FIG. 20 exhibits SEQ ID NO. 20.
[0156]FIG. 21 exhibits SEQ ID NO. 21.
[0157]FIG. 22 exhibits SEQ ID NO. 22.
[0158]FIG. 23 exhibits SEQ ID NO. 23.
[0159]FIG. 24 exhibits SEQ ID NO. 24.
[0160]FIG. 25 exhibits SEQ ID NO. 25.
[0161]FIG. 26 exhibits SEQ ID NO. 26.
[0162]FIG. 27 exhibits SEQ ID NO. 27.
[0163]FIG. 28 exhibits SEQ ID NO. 28.
[0164]FIG. 29 exhibits SEQ ID NO. 29.
[0165]FIG. 30 exhibits SEQ ID NO. 30.
[0166]FIG. 31 exhibits SEQ ID NO. 31.
[0167]FIG. 32 exhibits SEQ ID NO. 32.
[0168]FIG. 33 exhibits SEQ ID NO. 33.
[0169]FIG. 34 exhibits SEQ ID NO. 34.
[0170]FIG. 35 exhibits SEQ ID NO. 35.
[0171]FIG. 36 exhibits SEQ ID NO. 36.
[0172]FIG. 37 Gonad path finding phenotypes (A) and intensing of gonad defects.
ABBREVIATIONS
[0173]NGM--Nematode growing medium [0174]E. coli OP50-- Uracil requiring strain of E. coli; used as a food source for nematodes [0175]DMSO--Dimethylsulfoxide.
Deposit of Plasmid DNA
[0176]The plasmids of the present invention have been deposited with the DSMZ--Deutsche Sammiung von Mikroorganismen und Zellkulturen GmbH (German Collection of Microorganisms and Cell Cultures GmbH)
Mascheroder Weg 1b
D-38124 Braunschweig
[0177]according to the following numbers:plasmid pKG40=DSM 16147 (see also Seq ID No. 13 as well as FIG. 13)plasmid pKG43=DSM 16148 (see also Seq ID No. 15 as well as FIG. 15)plasmid pKG44=DSM 16149 (see also Seq ID No. 16 as well as FIG. 16)plasmid pKG58=DSM 16150 (see also Seq ID No. 17 as well as FIG. 17)plasmid pKG59=DSM 16151 (see also Seq ID No. 18 as well as FIG. 18)plasmid pKG123=DSM 16152 (see also Seq ID No. 14 as well as FIG. 14)
EXAMPLES
Strains, General Strain Culture, Molecular & Genetic Methods
[0178]Nematode culture was done according to Brenner 1974.
[0179]Strains were obtained from the C. elegans Genomics Center (CGC) and the C. elegans knock out consortium.
Cloning of pak-3
[0180]The different isoforms of pak-3 cDNA were cloned by RT-PCR from N2 (C. elegans wild-type) total RNA. All primers contain 5'adaptor sequences to allow Gateway cloning. RT-PCR products with 5' Gateway adaptor sequences were re-amplified using AftB sequence primers
TABLE-US-00001 (SEQ ID NO. 19) kg1; GGGGACAAGTTTGTACAAAAAAGCAGGCT, (SEQ ID NO. 20) kg2; GGGGACCACTTTGTACAAGAAAGCTGGGT,
and cloned via the BP reaction into the vector pDONR201 as described by the manufacturer (Invitrogen).
[0181]Sequence alignments were done using the Lasergene software package. Blast database searches were conducted using the NCBI Blast tool (internal installation). Sequence motifs were identified using the Workbench Pfam HMM database search tool (GeneData AG).
[0182]pKG40 (pak-3a wild-type; DSM 16147) was cloned by the use of gene-specific primers based on the gene prediction for y38f1a.10, (SEQ ID NO. 33). (5': kg25; aaaaagcaggctcaaaaATGTTTCAAAATAGTCCGATGAT (SEQ ID NO. 21; 3': kg26; agaaagctgggtCTACTTTTCTCGGACGGCTCT, SEQ ID NO. 22). (One pak-3a clone was isolated by the 5' SL1 primer kg37 [see below] in combination with kg26. This clone was found to have an ORF identical to pKG40 and was not kept).
[0183]pKG43 (pak-3b SL 1 t1680c; Ala-Ala; DSM 16148) was cloned by a 5' primer corresponding to the SL1 trans-spliced leader sequence (kg37; aaaaagcaggctGGTTTAATTACCCAAGTTTGAG, SEQ ID NO. 23) in conjunction with the 3' primer kg26 (agaaagctgggtcTACTTTTCTCGGACGGCTCT, SEQ ID NO. 24).
[0184]pKG44 (pak-3b Ins1581 t1686c; DSM 16149), pKG58 (pak-3b Ins228 Deta Exon 7) and pKG59 (pak-3b a43g Ins1581) were all cloned by combining a gene-specific 5' primer, kg50 (aaaaagcaggctcaaaaATGTCAACTTCAAAAAGTTCCAAG, SEQ ID NO. 25), derived from the sequence information from pKG43, with the 3' primer kg26 (SEQ ID NO. 22).
[0185]pKG123 (pak-3b wild-type; DSM 16152) was constructed by replacing an EcoRV-SacI restriction fragment (the C-terminal part of the kinase domain) in pKG44, containing deviations from the wild-type sequence, with the corresponding wild-type fragment from pKG58, thus creating a full-length, wild-type cDNA clone.
RNAi
[0186]RNA interference was done using the feeding method as described previously in Fraser et al. 2000 and Kamath et al. 2000. Double RNAi was done either by mixing bacterial cultures before seeding plates or by generation of vector constructs containing two cDNA fragments.
[0187]Vectors for RNAi by feeding were generated by cloning full-length or partial cDNAs into derivatives of the double T7 vector pPD129.36 (Timmons & Fire, Nature 395, 854). Either a Gateway-adopted version (pKG14) was used for cloning according to standard Gateway protocols (Invitrogen) or a version with a SrfI site added to the polylinker (pKG90) was used for direct cloning of PCR products as described (Schlofterer, C. and Wolff, C. Trends Genet, 1996.12, 286-287).
RNAi Vectors:
[0188]pKG61 (dT7-pak-1); vector: pKG14; insert: bp 1-1710 of pak-1; the RNA is encoded from 89 to 1753 of SEQ ID NO. 26 (entire ORF) (SEQ ID NO. 26) [0189]pKG65 (dT7-pak-3); vector: pKG14; insert: bp 1141-1941 of pak-3b (kinase domain); the RNA is encoded from 84 to 883 of SEQ ID NO. 27; specific for both pak-3a and pak-3b (SEQ ID NO. 27) [0190]pKG71 (dT7-pak-3/-1); vector pKG14; inserts: bp 1141-1941 of pak-3b (kinase domain) and bp 921-1710 of pak-1 (kinase domain); the RNA is encoded from 84 to 884 (pak1) and 885 to 1674 (pak3) of SEQ ID NO. 28; for double RNAi against pak-1/pak-3 (SEQ ID NO. 28) [0191]pKG63 (dT7-pak-3a); vector pKG14; insert: bp 1-128 of pak-3a (N-terminus); the RNA is encoded from 89 to 216 of SEQ ID NO. 29; specific for pak-3a (SEQ ID NO. 29) [0192]pKG64 (dT7-pak-3b); vector pKG14; insert: bp 1-788 of pak-3b (N-terminus); the RNA is encoded from 89 to 876 of SEQ ID NO. 30; specific for pak-3b (SEQ ID NO. 30) [0193]pKG167 (dT7-ced-10); vector pKG90; insert: bp 40-562 of c09g12.8b (ced-10); the RNA is encoded from 136 to 658 of SEQ ID NO. 31 pKG168 (d7-mig-2); vector pKG90; insert: bp 13-566 of c35c5.4 (mig-2); the RNA is encoded from 137 to 689 of SEQ ID NO. 32.
Assay for Phenotypic Analysis
[0194]Egg lay was scored by placing 5 or 10 [for 1. pak-1 (RNAi); 2. pak-1 (ok448); pak-3(RNAi)3. pak-1(ok448); pak-3b(RNAi)] adult F1 generation worms (1st generation progeny from the P0 parents initially exposed to RNAi treatment) on plates (5 plates per RNAi treatment) for 5 h at 20° C. and subsequent manual counting of the eggs after removal of the adult worms. Embryonic lethality was defined as the number of eggs remaining 24 h after removal of the adult worms relative the total number of eggs laid.
[0195]Gonad morphology and distal tip cell (DTC) migration was scored essentially as described previously (Nishiwaki 1999 Genetics 152, 985-997; Su et al 2000, Development 127, 585-594). Briefly, F1 generation late L4 larvae or young adults were observed under Nomarski (DIC) optics using an Axioplan 2 microscope (Sulston and Horwitz 1977, Dev Biol 56, 110-156) and the trajectories of the DTCs were deduced from the shapes of the gonad arms. As a negative control worms were exposed to bacteria expressing the empty T7 vector. The DTC migration phenotypes were group into five different classes (FIG. 1a): I) wild-types, showing the typical C-sharped gonad with normal 1st and 2nd turns; II) Rac-type, typically observed in ced-10 and mig-2 mutants, with normal 1st and 2nd turns but with an additional 3rd turn leading to that the gonadal tip points away from the midbody region (Reddien and Horwitz 2000, Nature Cell Biol 2, 131-136); III) Pak-type, with a normal 1st turn but a 2nd turn in the wrong direction away from the midbody region (a similar phenotype has been described previously for the mutant mig-14 [Nishiwaki 1999]); IV) Straight, where the gonad progresses without any turns along the ventral side away from the midbody region; V) Other, mainly a complete lack of gonad outgrowth or ruptured gonads with free-floating germ cells in the body.
Compound Testing
[0196]For screening of candidate pak-3 inhibitory compounds, synchronized RB689 (pak-1, ok448) L1 larvae were obtained by NaOH/Na-hypochlorite treatment of gravid adults and subsequent incubation of the resulting eggs in M9 buffer O/N at 20° C. with agitation essentially as described (C. elegans, a practical approach, Ed. I. Hope, 1999). About 30 L1 larvae in NGM medium were mixed with 2 OD600 E. coli OP-50, 100 μM test compound, 1% DMSO (from compound stock solution) in a final volume of 50 μl per well in flat-bottomed 96-well plates and incubated at 200 for 3 to 4 days. Preliminary in-well scoring of gonad phenotypes was done using an Axiovert 200 microscope. For final scoring, worms were pipetted out of the wells, mounted and analyzed under Nomarski (DIC) optics. As a negative control worms were incubated with 1% DMSO.
Cloning of Pak-3 cDNAs and Identification of the Pak-3 Gene
[0197]The initial indication of a hitherto unknown pak gene in C. elegans came from the identification of a predicted open reading frame, y38f1a.10 (SEQ ID NO. 33), encoding a kinase domain with high homology to a pak-type kinase domain. (The kinase domain is also classified as a pak-type kinase domain in the kinase database "kinase.com" located on the world wide web). However, the predicted ORF y38f1a.10 does not encode for a CRIB-domain, the regulatory domain conserved within the PAK gene family. We noticed by sequence comparison that the EST yk651h1 (SEQ ID NO. 34, 35), covering the y38f1a.100RF, also contains parts of the predicted ORF f18a11.4 (SEQ ID NO. 36), located upstream of y38f1a.10. This suggested to us that these two predicted ORFs might in fact be one single gene. To test this we performed RT-PCR using a 3' gene-specific primer corresponding to the 3' end of y38f1a.10 and a 5' primer corresponding to the SL1 leader sequence spliced in trans to many C. elegans mRNAs. As it has been reported that the C. elegans pak-1 mRNA is SL1 trans-spliced we suspected that this might be the case also for mRNAs from other C. elegans pak genes.
[0198]Sequence analysis of the RT-PCR products obtained revealed two different classes of mRNAs. The first class (isoforms a) corresponds roughly to the predicted ORF y38f1a.10 but with an additional exon upstream of the kinase domain. The second class (isoforms b) spans both ORFs y38f1a.10 and f18a11.4, thus demonstrating that these two ORFs belong to one single gene. However, the mRNAs have a 5' region longer than predicted in f18a11.4 and also longer than the EST yk651h1. Sequence analysis revealed a splicing pattern different from the ORF f18a11.4 and most importantly a domain with homology to a CRIB domain. Blast sequence database searches with cDNA isoforms b yielded a highest similarity score against human and rodent PAK3 and secondly against human and rodent PAK1.
[0199]Taken together this demonstrates that the two predicted ORFs y38f1a.10 (SEQ ID NO. 33) and f18a11.4 (SEQ ID NO. 36) are in fact one gene that codes for two different mRNA splice variants, a short form encoding a protein mainly consisting of a pak-type kinase domain and a 5' longer form encoding a typical PAK protein. Based on sequence similarity and biological function (see below) we propose to call this novel pak gene pak-3 with the short splice variant denoted pak-3a and the long form pak-3b.
RNAi in C. elegans Strains N2 (Wild Type) and RB689 (pak-1)
[0200]To assess the biological function of pak-3, RNAi by feeding experiments were performed. However, no obvious phenotypes could be detected when N2 wild-type worms were used for RNAi. Similarly, no phenotypes were observed when pak-1 function was assayed by RNAi, which was corroborated by observation of the pak-1 knock-out strain RB689, appearing completely wild-type in morphology and behavior.
[0201]Based on the similarity between pak-1 and pak-3 it was concluded that the lack of phenotypes in the RNAi experiments could be explained by supplementary functions of pak-1 and pak-3. To confirm this double RNAi experiments were conducted in the N2 background as well as pak-3 RNAi in the pak-1 knock-out strain RB689 (ok448). Similar results were obtained in both approaches, showing several drastic phenotypes: sterility, embryonic lethality and defects in the gonad migration pattern. Sterility was not completely penetrant but reproducibly shown to be very strong and readily visible. The result of a representative quantitative experiment is shown in Table I. Compared to the control worms exposed to mock RNAi treatment, the relative number of eggs laid by animals exposed to pak-1; pak-3 double RNAi was only 17%, when double RNAi was performed by mixing pak-1 and pak-3 RNAi bacterial cultures. When double RNAi was done using bacteria expressing a hybrid pak-1/pak-3 double RNA molecule the effect was somewhat stronger, 11%, suggesting that double RNAi by mixing of cultures is only moderately less efficient than the use of a dedicated double RNAi vector.
[0202]In pak-11g (ok448); pak-3 (RNAi) animals sterility was even more penetrant, only 3% compared to pak-1If (ok 448); mock (RNAi). When compared with the results obtained in the N2 background, this indicates that pak-1 RNAi is not as penetrant as the complete knock-out, which can be expected.
[0203]Embryonic lethality was initially observed from the presence of small, round eggs that did not hatch upon prolonged incubation. Closer examination of these eggs suggested a high degree of cellular differentiation, for example muscle and pharynx tissue was clearly present. However, the overall morphology of the embryos was distorted, ranging from moderate to very severe with no morphological features conserved. A quantitative analysis (Table I) demonstrated more than 20% embryonic lethality in N2 animals and almost 40% in the RB689 background.
[0204]Interestingly, the phenotypes could not be observed in the first generation (P0) of worms exposed to RNAi, sterility and gonad defects were first observed in the F1 generation. Embryonic lethality was first seen in F2 generation embryos, suggesting maternal rescue in the F1 generation.
[0205]The cloning of pak-3 cDNAs had revealed the existence of two different splice variants, pak-3a and pak-3b. The functional importance of the two forms was demonstrated by conducting isoforms-specific RNAi in the RB689 background. Both as assayed from the sterility phenotype and as well as embryonic lethality it appears that the longer isoform pak-3b may play the mayor role with respect to the phenotypes observed (Table I).
[0206]A third pak gene is encoded by the predicted gene c45b11.1, which is most similar to the human PAK-4. It is known that mammalian PAK-4 differ significantly in regulation and function from PAK1 and PAK3. In agreement with this, no additional phenotypes were observed in double RNAi experiments between c45b11.1 and pak-1 or pak-3.
pak-3 and pak-1 are Required for DTC Pathfinding
[0207]In the C. elegans hermaphrodite the shape of the bi-lobed gonad is determined by the paths of cell migration of the gonadal distal tip cells (DTCs). In wild-type animals the two gonadal arms develop from the ventrally located gonadal primordium in the midbody. One DTC migrates anteriorly and the other posteriorly close to the ventral midline. The migration of the DTCs then undergoes two turns, the first turn towards the dorsal side and the second turn towards the midbody. The result of these migrations is the formation of the two symmetrical C-shaped adult gonad arms. As mentioned above we noted deviations from the wild-type gonad shape, indicative of defects in DTC migration, were noted in pak-3(RNAi); pak-1 (RNAi) and pak-3(RNAi); RB689(ok448) animals, but not in single pak-3(RNAi); pak-1 (RNAi) or the RB689 strain itself. Thus, also for this phenotype pak-1 and pak-3 appears to act supplementary. In more than half of the of gonads observed the first turn appeared normal whereas the second turn was in the wrong direction, i.e. instead of turning towards the midbody, the posterior gonad continued posteriorly and the anterior continued anteriorly (FIG. 1). Occasionally gonads without any turns were observed, the gonads continuing along the ventral midline towards the posterior and anterior end of the animal, respectively.
[0208]There were also analyzed the pak-3 isoform specific effects on DTC migration by pak-3b and pak-3a RNAi in the RB689 background. The results demonstrate that only the pak-3b isoform is important for DTC migration, as for the sterility and embryonic lethality phenotypes (Table II).
Pak-Rac Interaction
[0209]It has previously been described that two of the three Rac GTPases in C. elegans, ced-10 and mig-2, are involved in DTC pathfinding (Reddien & Horwitz, 2000, Lundquist et al 2001). In ced-10 and mig-2 mutants the gonads undergo a third, extra, turn after the second turn, leading to gonad tips pointing away from the midbody. This phenotype is different from what was observed in pak-1; pak-3 mutant animals, in which already the second turn was defective. It is furthermore known from invitro studies and mammalian cell systems (e.g. Bishop & Hall Biochem J, 2000) that Rho GTPases, to which the Rac proteins belong, are upstream regulators of PAKs. Given that the C. elegans paks also are important for DTC pathfinding it was deducted that there is an interaction between pak-1 and pak-3 and the two Racs ced-10 and mig-2 in C. elegans gonad development. To demonstrate this a set of RNAi experiments was performed in different genetic backgrounds (summarized in Table II). The different experimental combinations consistently showed that mig-2 or ced-10 loss of function did not lead to a stronger phenotype in combination with pak-3 than the separate single loss of function mutants. However, in combination with pak-1 mutants the penetrance and severity of the gonad migration defects increased dramatically. As pak-1 and pak-3 act supplementary, these results suggest that ced-10 and mig-2 act as upstream regulators of pak-3 but not, or only to a minor extent, of pak-1. Interestingly, the ced-10; mig-2; pak-1 triple mutant animals were much stronger affected than pak-1; pak-3 double mutants, suggesting that the two Racs also act through other pathways than pak-3 in parallel. Furthermore it was not only observed that the penetrance of DTC pathfinding defects was higher in ced-10; mig-2; pak-1 animals but also the phenotypic spectrum shifted towards more severe pathfinding and migration defects. High frequencies of gonads were observed without turns and also gonad movement defects. This demonstrates that both Paks and Racs are involved in several stages and aspects of DTC pathfinding but that these functions are not evident in the single or double mutants, probably as an effect of the redundant functions of these genes.
Compound Mimicking RNAi Phenotype
[0210]To investigate if the gonad migration defect phenotype can be used as a reporter for PAK-3 inhibitory small molecules, worms were exposed to a set of potential PAK inhibitors, derived from a chemical compound collection, in a 96-well assay format. Synchronized RB689 (pak-1, ok448) L1 larvae were incubated with test compounds in NGM (media) and E. coli OP-50 as food source. As the compounds were added as DMSO solutions, worms exposed to DMSO was used as a control. At late L4 or early young adult stage, gonad phenotypes were scored.
[0211]Several of the 14 substances tested showed a partial effect on gonad migration, causing phenotypes similar to those seen in pak-1; pak-3 (RNAi) animals. In particular, one compound tested, A000025706, was shown to reproducibly cause gonad migration defects (Table III). Out of 100 gonads analyzed, 74 were found to have gonad migration defects.
[0212]The observation that the types of defects observed differ somewhat from those observed with RNAi is possibly due to pharmacological properties of the compound, e.g. uptake and stability. It is also possible that other kinases involved in gonad development and other developmental processes are also inhibited by A000025706. In fact, we observed general growth retardation in worms treated with this compound, suggesting a certain degree of non-specific effects of A 000025706. However, as A000025706 has been confirmed as a PAK inhibitor in other assays (data not shown) we believe that most or all of the gonad migration defects observed can be attributed to a specific inhibition of PAK-3.
[0213]The observation that PAK inhibitors can be identified in a C. elegans-based assay demonstrates the usefulness of this model organism as a tool for pharmacological research. The fact that growth retardation was observed also exemplifies that potential side effects can be identified in parallel to the specific assay readout, i.e. that C. elegans-based assays can be valuable as high-throughput screening systems.
TABLE-US-00002 TABLE I % Eggs % Emb Strain RNAi treatment Genotype laid Lethality N2 ctrl Wild-type 100 0.2 pak-1 pak-1 (RNAi) 113 2.2 pak-3 pak-3 (RNAi) 69 5.0 pak-1/-3 (one vector pak-1 (RNAi); 11 20.8 pak-3(RNAi); pak-1/pak-3 pak-1(RNAi); 17 15.4 (mixed vectors) pak-3(RNAi RB689 ctrl pak-1(ok448) 100 4.4 pak-3 pak-1(ok448); 3 38.7 pak-3(RNAi) pak-3a pak-1(ok448); 134 4.0 pak-3a(RNAi) pak-3b pak-1(ok448); 2 22.7 pak-3b(RNAi
TABLE-US-00003 TABLE II movement pathfinding defects defects % affected Strain RNAi construct Genotype % wt % rac % pak % straight % other gonads n N2 ctrl Wild type 100.0 0.0 0.0 0.0 0.0 0.0 168 pak-1 pak-1(RNAi) 100.0 0.0 0.0 0.0 0.0 0.0 154 pak-3 pak-3(RNAi) 99.3 0.0 0.0 0.7 0.0 0.7 150 pak-1/-3 (one vector) pak-1(RNAi); pak-3(RNAi) 41.7 0.0 54.3 3.5 0.4 58.3 230 pak-1/pak-3 (mixed pak-1(RNAi); pak-3(RNAi) 44.6 0.0 52.2 3.3 0.0 55.4 92 vectors) mig-2 mig-2(RNAi) 91.2 8.1 0.7 0.0 0.0 8.8 136 mig-2/pak-1 mig-2(RNAi); pak-1(RNAi) 75.0 1.6 22.6 0.8 0.0 25.0 124 mig-2/pak-3 mig-2(RNAi); pak-3(RNAi) 88.8 7.2 2.0 2.0 0.0 11.2 152 mig-2/ced-10 mig-2(RNAi); ced-10(RNAi) 92.7 5.6 1.6 0.0 0.0 7.3 124 ced-10 ced-10(RNAi) 90.5 8.8 0.7 0.0 0.0 9.5 148 ced-10/pak-1 ced-10(RNAi); pak-1(RNAi) 49.2 1.6 36.7 12.5 0.0 50.8 128 ced-10/pak-3 ced-10(RNAi); pak-3(RNAi) 96.6 3.4 0.0 0.0 0.0 3.4 146 RB689 ctrl pak-1(ok448) 99.5 0.0 0.5 0.0 0.0 0.5 198 pak 1(ok448) pak-3 pak-1(ok448); pak-3(RNAi) 44.1 0.0 49.6 4.7 1.7 55.9 236 pak-3a pak-1(ok448); pak-3a(RNAi) 100.0 0.0 0.0 0.0 0.0 0.0 80 pak-3b pak-1(ok448); pak-3b(RNAi) 31.7 0.0 54.9 2.4 11.0 68.3 82 mig-2 pak-1(ok448); mig-2(RNAi) 55.6 0.0 33.8 10.6 0.0 44.4 160 mig-2/pak-3 pak-1(ok448); pak-3(RNAi); 57.8 0.0 24.7 16.9 0.6 42.2 154 mig-2(RNAi) mig-2/ced-10 pak-1(ok448); mig-2(RNAi); 29.8 0.0 13.7 55.6 0.8 70.2 124 ced-10(RNAi) ced-10 pak-1(ok448); ced-10(RNAi) 37.8 1.2 32.3 28.0 0.6 62.2 164 ced-10/pak-3 pak-1(ok448); pak-3(RNAi); 43.8 3.1 34.6 18.5 0.0 56.2 162 ced-10(RNAi) CF162 ctrl mig-2(mu28) 71.4 28.6 0.0 0.0 0.0 28.6 126 mig2 (mu28) pak-1 pak-1(RNAi); mig-2(mu28) 40.5 3.2 43.7 9.5 3.2 59.5 126 pak-3 pak-3(RNAi); mig-2(mu28) 74.6 23.0 0.0 2.4 0.0 25.4 126 pak-1/-3 pak-1(RNAi); pak-3(RNAi); 5.5 0.8 43.0 48.4 2.3 94.5 128 mig-2(mu28) mig-2 mig-2(mu28); mig-2(RNAi) 65.9 34.1 0.0 0.0 0.0 34.1 88 ced-10 mig-2(mu28); ced-10(RNAi) 59.8 14.8 13.9 7.4 4.1 40.2 122 ced-10/pak-1 pak-1(RNAi); mig-2(mu28); 14.1 4.7 20.3 50.0 10.9 85.9 128 ced-10(RNAi) ced-10/pak-3 pak-3(RNAi); mig-2(mu28); 74.6 14.8 4.1 2.5 4.1 25.4 122 ced-10(RNAi) MT5013 ctrl ced-10(n1993) 77.3 21.4 0.0 0.6 0.6 22.7 154 ced-10 (n1993) pak-1 pak-1(RNAi); ced-10(n1993) 47.4 1.3 36.4 14.3 0.6 52.6 154 pak-3 pak-3(RNAi); ced-10(n1993) 82.5 14.3 1.3 1.3 0.6 17.5 154 pak-1/-3 pak-1(RNAi); pak-3(RNAi); 49.7 0.7 43.0 2.6 4.0 50.3 151 ced-10(n1993) mig-2 mig-2(RNAi); ced-10(n1993) 81.7 6.3 4.8 1.6 5.6 18.3 126 mig-2/pak-1 pak-1(RNAi); mig-2(RNAi); 11.3 2.4 16.1 50.8 19.4 88.7 124 ced-10(n1993) mig-2/pak-3 pak-3(RNAi); mig-2(RNAi); 87.1 8.9 0.0 0.0 4.0 12.9 124 ced-10(n1993) ced-10 ced-10(n1993); ced-10(RNAi) 75.0 9.2 5.3 0.0 10.5 25.0 76
TABLE-US-00004 TABLE III Strain Cpd % wt % pak-like % Straight % Movement Def % affected n RB689 pak-1(ok448) ctrl 99 0 0 1 1 80 RB689 pak-1(ok448) 25706 26 4 51 19 74 100
Sequence CWU
1
3611281DNACaenorhabditis elegans 1atgtttcaaa atagtccgat gatgtacgac
tggtggaatg acaccaccaa accgaaacac 60cagcagccga cacttaacgt gttgtcacca
tggggagcat atttcaatca cattggaaat 120gaactgctgc atctgaaaat cgcatcgtcg
acagtatcct cgggatgctc gtctccacaa 180cagtattcgt ctgctcgatc cgttggtaac
tcgctctcca acggcagtgt tgtctccaca 240acatcgtcag atggtgatgt gcaattgtcg
aataaggaaa attcgaatga caaatcagtt 300ggagacaaga atgggaacac caccacaaac
aaaacgaccg tcgaaccacc tccaccagaa 360gagccacctg ttcgtgttcg agcatctcat
cgtgaaaagc tttctgattc cgaagtgctc 420aatcaactcc gcgagattgt taatccaagt
aatccacttg gaaagtacga gatgaagaag 480caaatcggtg ttggagcatc cggaactgta
ttcgttgcta atgtggccgg cagcactgat 540gtggtggctg tgaagagaat ggctttcaag
actcagccga agaaggagat gttgctcacc 600gagattaagg ttatgaagca gtatcgacac
ccgaacctcg tcaactacat tgaatcgtat 660ctggttgatg ctgatgatct ttgggtagtg
atggattatc tggaaggtgg aaacttgaca 720gatgtcgttg tgaagactga gttggacgaa
ggacaaattg cagcagtttt gcaagaatgt 780cttaaagcgc ttcacttcct tcatagacac
tccatagtgc accgagatat caagagtgac 840aacgtgctgc tcggcatgaa cggagaggtt
aagctcaccg atatgggatt ctgtgctcag 900attcagccgg gatcgaaaag agatactgtc
gtcggaactc catattggat gtcgccggag 960atattgaaca agaagcagta caactataag
gttgacattt ggtcgctggg aattatggct 1020ctagagatga ttgatggaga gccaccatat
ttgagagaaa cacctttgaa ggctatctac 1080ttgattgctc aaaacgggaa gccagagatc
aagcaacgcg acagactgtc ttcagagttc 1140aacaatttcc ttgacaagtg tcttgttgtt
gatccggatc agagagccga tacaacggag 1200ctcttggcac atccattcct gaaaaaggcg
aagccactct caagcctgat tccatacatc 1260agagccgtcc gagaaaagta g
128121941DNACaenorhabditis elegans
2atgtcaactt caaaaagttc caaggtgcga atacggaatt tcatcgggcg aatcttctct
60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat
120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc
180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag
240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag
300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat
360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat
420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg
480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg
540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg
600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg
660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg
720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg
780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa
840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca
900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt
960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa
1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc
1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag
1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat
1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc
1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat
1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca
1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt
1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac
1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag
1560attcagccgg gatcgaaaag agatactgtc gtcggaactc catattggat gtcgccggag
1620atattgaaca agaagcagta caactataag gttgacattt ggtcgctggg aattatggct
1680ctagagatga ttgatggaga gccaccatat ttgagagaaa cacctttgaa ggctatctac
1740ttgattgctc aaaacgggaa gccagagatc aagcaacgcg acagactgtc ttcagagttc
1800aacaatttcc ttgacaagtg tcttgttgtt gatccggatc agagagccga tacaacggag
1860ctcttggcac atccattcct gaaaaaggcg aagccactct caagcctgat tccatacatc
1920agagccgtcc gagaaaagta g
194131941DNACaenorhabditis elegans 3atgtcaactt caaaaagttc caaggtgcga
atacggaatt tcatcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat
gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga
gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt
cttctccgag acattagtct tgccgatcag 240aagaaggatc cgaacgcggt ggtgactgcg
ttgaagttct acgcacaatc aatgaaggag 300aacgagaaga cgaaattcat gacgacgaat
agtgttttca cgaatagcga tgacgatgat 360gtggacgttc agttgaccgg acaagtcacg
gaacatttga ggaatttgca gtgtagtaat 420ggttccgcaa cttccccatc tacatcagtg
tcagcttcat cttcttctgc tcgtccactg 480acaaatggaa ataatcatct ttccacggcg
tcgtctaccg acacatctct ctcattatcg 540gaaaggaata acgttccgtc tccagctcca
gttccatata gtgaaagtgc tccacaactg 600aaaacattca ccggagagac tccaaaactg
catccacgat ctccgttccc gcctcaaccg 660ccagttcttc cgcaacgaag caaaaccgca
tcggcagtgg cgacgacgac gacgaatccg 720acgacttcga atggagcacc accaccagtt
cctggatcga aaggaccccc ggtgccaccg 780aaaccatcgc atctgaaaat cgcatcgtcg
acagtatcct cgggatgctc gtctccacaa 840cagtattcgt ctgctcgatc cgttggtaac
tcgctctcca acggcagtgt tgtctccaca 900acatcgtcag atggtgatgt gcaattgtcg
aataaggaaa attcgaatga caaatcagtt 960ggagacaaga atgggaacac caccacaaac
aaaacgaccg tcgaaccacc tccaccagaa 1020gagccacctg ttcgtgttcg agcatctcat
cgtgaaaagc tttctgattc cgaagtgctc 1080aatcaactcc gcgagattgt taatccaagt
aatccacttg gaaagtacga gatgaagaag 1140caaatcggtg ttggagcatc cggaactgta
ttcgttgcta atgtggccgg cagcactgat 1200gtggtggctg tgaagagaat ggctttcaag
actcagccga agaaggagat gttgctcacc 1260gagattaagg ttatgaagca gtatcgacac
ccgaacctcg tcaactacat tgaatcgtat 1320ctggttgatg ctgatgatct ttgggtagtg
atggattatc tggaaggtgg aaacttgaca 1380gatgtcgttg tgaagactga gttggacgaa
ggacaaattg cagcagtttt gcaagaatgt 1440cttaaagcgc ttcacttcct tcatagacac
tccatagtgc accgagatat caagagtgac 1500aacgtgctgc tcggcatgaa cggagaggtt
aagctcaccg atatgggatt ctgtgctcag 1560attcagccgg gatcgaaaag agatactgtc
gtcggaactc catattggat gtcgccggag 1620atattgaaca agaagcagta caactataag
gttgacattt ggtcgctggg aattatggcc 1680ctagagatga ttgatggaga gccaccatat
ttgagagaaa cacctttgaa ggctatctac 1740ttgattgctc aaaacgggaa gccagagatc
aagcaacgcg acagactgtc ttcagagttc 1800aacaatttcc ttgacaagtg tcttgttgtt
gatccggatc agagagccga tacaacggag 1860ctcttggcac atccattcct gaaaaaggcg
aagccactct caagcctgat tccatacatc 1920agagccgtcc gagaaaagta g
194141947DNACaenorhabditis elegans
4atgtcaactt caaaaagttc caaggtgcga atacggaatt tcatcgggcg aatcttctct
60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat
120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc
180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag
240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag
300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat
360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat
420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg
480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg
540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg
600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg
660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg
720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg
780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa
840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca
900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt
960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa
1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc
1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag
1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat
1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc
1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat
1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca
1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt
1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac
1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag
1560attcagccgg gatcgaaaag ttgtagagat actgtcgtcg gaactccata ttggatgtcg
1620ccggagatat tgaacaagaa gcagtacaac tataaggttg acatttggtc gctgggaatt
1680atggccctag agatgattga tggagagcca ccatatttga gagaaacacc tttgaaggct
1740atctacttga ttgctcaaaa cgggaagcca gagatcaagc aacgcgacag actgtcttca
1800gagttcaaca atttccttga caagtgtctt gttgttgatc cggatcagag agccgataca
1860acggagctct tggcacatcc attcctgaaa aaggcgaagc cactctcaag cctgattcca
1920tacatcagag ccgtccgaga aaagtag
194751806DNACaenorhabditis elegans 5atgtcaactt caaaaagttc caaggtgcga
atacggaatt tcatcgggcg aatcttctct 60cccagcgata aagacaagga tcgagacgat
gagatgaagc catcctcgtc cgcaatggat 120attagtcagc catataacac agtgcatcga
gtccacgttg gatacgacgg ccagaagttc 180agcggactgc cgcaaccatg gatggatatt
cttctccgag acattagcta tttcagtctt 240gccgatcaga agaaggatcc gaacgcggtg
gtgactgcgt tgaagttcta cgcacaatca 300atgaaggaga acgagaagac gaaattcatg
acgacgaata gtgttttcac gaatagcgat 360gacgatgatg tggacgttca gttgaccgga
caagtcacgg aacatttgag gaatttgcag 420tgtagtaatg gttccgcaac ttccccatct
acatcagtgt cagcttcatc ttcttctgct 480cgtccactga caaatggaaa taatcatctt
tccacggcgt cgtctaccga cacatctctc 540tcattatcgg aaaggaataa cgttccgtct
ccagctccag ttccatatag tgaaagtgct 600ccacaactga aaacattcac cggagagact
ccaaaactgc atccacgatc tccgttcccg 660cctcaaccgc cagttcttcc gcaacgaagc
aaaaccgcat cggcagtggc gacgacgacg 720acgaatccga cgacttcgaa tggagcacca
ccaccagttc ctggatcgaa aggacccccg 780gtgccaccga aaccatcgaa ggaaaattcg
aatgacaaat cagttggaga caagaatggg 840aacaccacca caaacaaaac gaccgtcgaa
ccacctccac cagaagagcc acctgttcgt 900gttcgagcat ctcatcgtga aaagctttct
gattccgaag tgctcaatca actccgcgag 960attgttaatc caagtaatcc acttggaaag
tacgagatga agaagcaaat cggtgttgga 1020gcatccggaa ctgtattcgt tgctaatgtg
gccggcagca ctgatgtggt ggctgtgaag 1080agaatggctt tcaagactca gccgaagaag
gagatgttgc tcaccgagat taaggttatg 1140aagcagtatc gacacccgaa cctcgtcaac
tacattgaat cgtatctggt tgatgctgat 1200gatctttggg tagtgatgga ttatctggaa
ggtggaaact tgacagatgt cgttgtgaag 1260actgagttgg acgaaggaca aattgcagca
gttttgcaag aatgtcttaa agcgcttcac 1320ttccttcata gacactccat agtgcaccga
gatatcaaga gtgacaacgt gctgctcggc 1380atgaacggag aggttaagct caccgatatg
ggattctgtg ctcagattca gccgggatcg 1440aaaagagata ctgtcgtcgg aactccatat
tggatgtcgc cggagatatt gaacaagaag 1500cagtacaact ataaggttga catttggtcg
ctgggaatta tggctctaga gatgattgat 1560ggagagccac catatttgag agaaacacct
ttgaaggcta tctacttgat tgctcaaaac 1620gggaagccag agatcaagca acgcgacaga
ctgtcttcag agttcaacaa tttccttgac 1680aagtgtcttg ttgttgatcc ggatcagaga
gccgatacaa cggagctctt ggcacatcca 1740ttcctgaaaa aggcgaagcc actctcaagc
ctgattccat acatcagagc cgtccgagaa 1800aagtag
180661947DNACaenorhabditis elegans
6atgtcaactt caaaaagttc caaggtgcga atacggaatt tcgtcgggcg aatcttctct
60cccagcgata aagacaagga tcgagacgat gagatgaagc catcctcgtc cgcaatggat
120attagtcagc catataacac agtgcatcga gtccacgttg gatacgacgg ccagaagttc
180agcggactgc cgcaaccatg gatggatatt cttctccgag acattagtct tgccgatcag
240aagaaggatc cgaacgcggt ggtgactgcg ttgaagttct acgcacaatc aatgaaggag
300aacgagaaga cgaaattcat gacgacgaat agtgttttca cgaatagcga tgacgatgat
360gtggacgttc agttgaccgg acaagtcacg gaacatttga ggaatttgca gtgtagtaat
420ggttccgcaa cttccccatc tacatcagtg tcagcttcat cttcttctgc tcgtccactg
480acaaatggaa ataatcatct ttccacggcg tcgtctaccg acacatctct ctcattatcg
540gaaaggaata acgttccgtc tccagctcca gttccatata gtgaaagtgc tccacaactg
600aaaacattca ccggagagac tccaaaactg catccacgat ctccgttccc gcctcaaccg
660ccagttcttc cgcaacgaag caaaaccgca tcggcagtgg cgacgacgac gacgaatccg
720acgacttcga atggagcacc accaccagtt cctggatcga aaggaccccc ggtgccaccg
780aaaccatcgc atctgaaaat cgcatcgtcg acagtatcct cgggatgctc gtctccacaa
840cagtattcgt ctgctcgatc cgttggtaac tcgctctcca acggcagtgt tgtctccaca
900acatcgtcag atggtgatgt gcaattgtcg aataaggaaa attcgaatga caaatcagtt
960ggagacaaga atgggaacac caccacaaac aaaacgaccg tcgaaccacc tccaccagaa
1020gagccacctg ttcgtgttcg agcatctcat cgtgaaaagc tttctgattc cgaagtgctc
1080aatcaactcc gcgagattgt taatccaagt aatccacttg gaaagtacga gatgaagaag
1140caaatcggtg ttggagcatc cggaactgta ttcgttgcta atgtggccgg cagcactgat
1200gtggtggctg tgaagagaat ggctttcaag actcagccga agaaggagat gttgctcacc
1260gagattaagg ttatgaagca gtatcgacac ccgaacctcg tcaactacat tgaatcgtat
1320ctggttgatg ctgatgatct ttgggtagtg atggattatc tggaaggtgg aaacttgaca
1380gatgtcgttg tgaagactga gttggacgaa ggacaaattg cagcagtttt gcaagaatgt
1440cttaaagcgc ttcacttcct tcatagacac tccatagtgc accgagatat caagagtgac
1500aacgtgctgc tcggcatgaa cggagaggtt aagctcaccg atatgggatt ctgtgctcag
1560attcagccgg gatcgaaaag ttgtagagat actgtcgtcg gaactccata ttggatgtcg
1620ccggagatat tgaacaagaa gcagtacaac tataaggttg acatttggtc gctgggaatt
1680atggctctag agatgattga tggagagcca ccatatttga gagaaacacc tttgaaggct
1740atctacttga ttgctcaaaa cgggaagcca gagatcaagc aacgcgacag actgtcttca
1800gagttcaaca atttccttga caagtgtctt gttgttgatc cggatcagag agccgataca
1860acggagctct tggcacatcc attcctgaaa aaggcgaagc cactctcaag cctgattcca
1920tacatcagag ccgtccgaga aaagtag
19477426PRTCaenorhabditis elegans 7Met Phe Gln Asn Ser Pro Met Met Tyr
Asp Trp Trp Asn Asp Thr Thr1 5 10
15Lys Pro Lys His Gln Gln Pro Thr Leu Asn Val Leu Ser Pro Trp
Gly20 25 30Ala Tyr Phe Asn His Ile Gly
Asn Glu Leu Leu His Leu Lys Ile Ala35 40
45Ser Ser Thr Val Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser50
55 60Ala Arg Ser Val Gly Asn Ser Leu Ser Asn
Gly Ser Val Val Ser Thr65 70 75
80Thr Ser Ser Asp Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser
Asn85 90 95Asp Lys Ser Val Gly Asp Lys
Asn Gly Asn Thr Thr Thr Asn Lys Thr100 105
110Thr Val Glu Pro Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala115
120 125Ser His Arg Glu Lys Leu Ser Asp Ser
Glu Val Leu Asn Gln Leu Arg130 135 140Glu
Ile Val Asn Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys145
150 155 160Gln Ile Gly Val Gly Ala
Ser Gly Thr Val Phe Val Ala Asn Val Ala165 170
175Gly Ser Thr Asp Val Val Ala Val Lys Arg Met Ala Phe Lys Thr
Gln180 185 190Pro Lys Lys Glu Met Leu Leu
Thr Glu Ile Lys Val Met Lys Gln Tyr195 200
205Arg His Pro Asn Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala210
215 220Asp Asp Leu Trp Val Val Met Asp Tyr
Leu Glu Gly Gly Asn Leu Thr225 230 235
240Asp Val Val Val Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala
Ala Val245 250 255Leu Gln Glu Cys Leu Lys
Ala Leu His Phe Leu His Arg His Ser Ile260 265
270Val His Arg Asp Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn
Gly275 280 285Glu Val Lys Leu Thr Asp Met
Gly Phe Cys Ala Gln Ile Gln Pro Gly290 295
300Ser Lys Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu305
310 315 320Ile Leu Asn Lys
Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu325 330
335Gly Ile Met Ala Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr
Leu Arg340 345 350Glu Thr Pro Leu Lys Ala
Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro355 360
365Glu Ile Lys Gln Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe
Leu370 375 380Asp Lys Cys Leu Val Val Asp
Pro Asp Gln Arg Ala Asp Thr Thr Glu385 390
395 400Leu Leu Ala His Pro Phe Leu Lys Lys Ala Lys Pro
Leu Ser Ser Leu405 410 415Ile Pro Tyr Ile
Arg Ala Val Arg Glu Lys420 4258646PRTCaenorhabditis
elegans 8Met Ser Thr Ser Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Ile Gly1
5 10 15Arg Ile Phe Ser
Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu Met20 25
30Lys Pro Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn
Thr Val35 40 45His Arg Val His Val Gly
Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50 55
60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp Ile Ser Leu Ala Asp Gln65
70 75 80Lys Lys Asp Pro
Asn Ala Val Val Thr Ala Leu Lys Phe Tyr Ala Gln85 90
95Ser Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr Asn
Ser Val100 105 110Phe Thr Asn Ser Asp Asp
Asp Asp Val Asp Val Gln Leu Thr Gly Gln115 120
125Val Thr Glu His Leu Arg Asn Leu Gln Cys Ser Asn Gly Ser Ala
Thr130 135 140Ser Pro Ser Thr Ser Val Ser
Ala Ser Ser Ser Ser Ala Arg Pro Leu145 150
155 160Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser
Thr Asp Thr Ser165 170 175Leu Ser Leu Ser
Glu Arg Asn Asn Val Pro Ser Pro Ala Pro Val Pro180 185
190Tyr Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu
Thr Pro195 200 205Lys Leu His Pro Arg Ser
Pro Phe Pro Pro Gln Pro Pro Val Leu Pro210 215
220Gln Arg Ser Lys Thr Ala Ser Ala Val Ala Thr Thr Thr Thr Asn
Pro225 230 235 240Thr Thr
Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser Lys Gly Pro245
250 255Pro Val Pro Pro Lys Pro Ser His Leu Lys Ile Ala
Ser Ser Thr Val260 265 270Ser Ser Gly Cys
Ser Ser Pro Gln Gln Tyr Ser Ser Ala Arg Ser Val275 280
285Gly Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr Thr Ser
Ser Asp290 295 300Gly Asp Val Gln Leu Ser
Asn Lys Glu Asn Ser Asn Asp Lys Ser Val305 310
315 320Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys
Thr Thr Val Glu Pro325 330 335Pro Pro Pro
Glu Glu Pro Pro Val Arg Val Arg Ala Ser His Arg Glu340
345 350Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg
Glu Ile Val Asn355 360 365Pro Ser Asn Pro
Leu Gly Lys Tyr Glu Met Lys Lys Gln Ile Gly Val370 375
380Gly Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly Ser
Thr Asp385 390 395 400Val
Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro Lys Lys Glu405
410 415Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln
Tyr Arg His Pro Asn420 425 430Leu Val Asn
Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp Asp Leu Trp435
440 445Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr
Asp Val Val Val450 455 460Lys Thr Glu Leu
Asp Glu Gly Gln Ile Ala Ala Val Leu Gln Glu Cys465 470
475 480Leu Lys Ala Leu His Phe Leu His Arg
His Ser Ile Val His Arg Asp485 490 495Ile
Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu Val Lys Leu500
505 510Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro
Gly Ser Lys Arg Asp515 520 525Thr Val Val
Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile Leu Asn Lys530
535 540Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu
Gly Ile Met Ala545 550 555
560Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu Thr Pro Leu565
570 575Lys Ala Ile Tyr Leu Ile Ala Gln Asn
Gly Lys Pro Glu Ile Lys Gln580 585 590Arg
Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp Lys Cys Leu595
600 605Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr
Glu Leu Leu Ala His610 615 620Pro Phe Leu
Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile Pro Tyr Ile625
630 635 640Arg Ala Val Arg Glu
Lys6459646PRTCaenorhabditis elegans 9Met Ser Thr Ser Lys Ser Ser Lys Val
Arg Ile Arg Asn Phe Ile Gly1 5 10
15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp Glu
Met20 25 30Lys Pro Ser Ser Ser Ala Met
Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40
45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50
55 60Gln Pro Trp Met Asp Ile Leu Leu Arg Asp
Ile Ser Leu Ala Asp Gln65 70 75
80Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys Phe Tyr Ala
Gln85 90 95Ser Met Lys Glu Asn Glu Lys
Thr Lys Phe Met Thr Thr Asn Ser Val100 105
110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu Thr Gly Gln115
120 125Val Thr Glu His Leu Arg Asn Leu Gln
Cys Ser Asn Gly Ser Ala Thr130 135 140Ser
Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala Arg Pro Leu145
150 155 160Thr Asn Gly Asn Asn His
Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165 170
175Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro Ala Pro Val
Pro180 185 190Tyr Ser Glu Ser Ala Pro Gln
Leu Lys Thr Phe Thr Gly Glu Thr Pro195 200
205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro Val Leu Pro210
215 220Gln Arg Ser Lys Thr Ala Ser Ala Val
Ala Thr Thr Thr Thr Asn Pro225 230 235
240Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro Gly Ser Lys
Gly Pro245 250 255Pro Val Pro Pro Lys Pro
Ser His Leu Lys Ile Ala Ser Ser Thr Val260 265
270Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser Ala Arg Ser
Val275 280 285Gly Asn Ser Leu Ser Asn Gly
Ser Val Val Ser Thr Thr Ser Ser Asp290 295
300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn Asp Lys Ser Val305
310 315 320Gly Asp Lys Asn
Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu Pro325 330
335Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg Ala Ser His
Arg Glu340 345 350Lys Leu Ser Asp Ser Glu
Val Leu Asn Gln Leu Arg Glu Ile Val Asn355 360
365Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln Ile Gly
Val370 375 380Gly Ala Ser Gly Thr Val Phe
Val Ala Asn Val Ala Gly Ser Thr Asp385 390
395 400Val Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln
Pro Lys Lys Glu405 410 415Met Leu Leu Thr
Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro Asn420 425
430Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp Asp
Leu Trp435 440 445Val Val Met Asp Tyr Leu
Glu Gly Gly Asn Leu Thr Asp Val Val Val450 455
460Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu Gln Glu
Cys465 470 475 480Leu Lys
Ala Leu His Phe Leu His Arg His Ser Ile Val His Arg Asp485
490 495Ile Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly
Glu Val Lys Leu500 505 510Thr Asp Met Gly
Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Arg Asp515 520
525Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile Leu
Asn Lys530 535 540Lys Gln Tyr Asn Tyr Lys
Val Asp Ile Trp Ser Leu Gly Ile Met Ala545 550
555 560Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu
Arg Glu Thr Pro Leu565 570 575Lys Ala Ile
Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu Ile Lys Gln580
585 590Arg Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu
Asp Lys Cys Leu595 600 605Val Val Asp Pro
Asp Gln Arg Ala Asp Thr Thr Glu Leu Leu Ala His610 615
620Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile Pro
Tyr Ile625 630 635 640Arg
Ala Val Arg Glu Lys64510648PRTCaenorhabditis elegans 10Met Ser Thr Ser
Lys Ser Ser Lys Val Arg Ile Arg Asn Phe Ile Gly1 5
10 15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys
Asp Arg Asp Asp Glu Met20 25 30Lys Pro
Ser Ser Ser Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35
40 45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe
Ser Gly Leu Pro50 55 60Gln Pro Trp Met
Asp Ile Leu Leu Arg Asp Ile Ser Leu Ala Asp Gln65 70
75 80Lys Lys Asp Pro Asn Ala Val Val Thr
Ala Leu Lys Phe Tyr Ala Gln85 90 95Ser
Met Lys Glu Asn Glu Lys Thr Lys Phe Met Thr Thr Asn Ser Val100
105 110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val
Gln Leu Thr Gly Gln115 120 125Val Thr Glu
His Leu Arg Asn Leu Gln Cys Ser Asn Gly Ser Ala Thr130
135 140Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser
Ala Arg Pro Leu145 150 155
160Thr Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165
170 175Leu Ser Leu Ser Glu Arg Asn Asn Val
Pro Ser Pro Ala Pro Val Pro180 185 190Tyr
Ser Glu Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu Thr Pro195
200 205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln
Pro Pro Val Leu Pro210 215 220Gln Arg Ser
Lys Thr Ala Ser Ala Val Ala Thr Thr Thr Thr Asn Pro225
230 235 240Thr Thr Ser Asn Gly Ala Pro
Pro Pro Val Pro Gly Ser Lys Gly Pro245 250
255Pro Val Pro Pro Lys Pro Ser His Leu Lys Ile Ala Ser Ser Thr Val260
265 270Ser Ser Gly Cys Ser Ser Pro Gln Gln
Tyr Ser Ser Ala Arg Ser Val275 280 285Gly
Asn Ser Leu Ser Asn Gly Ser Val Val Ser Thr Thr Ser Ser Asp290
295 300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser
Asn Asp Lys Ser Val305 310 315
320Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu
Pro325 330 335Pro Pro Pro Glu Glu Pro Pro
Val Arg Val Arg Ala Ser His Arg Glu340 345
350Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg Glu Ile Val Asn355
360 365Pro Ser Asn Pro Leu Gly Lys Tyr Glu
Met Lys Lys Gln Ile Gly Val370 375 380Gly
Ala Ser Gly Thr Val Phe Val Ala Asn Val Ala Gly Ser Thr Asp385
390 395 400Val Val Ala Val Lys Arg
Met Ala Phe Lys Thr Gln Pro Lys Lys Glu405 410
415Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro
Asn420 425 430Leu Val Asn Tyr Ile Glu Ser
Tyr Leu Val Asp Ala Asp Asp Leu Trp435 440
445Val Val Met Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp Val Val Val450
455 460Lys Thr Glu Leu Asp Glu Gly Gln Ile
Ala Ala Val Leu Gln Glu Cys465 470 475
480Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val His
Arg Asp485 490 495Ile Lys Ser Asp Asn Val
Leu Leu Gly Met Asn Gly Glu Val Lys Leu500 505
510Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Ser
Cys515 520 525Arg Asp Thr Val Val Gly Thr
Pro Tyr Trp Met Ser Pro Glu Ile Leu530 535
540Asn Lys Lys Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly Ile545
550 555 560Met Ala Leu Glu
Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu Thr565 570
575Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro
Glu Ile580 585 590Lys Gln Arg Asp Arg Leu
Ser Ser Glu Phe Asn Asn Phe Leu Asp Lys595 600
605Cys Leu Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu
Leu610 615 620Ala His Pro Phe Leu Lys Lys
Ala Lys Pro Leu Ser Ser Leu Ile Pro625 630
635 640Tyr Ile Arg Ala Val Arg Glu
Lys64511601PRTCaenorhabditis elegans 11Met Ser Thr Ser Lys Ser Ser Lys
Val Arg Ile Arg Asn Phe Ile Gly1 5 10
15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp Asp
Glu Met20 25 30Lys Pro Ser Ser Ser Ala
Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40
45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu Pro50
55 60Gln Pro Trp Met Asp Ile Leu Leu Arg
Asp Ile Ser Tyr Phe Ser Leu65 70 75
80Ala Asp Gln Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu
Lys Phe85 90 95Tyr Ala Gln Ser Met Lys
Glu Asn Glu Lys Thr Lys Phe Met Thr Thr100 105
110Asn Ser Val Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln
Leu115 120 125Thr Gly Gln Val Thr Glu His
Leu Arg Asn Leu Gln Cys Ser Asn Gly130 135
140Ser Ala Thr Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala145
150 155 160Arg Pro Leu Thr
Asn Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr165 170
175Asp Thr Ser Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser
Pro Ala180 185 190Pro Val Pro Tyr Ser Glu
Ser Ala Pro Gln Leu Lys Thr Phe Thr Gly195 200
205Glu Thr Pro Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro
Pro210 215 220Val Leu Pro Gln Arg Ser Lys
Thr Ala Ser Ala Val Ala Thr Thr Thr225 230
235 240Thr Asn Pro Thr Thr Ser Asn Gly Ala Pro Pro Pro
Val Pro Gly Ser245 250 255Lys Gly Pro Pro
Val Pro Pro Lys Pro Ser Lys Glu Asn Ser Asn Asp260 265
270Lys Ser Val Gly Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys
Thr Thr275 280 285Val Glu Pro Pro Pro Pro
Glu Glu Pro Pro Val Arg Val Arg Ala Ser290 295
300His Arg Glu Lys Leu Ser Asp Ser Glu Val Leu Asn Gln Leu Arg
Glu305 310 315 320Ile Val
Asn Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys Gln325
330 335Ile Gly Val Gly Ala Ser Gly Thr Val Phe Val Ala
Asn Val Ala Gly340 345 350Ser Thr Asp Val
Val Ala Val Lys Arg Met Ala Phe Lys Thr Gln Pro355 360
365Lys Lys Glu Met Leu Leu Thr Glu Ile Lys Val Met Lys Gln
Tyr Arg370 375 380His Pro Asn Leu Val Asn
Tyr Ile Glu Ser Tyr Leu Val Asp Ala Asp385 390
395 400Asp Leu Trp Val Val Met Asp Tyr Leu Glu Gly
Gly Asn Leu Thr Asp405 410 415Val Val Val
Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val Leu420
425 430Gln Glu Cys Leu Lys Ala Leu His Phe Leu His Arg
His Ser Ile Val435 440 445His Arg Asp Ile
Lys Ser Asp Asn Val Leu Leu Gly Met Asn Gly Glu450 455
460Val Lys Leu Thr Asp Met Gly Phe Cys Ala Gln Ile Gln Pro
Gly Ser465 470 475 480Lys
Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met Ser Pro Glu Ile485
490 495Leu Asn Lys Lys Gln Tyr Asn Tyr Lys Val Asp
Ile Trp Ser Leu Gly500 505 510Ile Met Ala
Leu Glu Met Ile Asp Gly Glu Pro Pro Tyr Leu Arg Glu515
520 525Thr Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn
Gly Lys Pro Glu530 535 540Ile Lys Gln Arg
Asp Arg Leu Ser Ser Glu Phe Asn Asn Phe Leu Asp545 550
555 560Lys Cys Leu Val Val Asp Pro Asp Gln
Arg Ala Asp Thr Thr Glu Leu565 570 575Leu
Ala His Pro Phe Leu Lys Lys Ala Lys Pro Leu Ser Ser Leu Ile580
585 590Pro Tyr Ile Arg Ala Val Arg Glu Lys595
60012648PRTCaenorhabditis elegans 12Met Ser Thr Ser Lys Ser Ser
Lys Val Arg Ile Arg Asn Phe Val Gly1 5 10
15Arg Ile Phe Ser Pro Ser Asp Lys Asp Lys Asp Arg Asp
Asp Glu Met20 25 30Lys Pro Ser Ser Ser
Ala Met Asp Ile Ser Gln Pro Tyr Asn Thr Val35 40
45His Arg Val His Val Gly Tyr Asp Gly Gln Lys Phe Ser Gly Leu
Pro50 55 60Gln Pro Trp Met Asp Ile Leu
Leu Arg Asp Ile Ser Leu Ala Asp Gln65 70
75 80Lys Lys Asp Pro Asn Ala Val Val Thr Ala Leu Lys
Phe Tyr Ala Gln85 90 95Ser Met Lys Glu
Asn Glu Lys Thr Lys Phe Met Thr Thr Asn Ser Val100 105
110Phe Thr Asn Ser Asp Asp Asp Asp Val Asp Val Gln Leu Thr
Gly Gln115 120 125Val Thr Glu His Leu Arg
Asn Leu Gln Cys Ser Asn Gly Ser Ala Thr130 135
140Ser Pro Ser Thr Ser Val Ser Ala Ser Ser Ser Ser Ala Arg Pro
Leu145 150 155 160Thr Asn
Gly Asn Asn His Leu Ser Thr Ala Ser Ser Thr Asp Thr Ser165
170 175Leu Ser Leu Ser Glu Arg Asn Asn Val Pro Ser Pro
Ala Pro Val Pro180 185 190Tyr Ser Glu Ser
Ala Pro Gln Leu Lys Thr Phe Thr Gly Glu Thr Pro195 200
205Lys Leu His Pro Arg Ser Pro Phe Pro Pro Gln Pro Pro Val
Leu Pro210 215 220Gln Arg Ser Lys Thr Ala
Ser Ala Val Ala Thr Thr Thr Thr Asn Pro225 230
235 240Thr Thr Ser Asn Gly Ala Pro Pro Pro Val Pro
Gly Ser Lys Gly Pro245 250 255Pro Val Pro
Pro Lys Pro Ser His Leu Lys Ile Ala Ser Ser Thr Val260
265 270Ser Ser Gly Cys Ser Ser Pro Gln Gln Tyr Ser Ser
Ala Arg Ser Val275 280 285Gly Asn Ser Leu
Ser Asn Gly Ser Val Val Ser Thr Thr Ser Ser Asp290 295
300Gly Asp Val Gln Leu Ser Asn Lys Glu Asn Ser Asn Asp Lys
Ser Val305 310 315 320Gly
Asp Lys Asn Gly Asn Thr Thr Thr Asn Lys Thr Thr Val Glu Pro325
330 335Pro Pro Pro Glu Glu Pro Pro Val Arg Val Arg
Ala Ser His Arg Glu340 345 350Lys Leu Ser
Asp Ser Glu Val Leu Asn Gln Leu Arg Glu Ile Val Asn355
360 365Pro Ser Asn Pro Leu Gly Lys Tyr Glu Met Lys Lys
Gln Ile Gly Val370 375 380Gly Ala Ser Gly
Thr Val Phe Val Ala Asn Val Ala Gly Ser Thr Asp385 390
395 400Val Val Ala Val Lys Arg Met Ala Phe
Lys Thr Gln Pro Lys Lys Glu405 410 415Met
Leu Leu Thr Glu Ile Lys Val Met Lys Gln Tyr Arg His Pro Asn420
425 430Leu Val Asn Tyr Ile Glu Ser Tyr Leu Val Asp
Ala Asp Asp Leu Trp435 440 445Val Val Met
Asp Tyr Leu Glu Gly Gly Asn Leu Thr Asp Val Val Val450
455 460Lys Thr Glu Leu Asp Glu Gly Gln Ile Ala Ala Val
Leu Gln Glu Cys465 470 475
480Leu Lys Ala Leu His Phe Leu His Arg His Ser Ile Val His Arg Asp485
490 495Ile Lys Ser Asp Asn Val Leu Leu Gly
Met Asn Gly Glu Val Lys Leu500 505 510Thr
Asp Met Gly Phe Cys Ala Gln Ile Gln Pro Gly Ser Lys Ser Cys515
520 525Arg Asp Thr Val Val Gly Thr Pro Tyr Trp Met
Ser Pro Glu Ile Leu530 535 540Asn Lys Lys
Gln Tyr Asn Tyr Lys Val Asp Ile Trp Ser Leu Gly Ile545
550 555 560Met Ala Leu Glu Met Ile Asp
Gly Glu Pro Pro Tyr Leu Arg Glu Thr565 570
575Pro Leu Lys Ala Ile Tyr Leu Ile Ala Gln Asn Gly Lys Pro Glu Ile580
585 590Lys Gln Arg Asp Arg Leu Ser Ser Glu
Phe Asn Asn Phe Leu Asp Lys595 600 605Cys
Leu Val Val Asp Pro Asp Gln Arg Ala Asp Thr Thr Glu Leu Leu610
615 620Ala His Pro Phe Leu Lys Lys Ala Lys Pro Leu
Ser Ser Leu Ile Pro625 630 635
640Tyr Ile Arg Ala Val Arg Glu Lys645133541DNAArtificial
SequenceDescription of Artificial SequenceVector 13gttaacgcta gcatggatct
cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga
tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtttca
aaatagtccg atgatgtacg actggtggaa tgacaccacc 180aaaccgaaac accagcagcc
gacacttaac gtgttgtcac catggggagc atatttcaat 240cacattggaa atgaactgct
gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 300tcgtctccac aacagtattc
gtctgctcga tccgttggta actcgctctc caacggcagt 360gttgtctcca caacatcgtc
agatggtgat gtgcaattgt cgaataagga aaattcgaat 420gacaaatcag ttggagacaa
gaatgggaac accaccacaa acaaaacgac cgtcgaacca 480cctccaccag aagagccacc
tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 540tccgaagtgc tcaatcaact
ccgcgagatt gttaatccaa gtaatccact tggaaagtac 600gagatgaaga agcaaatcgg
tgttggagca tccggaactg tattcgttgc taatgtggcc 660ggcagcactg atgtggtggc
tgtgaagaga atggctttca agactcagcc gaagaaggag 720atgttgctca ccgagattaa
ggttatgaag cagtatcgac acccgaacct cgtcaactac 780attgaatcgt atctggttga
tgctgatgat ctttgggtag tgatggatta tctggaaggt 840ggaaacttga cagatgtcgt
tgtgaagact gagttggacg aaggacaaat tgcagcagtt 900ttgcaagaat gtcttaaagc
gcttcacttc cttcatagac actccatagt gcaccgagat 960atcaagagtg acaacgtgct
gctcggcatg aacggagagg ttaagctcac cgatatggga 1020ttctgtgctc agattcagcc
gggatcgaaa agagatactg tcgtcggaac tccatattgg 1080atgtcgccgg agatattgaa
caagaagcag tacaactata aggttgacat ttggtcgctg 1140ggaattatgg ctctagagat
gattgatgga gagccaccat atttgagaga aacacctttg 1200aaggctatct acttgattgc
tcaaaacggg aagccagaga tcaagcaacg cgacagactg 1260tcttcagagt tcaacaattt
ccttgacaag tgtcttgttg ttgatccgga tcagagagcc 1320gatacaacgg agctcttggc
acatccattc ctgaaaaagg cgaagccact ctcaagcctg 1380attccataca tcagagccgt
ccgagaaaag tagacccagc tttcttgtac aaagttggca 1440ttataagaaa gcattgctta
tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata 1500aaatcattat ttgccatcca
gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac 1560attgcacaag ataaaaatat
atcatcatga acaataaaac tgtctgctta cataaacagt 1620aatacaaggg gtgttatgag
ccatattcaa cgggaaacgt cgaggccgcg attaaattcc 1680aacatggatg ctgatttata
tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt 1740gcgacaatct atcgcttgta
tgggaagccc gatgcgccag agttgtttct gaaacatggc 1800aaaggtagcg ttgccaatga
tgttacagat gagatggtca gactaaactg gctgacggaa 1860tttatgcctc ttccgaccat
caagcatttt atccgtactc ctgatgatgc atggttactc 1920accactgcga tccccggaaa
aacagcattc caggtattag aagaatatcc tgattcaggt 1980gaaaatattg ttgatgcgct
ggcagtgttc ctgcgccggt tgcattcgat tcctgtttgt 2040aattgtcctt ttaacagcga
tcgcgtattt cgtctcgctc aggcgcaatc acgaatgaat 2100aacggtttgg ttgatgcgag
tgattttgat gacgagcgta atggctggcc tgttgaacaa 2160gtctggaaag aaatgcataa
acttttgcca ttctcaccgg attcagtcgt cactcatggt 2220gatttctcac ttgataacct
tatttttgac gaggggaaat taataggttg tattgatgtt 2280ggacgagtcg gaatcgcaga
ccgataccag gatcttgcca tcctatggaa ctgcctcggt 2340gagttttctc cttcattaca
gaaacggctt tttcaaaaat atggtattga taatcctgat 2400atgaataaat tgcagtttca
tttgatgctc gatgagtttt tctaatcaga attggttaat 2460tggttgtaac actggcagag
cattacgctg acttgacggg acggcgcaag ctcatgacca 2520aaatccctta acgtgagttt
tcgttccact gagcgtcaga ccccgtagaa aagatcaaag 2580gatcttcttg agatcctttt
tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac 2640cgctaccagc ggtggtttgt
ttgccggatc aagagctacc aactcttttt ccgaaggtaa 2700ctggcttcag cagagcgcag
ataccaaata ctgtccttct agtgtagccg tagttaggcc 2760accacttcaa gaactctgta
gcaccgccta catacctcgc tctgctaatc ctgttaccag 2820tggctgctgc cagtggcgat
aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 2880cggataaggc gcagcggtcg
ggctgaacgg ggggttcgtg cacacagccc agcttggagc 2940gaacgaccta caccgaactg
agatacctac agcgtgagct atgagaaagc gccacgcttc 3000ccgaagggag aaaggcggac
aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3060cgagggagct tccaggggga
aacgcctggt atctttatag tcctgtcggg tttcgccacc 3120tctgacttga gcgtcgattt
ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 3180ccagcaacgc ggccttttta
cggttcctgg ccttttgctg gccttttgct cacatgttct 3240ttcctgcgtt atcccctgat
tctgtggata accgtattac cgctagccag gaagagtttg 3300tagaaacgca aaaaggccat
ccgtcaggat ggccttctgc ttagtttgat gcctggcagt 3360ttatggcggg cgtcctgccc
gccaccctcc gggccgttgc ttcacaacgt tcaaatccgc 3420tcccggcgga tttgtcctac
tcaggagagc gttcaccgac aaacaacaga taaaacgaaa 3480ggcccagtct tccgactgag
cctttcgttt tatttgatgc ctggcagttc cctactctcg 3540c
3541144201DNAArtificial
SequenceDescription of Artificial SequenceVector 14gttaacgcta gcatggatct
cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga
tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac
ttcaaaaagt tccaaggtgc gaatacggaa tttcatcggg 180cgaatcttct ctcccagcga
taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca
gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact
gccgcaacca tggatggata ttcttctccg agacattagt 360cttgccgatc agaagaagga
tccgaacgcg gtggtgactg cgttgaagtt ctacgcacaa 420tcaatgaagg agaacgagaa
gacgaaattc atgacgacga atagtgtttt cacgaatagc 480gatgacgatg atgtggacgt
tcagttgacc ggacaagtca cggaacattt gaggaatttg 540cagtgtagta atggttccgc
aacttcccca tctacatcag tgtcagcttc atcttcttct 600gctcgtccac tgacaaatgg
aaataatcat ctttccacgg cgtcgtctac cgacacatct 660ctctcattat cggaaaggaa
taacgttccg tctccagctc cagttccata tagtgaaagt 720gctccacaac tgaaaacatt
caccggagag actccaaaac tgcatccacg atctccgttc 780ccgcctcaac cgccagttct
tccgcaacga agcaaaaccg catcggcagt ggcgacgacg 840acgacgaatc cgacgacttc
gaatggagca ccaccaccag ttcctggatc gaaaggaccc 900ccggtgccac cgaaaccatc
gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 960tcgtctccac aacagtattc
gtctgctcga tccgttggta actcgctctc caacggcagt 1020gttgtctcca caacatcgtc
agatggtgat gtgcaattgt cgaataagga aaattcgaat 1080gacaaatcag ttggagacaa
gaatgggaac accaccacaa acaaaacgac cgtcgaacca 1140cctccaccag aagagccacc
tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 1200tccgaagtgc tcaatcaact
ccgcgagatt gttaatccaa gtaatccact tggaaagtac 1260gagatgaaga agcaaatcgg
tgttggagca tccggaactg tattcgttgc taatgtggcc 1320ggcagcactg atgtggtggc
tgtgaagaga atggctttca agactcagcc gaagaaggag 1380atgttgctca ccgagattaa
ggttatgaag cagtatcgac acccgaacct cgtcaactac 1440attgaatcgt atctggttga
tgctgatgat ctttgggtag tgatggatta tctggaaggt 1500ggaaacttga cagatgtcgt
tgtgaagact gagttggacg aaggacaaat tgcagcagtt 1560ttgcaagaat gtcttaaagc
gcttcacttc cttcatagac actccatagt gcaccgagat 1620atcaagagtg acaacgtgct
gctcggcatg aacggagagg ttaagctcac cgatatggga 1680ttctgtgctc agattcagcc
gggatcgaaa agagatactg tcgtcggaac tccatattgg 1740atgtcgccgg agatattgaa
caagaagcag tacaactata aggttgacat ttggtcgctg 1800ggaattatgg ctctagagat
gattgatgga gagccaccat atttgagaga aacacctttg 1860aaggctatct acttgattgc
tcaaaacggg aagccagaga tcaagcaacg cgacagactg 1920tcttcagagt tcaacaattt
ccttgacaag tgtcttgttg ttgatccgga tcagagagcc 1980gatacaacgg agctcttggc
acatccattc ctgaaaaagg cgaagccact ctcaagcctg 2040attccataca tcagagccgt
ccgagaaaag tagacccagc tttcttgtac aaagttggca 2100ttataagaaa gcattgctta
tcaatttgtt gcaacgaaca ggtcactatc agtcaaaata 2160aaatcattat ttgccatcca
gctgcagctc tggcccgtgt ctcaaaatct ctgatgttac 2220attgcacaag ataaaaatat
atcatcatga acaataaaac tgtctgctta cataaacagt 2280aatacaaggg gtgttatgag
ccatattcaa cgggaaacgt cgaggccgcg attaaattcc 2340aacatggatg ctgatttata
tgggtataaa tgggctcgcg ataatgtcgg gcaatcaggt 2400gcgacaatct atcgcttgta
tgggaagccc gatgcgccag agttgtttct gaaacatggc 2460aaaggtagcg ttgccaatga
tgttacagat gagatggtca gactaaactg gctgacggaa 2520tttatgcctc ttccgaccat
caagcatttt atccgtactc ctgatgatgc atggttactc 2580accactgcga tccccggaaa
aacagcattc caggtattag aagaatatcc tgattcaggt 2640gaaaatattg ttgatgcgct
ggcagtgttc ctgcgccggt tgcattcgat tcctgtttgt 2700aattgtcctt ttaacagcga
tcgcgtattt cgtctcgctc aggcgcaatc acgaatgaat 2760aacggtttgg ttgatgcgag
tgattttgat gacgagcgta atggctggcc tgttgaacaa 2820gtctggaaag aaatgcataa
acttttgcca ttctcaccgg attcagtcgt cactcatggt 2880gatttctcac ttgataacct
tatttttgac gaggggaaat taataggttg tattgatgtt 2940ggacgagtcg gaatcgcaga
ccgataccag gatcttgcca tcctatggaa ctgcctcggt 3000gagttttctc cttcattaca
gaaacggctt tttcaaaaat atggtattga taatcctgat 3060atgaataaat tgcagtttca
tttgatgctc gatgagtttt tctaatcaga attggttaat 3120tggttgtaac actggcagag
cattacgctg acttgacggg acggcgcaag ctcatgacca 3180aaatccctta acgtgagttt
tcgttccact gagcgtcaga ccccgtagaa aagatcaaag 3240gatcttcttg agatcctttt
tttctgcgcg taatctgctg cttgcaaaca aaaaaaccac 3300cgctaccagc ggtggtttgt
ttgccggatc aagagctacc aactcttttt ccgaaggtaa 3360ctggcttcag cagagcgcag
ataccaaata ctgtccttct agtgtagccg tagttaggcc 3420accacttcaa gaactctgta
gcaccgccta catacctcgc tctgctaatc ctgttaccag 3480tggctgctgc cagtggcgat
aagtcgtgtc ttaccgggtt ggactcaaga cgatagttac 3540cggataaggc gcagcggtcg
ggctgaacgg ggggttcgtg cacacagccc agcttggagc 3600gaacgaccta caccgaactg
agatacctac agcgtgagct atgagaaagc gccacgcttc 3660ccgaagggag aaaggcggac
aggtatccgg taagcggcag ggtcggaaca ggagagcgca 3720cgagggagct tccaggggga
aacgcctggt atctttatag tcctgtcggg tttcgccacc 3780tctgacttga gcgtcgattt
ttgtgatgct cgtcaggggg gcggagccta tggaaaaacg 3840ccagcaacgc ggccttttta
cggttcctgg ccttttgctg gccttttgct cacatgttct 3900ttcctgcgtt atcccctgat
tctgtggata accgtattac cgctagccag gaagagtttg 3960tagaaacgca aaaaggccat
ccgtcaggat ggccttctgc ttagtttgat gcctggcagt 4020ttatggcggg cgtcctgccc
gccaccctcc gggccgttgc ttcacaacgt tcaaatccgc 4080tcccggcgga tttgtcctac
tcaggagagc gttcaccgac aaacaacaga taaaacgaaa 4140ggcccagtct tccgactgag
cctttcgttt tatttgatgc ctggcagttc cctactctcg 4200c
4201154278DNAArtificial
SequenceDescription of Artificial SequenceVector 15gttaacgcta gcatggatct
cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga
tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctggt ttaattaccc
aagtttgaga tttaccttaa catcgggtct gacaaccgtg 180tcgcttacga cgcattctaa
tcattaacca tgtcaacttc aaaaagttcc aaggtgcgaa 240tacggaattt catcgggcga
atcttctctc ccagcgataa agacaaggat cgagacgatg 300agatgaagcc atcctcgtcc
gcaatggata ttagtcagcc atataacaca gtgcatcgag 360tccacgttgg atacgacggc
cagaagttca gcggactgcc gcaaccatgg atggatattc 420ttctccgaga cattagtctt
gccgatcaga agaaggatcc gaacgcggtg gtgactgcgt 480tgaagttcta cgcacaatca
atgaaggaga acgagaagac gaaattcatg acgacgaata 540gtgttttcac gaatagcgat
gacgatgatg tggacgttca gttgaccgga caagtcacgg 600aacatttgag gaatttgcag
tgtagtaatg gttccgcaac ttccccatct acatcagtgt 660cagcttcatc ttcttctgct
cgtccactga caaatggaaa taatcatctt tccacggcgt 720cgtctaccga cacatctctc
tcattatcgg aaaggaataa cgttccgtct ccagctccag 780ttccatatag tgaaagtgct
ccacaactga aaacattcac cggagagact ccaaaactgc 840atccacgatc tccgttcccg
cctcaaccgc cagttcttcc gcaacgaagc aaaaccgcat 900cggcagtggc gacgacgacg
acgaatccga cgacttcgaa tggagcacca ccaccagttc 960ctggatcgaa aggacccccg
gtgccaccga aaccatcgca tctgaaaatc gcatcgtcga 1020cagtatcctc gggatgctcg
tctccacaac agtattcgtc tgctcgatcc gttggtaact 1080cgctctccaa cggcagtgtt
gtctccacaa catcgtcaga tggtgatgtg caattgtcga 1140ataaggaaaa ttcgaatgac
aaatcagttg gagacaagaa tgggaacacc accacaaaca 1200aaacgaccgt cgaaccacct
ccaccagaag agccacctgt tcgtgttcga gcatctcatc 1260gtgaaaagct ttctgattcc
gaagtgctca atcaactccg cgagattgtt aatccaagta 1320atccacttgg aaagtacgag
atgaagaagc aaatcggtgt tggagcatcc ggaactgtat 1380tcgttgctaa tgtggccggc
agcactgatg tggtggctgt gaagagaatg gctttcaaga 1440ctcagccgaa gaaggagatg
ttgctcaccg agattaaggt tatgaagcag tatcgacacc 1500cgaacctcgt caactacatt
gaatcgtatc tggttgatgc tgatgatctt tgggtagtga 1560tggattatct ggaaggtgga
aacttgacag atgtcgttgt gaagactgag ttggacgaag 1620gacaaattgc agcagttttg
caagaatgtc ttaaagcgct tcacttcctt catagacact 1680ccatagtgca ccgagatatc
aagagtgaca acgtgctgct cggcatgaac ggagaggtta 1740agctcaccga tatgggattc
tgtgctcaga ttcagccggg atcgaaaaga gatactgtcg 1800tcggaactcc atattggatg
tcgccggaga tattgaacaa gaagcagtac aactataagg 1860ttgacatttg gtcgctggga
attatggccc tagagatgat tgatggagag ccaccatatt 1920tgagagaaac acctttgaag
gctatctact tgattgctca aaacgggaag ccagagatca 1980agcaacgcga cagactgtct
tcagagttca acaatttcct tgacaagtgt cttgttgttg 2040atccggatca gagagccgat
acaacggagc tcttggcaca tccattcctg aaaaaggcga 2100agccactctc aagcctgatt
ccatacatca gagccgtccg agaaaagtag acccagcttt 2160cttgtacaaa gttggcatta
taagaaagca ttgcttatca atttgttgca acgaacaggt 2220cactatcagt caaaataaaa
tcattatttg ccatccagct gcagctctgg cccgtgtctc 2280aaaatctctg atgttacatt
gcacaagata aaaatatatc atcatgaaca ataaaactgt 2340ctgcttacat aaacagtaat
acaaggggtg ttatgagcca tattcaacgg gaaacgtcga 2400ggccgcgatt aaattccaac
atggatgctg atttatatgg gtataaatgg gctcgcgata 2460atgtcgggca atcaggtgcg
acaatctatc gcttgtatgg gaagcccgat gcgccagagt 2520tgtttctgaa acatggcaaa
ggtagcgttg ccaatgatgt tacagatgag atggtcagac 2580taaactggct gacggaattt
atgcctcttc cgaccatcaa gcattttatc cgtactcctg 2640atgatgcatg gttactcacc
actgcgatcc ccggaaaaac agcattccag gtattagaag 2700aatatcctga ttcaggtgaa
aatattgttg atgcgctggc agtgttcctg cgccggttgc 2760attcgattcc tgtttgtaat
tgtcctttta acagcgatcg cgtatttcgt ctcgctcagg 2820cgcaatcacg aatgaataac
ggtttggttg atgcgagtga ttttgatgac gagcgtaatg 2880gctggcctgt tgaacaagtc
tggaaagaaa tgcataaact tttgccattc tcaccggatt 2940cagtcgtcac tcatggtgat
ttctcacttg ataaccttat ttttgacgag gggaaattaa 3000taggttgtat tgatgttgga
cgagtcggaa tcgcagaccg ataccaggat cttgccatcc 3060tatggaactg cctcggtgag
ttttctcctt cattacagaa acggcttttt caaaaatatg 3120gtattgataa tcctgatatg
aataaattgc agtttcattt gatgctcgat gagtttttct 3180aatcagaatt ggttaattgg
ttgtaacact ggcagagcat tacgctgact tgacgggacg 3240gcgcaagctc atgaccaaaa
tcccttaacg tgagttttcg ttccactgag cgtcagaccc 3300cgtagaaaag atcaaaggat
cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt 3360gcaaacaaaa aaaccaccgc
taccagcggt ggtttgtttg ccggatcaag agctaccaac 3420tctttttccg aaggtaactg
gcttcagcag agcgcagata ccaaatactg tccttctagt 3480gtagccgtag ttaggccacc
acttcaagaa ctctgtagca ccgcctacat acctcgctct 3540gctaatcctg ttaccagtgg
ctgctgccag tggcgataag tcgtgtctta ccgggttgga 3600ctcaagacga tagttaccgg
ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac 3660acagcccagc ttggagcgaa
cgacctacac cgaactgaga tacctacagc gtgagctatg 3720agaaagcgcc acgcttcccg
aagggagaaa ggcggacagg tatccggtaa gcggcagggt 3780cggaacagga gagcgcacga
gggagcttcc agggggaaac gcctggtatc tttatagtcc 3840tgtcgggttt cgccacctct
gacttgagcg tcgatttttg tgatgctcgt caggggggcg 3900gagcctatgg aaaaacgcca
gcaacgcggc ctttttacgg ttcctggcct tttgctggcc 3960ttttgctcac atgttctttc
ctgcgttatc ccctgattct gtggataacc gtattaccgc 4020tagccaggaa gagtttgtag
aaacgcaaaa aggccatccg tcaggatggc cttctgctta 4080gtttgatgcc tggcagttta
tggcgggcgt cctgcccgcc accctccggg ccgttgcttc 4140acaacgttca aatccgctcc
cggcggattt gtcctactca ggagagcgtt caccgacaaa 4200caacagataa aacgaaaggc
ccagtcttcc gactgagcct ttcgttttat ttgatgcctg 4260gcagttccct actctcgc
4278164207DNAArtificial
SequenceDescription of Artificial SequenceVector 16gttaacgcta gcatggatct
cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga
tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac
ttcaaaaagt tccaaggtgc gaatacggaa tttcatcggg 180cgaatcttct ctcccagcga
taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca
gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact
gccgcaacca tggatggata ttcttctccg agacattagt 360cttgccgatc agaagaagga
tccgaacgcg gtggtgactg cgttgaagtt ctacgcacaa 420tcaatgaagg agaacgagaa
gacgaaattc atgacgacga atagtgtttt cacgaatagc 480gatgacgatg atgtggacgt
tcagttgacc ggacaagtca cggaacattt gaggaatttg 540cagtgtagta atggttccgc
aacttcccca tctacatcag tgtcagcttc atcttcttct 600gctcgtccac tgacaaatgg
aaataatcat ctttccacgg cgtcgtctac cgacacatct 660ctctcattat cggaaaggaa
taacgttccg tctccagctc cagttccata tagtgaaagt 720gctccacaac tgaaaacatt
caccggagag actccaaaac tgcatccacg atctccgttc 780ccgcctcaac cgccagttct
tccgcaacga agcaaaaccg catcggcagt ggcgacgacg 840acgacgaatc cgacgacttc
gaatggagca ccaccaccag ttcctggatc gaaaggaccc 900ccggtgccac cgaaaccatc
gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 960tcgtctccac aacagtattc
gtctgctcga tccgttggta actcgctctc caacggcagt 1020gttgtctcca caacatcgtc
agatggtgat gtgcaattgt cgaataagga aaattcgaat 1080gacaaatcag ttggagacaa
gaatgggaac accaccacaa acaaaacgac cgtcgaacca 1140cctccaccag aagagccacc
tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 1200tccgaagtgc tcaatcaact
ccgcgagatt gttaatccaa gtaatccact tggaaagtac 1260gagatgaaga agcaaatcgg
tgttggagca tccggaactg tattcgttgc taatgtggcc 1320ggcagcactg atgtggtggc
tgtgaagaga atggctttca agactcagcc gaagaaggag 1380atgttgctca ccgagattaa
ggttatgaag cagtatcgac acccgaacct cgtcaactac 1440attgaatcgt atctggttga
tgctgatgat ctttgggtag tgatggatta tctggaaggt 1500ggaaacttga cagatgtcgt
tgtgaagact gagttggacg aaggacaaat tgcagcagtt 1560ttgcaagaat gtcttaaagc
gcttcacttc cttcatagac actccatagt gcaccgagat 1620atcaagagtg acaacgtgct
gctcggcatg aacggagagg ttaagctcac cgatatggga 1680ttctgtgctc agattcagcc
gggatcgaaa agttgtagag atactgtcgt cggaactcca 1740tattggatgt cgccggagat
attgaacaag aagcagtaca actataaggt tgacatttgg 1800tcgctgggaa ttatggccct
agagatgatt gatggagagc caccatattt gagagaaaca 1860cctttgaagg ctatctactt
gattgctcaa aacgggaagc cagagatcaa gcaacgcgac 1920agactgtctt cagagttcaa
caatttcctt gacaagtgtc ttgttgttga tccggatcag 1980agagccgata caacggagct
cttggcacat ccattcctga aaaaggcgaa gccactctca 2040agcctgattc catacatcag
agccgtccga gaaaagtaga cccagctttc ttgtacaaag 2100ttggcattat aagaaagcat
tgcttatcaa tttgttgcaa cgaacaggtc actatcagtc 2160aaaataaaat cattatttgc
catccagctg cagctctggc ccgtgtctca aaatctctga 2220tgttacattg cacaagataa
aaatatatca tcatgaacaa taaaactgtc tgcttacata 2280aacagtaata caaggggtgt
tatgagccat attcaacggg aaacgtcgag gccgcgatta 2340aattccaaca tggatgctga
tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 2400tcaggtgcga caatctatcg
cttgtatggg aagcccgatg cgccagagtt gtttctgaaa 2460catggcaaag gtagcgttgc
caatgatgtt acagatgaga tggtcagact aaactggctg 2520acggaattta tgcctcttcc
gaccatcaag cattttatcc gtactcctga tgatgcatgg 2580ttactcacca ctgcgatccc
cggaaaaaca gcattccagg tattagaaga atatcctgat 2640tcaggtgaaa atattgttga
tgcgctggca gtgttcctgc gccggttgca ttcgattcct 2700gtttgtaatt gtccttttaa
cagcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 2760atgaataacg gtttggttga
tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 2820gaacaagtct ggaaagaaat
gcataaactt ttgccattct caccggattc agtcgtcact 2880catggtgatt tctcacttga
taaccttatt tttgacgagg ggaaattaat aggttgtatt 2940gatgttggac gagtcggaat
cgcagaccga taccaggatc ttgccatcct atggaactgc 3000ctcggtgagt tttctccttc
attacagaaa cggctttttc aaaaatatgg tattgataat 3060cctgatatga ataaattgca
gtttcatttg atgctcgatg agtttttcta atcagaattg 3120gttaattggt tgtaacactg
gcagagcatt acgctgactt gacgggacgg cgcaagctca 3180tgaccaaaat cccttaacgt
gagttttcgt tccactgagc gtcagacccc gtagaaaaga 3240tcaaaggatc ttcttgagat
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 3300aaccaccgct accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga 3360aggtaactgg cttcagcaga
gcgcagatac caaatactgt ccttctagtg tagccgtagt 3420taggccacca cttcaagaac
tctgtagcac cgcctacata cctcgctctg ctaatcctgt 3480taccagtggc tgctgccagt
ggcgataagt cgtgtcttac cgggttggac tcaagacgat 3540agttaccgga taaggcgcag
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct 3600tggagcgaac gacctacacc
gaactgagat acctacagcg tgagctatga gaaagcgcca 3660cgcttcccga agggagaaag
gcggacaggt atccggtaag cggcagggtc ggaacaggag 3720agcgcacgag ggagcttcca
gggggaaacg cctggtatct ttatagtcct gtcgggtttc 3780gccacctctg acttgagcgt
cgatttttgt gatgctcgtc aggggggcgg agcctatgga 3840aaaacgccag caacgcggcc
tttttacggt tcctggcctt ttgctggcct tttgctcaca 3900tgttctttcc tgcgttatcc
cctgattctg tggataaccg tattaccgct agccaggaag 3960agtttgtaga aacgcaaaaa
ggccatccgt caggatggcc ttctgcttag tttgatgcct 4020ggcagtttat ggcgggcgtc
ctgcccgcca ccctccgggc cgttgcttca caacgttcaa 4080atccgctccc ggcggatttg
tcctactcag gagagcgttc accgacaaac aacagataaa 4140acgaaaggcc cagtcttccg
actgagcctt tcgttttatt tgatgcctgg cagttcccta 4200ctctcgc
4207174066DNAArtificial
SequenceDescription of Artificial SequenceVector 17gttaacgcta gcatggatct
cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga
tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac
ttcaaaaagt tccaaggtgc gaatacggaa tttcatcggg 180cgaatcttct ctcccagcga
taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca
gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact
gccgcaacca tggatggata ttcttctccg agacattagc 360tatttcagtc ttgccgatca
gaagaaggat ccgaacgcgg tggtgactgc gttgaagttc 420tacgcacaat caatgaagga
gaacgagaag acgaaattca tgacgacgaa tagtgttttc 480acgaatagcg atgacgatga
tgtggacgtt cagttgaccg gacaagtcac ggaacatttg 540aggaatttgc agtgtagtaa
tggttccgca acttccccat ctacatcagt gtcagcttca 600tcttcttctg ctcgtccact
gacaaatgga aataatcatc tttccacggc gtcgtctacc 660gacacatctc tctcattatc
ggaaaggaat aacgttccgt ctccagctcc agttccatat 720agtgaaagtg ctccacaact
gaaaacattc accggagaga ctccaaaact gcatccacga 780tctccgttcc cgcctcaacc
gccagttctt ccgcaacgaa gcaaaaccgc atcggcagtg 840gcgacgacga cgacgaatcc
gacgacttcg aatggagcac caccaccagt tcctggatcg 900aaaggacccc cggtgccacc
gaaaccatcg aaggaaaatt cgaatgacaa atcagttgga 960gacaagaatg ggaacaccac
cacaaacaaa acgaccgtcg aaccacctcc accagaagag 1020ccacctgttc gtgttcgagc
atctcatcgt gaaaagcttt ctgattccga agtgctcaat 1080caactccgcg agattgttaa
tccaagtaat ccacttggaa agtacgagat gaagaagcaa 1140atcggtgttg gagcatccgg
aactgtattc gttgctaatg tggccggcag cactgatgtg 1200gtggctgtga agagaatggc
tttcaagact cagccgaaga aggagatgtt gctcaccgag 1260attaaggtta tgaagcagta
tcgacacccg aacctcgtca actacattga atcgtatctg 1320gttgatgctg atgatctttg
ggtagtgatg gattatctgg aaggtggaaa cttgacagat 1380gtcgttgtga agactgagtt
ggacgaagga caaattgcag cagttttgca agaatgtctt 1440aaagcgcttc acttccttca
tagacactcc atagtgcacc gagatatcaa gagtgacaac 1500gtgctgctcg gcatgaacgg
agaggttaag ctcaccgata tgggattctg tgctcagatt 1560cagccgggat cgaaaagaga
tactgtcgtc ggaactccat attggatgtc gccggagata 1620ttgaacaaga agcagtacaa
ctataaggtt gacatttggt cgctgggaat tatggctcta 1680gagatgattg atggagagcc
accatatttg agagaaacac ctttgaaggc tatctacttg 1740attgctcaaa acgggaagcc
agagatcaag caacgcgaca gactgtcttc agagttcaac 1800aatttccttg acaagtgtct
tgttgttgat ccggatcaga gagccgatac aacggagctc 1860ttggcacatc cattcctgaa
aaaggcgaag ccactctcaa gcctgattcc atacatcaga 1920gccgtccgag aaaagtagac
ccagctttct tgtacaaagt tggcattata agaaagcatt 1980gcttatcaat ttgttgcaac
gaacaggtca ctatcagtca aaataaaatc attatttgcc 2040atccagctgc agctctggcc
cgtgtctcaa aatctctgat gttacattgc acaagataaa 2100aatatatcat catgaacaat
aaaactgtct gcttacataa acagtaatac aaggggtgtt 2160atgagccata ttcaacggga
aacgtcgagg ccgcgattaa attccaacat ggatgctgat 2220ttatatgggt ataaatgggc
tcgcgataat gtcgggcaat caggtgcgac aatctatcgc 2280ttgtatggga agcccgatgc
gccagagttg tttctgaaac atggcaaagg tagcgttgcc 2340aatgatgtta cagatgagat
ggtcagacta aactggctga cggaatttat gcctcttccg 2400accatcaagc attttatccg
tactcctgat gatgcatggt tactcaccac tgcgatcccc 2460ggaaaaacag cattccaggt
attagaagaa tatcctgatt caggtgaaaa tattgttgat 2520gcgctggcag tgttcctgcg
ccggttgcat tcgattcctg tttgtaattg tccttttaac 2580agcgatcgcg tatttcgtct
cgctcaggcg caatcacgaa tgaataacgg tttggttgat 2640gcgagtgatt ttgatgacga
gcgtaatggc tggcctgttg aacaagtctg gaaagaaatg 2700cataaacttt tgccattctc
accggattca gtcgtcactc atggtgattt ctcacttgat 2760aaccttattt ttgacgaggg
gaaattaata ggttgtattg atgttggacg agtcggaatc 2820gcagaccgat accaggatct
tgccatccta tggaactgcc tcggtgagtt ttctccttca 2880ttacagaaac ggctttttca
aaaatatggt attgataatc ctgatatgaa taaattgcag 2940tttcatttga tgctcgatga
gtttttctaa tcagaattgg ttaattggtt gtaacactgg 3000cagagcatta cgctgacttg
acgggacggc gcaagctcat gaccaaaatc ccttaacgtg 3060agttttcgtt ccactgagcg
tcagaccccg tagaaaagat caaaggatct tcttgagatc 3120ctttttttct gcgcgtaatc
tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg 3180tttgtttgcc ggatcaagag
ctaccaactc tttttccgaa ggtaactggc ttcagcagag 3240cgcagatacc aaatactgtc
cttctagtgt agccgtagtt aggccaccac ttcaagaact 3300ctgtagcacc gcctacatac
ctcgctctgc taatcctgtt accagtggct gctgccagtg 3360gcgataagtc gtgtcttacc
gggttggact caagacgata gttaccggat aaggcgcagc 3420ggtcgggctg aacggggggt
tcgtgcacac agcccagctt ggagcgaacg acctacaccg 3480aactgagata cctacagcgt
gagctatgag aaagcgccac gcttcccgaa gggagaaagg 3540cggacaggta tccggtaagc
ggcagggtcg gaacaggaga gcgcacgagg gagcttccag 3600ggggaaacgc ctggtatctt
tatagtcctg tcgggtttcg ccacctctga cttgagcgtc 3660gatttttgtg atgctcgtca
ggggggcgga gcctatggaa aaacgccagc aacgcggcct 3720ttttacggtt cctggccttt
tgctggcctt ttgctcacat gttctttcct gcgttatccc 3780ctgattctgt ggataaccgt
attaccgcta gccaggaaga gtttgtagaa acgcaaaaag 3840gccatccgtc aggatggcct
tctgcttagt ttgatgcctg gcagtttatg gcgggcgtcc 3900tgcccgccac cctccgggcc
gttgcttcac aacgttcaaa tccgctcccg gcggatttgt 3960cctactcagg agagcgttca
ccgacaaaca acagataaaa cgaaaggccc agtcttccga 4020ctgagccttt cgttttattt
gatgcctggc agttccctac tctcgc 4066184207DNAArtificial
SequenceDescription of Artificial SequenceVector 18gttaacgcta gcatggatct
cgggccccaa ataatgattt tattttgact gatagtgacc 60tgttcgttgc aacaaattga
tgagcaatgc ttttttataa tgccaacttt gtacaaaaaa 120gcaggctcaa aaatgtcaac
ttcaaaaagt tccaaggtgc gaatacggaa tttcgtcggg 180cgaatcttct ctcccagcga
taaagacaag gatcgagacg atgagatgaa gccatcctcg 240tccgcaatgg atattagtca
gccatataac acagtgcatc gagtccacgt tggatacgac 300ggccagaagt tcagcggact
gccgcaacca tggatggata ttcttctccg agacattagt 360cttgccgatc agaagaagga
tccgaacgcg gtggtgactg cgttgaagtt ctacgcacaa 420tcaatgaagg agaacgagaa
gacgaaattc atgacgacga atagtgtttt cacgaatagc 480gatgacgatg atgtggacgt
tcagttgacc ggacaagtca cggaacattt gaggaatttg 540cagtgtagta atggttccgc
aacttcccca tctacatcag tgtcagcttc atcttcttct 600gctcgtccac tgacaaatgg
aaataatcat ctttccacgg cgtcgtctac cgacacatct 660ctctcattat cggaaaggaa
taacgttccg tctccagctc cagttccata tagtgaaagt 720gctccacaac tgaaaacatt
caccggagag actccaaaac tgcatccacg atctccgttc 780ccgcctcaac cgccagttct
tccgcaacga agcaaaaccg catcggcagt ggcgacgacg 840acgacgaatc cgacgacttc
gaatggagca ccaccaccag ttcctggatc gaaaggaccc 900ccggtgccac cgaaaccatc
gcatctgaaa atcgcatcgt cgacagtatc ctcgggatgc 960tcgtctccac aacagtattc
gtctgctcga tccgttggta actcgctctc caacggcagt 1020gttgtctcca caacatcgtc
agatggtgat gtgcaattgt cgaataagga aaattcgaat 1080gacaaatcag ttggagacaa
gaatgggaac accaccacaa acaaaacgac cgtcgaacca 1140cctccaccag aagagccacc
tgttcgtgtt cgagcatctc atcgtgaaaa gctttctgat 1200tccgaagtgc tcaatcaact
ccgcgagatt gttaatccaa gtaatccact tggaaagtac 1260gagatgaaga agcaaatcgg
tgttggagca tccggaactg tattcgttgc taatgtggcc 1320ggcagcactg atgtggtggc
tgtgaagaga atggctttca agactcagcc gaagaaggag 1380atgttgctca ccgagattaa
ggttatgaag cagtatcgac acccgaacct cgtcaactac 1440attgaatcgt atctggttga
tgctgatgat ctttgggtag tgatggatta tctggaaggt 1500ggaaacttga cagatgtcgt
tgtgaagact gagttggacg aaggacaaat tgcagcagtt 1560ttgcaagaat gtcttaaagc
gcttcacttc cttcatagac actccatagt gcaccgagat 1620atcaagagtg acaacgtgct
gctcggcatg aacggagagg ttaagctcac cgatatggga 1680ttctgtgctc agattcagcc
gggatcgaaa agttgtagag atactgtcgt cggaactcca 1740tattggatgt cgccggagat
attgaacaag aagcagtaca actataaggt tgacatttgg 1800tcgctgggaa ttatggctct
agagatgatt gatggagagc caccatattt gagagaaaca 1860cctttgaagg ctatctactt
gattgctcaa aacgggaagc cagagatcaa gcaacgcgac 1920agactgtctt cagagttcaa
caatttcctt gacaagtgtc ttgttgttga tccggatcag 1980agagccgata caacggagct
cttggcacat ccattcctga aaaaggcgaa gccactctca 2040agcctgattc catacatcag
agccgtccga gaaaagtaga cccagctttc ttgtacaaag 2100ttggcattat aagaaagcat
tgcttatcaa tttgttgcaa cgaacaggtc actatcagtc 2160aaaataaaat cattatttgc
catccagctg cagctctggc ccgtgtctca aaatctctga 2220tgttacattg cacaagataa
aaatatatca tcatgaacaa taaaactgtc tgcttacata 2280aacagtaata caaggggtgt
tatgagccat attcaacggg aaacgtcgag gccgcgatta 2340aattccaaca tggatgctga
tttatatggg tataaatggg ctcgcgataa tgtcgggcaa 2400tcaggtgcga caatctatcg
cttgtatggg aagcccgatg cgccagagtt gtttctgaaa 2460catggcaaag gtagcgttgc
caatgatgtt acagatgaga tggtcagact aaactggctg 2520acggaattta tgcctcttcc
gaccatcaag cattttatcc gtactcctga tgatgcatgg 2580ttactcacca ctgcgatccc
cggaaaaaca gcattccagg tattagaaga atatcctgat 2640tcaggtgaaa atattgttga
tgcgctggca gtgttcctgc gccggttgca ttcgattcct 2700gtttgtaatt gtccttttaa
cagcgatcgc gtatttcgtc tcgctcaggc gcaatcacga 2760atgaataacg gtttggttga
tgcgagtgat tttgatgacg agcgtaatgg ctggcctgtt 2820gaacaagtct ggaaagaaat
gcataaactt ttgccattct caccggattc agtcgtcact 2880catggtgatt tctcacttga
taaccttatt tttgacgagg ggaaattaat aggttgtatt 2940gatgttggac gagtcggaat
cgcagaccga taccaggatc ttgccatcct atggaactgc 3000ctcggtgagt tttctccttc
attacagaaa cggctttttc aaaaatatgg tattgataat 3060cctgatatga ataaattgca
gtttcatttg atgctcgatg agtttttcta atcagaattg 3120gttaattggt tgtaacactg
gcagagcatt acgctgactt gacgggacgg cgcaagctca 3180tgaccaaaat cccttaacgt
gagttttcgt tccactgagc gtcagacccc gtagaaaaga 3240tcaaaggatc ttcttgagat
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 3300aaccaccgct accagcggtg
gtttgtttgc cggatcaaga gctaccaact ctttttccga 3360aggtaactgg cttcagcaga
gcgcagatac caaatactgt ccttctagtg tagccgtagt 3420taggccacca cttcaagaac
tctgtagcac cgcctacata cctcgctctg ctaatcctgt 3480taccagtggc tgctgccagt
ggcgataagt cgtgtcttac cgggttggac tcaagacgat 3540agttaccgga taaggcgcag
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct 3600tggagcgaac gacctacacc
gaactgagat acctacagcg tgagctatga gaaagcgcca 3660cgcttcccga agggagaaag
gcggacaggt atccggtaag cggcagggtc ggaacaggag 3720agcgcacgag ggagcttcca
gggggaaacg cctggtatct ttatagtcct gtcgggtttc 3780gccacctctg acttgagcgt
cgatttttgt gatgctcgtc aggggggcgg agcctatgga 3840aaaacgccag caacgcggcc
tttttacggt tcctggcctt ttgctggcct tttgctcaca 3900tgttctttcc tgcgttatcc
cctgattctg tggataaccg tattaccgct agccaggaag 3960agtttgtaga aacgcaaaaa
ggccatccgt caggatggcc ttctgcttag tttgatgcct 4020ggcagtttat ggcgggcgtc
ctgcccgcca ccctccgggc cgttgcttca caacgttcaa 4080atccgctccc ggcggatttg
tcctactcag gagagcgttc accgacaaac aacagataaa 4140acgaaaggcc cagtcttccg
actgagcctt tcgttttatt tgatgcctgg cagttcccta 4200ctctcgc
42071929DNAArtificial
Sequenceprimer 19ggggacaagt ttgtacaaaa aagcaggct
292029DNAArtificial Sequenceprimer 20ggggaccact ttgtacaaga
aagctgggt 292140DNAArtificial
Sequenceprimer 21aaaaagcagg ctcaaaaatg tttcaaaata gtccgatgat
402233DNAArtificial Sequenceprimer 22agaaagctgg gtctactttt
ctcggacggc tct 332334DNAArtificial
Sequenceprimer 23aaaaagcagg ctggtttaat tacccaagtt tgag
342433DNAArtificial Sequenceprimer 24agaaagctgg gtctactttt
ctcggacggc tct 332541DNAArtificial
Sequenceprimer 25aaaaagcagg ctcaaaaatg tcaacttcaa aaagttccaa g
41264447DNAArtificial SequenceDescription of Artificial
SequenceVector 26aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcac 60aagtttgtac aaaaaagcag gctcaaaaat gaaagctttc tcatcgtatg
atgagaaacc 120accagcacca ccaattcgtt tcagcagctc ggcaacgagg gagaatcagg
tcgtcggatt 180gaagccattg cccaaagagc cagaagcaac caagaaaaag aagacgatgc
ctaacccgtt 240catgaaaaag aacaaagaca aaaaggaagc gtcagaaaaa ccagtgatct
ctcgaccgag 300caatttcgaa cacacaattc atgtcggata tgacccaaaa accggcgaat
ttacgggaat 360gcctgaagca tgggcacgtc ttctcacaga ctcacagatc tcaaaacaag
agcagcaaca 420gaatcctcag gcagtgttgg acgcgctcaa atactacaca caaggcgaaa
gcagcggcca 480gaagtggttg cagtacgata tgaatgacgc accttctcgg acgccatcat
acggactgaa 540accgcaacca tatagcacat catccctgcc gtatcatggc aataaaattc
aggatccaag 600aaagatgaat ccaatgacaa ccagtacaag tagtgcgggg tataacagca
agcaaggagt 660tcctccgacg acgtttagtg taaatgagaa tagatcgagt atgccaccga
gttatgcacc 720gccaccggtc ccccatggtg aaactcctgc tgatattgtt cctcccgcta
tccctgatag 780gccggcaagg acgttgagta tttacacaaa accgaaagag gaggaagaaa
aaattccaga 840cctttcaaaa ggacaatttg gtgtacaggc cagaggtcaa aaagctaaga
aaaagatgac 900tgacgctgaa gtgctgacta agctccgtac cattgtgtct atcggaaatc
cagatcgaaa 960atatagaaaa gttgataaaa tcggctcagg tgcatctggt tctgtgtaca
ccgctattga 1020aattagtacc gaagcggagg tggctatcaa gcagatgaac ctgaaggatc
aaccaaagaa 1080ggaattgatc attaatgaga ttttggtgat gcgtgagaat aagcatgcaa
atattgtaaa 1140ttatttggat tcgtatttgg tgtgcgatga attatgggta gtgatggagt
atcttgccgg 1200tggatcattg actgatgttg tcacggagtg ccagatggag gatggaatta
ttgcagctgt 1260ttgcagagaa gttcttcaag cgcttgaatt cctccacagc cgccacgtca
ttcacagaga 1320tattaaatct gacaatattc ttttgggaat ggatggttcg gtgaaattga
ccgactttgg 1380attctgtgct cagctctcgc cggagcaaag aaaacgcacg acaatggtcg
gaactccata 1440ctggatggcg ccggaagtgg tgacccgcaa acaatacgga cccaaggttg
atgtgtggtc 1500cttgggaatc atggcgattg agatggtcga aggagaaccg ccatatttga
atgaaaatcc 1560actcagggct atctatctca ttgctacaaa tggcaaaccc gacttccctg
gaagagattc 1620catgactttg ttgttcaagg actttgtcga ctctgcgttg gaagtacaag
ttgaaaatcg 1680atggtcggca agccaactcc ttacgcatcc attcctccga tgcgccaaac
cgcttgcttc 1740actgtactac ttaatcgttg cggcgaagaa gagcatcgcc gaagctagca
actcataaac 1800ccagctttct tgtacaaagt ggtgatatca agcttatcga taccgtcgac
ctcgaggggg 1860ggcccggtac ccaattcgcc ctatagtgag tcgtattacg cgcgctcact
ggccgtcgtt 1920ttacaacgtc gtgactggga aaaccctggc gttacccaac ttaatcgcct
tgcagcacat 1980ccccctttcg ccagctggcg taatagcgaa gaggcccgca ccgatcgccc
ttcccaacag 2040ttgcgcagcc tgaatggcga atgggacgcg ccctgtagcg gcgcattaag
cgcggcgggt 2100gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc
cgctcctttc 2160gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc
tctaaatcgg 2220gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa
aaaacttgat 2280tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg
ccctttgacg 2340ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac
actcaaccct 2400atctcggtct attcttttga tttataaggg attttgccga tttcggccta
ttggttaaaa 2460aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac
gcttacaatt 2520taggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt
ttctaaatac 2580attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa
taatattgaa 2640aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt
tttgcggcat 2700tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat
gctgaagatc 2760agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag
atccttgaga 2820gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg
ctatgtggcg 2880cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata
cactattctc 2940agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat
ggcatgacag 3000taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc
aacttacttc 3060tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg
ggggatcatg 3120taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac
gacgagcgtg 3180acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact
ggcgaactac 3240ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa
gttgcaggac 3300cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct
ggagccggtg 3360agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc
tcccgtatcg 3420tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga
cagatcgctg 3480agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac
tcatatatac 3540tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag
atcctttttg 3600ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg
tcagaccccg 3660tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc
tgctgcttgc 3720aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag
ctaccaactc 3780tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc
cttctagtgt 3840agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac
ctcgctctgc 3900taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc
gggttggact 3960caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt
tcgtgcacac 4020agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt
gagctatgag 4080aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc
ggcagggtcg 4140gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt
tatagtcctg 4200tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca
ggggggcgga 4260gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt
tgctggcctt 4320ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt
attaccgcct 4380ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag
tcagtgagcg 4440aggaagc
4447273533DNAArtificial SequenceDescription of Artificial
SequenceVector 27aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcac 60aagtttgtac aaaaaagcag gctcaaatcg gtgttggagc atccggaact
gtattcgttg 120ctaatgtggc cggcagcact gatgtggtgg ctgtgaagag aatggctttc
aagactcagc 180cgaagaagga gatgttgctc accgagatta aggttatgaa gcagtatcga
cacccgaacc 240tcgtcaacta cattgaatcg tatctggttg atgctgatga tctttgggta
gtgatggatt 300atctggaagg tggaaacttg acagatgtcg ttgtgaagac tgagttggac
gaaggacaaa 360ttgcagcagt tttgcaagaa tgtcttaaag cgcttcactt ccttcataga
cactccatag 420tgcaccgaga tatcaagagt gacaacgtgc tgctcggcat gaacggagag
gttaagctca 480ccgatatggg attctgtgct cagattcagc cgggatcgaa aagagatact
gtcgtcggaa 540ctccatattg gatgtcgccg gagatattga acaagaagca gtacaactat
aaggttgaca 600tttggtcgct gggaattatg gctctagaga tgattgatgg agagccacca
tatttgagag 660aaacaccttt gaaggctatc tacttgattg ctcaaaacgg gaagccagag
atcaagcaac 720gcgacagact gtcttcagag ttcaacaatt tccttgacaa gtgtcttgtt
gttgatccgg 780atcagagagc cgatacaacg gagctcttgg cacatccatt cctgaaaaag
gcgaagccac 840tctcaagcct gattccatac atcagagccg tccgagaaaa gtagacccag
ctttcttgta 900caaagtggtg atatcaagct tatcgatacc gtcgacctcg agggggggcc
cggtacccaa 960ttcgccctat agtgagtcgt attacgcgcg ctcactggcc gtcgttttac
aacgtcgtga 1020ctgggaaaac cctggcgtta cccaacttaa tcgccttgca gcacatcccc
ctttcgccag 1080ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc caacagttgc
gcagcctgaa 1140tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg
tggttacgcg 1200cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt
tcttcccttc 1260ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc
tccctttagg 1320gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg
gtgatggttc 1380acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg
agtccacgtt 1440ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct
cggtctattc 1500ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg
agctgattta 1560acaaaaattt aacgcgaatt ttaacaaaat attaacgctt acaatttagg
tggcactttt 1620cggggaaatg tgcgcggaac ccctatttgt ttatttttct aaatacattc
aaatatgtat 1680ccgctcatga gacaataacc ctgataaatg cttcaataat attgaaaaag
gaagagtatg 1740agtattcaac atttccgtgt cgcccttatt cccttttttg cggcattttg
ccttcctgtt 1800tttgctcacc cagaaacgct ggtgaaagta aaagatgctg aagatcagtt
gggtgcacga 1860gtgggttaca tcgaactgga tctcaacagc ggtaagatcc ttgagagttt
tcgccccgaa 1920gaacgttttc caatgatgag cacttttaaa gttctgctat gtggcgcggt
attatcccgt 1980attgacgccg ggcaagagca actcggtcgc cgcatacact attctcagaa
tgacttggtt 2040gagtactcac cagtcacaga aaagcatctt acggatggca tgacagtaag
agaattatgc 2100agtgctgcca taaccatgag tgataacact gcggccaact tacttctgac
aacgatcgga 2160ggaccgaagg agctaaccgc ttttttgcac aacatggggg atcatgtaac
tcgccttgat 2220cgttgggaac cggagctgaa tgaagccata ccaaacgacg agcgtgacac
cacgatgcct 2280gtagcaatgg caacaacgtt gcgcaaacta ttaactggcg aactacttac
tctagcttcc 2340cggcaacaat taatagactg gatggaggcg gataaagttg caggaccact
tctgcgctcg 2400gcccttccgg ctggctggtt tattgctgat aaatctggag ccggtgagcg
tgggtctcgc 2460ggtatcattg cagcactggg gccagatggt aagccctccc gtatcgtagt
tatctacacg 2520acggggagtc aggcaactat ggatgaacga aatagacaga tcgctgagat
aggtgcctca 2580ctgattaagc attggtaact gtcagaccaa gtttactcat atatacttta
gattgattta 2640aaacttcatt tttaatttaa aaggatctag gtgaagatcc tttttgataa
tctcatgacc 2700aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag accccgtaga
aaagatcaaa 2760ggatcttctt gagatccttt ttttctgcgc gtaatctgct gcttgcaaac
aaaaaaacca 2820ccgctaccag cggtggtttg tttgccggat caagagctac caactctttt
tccgaaggta 2880actggcttca gcagagcgca gataccaaat actgtccttc tagtgtagcc
gtagttaggc 2940caccacttca agaactctgt agcaccgcct acatacctcg ctctgctaat
cctgttacca 3000gtggctgctg ccagtggcga taagtcgtgt cttaccgggt tggactcaag
acgatagtta 3060ccggataagg cgcagcggtc gggctgaacg gggggttcgt gcacacagcc
cagcttggag 3120cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag
cgccacgctt 3180cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac
aggagagcgc 3240acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg
gtttcgccac 3300ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct
atggaaaaac 3360gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc
tcacatgttc 3420tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga
gtgagctgat 3480accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga
agc 3533284323DNAArtificial SequenceDescription of Artificial
SequenceVector 28aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcac 60aagtttgtac aaaaaagcag gctcaaatcg gtgttggagc atccggaact
gtattcgttg 120ctaatgtggc cggcagcact gatgtggtgg ctgtgaagag aatggctttc
aagactcagc 180cgaagaagga gatgttgctc accgagatta aggttatgaa gcagtatcga
cacccgaacc 240tcgtcaacta cattgaatcg tatctggttg atgctgatga tctttgggta
gtgatggatt 300atctggaagg tggaaacttg acagatgtcg ttgtgaagac tgagttggac
gaaggacaaa 360ttgcagcagt tttgcaagaa tgtcttaaag cgcttcactt ccttcataga
cactccatag 420tgcaccgaga tatcaagagt gacaacgtgc tgctcggcat gaacggagag
gttaagctca 480ccgatatggg attctgtgct cagattcagc cgggatcgaa aagagatact
gtcgtcggaa 540ctccatattg gatgtcgccg gagatattga acaagaagca gtacaactat
aaggttgaca 600tttggtcgct gggaattatg gctctagaga tgattgatgg agagccacca
tatttgagag 660aaacaccttt gaaggctatc tacttgattg ctcaaaacgg gaagccagag
atcaagcaac 720gcgacagact gtcttcagag ttcaacaatt tccttgacaa gtgtcttgtt
gttgatccgg 780atcagagagc cgatacaacg gagctcttgg cacatccatt cctgaaaaag
gcgaagccac 840tctcaagcct gattccatac atcagagccg tccgagaaaa gtagcaccgc
tattgaaatt 900agtaccgaag cggaggtggc tatcaagcag atgaacctga aggatcaacc
aaagaaggaa 960ttgatcatta atgagatttt ggtgatgcgt gagaataagc atgcaaatat
tgtaaattat 1020ttggattcgt atttggtgtg cgatgaatta tgggtagtga tggagtatct
tgccggtgga 1080tcattgactg atgttgtcac ggagtgccag atggaggatg gaattattgc
agctgtttgc 1140agagaagttc ttcaagcgct tgaattcctc cacagccgcc acgtcattca
cagagatatt 1200aaatctgaca atattctttt gggaatggat ggttcggtga aattgaccga
ctttggattc 1260tgtgctcagc tctcgccgga gcaaagaaaa cgcacgacaa tggtcggaac
tccatactgg 1320atggcgccgg aagtggtgac ccgcaaacaa tacggaccca aggttgatgt
gtggtccttg 1380ggaatcatgg cgattgagat ggtcgaagga gaaccgccat atttgaatga
aaatccactc 1440agggctatct atctcattgc tacaaatggc aaacccgact tccctggaag
agattccatg 1500actttgttgt tcaaggactt tgtcgactct gcgttggaag tacaagttga
aaatcgatgg 1560tcggcaagcc aactccttac gcatccattc ctccgatgcg ccaaaccgct
tgcttcactg 1620tactacttaa tcgttgcggc gaagaagagc atcgccgaag ctagcaactc
ataaacccag 1680ctttcttgta caaagtggtg atatcaagct tatcgatacc gtcgacctcg
agggggggcc 1740cggtacccaa ttcgccctat agtgagtcgt attacgcgcg ctcactggcc
gtcgttttac 1800aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa tcgccttgca
gcacatcccc 1860ctttcgccag ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc
caacagttgc 1920gcagcctgaa tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg
gcgggtgtgg 1980tggttacgcg cagcgtgacc gctacacttg ccagcgccct agcgcccgct
cctttcgctt 2040tcttcccttc ctttctcgcc acgttcgccg gctttccccg tcaagctcta
aatcgggggc 2100tccctttagg gttccgattt agtgctttac ggcacctcga ccccaaaaaa
cttgattagg 2160gtgatggttc acgtagtggg ccatcgccct gatagacggt ttttcgccct
ttgacgttgg 2220agtccacgtt ctttaatagt ggactcttgt tccaaactgg aacaacactc
aaccctatct 2280cggtctattc ttttgattta taagggattt tgccgatttc ggcctattgg
ttaaaaaatg 2340agctgattta acaaaaattt aacgcgaatt ttaacaaaat attaacgctt
acaatttagg 2400tggcactttt cggggaaatg tgcgcggaac ccctatttgt ttatttttct
aaatacattc 2460aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat
attgaaaaag 2520gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg
cggcattttg 2580ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta aaagatgctg
aagatcagtt 2640gggtgcacga gtgggttaca tcgaactgga tctcaacagc ggtaagatcc
ttgagagttt 2700tcgccccgaa gaacgttttc caatgatgag cacttttaaa gttctgctat
gtggcgcggt 2760attatcccgt attgacgccg ggcaagagca actcggtcgc cgcatacact
attctcagaa 2820tgacttggtt gagtactcac cagtcacaga aaagcatctt acggatggca
tgacagtaag 2880agaattatgc agtgctgcca taaccatgag tgataacact gcggccaact
tacttctgac 2940aacgatcgga ggaccgaagg agctaaccgc ttttttgcac aacatggggg
atcatgtaac 3000tcgccttgat cgttgggaac cggagctgaa tgaagccata ccaaacgacg
agcgtgacac 3060cacgatgcct gtagcaatgg caacaacgtt gcgcaaacta ttaactggcg
aactacttac 3120tctagcttcc cggcaacaat taatagactg gatggaggcg gataaagttg
caggaccact 3180tctgcgctcg gcccttccgg ctggctggtt tattgctgat aaatctggag
ccggtgagcg 3240tgggtctcgc ggtatcattg cagcactggg gccagatggt aagccctccc
gtatcgtagt 3300tatctacacg acggggagtc aggcaactat ggatgaacga aatagacaga
tcgctgagat 3360aggtgcctca ctgattaagc attggtaact gtcagaccaa gtttactcat
atatacttta 3420gattgattta aaacttcatt tttaatttaa aaggatctag gtgaagatcc
tttttgataa 3480tctcatgacc aaaatccctt aacgtgagtt ttcgttccac tgagcgtcag
accccgtaga 3540aaagatcaaa ggatcttctt gagatccttt ttttctgcgc gtaatctgct
gcttgcaaac 3600aaaaaaacca ccgctaccag cggtggtttg tttgccggat caagagctac
caactctttt 3660tccgaaggta actggcttca gcagagcgca gataccaaat actgtccttc
tagtgtagcc 3720gtagttaggc caccacttca agaactctgt agcaccgcct acatacctcg
ctctgctaat 3780cctgttacca gtggctgctg ccagtggcga taagtcgtgt cttaccgggt
tggactcaag 3840acgatagtta ccggataagg cgcagcggtc gggctgaacg gggggttcgt
gcacacagcc 3900cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc
tatgagaaag 3960cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca
gggtcggaac 4020aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata
gtcctgtcgg 4080gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg
ggcggagcct 4140atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct
ggccttttgc 4200tcacatgttc tttcctgcgt tatcccctga ttctgtggat aaccgtatta
ccgcctttga 4260gtgagctgat accgctcgcc gcagccgaac gaccgagcgc agcgagtcag
tgagcgagga 4320agc
4323292865DNAArtificial SequenceDescription of Artificial
SequenceVector 29aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcac 60aagtttgtac aaaaaagcag gctcaaaaat gtttcaaaat agtccgatga
tgtacgactg 120gtggaatgac accaccaaac cgaaacacca gcagccgaca cttaacgtgt
tgtcaccatg 180gggagcatat ttcaatcaca ttggaaatga actgctaccc agctttcttg
tacaaagtgg 240tgatatcaag cttatcgata ccgtcgacct cgaggggggg cccggtaccc
aattcgccct 300atagtgagtc gtattacgcg cgctcactgg ccgtcgtttt acaacgtcgt
gactgggaaa 360accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc
agctggcgta 420atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg
aatggcgaat 480gggacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg
cgcagcgtga 540ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct
tcctttctcg 600ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta
gggttccgat 660ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt
tcacgtagtg 720ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg
ttctttaata 780gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat
tcttttgatt 840tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt
taacaaaaat 900ttaacgcgaa ttttaacaaa atattaacgc ttacaattta ggtggcactt
ttcggggaaa 960tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt
atccgctcat 1020gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta
tgagtattca 1080acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg
tttttgctca 1140cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac
gagtgggtta 1200catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg
aagaacgttt 1260tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc
gtattgacgc 1320cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg
ttgagtactc 1380accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat
gcagtgctgc 1440cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg
gaggaccgaa 1500ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg
atcgttggga 1560accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc
ctgtagcaat 1620ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt
cccggcaaca 1680attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct
cggcccttcc 1740ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc
gcggtatcat 1800tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca
cgacggggag 1860tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct
cactgattaa 1920gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt
taaaacttca 1980tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga
ccaaaatccc 2040ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca
aaggatcttc 2100ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac
caccgctacc 2160agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg
taactggctt 2220cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag
gccaccactt 2280caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac
cagtggctgc 2340tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt
taccggataa 2400ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg
agcgaacgac 2460ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc
ttcccgaagg 2520gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc
gcacgaggga 2580gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc
acctctgact 2640tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa
acgccagcaa 2700cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt
tctttcctgc 2760gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg
ataccgctcg 2820ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagc
2865303525DNAArtificial SequenceDescription of Artificial
SequenceVector 30aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcac 60aagtttgtac aaaaaagcag gctcaaaaat gtcaacttca aaaagttcca
aggtgcgaat 120acggaatttc atcgggcgaa tcttctctcc cagcgataaa gacaaggatc
gagacgatga 180gatgaagcca tcctcgtccg caatggatat tagtcagcca tataacacag
tgcatcgagt 240ccacgttgga tacgacggcc agaagttcag cggactgccg caaccatgga
tggatattct 300tctccgagac attagtcttg ccgatcagaa gaaggatccg aacgcggtgg
tgactgcgtt 360gaagttctac gcacaatcaa tgaaggagaa cgagaagacg aaattcatga
cgacgaatag 420tgttttcacg aatagcgatg acgatgatgt ggacgttcag ttgaccggac
aagtcacgga 480acatttgagg aatttgcagt gtagtaatgg ttccgcaact tccccatcta
catcagtgtc 540agcttcatct tcttctgctc gtccactgac aaatggaaat aatcatcttt
ccacggcgtc 600gtctaccgac acatctctct cattatcgga aaggaataac gttccgtctc
cagctccagt 660tccatatagt gaaagtgctc cacaactgaa aacattcacc ggagagactc
caaaactgca 720tccacgatct ccgttcccgc ctcaaccgcc agttcttccg caacgaagca
aaaccgcatc 780ggcagtggcg acgacgacga cgaatccgac gacttcgaat ggagcaccac
caccagttcc 840tggatcgaaa ggacccccgg tgccaccgaa accatcaccc agctttcttg
tacaaagtgg 900tgatatcaag cttatcgata ccgtcgacct cgaggggggg cccggtaccc
aattcgccct 960atagtgagtc gtattacgcg cgctcactgg ccgtcgtttt acaacgtcgt
gactgggaaa 1020accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc
agctggcgta 1080atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg
aatggcgaat 1140gggacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg
cgcagcgtga 1200ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct
tcctttctcg 1260ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta
gggttccgat 1320ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt
tcacgtagtg 1380ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg
ttctttaata 1440gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat
tcttttgatt 1500tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt
taacaaaaat 1560ttaacgcgaa ttttaacaaa atattaacgc ttacaattta ggtggcactt
ttcggggaaa 1620tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt
atccgctcat 1680gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta
tgagtattca 1740acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg
tttttgctca 1800cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac
gagtgggtta 1860catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg
aagaacgttt 1920tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc
gtattgacgc 1980cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg
ttgagtactc 2040accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat
gcagtgctgc 2100cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg
gaggaccgaa 2160ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg
atcgttggga 2220accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc
ctgtagcaat 2280ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt
cccggcaaca 2340attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct
cggcccttcc 2400ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc
gcggtatcat 2460tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca
cgacggggag 2520tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct
cactgattaa 2580gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt
taaaacttca 2640tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga
ccaaaatccc 2700ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca
aaggatcttc 2760ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac
caccgctacc 2820agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg
taactggctt 2880cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag
gccaccactt 2940caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac
cagtggctgc 3000tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt
taccggataa 3060ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg
agcgaacgac 3120ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc
ttcccgaagg 3180gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc
gcacgaggga 3240gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc
acctctgact 3300tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa
acgccagcaa 3360cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt
tctttcctgc 3420gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg
ataccgctcg 3480ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagc
3525313292DNAArtificial SequenceDescription of Artificial
SequenceVector 31aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcat 60cgatgaattc gagctccacc gcggtggcgg ccgctctaga actagtggat
cccccgggct 120gcaggaattc cgcccgtcgg taaaacgtgt ctcctgatat cctacaccac
aaacgcattt 180cccggagaat atattccgac ggtattcgac aactactcag caaatgtgat
ggtcgacggt 240cggccgataa atctcgggct ctgggataca gctggacagg aagattacga
tcgactccga 300ccactgtcat atccacaaac agacgtgttt ctcgtatgct ttgccctgaa
caatccggcg 360agttttgaga atgttcgtgc gaaatggtat ccagaagtgt cacatcattg
cccgaatacg 420ccgattattt tggttggaac gaaagctgat ctgcgtgagg atcgagatac
tgttgaacgg 480ctccgcgaac gccggctcca accagtgagc caaacccagg gctacgtgat
ggcaaaggaa 540atcaaggctg tcaagtatct ggagtgctcg gcgctcacgc aacgtggtct
gaaacaagtt 600ttcgatgagg cgatccgagc cgtgctcacg ccgccacaaa gagccaaaaa
gagcaagtgg 660gcgaattcga tatcaagctt atcgataccg tcgacctcga gggggggccc
ggtacccaat 720tcgccctata gtgagtcgta ttacgcgcgc tcactggccg tcgttttaca
acgtcgtgac 780tgggaaaacc ctggcgttac ccaacttaat cgccttgcag cacatccccc
tttcgccagc 840tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg
cagcctgaat 900ggcgaatggg acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt
ggttacgcgc 960agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt
cttcccttcc 1020tttctcgcca cgttcgccgg ctttccccgt caagctctaa atcgggggct
ccctttaggg 1080ttccgattta gtgctttacg gcacctcgac cccaaaaaac ttgattaggg
tgatggttca 1140cgtagtgggc catcgccctg atagacggtt tttcgccctt tgacgttgga
gtccacgttc 1200tttaatagtg gactcttgtt ccaaactgga acaacactca accctatctc
ggtctattct 1260tttgatttat aagggatttt gccgatttcg gcctattggt taaaaaatga
gctgatttaa 1320caaaaattta acgcgaattt taacaaaata ttaacgctta caatttaggt
ggcacttttc 1380ggggaaatgt gcgcggaacc cctatttgtt tatttttcta aatacattca
aatatgtatc 1440cgctcatgag acaataaccc tgataaatgc ttcaataata ttgaaaaagg
aagagtatga 1500gtattcaaca tttccgtgtc gcccttattc ccttttttgc ggcattttgc
cttcctgttt 1560ttgctcaccc agaaacgctg gtgaaagtaa aagatgctga agatcagttg
ggtgcacgag 1620tgggttacat cgaactggat ctcaacagcg gtaagatcct tgagagtttt
cgccccgaag 1680aacgttttcc aatgatgagc acttttaaag ttctgctatg tggcgcggta
ttatcccgta 1740ttgacgccgg gcaagagcaa ctcggtcgcc gcatacacta ttctcagaat
gacttggttg 1800agtactcacc agtcacagaa aagcatctta cggatggcat gacagtaaga
gaattatgca 1860gtgctgccat aaccatgagt gataacactg cggccaactt acttctgaca
acgatcggag 1920gaccgaagga gctaaccgct tttttgcaca acatggggga tcatgtaact
cgccttgatc 1980gttgggaacc ggagctgaat gaagccatac caaacgacga gcgtgacacc
acgatgcctg 2040tagcaatggc aacaacgttg cgcaaactat taactggcga actacttact
ctagcttccc 2100ggcaacaatt aatagactgg atggaggcgg ataaagttgc aggaccactt
ctgcgctcgg 2160cccttccggc tggctggttt attgctgata aatctggagc cggtgagcgt
gggtctcgcg 2220gtatcattgc agcactgggg ccagatggta agccctcccg tatcgtagtt
atctacacga 2280cggggagtca ggcaactatg gatgaacgaa atagacagat cgctgagata
ggtgcctcac 2340tgattaagca ttggtaactg tcagaccaag tttactcata tatactttag
attgatttaa 2400aacttcattt ttaatttaaa aggatctagg tgaagatcct ttttgataat
ctcatgacca 2460aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaa
aagatcaaag 2520gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaaca
aaaaaaccac 2580cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactcttttt
ccgaaggtaa 2640ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccg
tagttaggcc 2700accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatc
ctgttaccag 2760tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaaga
cgatagttac 2820cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagccc
agcttggagc 2880gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagc
gccacgcttc 2940ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaaca
ggagagcgca 3000cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcggg
tttcgccacc 3060tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagccta
tggaaaaacg 3120ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgct
cacatgttct 3180ttcctgcgtt atcccctgat tctgtggata accgtattac cgcctttgag
tgagctgata 3240ccgctcgccg cagccgaacg accgagcgca gcgagtcagt gagcgaggaa
gc 3292323323DNAArtificial SequenceDescription of Artificial
SequenceVector 32aacctggctt atcgaaatta atacgactca ctatagggag accggcagat
ctgatatcat 60cgatgaattc gagctccacc gcggtggcgg ccgctctaga actagtggat
cccccgggct 120gcaggaattc cgccctcgag gcagatcaaa tgtgtagttg ttggagacgg
aacagttgga 180aaaacatgca tgttaatatc ttacacaact gactcttttc cagttcagta
tgtgcctaca 240gtatttgata actattcggc acagatgagt cttgatggga acgttgtgaa
cttaggattg 300tgggatactg ctggacagga ggattatgat cgtttacgac cactttccta
cccacagacg 360gatgttttca ttctctgctt ctctgtcgtc tcgcccgtat cgtttgacaa
tgtggcaagc 420aagtggattc cggaaatacg acagcattgt ccagatgcgc ctgtcattct
agttggtacc 480aaactcgatt tgcgcgacga ggccgaaccg atgcgtgctc tgcaggccga
aggaaagtcc 540ccaatttcca aaacgcaagg catgaaaatg gctcaaaaaa ttaaagctgt
caagtatttg 600gaatgctctg cattgacgca acagggactc acacaggtgt tcgaagacgc
cgtacggtcc 660attcttcatc cgaaaccaca gaaaaagaag ggcgaattcg atatcaagct
tatcgatacc 720gtcgacctcg agggggggcc cggtacccaa ttcgccctat agtgagtcgt
attacgcgcg 780ctcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta
cccaacttaa 840tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg
cccgcaccga 900tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg gacgcgccct
gtagcggcgc 960attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg
ccagcgccct 1020agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg
gctttccccg 1080tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac
ggcacctcga 1140ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct
gatagacggt 1200ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt
tccaaactgg 1260aacaacactc aaccctatct cggtctattc ttttgattta taagggattt
tgccgatttc 1320ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt
ttaacaaaat 1380attaacgctt acaatttagg tggcactttt cggggaaatg tgcgcggaac
ccctatttgt 1440ttatttttct aaatacattc aaatatgtat ccgctcatga gacaataacc
ctgataaatg 1500cttcaataat attgaaaaag gaagagtatg agtattcaac atttccgtgt
cgcccttatt 1560cccttttttg cggcattttg ccttcctgtt tttgctcacc cagaaacgct
ggtgaaagta 1620aaagatgctg aagatcagtt gggtgcacga gtgggttaca tcgaactgga
tctcaacagc 1680ggtaagatcc ttgagagttt tcgccccgaa gaacgttttc caatgatgag
cacttttaaa 1740gttctgctat gtggcgcggt attatcccgt attgacgccg ggcaagagca
actcggtcgc 1800cgcatacact attctcagaa tgacttggtt gagtactcac cagtcacaga
aaagcatctt 1860acggatggca tgacagtaag agaattatgc agtgctgcca taaccatgag
tgataacact 1920gcggccaact tacttctgac aacgatcgga ggaccgaagg agctaaccgc
ttttttgcac 1980aacatggggg atcatgtaac tcgccttgat cgttgggaac cggagctgaa
tgaagccata 2040ccaaacgacg agcgtgacac cacgatgcct gtagcaatgg caacaacgtt
gcgcaaacta 2100ttaactggcg aactacttac tctagcttcc cggcaacaat taatagactg
gatggaggcg 2160gataaagttg caggaccact tctgcgctcg gcccttccgg ctggctggtt
tattgctgat 2220aaatctggag ccggtgagcg tgggtctcgc ggtatcattg cagcactggg
gccagatggt 2280aagccctccc gtatcgtagt tatctacacg acggggagtc aggcaactat
ggatgaacga 2340aatagacaga tcgctgagat aggtgcctca ctgattaagc attggtaact
gtcagaccaa 2400gtttactcat atatacttta gattgattta aaacttcatt tttaatttaa
aaggatctag 2460gtgaagatcc tttttgataa tctcatgacc aaaatccctt aacgtgagtt
ttcgttccac 2520tgagcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt
ttttctgcgc 2580gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg
tttgccggat 2640caagagctac caactctttt tccgaaggta actggcttca gcagagcgca
gataccaaat 2700actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt
agcaccgcct 2760acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga
taagtcgtgt 2820cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc
gggctgaacg 2880gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact
gagataccta 2940cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga
caggtatccg 3000gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg
aaacgcctgg 3060tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt
tttgtgatgc 3120tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt
acggttcctg 3180gccttttgct ggccttttgc tcacatgttc tttcctgcgt tatcccctga
ttctgtggat 3240aaccgtatta ccgcctttga gtgagctgat accgctcgcc gcagccgaac
gaccgagcgc 3300agcgagtcag tgagcgagga agc
3323331281DNACaenorhabditis elegans 33atgtttcaaa atagtccgat
gatgtacgac tggtggaatg acaccaccaa accgaaacac 60cagcagccga cacttaacgt
gttgtcacca tggggagcat atttcaatca cattggaaat 120gaactgctgc atctgaaaat
cgcatcgtcg acagtatcct cgggatgctc gtctccacaa 180cagtattcgt ctgctcgatc
cgttggtaac tcgctctcca acggcagtgt tgtctccaca 240acatcgtcag atggtgatgt
gcaattgtcg aataaggaaa attcgaatga caaatcagtt 300ggagacaaga atgggaacac
caccacaaac aaaacgaccg tcgaaccacc tccaccagaa 360gagccacctg ttcgtgttcg
agcatctcat cgtgaaaagc tttctgattc cgaagtgctc 420aatcaactcc gcgagattgt
taatccaagt aatccacttg gaaagtacga gatgaagaag 480caaatcggtg ttggagcatc
cggaactgta ttcgttgcta atgtggccgg cagcactgat 540gtggtggctg tgaagagaat
ggctttcaag actcagccga agaaggagat gttgctcacc 600gagattaagg ttatgaagca
gtatcgacac ccgaacctcg tcaactacat tgaatcgtat 660ctggttgatg ctgatgatct
ttgggtagtg atggattatc tggaaggtgg aaacttgaca 720gatgtcgttg tgaagactga
gttggacgaa ggacaaattg cagcagtttt gcaagaatgt 780cttaaagcgc ttcacttcct
tcatagacac tccatagtgc accgagatat caagagtgac 840aacgtgctgc tcggcatgaa
cggagaggtt aagctcaccg atatgggatt ctgtgctcag 900attcagccgg gatcgaaaag
agatactgtc gtcggaactc catattggat gtcgccggag 960atattgaaca agaagcagta
caactataag gttgacattt ggtcgctggg aattatggct 1020ctagagatga ttgatggaga
gccaccatat ttgagagaaa cacctttgaa ggctatctac 1080ttgattgctc aaaacgggaa
gccagagatc aagcaacgcg acagactgtc ttcagagttc 1140aacaatttcc ttgacaagtg
tcttgttgtt gatccggatc agagagccga tacaacggag 1200ctcttggcac atccattcct
gaaaaaggcg aagccactct caagcctgat tccatacatc 1260agagccgtcc gagaaaagta g
128134360DNACaenorhabditis
elegansmisc_feature(303)..(303)n is a or c or g or t or unknown
34cgacgaaata gtgttttcac gaatagcgat gacgatgatg tggacgttca gttgaccgga
60caagtcacgg aacatttgag gaatttgcag tgtagtaatg gttccgcaac ttccccatct
120acatcagtgt cagcttcatc ttcttctgct cgtccactga caaatggaaa taatcatctt
180tccacggcgt cgtctaccga cacatctctc tcattatcgg aaaggaataa cgttccgtct
240ccagctccag ttccatatag tgaaagtgct ccacaactga aaacattcac cggagagact
300ccnaaactgc atccacgatc tccgttcccg cctcaaccgc cagttcttcc gcaacgaagc
36035300DNACaenorhabditis elegansmisc_feature(33)..(33)a or c or g or t
or unknown 35atgtttctgt atattttatg tgaaatgcaa cangaatctt ctagcaaaaa
agtacgatgc 60tggcaggtag ttgttggggg atggagagaa ggggagaaac aaaacaaaaa
tgacaatagg 120tgataaaaat nataataatg ttttcgccac agttttcgcg cttaattcac
aggaaggttt 180ttttttgcat acaataaaat agtgtgaatg ggagagattt ttagagagaa
aaaaactaca 240aaaaaaacga ggagcaagat ataagggctt gtgtatggta aaacatataa
aacgctgtgt 30036750DNACaenorhabditis elegans 36atgaagccat cctcgtccgc
aatggatatt agtcagccat ataacacagt gcatcgtctt 60gccgatcaga agaaggatcc
gaacgcggtg gtgactgcgt tgaagttcta cgcacaatca 120atgaaggaga acgagaagac
gaaattcatg acgacgaata gtgttttcac gaatagcgat 180gacgatgatg tggacgttca
gttgaccgga caagtcacgg aacatttgag gaatttgcag 240tgtagtaatg gttccgcaac
ttccccatct acatcagtgt cagcttcatc ttcttctgct 300cgtccactga caaatggaaa
taatcatctt tccacggcgt cgtctaccga cacatctctc 360tcattatcgg aaaggaataa
cgttccgtct ccagctccag ttccatatag tgaaagtgct 420ccacaactga aaacattcac
cggagagact ccaaaactgc atccacgatc tccgttcccg 480cctcaaccgc cagttcttcc
gcaacgaagc aaaaccgcat cggcagtggc gacgacgacg 540acgaatccga cgacttcgaa
tggagcacca ccaccagttc ctggatcgaa aggacccccg 600gtgccaccga aaccatcgac
ttcagttatc tcttttcgtg agtgttcact gatttgtgtt 660ttgatttatg ttgttcgtca
aatttgtaga tttgatcttc tcacttccaa gctcggtgca 720cattgttcaa actctttgca
attctggtag 750
User Contributions:
comments("1"); ?> comment_form("1"); ?>Inventors list |
Agents list |
Assignees list |
List by place |
Classification tree browser |
Top 100 Inventors |
Top 100 Agents |
Top 100 Assignees |
Usenet FAQ Index |
Documents |
Other FAQs |
User Contributions:
Comment about this patent or add new information about this topic: