Patent application title: GENOME EDITING METHOD
Inventors:
IPC8 Class: AC12N1590FI
USPC Class:
1 1
Class name:
Publication date: 2020-07-16
Patent application number: 20200224221
Abstract:
The present invention relates to the field of genetic engineering. In
particular, the present invention relates to a genome editing method with
high efficiency and high specificity. More specifically, the present
invention relates to a method for increasing the efficiency of
site-directed modification of a target sequence in a genome of an
organism by a high-specificity Cas9 nuclease variant.Claims:
1. A genome editing system for site-directed modification of a target
sequence in the genome of a cell, which comprises at least one selected
from the following i) to iii): i) a Cas9 nuclease variant, and an
expression construct comprising a nucleotide sequence encoding a
tRNA-guide RNA fusion; ii) an expression construct comprising a
nucleotide sequence encoding a Cas9 nuclease variant, and an expression
construct comprising a nucleotide sequence encoding a tRNA-guide RNA
fusion; and iii) an expression construct comprising a nucleotide sequence
encoding a Cas9 nuclease variant and a nucleotide sequence encoding a
tRNA-guide RNA fusion; wherein the Cas9 nuclease variant has higher
specificity as compared with the wild-type Cas9 nuclease, wherein the 5'
end of the guide RNA is linked to the 3' end of the tRNA, wherein the
fusion is cleaved at the 5' end of the guide RNA after being transcribed
in the cell, thereby forming a guide RNA that does not carry an extra
nucleotide at the 5' end.
2. A genome editing system for site-directed modification of a target sequence in the genome of a cell, which comprises at least one selected from the following i) to iii): i) a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a ribozyme-guide RNA fusion; ii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a ribozyme-guide RNA fusion; and iii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant and a nucleotide sequence encoding a ribozyme-guide RNA fusion; wherein the Cas9 nuclease variant has higher specificity as compared with the wild-type Cas9 nuclease, wherein the 5' end of the guide RNA is linked to the 3' end of a first ribozyme, wherein the first ribozyme is designed to cleave the fusion at the 5' end of the guide RNA, thereby forming a guide RNA that does not carry extra nucleotide at the 5' end.
3. The system of claim 1, wherein the tRNA and the cell to be modified are derived from a same species.
4. The system of claim 1, wherein the tRNA is encoded by a sequence as shown in SEQ ID NO:1.
5. The system of claim 1, wherein the Cas9 nuclease variant is a variant of SEQ ID NO:2 and comprises an amino acid substitution at position 855 of SEQ ID NO:2, for example, the amino acid substitution is K855A.
6. The system of claim 1, wherein the Cas9 nuclease variant is a variant of the SEQ ID NO:2 and comprises amino acid substitutions at positions 810, 1003 and 1060 of SEQ ID NO:2, for example, the amino acid substitutions are K810A, K1003A and R1060A.
7. The system of claim 1, wherein the Cas9 nuclease variant is a variant of the SEQ ID NO:2 and comprises amino acid substitutions at positions 848, 1003 and 1060 of SEQ ID NO:2, for example, the amino acid substitutions are K848A, K1003A and R1060A.
8. The system of claim 1, wherein the Cas9 nuclease variant is a variant of the SEQ ID NO:2 and comprises amino acid substitutions at positions 611, 695 and 926 of SEQ ID NO:2, for example, the amino acid substitutions are R611A, Q695A and Q926A.
9. The system of claim 1, wherein the Cas9 nuclease variant is a variant of the SEQ ID NO:2 and comprises amino acid substitutions at positions 497, 611, 695 and 926 of SEQ ID NO:2, for example, the amino acid substitutions are N497A, R611A, Q695A and Q926A.
10. The system of claim 1, wherein the Cas9 nuclease variant comprises an amino acid sequence as shown in SEQ ID NO:4, SEQ ID NO:5 or SEQ ID NO:6.
11. The system of claim 1, wherein the nucleotide sequence encoding the Cas9 nuclease variant is codon-optimized for the organism from which the cell to be modified is derived.
12. The system of claim 1, wherein the guide RNA is a single guide RNA (sgRNA).
13. A method for genetically modifying a cell, comprising: introducing the system of claim 1 to the cell, and thereby the Cas9 nuclease variant is targeted to the target sequence in the genome of the cell by the guide RNA, and results in substitution, deletion and/or addition of one or more nucleotides in the target sequence.
14. The method of claim 13, wherein the cell is derived from mammals.
15. The method of claim 13, wherein the system is introduced into the cell by a method selected from: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as a baculovirus, a vaccinia virus, an adenovirus and other viruses), particle bombardment, PEG-mediated protoplast transformation and agrobacterium-mediated transformation.
16. The method of claim 14, wherein the mammal is a human, a mouse, a rat, a monkey, a dog, a pig, a sheep, a cow or a cat, wherein the poultry is a chicken, a duck or a goose, and wherein the plant is rice, maize, wheat, sorghum, barley, soybean, peanut or Arabidopsis thaliana.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a U.S. National Phase of International Patent Application No. PCT/CN2018/076949, filed Feb. 22, 2018, which claims priority to Chinese Patent Application No. 201710089494.9, filed Feb. 20, 2017, both of which applications are herein incorporated by reference in their entireties.
SEQUENCE LISTING
[0002] This application contains a Sequence Listing which has been submitted in ASCII format via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Aug. 19, 2019, is named 245761_000084seqlist.txt, and is 102,206 bytes in size.
FIELD OF THE INVENTION
[0003] The present invention relates to the field of genetic engineering. In particular, the present invention relates to a genome editing method with high efficiency and high specificity. More specifically, the present invention relates to a method for increasing the efficiency of site-directed modification of a target sequence in a genome of an organism by a high-specificity Cas9 nuclease variant.
BACKGROUND OF THE INVENTION
[0004] Clustered regularly interspaced short palindromic repeats and CRISPR associated system (CRISPR/Cas9) is the most popular tool for genome editing. In the system, Cas9 protein cleaves a specific DNA sequence under the guidance of a gRNA to create a double-strand break (DSB). DSB can activate intracellular repair mechanisms of non-homologous end joining (NHEJ) and homologous recombination (HR) to repair DNA damage in cells such that the specific DNA sequence is edited during the repair process. Currently, the most commonly used Cas9 protein is Cas9 derived from Streptococcus pyogenes (SpCas9). One disadvantage of the CRISPR/Cas9 genome editing system is its low specificity and off-target effect, which greatly limit the application thereof.
[0005] There remains a need in the art for a method and tool that allow for efficient, high-specific genome editing.
SUMMARY OF THE INVENTION
[0006] In one aspect, the present invention provides a genome editing system for site-directed modification of a target sequence in the genome of a cell, which comprises at least one selected from the following i) to iii):
[0007] i) a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a tRNA-guide RNA fusion;
[0008] ii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a tRNA-guide RNA fusion; and
[0009] iii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant and a nucleotide sequence encoding a tRNA-guide RNA fusion;
[0010] wherein the Cas9 nuclease variant has higher specificity as compared with the wild-type Cas9 nuclease,
[0011] wherein the 5' end of the guide RNA is linked to the 3' end of the tRNA,
[0012] wherein the fusion is cleaved at the 5' end of the guide RNA after being transcribed in the cell, thereby forming a guide RNA that does not carry extra nucleotide at the 5' end.
[0013] In a second aspect, the present invention provides a genome editing system for site-directed modification of a target sequence in the genome of a cell, which comprises at least one selected from the following i) to iii):
[0014] i) a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a ribozyme-guide RNA fusion;
[0015] ii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a ribozyme-guide RNA fusion; and
[0016] iii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant and a nucleotide sequence encoding a ribozyme-guide RNA fusion;
[0017] wherein the Cas9 nuclease variant has higher specificity as compared with the wild-type Cas9 nuclease,
[0018] wherein the 5' end of the guide RNA is linked to the 3' end of a first ribozyme,
[0019] wherein the first ribozyme is designed to cleave the fusion at the 5' end of the guide RNA, thereby forming a guide RNA that does not carry extra nucleotide at the 5' end.
[0020] In a third aspect, the present invention provides a method for genetically modifying a cell, comprising introducing the genome editing system of the present invention into the cell, whereby the Cas9 nuclease variant is targeted to a target sequence in the genome of the cell by the guide RNA, and results in substitution, deletion and/or addition of one or more nucleotides in the target sequence.
[0021] In a fourth aspect, the present invention provides a genetically modified organism, which comprises a genetically modified cell produced by the method of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] FIGS. 1A and 1B show the strategies for designing sgRNA for target sequences with different 5' end nucleotides when using U3 or U6. A: by fusion with tRNA, sgRNA can be designed without considering the 5' end nucleotide of the target sequence; B: precise cleavage of tRNA-sgRNA fusion.
[0023] FIG. 2 shows the editing efficiency of WT SpCas9 (wild type SpCas9), eSpCas9(1.0), eSpCas9(1.1), SpCas9-HF1 on targets of class (1).
[0024] FIG. 3 shows shows the editing efficiency of WT SpCas9 (wild type SpCas9), eSpCas9(1.0), eSpCas9(1.1), SpCas9-HF1 on targets of class (2).
[0025] FIG. 4 shows that the additional nucleotide at 5' end of sgRNA affects the editing efficiency when U6 promoter is used.
[0026] FIGS. 5A and 5B show that for the OsMKK4 locus, tRNA-sgRNA can improve the editing efficiency and maintain high specificity as compared to sgRNA.
[0027] FIGS. 6A and 6B show that for the OsCDKB2 locus, the use of tRNA-sgRNA can increase the editing efficiency to the level of wild-type SpCas9, while maintaining high specificity.
[0028] FIGS. 7A and 7B show the editing specificity of Cas9 variant for mismatch between gRNA and target sequence. In FIG. 7A, the sequences listed are SEQ ID NOS: 23 and 65-83 in the order shown. In FIG. 7B, the sequences listed are also SEQ ID NOS: 23 and 65-83 in the order shown.
[0029] FIG. 8 shows tRNA-sgRNA improved the editing efficiency of eSpCas9(1.1) and SpCas9-HF1 to that of wild-type SpCas9 in human cells.
[0030] FIG. 9 shows the sequence structure of pUC57-U3-tRNA-sgRNA vector for tRNA-sgRNA fusion expression.
DETAILED DESCRIPTION OF THE INVENTION
1. Definition
[0031] In the present invention, unless indicated otherwise, the scientific and technological terminologies used herein refer to meanings commonly understood by a person skilled in the art. Also, the terminologies and experimental procedures used herein relating to protein and nucleotide chemistry, molecular biology, cell and tissue cultivation, microbiology, immunology, all belong to terminologies and conventional methods generally used in the art. For example, the standard DNA recombination and molecular cloning technology used herein are well known to a person skilled in the art, and are described in details in the following references: Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 1989. In the meantime, in order to better understand the present invention, definitions and explanations for the relevant terminologies are provided below.
[0032] "Cas9 nuclease" and "Cas9" can be used interchangeably herein, which refer to a RNA directed nuclease, including the Cas9 protein or fragments thereof (such as a protein comprising an active DNA cleavage domain of Cas9 and/or a gRNA binding domain of Cas9). Cas9 is a component of the CRISPR/Cas (clustered regularly interspaced short palindromic repeats and its associated system) genome editing system, which targets and cleaves a DNA target sequence to form a DNA double strand breaks (DSB) under the guidance of a guide RNA.
[0033] "guide RNA" and "gRNA" can be used interchangeably herein, which typically are composed of crRNA and tracrRNA molecules forming complexes through partial complement, wherein crRNA comprises a sequence that is sufficiently complementary to a target sequence for hybridization and directs the CRISPR complex (Cas9+crRNA+tracrRNA) to specifically bind to the target sequence. However, it is known in the art that single guide RNA (sgRNA) can be designed, which comprises the characteristics of both crRNA and tracrRNA.
[0034] As used herein, the terms "tRNA" and "transfer RNA" are used interchangeably to refer to small molecule RNAs that have the function of carrying and transporting amino acids. The tRNA molecule usually consists of a short chain of about 70-90 nucleotides folded into a clover shape. In eukaryotes, tRNA genes in the genome are transcribed into tRNA precursors, which are then processed into mature tRNA after excision of the 5' and 3' additional sequences by RNase P and RNase Z.
[0035] As used herein, the term "ribozyme" refers to an RNA molecule that has a catalytic function which participates in the cleavage and processing of RNA by catalyzing the transphosphate and phosphodiester bond hydrolysis reactions.
[0036] "Genome" as used herein encompasses not only chromosomal DNA present in the nucleus, but also organelle DNA present in the subcellular components (e.g., mitochondria, plastids) of the cell.
[0037] As used herein, "organism" includes any organism that is suitable for genomic editing. Exemplary organisms include, but are not limited to, mammals such as human, mouse, rat, monkey, dog, pig, sheep, cattle, cat; poultry such as chicken, duck, goose; plants including monocots and dicots such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis and the like.
[0038] "Genetically modified organism" or "genetically modified cell" means an organism or cell that contains an exogenous polynucleotide or modified gene or expression control sequence within its genome. For example, the exogenous polynucleotide is stably integrated into the genome of an organism or cell and inherited for successive generations. The exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct. The modified gene or expression control sequence is the sequence in the genome of the organism or cell that comprises single or multiple deoxynucleotide substitutions, deletions and additions.
[0039] The term "exogenous" with respect to sequence means a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
[0040] "Polynucleotide", "nucleic acid sequence", "nucleotide sequence", or "nucleic acid fragment" are used interchangeably to refer to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. Nucleotides (usually found in their 5'-monophosphate form) are referred to by their single letter designation as follows: "A" for adenylate or deoxyadenylate (for RNA or DNA, respectively), "C" for cytidylate or deoxycytidylate, "G" for guanylate or deoxyguanylate, "U" for uridylate, "T" for deoxythymidylate, "R" for purines (A or G), "Y" for pyrimidines (C or T), "K" for G or T, "H" for A or C or T, "I" for inosine, and "N" for any nucleotide.
[0041] "Polypeptide", "peptide", "amino acid sequence" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence", and "protein" are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
[0042] As used herein, an "expression construct" refers to a vector suitable for expression of a nucleotide sequence of interest in an organism, such as a recombinant vector. "Expression" refers to the production of a functional product. For example, the expression of a nucleotide sequence may refer to transcription of the nucleotide sequence (such as transcribe to produce an mRNA or a functional RNA) and/or translation of RNA into a protein precursor or a mature protein.
[0043] "Expression construct" of the invention may be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, an RNA that can be translated (such as an mRNA).
[0044] "Expression construct" of the invention may comprise regulatory sequences and nucleotide sequences of interest that are derived from different sources, or regulatory sequences and nucleotide sequences of interest derived from the same source, but arranged in a manner different than that normally found in nature.
[0045] "Regulatory sequence" or "regulatory element" are used interchangeably and refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
[0046] "Promoter" refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment. In some embodiments of the present invention, the promoter is a promoter capable of controlling the transcription of a gene in a cell, whether or not it is derived from the cell. The promoter may be a constitutive promoter or a tissue-specific promoter or a developmentally-regulated promoter or an inducible promoter.
[0047] "Constitutive promoter" refers to a promoter that may cause expression of a gene in most circumstances in most cell types. "Tissue-specific promoter" and "tissue-preferred promoter" are used interchangeably, and refer to a promoter that is expressed predominantly but not necessarily exclusively in one tissue or organ, but that may also be expressed in one specific cell or cell type. "Developmentally regulated promoter" refers to a promoter whose activity is determined by developmental events. "Inducible promoter" selectively expresses a DNA sequence operably linked to it in response to an endogenous or exogenous stimulus (environment, hormones, or chemical signals, and so on).
[0048] As used herein, the term "operably linked" means that a regulatory element (for example but not limited to, a promoter sequence, a transcription termination sequence, and so on) is associated to a nucleic acid sequence (such as a coding sequence or an open reading frame), such that the transcription of the nucleotide sequence is controlled and regulated by the transcriptional regulatory element. Techniques for operably linking a regulatory element region to a nucleic acid molecule are known in the art.
[0049] "Introduction" of a nucleic acid molecule (e.g., plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism means that the nucleic acid or protein is used to transform a cell of the organism such that the nucleic acid or protein functions in the cell. As used in the present invention, "transformation" includes both stable and transient transformations. "Stable transformation" refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in the stable inheritance of foreign genes. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any of its successive generations. "Transient transformation" refers to the introduction of a nucleic acid molecule or protein into a cell, performing its function without the stable inheritance of an exogenous gene. In transient transformation, the exogenous nucleic acid sequence is not integrated into the genome.
2. Genome Editing System with High Efficiency and High Specificity
[0050] It has been reported that the Cas9 nuclease variant eSpCas9 (1.0) (K810A/K1003A/R1060A), eSpCas9(1.1) (K848A/K1003A/R1060A) of Feng Zhang et al., and the Cas9 nuclease variant SpCas9-HF1 (N497A/R661A/Q695A/Q926A) developed by J. Keith Joung et al., are capable of significantly reducing the off-target rate in genomic editing, and thus have high specificity. However, surprisingly, the present inventors found that these three Cas9 nuclease variants, while having high specificity, have a much lower gene editing efficiency compared to wild-type Cas9.
[0051] The present inventors have surprisingly found that by fusing the 5' end of the guide RNA to a tRNA, the editing efficiency of the high-specificity Cas9 nuclease variant can be increased, even to the wild-type level, while maintaining the high specificity.
[0052] Not intended to be limited by any theory, it is believed that the editing efficiency reduction of high-specificity Cas9 nuclease variants is related to whether the transcription of guide RNA can be precisely initiated or not. In the art, commonly used promoters for producing guide RNA in vivo include for example U6 or U3 snRNA promoters, for which the transcription is driven by RNA polymerase III. U6 promoter needs to initiate transcription at G, and thus for the target sequences with the first nucleotide of A, C or T, an additional G will be present at 5' end of sgRNA as transcribed. U3 promoter initiates transcription at A, and thus for the target sequences with the first nucleotide of G, C or T, an additional A will be present at 5' end of sgRNA as transcribed. The inventors found that, the editing efficiency of high-specificity Cas9 nuclease variants is reduced in the case that an additional nucleotide is present at 5' end of the sgRNA. By fusion transcription with a tRNA, due to the mechanism of precisely processing tRNA (precisely removing additional sequence of 5' and 3' of tRNA precursor to form mature tRNA), sgRNA without additional nucleotide at 5' end can be readily obtained even using U6 or U3 promoters, without the need of considering the type of the first nucleotide of the target sequence. Thereby, the editing efficiency of high specificity Cas9 nuclease variants can be improved, and the selectable range of target sequences can be extended. In addition, not intended to be limited by any theory, fusion with tRNA can increase the expression level of sgRNA, which may also contribute to the improvement of editing efficiency of high-specificity Cas9 nuclease variants.
[0053] Therefore, the present invention provides a genome editing system for site-directed modification of a target sequence in the genome of a cell, which comprises at least one selected from the following i) to iii):
[0054] i) a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a tRNA-guide RNA fusion;
[0055] ii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a tRNA-guide RNA fusion; and
[0056] iii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant and a nucleotide sequence encoding a tRNA-guide RNA fusion;
[0057] wherein the Cas9 nuclease variant has higher specificity as compared with the wild-type Cas9 nuclease,
[0058] wherein the 5' end of the guide RNA is linked to the 3' end of the tRNA,
[0059] wherein the fusion is cleaved at the 5' end of the guide RNA after being transcribed in the cell, thereby forming a guide RNA that does not carry extra nucleotide at the 5' end.
[0060] In some embodiments, the tRNA and the cell to be modified are from the same species.
[0061] In some specific embodiments, the tRNA is encoded by the following sequence: aacaaagcaccagtggtctagtggtagaatagtaccctgccacggtacagacccgggttcgat- tcccggctggtgca (SEQ ID NO:1).
[0062] The design of the tRNA-guide RNA fusion is within the skill of the person in the art. For example, reference can be made to Xie et al., PNAS, Mar. 17, 2015; vol. 112, no. 11, 3570-3575.
[0063] The present invention also considers the fusion of a guide RNA and a ribozyme. On the basis that it is found in the invention that the editing efficiency of high-specificity Cas9 nuclease variants is related to precise transcription initiation of sgRNA, by using the ability of ribozyme to cut RNA at specific site, it is possible to produce sgRNA without additional nucleotide at 5' end by rational design of a fusion of RNA and ribozyme, so as to improve editing efficiency while maintain the high specificity.
[0064] Therefore, the invention also provides a genome editing system for site-directed modification of a target sequence in the genome of a cell, which comprises at least one selected from the following i) to iii):
[0065] i) a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a ribozyme-guide RNA fusion;
[0066] ii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant, and an expression construct comprising a nucleotide sequence encoding a ribozyme-guide RNA fusion; and
[0067] iii) an expression construct comprising a nucleotide sequence encoding a Cas9 nuclease variant and a nucleotide sequence encoding a ribozyme-guide RNA fusion;
[0068] wherein the Cas9 nuclease variant has higher specificity as compared with the wild-type Cas9 nuclease,
[0069] wherein the 5' end of the guide RNA is linked to the 3' end of a first ribozyme,
[0070] wherein the first ribozyme is designed to cleave the fusion at the 5' end of the guide RNA, thereby forming a guide RNA that does not carry extra nucleotide at the 5' end.
[0071] In one embodiment, the 3' end of the guide RNA is linked to the 5' end of a second ribozyme, the second ribozyme is designed to cleave the fusion at the 3' end of the guide RNA, thereby forming a guide RNA that does not carry extra nucleotide at the 3' end.
[0072] The design of the first ribozyme or the second ribozyme is within the skill of the person in the art. For example, reference can be made to Gao et al., JIPB, April, 2014; Vol 56, Issue 4, 343-349.
[0073] In one specific embodiment, the first ribozyme is encoded by the following sequence: 5'-(N).sub.6CTGATGAGTCCGTGAGGACGAAACGAGTAAGCTCGTC-3' (SEQ ID NO:12), wherein N is independently selected from A, G, C, and T, and (N).sub.6 refers to a sequence reversely complementary to the first 6 nucleotides at 5' end of the guide RNA. In one specific embodiment, the second ribozyme is encoded by the following sequence: 5' -GGCCGGCATGGTCCCAGCCTCCTCGCTGGCGCCGGCTGGGCAACATGCTTCGGCATGGC GAATGGGAC-3' (SEQ ID NO:13).
[0074] The Cas9 nuclease variant in the invention that has higher specificity as compared with wild type Cas9 nuclease can be derived from Cas9 of various species, for example, derived from Cas9 of Streptococcus pyogenes (SpCas9, nucleotide sequence shown in SEQ ID NO:2, amino acid sequence shown in SEQ ID NO:3).
[0075] In some embodiments of the invention, the Cas9 nuclease variant is a variant of SEQ ID NO:2, which comprises an amino acid substitution at position 855 of SEQ ID NO:2. In some specific embodiments, the amino acid substitution at position 855 is K855A.
[0076] In some embodiments of the invention, the Cas9 nuclease variant is a variant of SEQ ID NO:2, which comprises amino acid substitutions at positions 810, 1003 and 1060 of SEQ ID NO:2. In some specific embodiments, the amino acid substitutions respectively are K810A, K1003A and R1060A.
[0077] In some embodiments of the invention, the Cas9 nuclease variant is a variant of SEQ ID NO:2, which comprises amino acid substitutions at positions 848, 1003 and 1060 of SEQ ID NO:2. In some specific embodiments, the amino acid substitutions respectively are K848A, K1003A and R1060A.
[0078] In some embodiments of the invention, the Cas9 nuclease variant is a variant of SEQ ID NO:2, which comprises amino acid substitutions at positions 611, 695 and 926 of SEQ ID NO:2. In some specific embodiments, the amino acid substitutions respectively are R611A, Q695A and Q926A.
[0079] In some embodiments of the invention, the Cas9 nuclease variant is a variant of SEQ ID NO:2, which comprises amino acid substitutions at positions 497, 611, 695 and 926 of SEQ ID NO:2. In some specific embodiments, the amino acid substitutions respectively are N497A, R611A, Q695A and Q926A.
[0080] In some specific embodiments of the invention, the Cas9 nuclease variant comprises an amino acid sequence as shown in SEQ ID NO:4 (eSpCas9(1.0)), SEQ ID NO:5 (eSpCas9(1.1)) or SEQ ID NO:6 (SpCas9-HF1).
[0081] In some embodiments of the invention, the Cas9 nuclease variant of the invention further comprises a nuclear localization sequence (NLS). In general, one or more NLSs in the Cas9 nuclease variant should have sufficient strength to drive the accumulation of the Cas9 nuclease variant in the nucleus of the cell in an amount sufficient for the genome editing function. In general, the strength of the nuclear localization activity is determined by the number and position of NLSs, and one or more specific NLSs used in the Cas9 nuclease variant, or a combination thereof.
[0082] In some embodiments of the present invention, the NLSs of the Cas9 nuclease variant of the invention may be located at the N-terminus and/or the C-terminus. In some embodiments, the Cas9 nuclease variant comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. In some embodiments, the Cas9 nuclease variant comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the N-terminus. In some embodiments, the Cas9 nuclease variant comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the C-terminus. In some embodiments, the Cas9 nuclease variant comprises a combination of these, such as one or more NLSs at the N-terminus and one or more NLSs at the C-terminus. Where there are more than one NLS, each NLS may be selected as independent from other NLSs. In some preferred embodiments of the invention, the Cas9 nuclease variant comprises two NLSs, for example, the two NLSs are located at the N-terminus and the C-terminus, respectively.
[0083] In general, NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of a protein, but other types of NLS are also known in the art. Non-limiting examples of NLSs include KKRKV (nucleotide sequence 5'-AAGAAGAGAAAGGTC-3' (SEQ ID NO: 14)), PKKKRKV (nucleotide sequence 5'-CCCAAGAAGAAGAGGAAGGTG-3' (SEQ ID NO: 15) or CCAAAGAAGAAGAGGAAGGTT (SEQ ID NO: 16), or SGGSPKKKRKV (SEQ ID NO: 17) (nucleotide sequence 5'-TCGGGGGGGAGCCCAAAGAAGAAGCGGAAGGTG-3') (SEQ ID NO: 18.
[0084] In some embodiments of the invention, the N-terminus of the Cas9 nuclease variant comprises an NLS with an amino acid sequence shown by PKKKRKV (SEQ ID NO: 19). In some embodiments of the invention, the C-terminus of the Cas9 nuclease variant comprises an NLS with an amino acid sequence shown by SGGSPKKKRKV (SEQ ID NO: 17).
[0085] In addition, the Cas9 nuclease variant of the present invention may also include other localization sequences, such as cytoplasmic localization sequences, chloroplast localization sequences, mitochondrial localization sequences, and the like, depending on the location of the DNA to be edited.
[0086] For obtaining effective expression in the target cell, in some embodiments of the invention, the nucleotide sequence encoding the Cas9 nuclease variant is codon-optimized for the organism where the cell to be genome-edited is from.
[0087] Codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g. about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the "Codon Usage Database" available at www.kazusa.orjp/codon/ and these tables can be adapted in a number of ways. See Nakamura, Y., et al."Codon usage tabulated from the international DNA sequence databases: status for the year 2000" Nucl. Acids Res. 28:292 (2000).
[0088] The organism, from which the cell that can be genome edited by the system of the invention is derived, includes but is not limited to, mammals such as human, mice, rat, monkey, dog, pig, sheep, cow and cat; poultry such as chicken, duck and goose; plants including monocotyledons and dicotyledons, e.g. rice, maize, wheat, sorghum, barley, soybean, peanut and Arabidopsis thaliana and the like.
[0089] In some specific embodiments of the invention, the codon-optimized nucleotide sequence encoding the Cas9 nuclease variant is as shown in SEQ ID NO:7 (eSpCas9(1.0)), SEQ ID NO:8 (eSpCas9(1.1)) or SEQ ID NO:9 (SpCas9-HF1).
[0090] In some embodiments of the invention, the guide RNA is a single guide RNA (sgRNA). Methods of constructing suitable sgRNAs according to a given target sequence are known in the art. See e.g., Wang, Y. et al. Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew. Nat. Biotechnol. 32, 947-951 (2014); Shan, Q. et al. Targeted genome modification of crop plants using a CRISPR-Cas system. Nat. Biotechnol. 31, 686-688 (2013); Liang, Z. et al. Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system. J Genet Genomics. 41, 63-68 (2014).
[0091] In some embodiments of the invention, the nucleotide sequence encoding the Cas9 nuclease variant and/or the nucleotide sequence encoding the guide RNA fusion are operatively linked to an expression regulatory element such as a promoter.
[0092] Examples of promoters that can be used in the present invention include but are not limited to polymerase (pol) I, pol II or pol III promoters. Examples of pol I promoters include chicken RNA pol I promoter. Examples of pol II promoters include but are not limited to cytomegalovirus immediate early(CMV) promoter, rous sarcoma virus long terminal repeat (RSV-LTR) promoter and simian virus 40 (SV40) immediate early promoter. Examples of pol III promoters include U6 and H1 promoter. Inducible promoter such as metalothionein promoter can be used. Other examples of promoters include T7 bacteriophage promoter, T3 bacteriophage promoter, .beta.-galactosidase promoter and Sp6 bacteriophage promoter etc. When used for plants, promoters that can be used include but are not limited to cauliflower mosaic virus 35S promoter, maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, maize U3 promoter and rice actin promoter etc.
3. Method for Genetically Modifying a Cell
[0093] In another aspect, the invention provides a method for genetically modifying a cell, comprising: introducing the genome editing system of the invention to the cell, thereby the Cas9 nuclease variant is targeted to the target sequence in the genome of the cell by the guide RNA, and results in substitution, deletion and/or addition of one or more nucleotides in the target sequence.
[0094] The design of the target sequence that can be recognized and targeted by a Cas9 and guide RNA complex is within the technical skills of one of ordinary skill in the art. In general, the target sequence is a sequence that is complementary to a leader sequence of about 20 nucleotides comprised in guide RNA, and the 3'-end of which is immediately adjacent to the protospacer adjacent motif (PAM) NGG.
[0095] For example, in some embodiments of the invention, the target sequence has the structure: 5'-Nx-NGG-3', wherein N is selected independently from A, G, C, and T; X is an integer of 14.ltoreq.X.ltoreq.30; NX represents X contiguous nucleotides, and NGG is a PAM sequence. In some specific embodiments of the invention, X is 20.
[0096] In the present invention, the target sequence to be modified may be located anywhere in the genome, for example, within a functional gene such as a protein-coding gene or, for example, may be located in a gene expression regulatory region such as a promoter region or an enhancer region, and thereby accomplish the functional modification of said gene or accomplish the modification of a gene expression.
[0097] The substitution, deletion and/or addition in the target sequence of the cell can be detected by T7EI, PCR/RE or sequencing methods, see e.g., Shan, Q., Wang, Y., Li, J. & Gao, C. Genome editing in rice and wheat using the CRISPR/Cas system. Nat. Protoc. 9, 2395-2410 (2014).
[0098] In the method of the present invention, the genome editing system can be introduced into the cell by using various methods well known by the skilled in the art.
[0099] Methods for introducing the genome editing system of the present invention into the cell include, but are not limited to calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as a baculovirus, a vaccinia virus, an adenovirus and other viruses), particle bombardment, PEG-mediated protoplast transformation or agrobacterium-mediated transformation.
[0100] The cell which can be subjected to genome editing with the method of the present invention can be from, for example, mammals such as human, mouse, rat, monkey, dog, pig, sheep, cow and cat; poultry such as chicken, duck and goose; and plants including monocotyledons and dicotyledons such as rice, maize, wheat, sorghum, barley, soybean, peanut and Arabidopsis thaliana etc.
[0101] In some embodiments, the method of the present invention is performed in vitro. For example, the cell is an isolated cell. In some other embodiments, the method of the present invention can be performed in vivo. For example, the cell is a cell within an organism, and the system of the present invention can be introduced in-vivo into said cell by using, for example, a virus-mediated method. In some embodiments, the cell is a germ cell. In some implementations, the cell is a somatic cell.
[0102] In another aspect, the present invention further provides a genetically modified organism comprising a genetically modified cell produced by the method of the present invention.
[0103] The organism includes, but is not limited to mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows and cats; poultry such as chicken, ducks and geese; and plants including monocotyledons and dicotyledons such as rice, maize, wheat, sorghum, barley, soybean, peanuts and Arabidopsis thaliana.
EXAMPLES
Materials and Methods
Construction of Binary Expression Vectors pJIT163-SpCas9, PJIT163-eSpCas9(1.0), pJIT163-eSpCas9(1.1) and pJIT163-SpCas9-HF1
[0104] SpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 sequences were codon-optimized for rice. SpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 were obtained by site-directed mutagenesis using Fast MultiSite Mutagenesis System (TransGen) with pJIT163-SpCas9 plasmid (SEQ ID NO:10) as the template.
Construction of sgRNA Expression Vector
[0105] sgRNA target sequences used in the experiments are showed in table 1 as follows:
TABLE-US-00001 TABLE 1 Target Gene and sgRNA Target Sequence sgRNA Target sequence Oligo-F Oligo-R OsCDKB2 AGGTCGGGGAGGGGA GGCAAGGTCGGGGAGG AAACGTACGTCCCCTCC CGTACGGG (SEQ ID NO: GGACGTAC (SEQ ID NO: CCGACCT (SEQ ID NO: 20) 21) 22) OsMKK4 GACGTCGGCGAGGAA GGCAGACGTCGGCGAG AAACAGGCCTTCCTCGC GGCCTCGG (SEQ ID NO: GAAGGCCT (SEQ IN NO: CGACGTC (SEQ ID NO: 23) 24) 25) A1 CATGGTGGGGAAAGCT GGCACATGGTGGGGAAA AAACTCCAAGCTTTCCC TGGAGGG (SEQ ID NO: GCTTGGA (SEQ ID NO: CACCATG (SEQ ID NO: 26) 27) 28) A2 CCGGACGACGACGTCG GGCACCGGACGACGAC AAACTCGTCGACGTCGT ACGACGG (SEQ ID NO: GTCGACGA (SEQ ID NO: CGTCCGG (SEQ ID NO: 29) 30) 31) A3 TTGAAGTCCCTTCTAGA GGCATTGAAGTCCCTTCT AAACCCATCTAGAAGGG TGGAGG (SEQ ID NO: AGATGG (SEQ ID NO: ACTTCAA (SEQ ID NO: 32) 33) 34) A4 ACTGCGACACCCAGAT GGCAACTGCGACACCCA AAACCGATATCTGGGTG ATCGTGG (SEQ ID NO: GATATCG (SEQ ID NO: TCGCAGT (SEQ ID NO: 35) 36) 37) PDS GTTGGTCTTTGCTCCTG GGCAGTTGGTCTTTGCT AAACCTGCAGGAGCAA CAGAGG (SEQ ID NO: CCTGCAG (SEQ ID NO: AGACCAAC (SEQ ID NO: 38) 39) 40)
[0106] sgRNA expression vectors: pOsU3-CDKB2-sgRNA, pOsU3-MKK4-sgRNA, pOsU3-A1-sgRNA as well as pOsU3-A2-sgRNA, pOsU3-A3-sgRNA, pOsU3-A4-sgRNA and pOsU3-PDS-sgRNA are constructed on the basis of pOsU3-sgRNA(Addgene ID53063) as described previously (Shan, Q. et al. Targeted genome modification of crop plants using a CRISPR-Cas system. Nat. Biotechnol. 31, 686-688, 2013).
Construction of tRNA-sgRNA Expression Vectors
[0107] tRNA-sgRNA expression vectors are constructed on the basis of the pUC57-U3-tRNA-sgRNA vector (SEQ ID NO:11, FIG. 6). A linear vector is obtained after digestion of pUC57-U3-tRNA-sgRNA with BsaI, the corresponding oligo-F and oligo-R are annealed and connected into the linear vector, and the subsequent steps are similar to the construction of the sgRNA expression vectors.
TABLE-US-00002 TABLE 2 Target Genes and Oligonucleotide Sequences for Constructing tRNA-sgRNA Expression Vectors sgRNA Target sequence Oligo-F Oligo-R OsCDKB2 AGGTCGGGGAGGGGA TGCAAGGTCGGGGAGG AAACGTACGTCCCCTCC CGTACGGG (SEQ ID GGACGTAC (SEQ ID NO: CCGACCT (SEQ ID NO: NO: 20) 41) 22) OsMKK4 GACGTCGGCGAGGAA TGCAGACGTCGGCGAGG AAACAGGCCTTCCTCGC GGCCTCGG (SEQ ID AAGGCCT (SEQ ID NO: CGACGTC (SEQ ID NO: NO: 23) 42) 25) A1 CATGGTGGGGAAAGCT TGCACATGGTGGGGAAA AAACTCCAAGCTTTCCC TGGAGGG (SEQ ID NO: GCTTGGA (SEQ ID NO: CACCATG (SEQ ID NO: 26) 43) 28) A2 CCGGACGACGACGTCG TGCACCGGACGACGACG AAACTCGTCGACGTCGT ACGACGG (SEQ ID NO: TCGACGA (SEQ ID NO: CGTCCGG (SEQ ID NO: 29) 44) 31) A3 TTGAAGTCCCTTCTAG TGCATTGAAGTCCCTTCT AAACCCATCTAGAAGGG ATGGAGG (SEQ ID NO: AGATGG (SEQ ID NO: 45) ACTTCAA (SEQ ID NO: 32) 34) A4 ACTGCGACACCCAGAT TGCACTGCGACACCCAG AAACCGATATCTGGGTG ATCGTGG (SEQ ID NO: ATATCG (SEQ ID NO: 46) TCGCAGT (SEQ ID NO: 35) 37) PDS GTTGGTCTTTGCTCCT TGCAGTTGGTCTTTGCTC AAACCTGCAGGAGCAA GCAGAGG (SEQ ID NO: CTGCAG (SEQ ID NO: 47) AGACCAAC (SEQ ID NO: 38) 40)
Protoplast Assays
[0108] Rice cultivar nipponbare is used in the research. Protoplasts transformation is performed as described below. Transformation is carried out with 10 .mu.g of each plasmid by PEG-mediated transfection. Protoplasts were collected after 48 h and DNA was extracted for PCR-RE assay.
Preparation and Transformation of Rice Protoplast
[0109] 1) Leaf sheath of the seedlings were used for protoplasts isolation, and cut into about 0.5 mm wide with a sharp blade.
[0110] 2) Immediately after incision, transfered into 0.6M Mannitol solution, and placed in the dark for 10 min.
[0111] 3) Mannitol solution was removed by filtration, and the products were transfered into enzymolysis solution, and evacuated for 30 min.
[0112] 4) Enzymolysis was performed for 5-6h in darkness with gently shaking (decolorization shaker, speed 10).
[0113] 5) After enzymolysis completion, an equal volume of W5 was added, horizontal shake for 10s to release protoplasts.
[0114] 6) Protoplasts were filtered into a 50 ml round bottom centrifuge tube with a 40 .mu.m nylon membrane and washed with W5 solution.
[0115] 7) 250 g horizontal centrifugation for 3 min to precipitate the protoplasts, the supernatant was discarded.
[0116] 8) Protoplasts were resuspended by adding 10 ml W5, and then centrifuged at 250 g for 3 min, and the supernatant was discarded.
[0117] 9) An appropriate amount of MMG solution was added to resuspend the protoplasts to a concentration of 2.times.10.sup.6/ml.
[0118] Note: All the above steps were carried out at room temperature.
[0119] 10) 10-20 .mu.g plasmid, 200 .mu.l protoplasts (about 4.times.10.sup.5 cells), and 220 .mu.l fresh PEG solution were added into a 2 ml centrifugal tube, mixed, and placed at room temperature in darkness for 10-20 minutes to induce transformation.
[0120] 11) After the completion of the transformation, 880 .mu.l W5 solution was slowly added, and the tubes were gently turned upside down for mixing, 250 g horizontal centrifuged for 3 min, and the supernatant was discarded.
[0121] 12) The products were resuspended in 2 ml WI solution, transfered to a six-well plate, cultivated in room temperature (or 25.degree. C.) in darkness. For protoplast genomic DNA extraction, the products need to be cultivated for 48 h.
Mutation Identification by Deep Sequencing
[0122] Deep sequencing analysis is performed by reference to Liang, Z., Chen, K., Li, T., Zhang, Y., Wang, Y., Zhao, Q., Liu, J., Zhang, H., Liu, C., Ran, Y., et al. (2017). Efficient DNA-free genome editing of bread wheat using CRISPR/Cas9 ribonucleoprotein complexes. Nature Communications 8, 14261.
Example 1: Comparing Editing Capacities of WT SpCas9 and Variants Thereof to Target Sites
[0123] WT SpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 were respectively constructed in a transient expression vector pJIT163, and the expressions of WT SpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 are driven by a maize ubiquitin gene promoter. sgRNAs were constructed in the pOsU3-sgRNA vector, and the expression of sgRNAs is driven by OsU3 promoter. Rice protoplasts were transformed, and protoplast DNA was extracted for PCR-RE analysis to evaluate the mutation efficiency. Five target sites (A1, A2, A3, A4 and PDS, see FIG. 2 and FIG. 3) are selected to compare the difference of editing capacities of wild-type SpCas9 and eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1.
[0124] The OsU3 promoter has to initiate transcription with the nucleotide A, and therefore, the design of the sgRNA expression vectors for the target sites can be divided into two conditions as follows:
[0125] (1) If the first nucleotide at the 5' end of the desired sgRNA target sequence (20 bp) is any one of G/T/C, as the U3 promoter initiates transcription with an A, an additional A will be added to the 5' end of the transcribed sgRNA, and furthermore, the transcribed sgRNA cannot completely match with the target sequence. sgRNA expression vector can be constructed as U3+AN.sub.20 in FIG. 1, while N.sub.20 is the target sequence, A is the additional nucleotide at 5' end.
[0126] (2) If the first nucleotide at the 5' end of the desired sgRNA target sequence (20 bp) is A, it can used by the U3 promoter for initiating transcription, and therefore no additional nucleotide will exist at the 5' end of the transcribed sgRNA. sgRNA expression vector can be constructed as U3+AN.sub.19 in FIG. 1, while AN.sub.19 is the target sequence.
[0127] The selected target sites A1, A2, A3 and PDS belong to target sites of class (1), and target site A4 belongs to target sites of class (2).
[0128] The experiment results show (FIG. 2) that the editing efficiencies of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 for the target sites of class (1) are extremely low. The difference of the editing efficiencies of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 and the editing efficiency of WT SpCas9 is not significant for target sites of class (2). This shows that the additional nucleotide at the 5' end of the sgRNA resulted from the transcription can reduce the editing efficiencies of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1.
[0129] Similar to OsU3 promoter, maize U6 promoter (TaU6) has to initiate transcription with the nucleotide G, and therefore, the design of the sgRNA expression vectors for the target sites can be divided into two conditions as follows:
[0130] (1) If the first nucleotide at the 5' end of the desired sgRNA target sequence (20 bp) is any one of A/T/C, as the U6 promoter initiates transcription with a G, an additional G will be added to the 5' end of the transcribed sgRNA, and furthermore, the transcribed sgRNA cannot completely match with the target sequence.
[0131] (2) If the first nucleotide at the 5' end of the desired sgRNA target sequence (20 bp) is it can used by the U6 promoter for initiating transcription, and therefore no additional nucleotide will exist at the 5' end of the transcribed sgRNA.
[0132] The OsPDS target site belongs to target sites of class (2). TaU6 promoter was used to drive the transcription of GN.sub.19 and GN.sub.20 sgRNAs against OsPDS target site, where GN.sub.20 can mimic the target sites of class (1), namely with an additional G at 5' end of the sgRNA.
TABLE-US-00003 TABLE 3 Target gene and oligonucleotide sequences for construction of TaU6-sgRNA expression vectors sgRNA Target sequence Oligo-F Oligo-R OsPDS-GN.sub.19 GTTGGTCTTTGCTCC GGCGTTGGTCTTTGCTC AAACCTGCAGGAGCAA TGCAGAGG (SEQ ID CTGCAG (SEQ ID NO: AGACCAA (SEQ ID NO: NO. 38) 48) 40) OsPDS-GN.sub.20 GTTGGTCTTTGCTCC GGCGGTTGGTCTTTGCT AAACCTGCAGGAGCAA TGCAGAGG (SEQ ID CCTGCAG (SEQ ID NO: AGACCAAC (SEQ ID NO. 38) 49) NO: 40)
[0133] The results show (FIG. 2) that one additional G at 5' end of the sgRNA significantly reduces the editing efficiency of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1.
Example 2: Increasing Editing Efficiency of Cas9 Variants by tRNA-sgRNA Fusion
[0134] According to the result of the Example 1, an important factor influencing the editing efficiencies of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1is weather the sgRNA is precisely initiated or not. According to previous report, fusion of a tRNA to the 5' end of an sgRNA may up-regulate the expression of the sgRNA and result in precise cleavage at the 5' end of the sgRNA, and thereby avoiding additional nucleotide at the 5' end of the sgRNA. (See Xie K, Minkenberg B, Yang Y. Boosting CRISPR/Cas9 multiplex editing capability with the endogenous tRNA-processing system. Proc Natl Acad Sci USA. 2015 Mar. 17; 112(11):3570-5. doi: 10.1073/pnas.1420294112. Epub 2015 Mar. 2.)
[0135] sgRNA for each target site in Example 1 was fused to tRNA and expressed under the control of the OsU3 promoter. Experiments were performed by the method in Example 1 with tRNA-sgRNAs instead of sgRNAs. As shown in FIG. 2, for the target sites A1, A2, A3 and PDS, the editing efficiencies of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 are significantly increased using tRNA-sgRNAs instead of sgRNAs.
Example 3: Influences of tRNA-sgRNA Fusion to Editing Specificity of Cas9 Variants
3.1 Rice OsMKK4 Target Site
[0136] A target site GACGTCGGCGAGGAAGGCCTCGG (SEQ ID NO: 23) in rice gene MKK4 was selected to design sgRNA and tRNA-sgRNA. This target site has two possible off-target sites as shown in FIG. 5. A vector for expressing sgRNA or tRNA-sgRNA and vectors for expressing WTSpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 were respectively co-transformed into rice protoplasts. Two days after transformation, protoplast DNA was extracted, and genomic fragments of the target site and the off-target sites were amplified by using specific primers. Mutation rates of the three sites were analyzed by using second-generation sequencing technology.
[0137] The experiment result is shown in FIG. 5:
[0138] When sgRNAs were used, compared with WTSpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 have extremely low off-target effect, but have significantly lower editing efficiencies.
[0139] When tRNA-sgRNAs were used, the editing efficiency of each group was increased, however, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 can maintain relatively high specificity. Particularly for SpCas9-HF1, only extremely low-level mutation can be detected for both two off-target sites. Therefore, the combination of tRNA-sgRNA and SpCas9-HF1 is particularly suitable for genome editing with high efficiency and high specificity.
3.2 Rice OsCDKB2 Target Site
[0140] A target site AGGTCGGGGAGGGGACGTACGGG (SEQ ID NO: 20) in rice gene OsCDKB2 was selected to design sgRNA. This target site has three possible off-target sites as shown in FIG. 6. A vector for expressing sgRNA or tRNA-sgRNA and vectors for expressing WTSpCas9, eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 were respectively co-transformed into rice protoplasts. Two days after transformation, protoplast DNA was extracted, and genomic fragments of the target site and the off-target sites were amplified by using specific primers. Mutation rates of the four sites were analyzed by deep sequencing.
[0141] The experiment results are shown in FIG. 6. The editing efficiencies of eSpCas9(1.0), eSpCas9(1.1) and SpCas9-HF1 to the target sites are effectively increased by using tRNA-sgRNA instead of sgRNA. In particular, the editing efficiency of SpCas9-HF1 can be restored to a wild-type level, and high specificity can be maintained. As this target sequence starts with an A, by which the U3 promoter can precisely initiate transcription, the increased editing efficiency may result from the increased expression level of sgRNA due to the fusion with tRNA.
Example 4: Editing Specificity of Cas9 Variants to Mismatch Between gRNA and Target Sequence
[0142] When designing sgRNA for a target site GACGTCGGCGAGGAAGGCCTCGG (SEQ ID NO: 23) in rice gene MKK4, mismatches of two adjacent bases were artificially introduced (purine for purine, and pyrimidine for pyrimidine). Edition under the condition that sgRNA cannot completely match with the target site was detected. It is considered as off-target if edition can be detected. The experiments were performed in a way similar to that in Example 3.1.
[0143] The experiment results were shown in FIG. 7. When tRNA-sgRNA is used, SpCas9 variants showed higher sensitivity to mismatches between gRNA and the target sequence (particularly the mismatch closer to either ends).
Example 5: Editing Efficiency and Specificity of Cas9 Variants in Human Embryonic Kidney 293 Cells
[0144] sgRNAs were designed against a target sequence GGTGAGTGAGTGTGTGCGTGTGG (SEQ ID NO: 50) within human VEGFA gene. U6:sgRNA-GN.sub.19 and U6:tRNA-sgRNA-N.sub.20 represent that the sgRNAs transcribed with U6 promoter are 20 nt in length and completely match the target sequence; U6:sgRNA-GN.sub.20 represents that the sgRNA transcribed with U6 promoter is 21 nt in length and contains an additional G at 5' end.
[0145] The T7E1 assay results show (FIG. 8) that WT Cas9 exhibits similar editing efficiency when sgRNA transcribed with different strategies were used. However, the editing efficiency of eSpCas9(1.1) and SpCas9-HF1 were significantly reduced when the sgRNA contains an additional nucleotide at 5'end. And by using tRNA-sgRNA fusions, the editing efficiency of eSpCas9(1.1) and SpCas9-HF1 were increased to that of WT Cas9 or even higher.
[0146] With respect to editing specificity, WT Cas9 resulted in off-target editing in both sites off target1 and off target2. eSpCas9(1.1) and SpCas9-HF1 did not result in off-target editing when tRNA-sgRNA fusions were used.
TABLE-US-00004 TABLE 4 Target gene and oligonucleotide sequences for construction of sgRNA expression vectors sgRNA Target sequence Oligo-F Oligo-R VEGFA-GN.sub.19 GGTGAGTGAGTGTGT CACCGGTGAGTGAGTG AAACCACGCACACACTC GCGTGTGG (SEQ ID TGTGCGTG (SEQ ID ACTCACC (SEQ ID NO: NO: 50) NO: 51) 52) VEGFA-GN.sub.20 GGTGAGTGAGTGTGT CACCGGGTGAGTGAGT AAACCACGCACACACTC GCGTGTGG (SEQ ID GTGTGCGTG (SEQ ID ACTCACCC (SEQ ID NO: NO: 50) NO: 53) 54) VEGFA-tRNA- GGTGAGTGAGTGTGT CACCGaacaaagcaccagtggt AAACCACGCACACACTC N.sub.20 GCGTGTGG (SEQ ID ctagtggtagaatagtaccctgccac ACTCACCtgcaccagccgggaat NO: 50) ggtacagacccgggttcgattcccg cgaacccgggtctgtaccgtggcaggg gctggtgcaGGTGAGTGAG tactattctaccactagaccactggtgctt TGTGTGCGTG (SEQ tgttC (SEQ ID NO: 56) ID NO: 55)
TABLE-US-00005 Sequence listing tRNA encoding sequence SEQ ID NO: 1 aacaaagcaccagtggtctagtggtagaatagtaccctgccacggtacagacccgggttcgattcccggctggt- gca SpCas9 nucleotide sequence SEQ ID NO: 2 atggcccctaagaagaagagaaaggtcggtattcacggcgttcctgcggcgatggacaagaagtatagtattgg- tct ggacattgggacgaattccgttggctgggccgtgatcaccgatgagtacaaggtcccttccaagaagtttaagg- ttc tggggaacaccgatcggcacagcatcaagaagaatctcattggagccctcctgttcgactcaggcgagaccgcc- gaa gcaacaaggctcaagagaaccgcaaggagacggtatacaagaaggaagaataggatctgctacctgcaggagat- ttt cagcaacgaaatggcgaaggtggacgattcgttctttcatagattggaggagagtttcctcgtcgaggaagata- aga agcacgagaggcatcctatctttggcaacattgtcgacgaggttgcctatcacgaaaagtaccccacaatctat- cat ctgcggaagaagcttgtggactcgactgataaggcggaccttagattgatctacctcgctctggcacacatgat- taa gttcaggggccattttctgatcgagggggatcttaacccggacaatagcgatgtggacaagttgttcatccagc- tcg tccaaacctacaatcagctctttgaggaaaacccaattaatgcttcaggcgtcgacgccaaggcgatcctgtct- gca cgcctttcaaagtctcgccggcttgagaacttgatcgctcaactcccgggcgaaaagaagaacggcttgttcgg- gaa tctcattgcactttcgttggggctcacaccaaacttcaagagtaattttgatctcgctgaggacgcaaagctgc- agc tttccaaggacacttatgacgatgacctggataaccttttggcccaaatcggcgatcagtacgcggacttgttc- ctc gccgcgaagaatttgtcggacgcgatcctcctgagtgatattctccgcgtgaacaccgagattacaaaggcccc- gct ctcggcgagtatgatcaagcgctatgacgagcaccatcaggatctgacccttttgaaggctttggtccggcagc- aac tcccagagaagtacaaggaaatcttctttgatcaatccaagaacggctacgctggttatattgacggcggggca- tcg caggaggaattctacaagtttatcaagccaattctggagaagatggatggcacagaggaactcctggtgaagct- caa tagggaggaccttttgcggaagcaaagaactttcgataacggcagcatccctcaccagattcatctcggggagc- tgc acgccatcctgagaaggcaggaagacttctacccctttcttaaggataaccgggagaagatcgaaaagattctg- acg ttcagaattccgtactatgtcggaccactcgcccggggtaattccagatttgcgtggatgaccagaaagagcga- gga aaccatcacaccttggaacttcgaggaagtggtcgataagggcgcttccgcacagagcttcattgagcgcatga- caa attttgacaagaacctgcctaatgagaaggtccttcccaagcattccctcctgtacgagtatttcactgtttat- aac gaactcacgaaggtgaagtatgtgaccgagggaatgcgcaagcccgccttcctgagcggcgagcaaaagaaggc- gat cgtggaccttttgtttaagaccaatcggaaggtcacagttaagcagctcaaggaggactacttcaagaagattg- aat gcttcgattccgttgagatcagcggcgtggaagacaggtttaacgcgtcactggggacttaccacgatctcctg- aag atcattaaggataaggacttcttggacaacgaggaaaatgaggatatcctcgaagacattgtcctgactcttac- gtt gtttgaggatagggaaatgatcgaggaacgcttgaagacgtatgcccatctcttcgatgacaaggttatgaagc- agc tcaagagaagaagatacaccggatggggaaggctgtcccgcaagcttatcaatggcattagagacaagcaatca- ggg aagacaatccttgactttttgaagtctgatggcttcgcgaacaggaattttatgcagctgattcacgatgactc- act tactttcaaggaggatatccagaaggctcaagtgtcgggacaaggtgacagtctgcacgagcatatcgccaacc- ttg cgggatctcctgcaatcaagaagggtattctgcagacagtcaaggttgtggatgagcttgtgaaggtcatggga- cgg cataagcccgagaacatcgttattgagatggccagagaaaatcagaccacacaaaagggtcagaagaactcgag- gga gcgcatgaagcgcatcgaggaaggcattaaggagctggggagtcagatccttaaggagcacccggtggaaaaca- cgc agttgcaaaatgagaagctctatctgtactatctgcaaaatggcagggatatgtatgtggaccaggagttggat- att aaccgcctctcggattacgacgtcgatcatatcgttcctcagtccttccttaaggatgacagcattgacaataa- ggt tctcaccaggtccgacaagaaccgcgggaagtccgataatgtgcccagcgaggaagtcgttaagaagatgaaga- act actggaggcaacttttgaatgccaagttgatcacacagaggaagtttgataacctcactaaggccgagcgcgga- ggt ctcagcgaactggacaaggcgggcttcattaagcggcaactggttgagactagacagatcacgaagcacgtggc- gca gattctcgattcacgcatgaacacgaagtacgatgagaatgacaagctgatccgggaagtgaaggtcatcacct- tga agtcaaagctcgtttctgacttcaggaaggatttccaattttataaggtgcgcgagatcaacaattatcaccat- gct catgacgcatacctcaacgctgtggtcggaacagcattgattaagaagtacccgaagctcgagtccgaattcgt- gta cggtgactataaggtttacgatgtgcgcaagatgatcgccaagtcagagcaggaaattggcaaggccactgcga- agt atttcttttactctaacattatgaatttctttaagactgagatcacgctggctaatggcgaaatccggaagaga- cca cttattgagaccaacggcgagacaggggaaatcgtgtgggacaaggggagggatttcgccacagtccgcaaggt- tct ctctatgcctcaagtgaatattgtcaagaagactgaagtccagacgggcgggttctcaaaggaatctattctgc- cca agcggaactcggataagcttatcgccagaaagaaggactgggacccgaagaagtatggaggtttcgactcacca- acg gtggcttactctgtcctggttgtggcaaaggtggagaagggaaagtcaaagaagctcaagtctgtcaaggagct- cct gggtatcaccattatggagaggtccagcttcgaaaagaatccgatcgattttctcgaggcgaagggatataagg- aag tgaagaaggacctgatcattaagcttccaaagtacagtcttttcgagttggaaaacggcaggaagcgcatgttg- gct tccgcaggagagctccagaagggtaacgagcttgctttgccgtccaagtatgtgaacttcctctatctggcatc- cca ctacgagaagctcaagggcagcccagaggataacgaacagaagcaactgtttgtggagcaacacaagcattatc- ttg acgagatcattgaacagatttcggagttcagtaagcgcgtcatcctcgccgacgcgaatttggataaggttctc- tca gcctacaacaagcaccgggacaagcctatcagagagcaggcggaaaatatcattcatctcttcaccctgacaaa- cct tggggctcccgctgcattcaagtattttgacactacgattgatcggaagagatacacttctacgaaggaggtgc- tgg atgcaacccttatccaccaatcgattactggcctctacgagacgcggatcgacttgagtcagctcgggggggat- aag agaccagcggcaaccaagaaggcaggacaagcgaagaagaagaagtag SpCas9 amino acid sequence SEQ ID NO: 3 MAPKKKRKVGIHGVPAAMDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGE- TAE ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPT- IYH LRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAI- LSA RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYAD- LFL AAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDG- GAS QEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK- ILT FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFT- VYN ELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHD- LLK IIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDK- QSG KTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKV- MGR HKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQE- LDI NRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAE- RGG LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNY- HHA NDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIR- KRP LIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFD- SPT VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKR- MLA SAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDK- VLS AYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLG- GDK RPAATKKAGQAKKKK eSpCas9(1.0) amino acid sequence SEQ ID NO: 4 MAPKKKRKVGIHGVPAAMDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGE- TAE ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPT- IYH LRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAI- LSA RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYAD- LFL AAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDG- GAS QEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK- ILT
FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFT- VYN ELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHD- LLK IIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDK- QSG KTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKV- MGR HKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEALYLYYLQNGRDMYVDQE- LDI NRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAE- RGG LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNY- HHA NDAYLNAVVGTALIKKYPALESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIR- KRP LIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFD- SPT VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKR- MLA SAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDK- VLS AYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLG- GDK RPAATKKAGQAKKKK eSpCas9(1.1) amino acid sequence SEQ ID NO: 5 MAPKKKRKVGIHGVPAAMDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGE- TAE ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPT- IYH LRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAI- LSA RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYAD- LFL AAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDG- GAS QEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK- ILT FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFT- VYN ELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHD- LLK IIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDK- QSG KTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKV- MGR HKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQE- LDI NRLSDYDVDHIVPQSFLADDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAE- RGG LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNY- HHA NDAYLNAVVGTALIKKYPALESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIR- KAP LIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFD- SPT VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKR- MLA SAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQ1SEFSKRVILADANLDK- VLS AYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLG- GDK RPAATKKAGQAKKKK SpCas9-HF1 amino acid sequence SEQ ID NO: 6 MAPKKKRKVGIHGVPAAMDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGE- TAE ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPT- IYH LRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAI- LSA RLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYAD- LFL AAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDG- GAS QEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEK- ILT FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTAFDKNLPNEKVLPKHSLLYEYFT- VYN ELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHD- LLK IIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGALSRKLINGIRDK- QSG KTILDFLKSDGFANRNFMALIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKV- MGR HKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQE- LDI NRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAE- RGG LSELDKAGFIKRQLVETRAITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNY- HHA NDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIR- KAP LIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFD- SPT VYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM- LAS AGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKV- LSA YNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGG- DKR PAATKKAGQAKKKK eSpCas9(1.0) codon-optimized nucleotide sequence SEQ ID NO: 7 atggcccctaagaagaagagaaaggtcggtattcacggcgttcctgcggcgatggacaagaagtatagtattgg- tct ggacattgggacgaattccgttggctgggccgtgatcaccgatgagtacaaggtcccttccaagaagtttaagg- ttc tggggaacaccgatcggcacagcatcaagaagaatctcattggagccctcctgttcgactcaggcgagaccgcc- gaa gcaacaaggctcaagagaaccgcaaggagacggtatacaagaaggaagaataggatctgctacctgcaggagat- ttt cagcaacgaaatggcgaaggtggacgattcgttctttcatagattggaggagagtttcctcgtcgaggaagata- aga agcacgagaggcatcctatctttggcaacattgtcgacgaggttgcctatcacgaaaagtaccccacaatctat- cat ctgcggaagaagcttgtggactcgactgataaggcggaccttagattgatctacctcgctctggcacacatgat- taa gttcaggggccattttctgatcgagggggatcttaacccggacaatagcgatgtggacaagttgttcatccagc- tcg tccaaacctacaatcagctctttgaggaaaacccaattaatgcttcaggcgtcgacgccaaggcgatcctgtct- gca cgcctttcaaagtctcgccggcttgagaacttgatcgctcaactcccgggcgaaaagaagaacggcttgttcgg- gaa tctcattgcactttcgttggggctcacaccaaacttcaagagtaattttgatctcgctgaggacgcaaagctgc- agc tttccaaggacacttatgacgatgacctggataaccttttggcccaaatcggcgatcagtacgcggacttgttc- ctc gccgcgaagaatttgtcggacgcgatcctcctgagtgatattctccgcgtgaacaccgagattacaaaggcccc- gct ctcggcgagtatgatcaagcgctatgacgagcaccatcaggatctgacccttttgaaggctttggtccggcagc- aac tcccagagaagtacaaggaaatcttctttgatcaatccaagaacggctacgctggttatattgacggcggggca- tcg caggaggaattctacaagtttatcaagccaattctggagaagatggatggcacagaggaactcctggtgaagct- caa tagggaggaccttttgcggaagcaaagaactttcgataacggcagcatccctcaccagattcatctcggggagc- tgc acgccatcctgagaaggcaggaagacttctacccctttcttaaggataaccgggagaagatcgaaaagattctg- acg ttcagaattccgtactatgtcggaccactcgcccggggtaattccagatttgcgtggatgaccagaaagagcga- gga aaccatcacaccttggaacttcgaggaagtggtcgataagggcgcttccgcacagagcttcattgagcgcatga- caa attttgacaagaacctgcctaatgagaaggtccttcccaagcattccctcctgtacgagtatttcactgtttat- aac gaactcacgaaggtgaagtatgtgaccgagggaatgcgcaagcccgccttcctgagcggcgagcaaaagaaggc- gat cgtggaccttttgtttaagaccaatcggaaggtcacagttaagcagctcaaggaggactacttcaagaagattg- aat gcttcgattccgttgagatcagcggcgtggaagacaggtttaacgcgtcactggggacttaccacgatctcctg- aag atcattaaggataaggacttcttggacaacgaggaaaatgaggatatcctcgaagacattgtcctgactcttac- gtt gtttgaggatagggaaatgatcgaggaacgcttgaagacgtatgcccatctcttcgatgacaaggttatgaagc- agc tcaagagaagaagatacaccggatggggaaggctgtcccgcaagcttatcaatggcattagagacaagcaatca- ggg aagacaatccttgactttttgaagtctgatggcttcgcgaacaggaattttatgcagctgattcacgatgactc- act tactttcaaggaggatatccagaaggctcaagtgtcgggacaaggtgacagtctgcacgagcatatcgccaacc- ttg cgggatctcctgcaatcaagaagggtattctgcagacagtcaaggttgtggatgagcttgtgaaggtcatggga- cgg cataagcccgagaacatcgttattgagatggccagagaaaatcagaccacacaaaagggtcagaagaactcgag- gga gcgcatgaagcgcatcgaggaaggcattaaggagctggggagtcagatccttaaggagcacccggtggaaaaca- cgc
agttgcaaaatgaggccctctatctgtactatctgcaaaatggcagggatatgtatgtggaccaggagttggat- att aaccgcctctcggattacgacgtcgatcatatcgttcctcagtccttccttaaggatgacagcattgacaataa- ggt tctcaccaggtccgacaagaaccgcgggaagtccgataatgtgcccagcgaggaagtcgttaagaagatgaaga- act actggaggcaacttttgaatgccaagttgatcacacagaggaagtttgataacctcactaaggccgagcgcgga- ggt ctcagcgaactggacaaggcgggcttcattaagcggcaactggttgagactagacagatcacgaagcacgtggc- gca gattctcgattcacgcatgaacacgaagtacgatgagaatgacaagctgatccgggaagtgaaggtcatcacct- tga agtcaaagctcgtttctgacttcaggaaggatttccaattttataaggtgcgcgagatcaacaattatcaccat- gct catgacgcatacctcaacgctgtggtcggaacagcattgattaagaagtacccggcgctcgagtccgaattcgt- gta cggtgactataaggtttacgatgtgcgcaagatgatcgccaagtcagagcaggaaattggcaaggccactgcga- agt atttcttttactctaacattatgaatttctttaagactgagatcacgctggctaatggcgaaatccggaaggcg- cca cttattgagaccaacggcgagacaggggaaatcgtgtgggacaaggggagggatttcgccacagtccgcaaggt- tct ctctatgcctcaagtgaatattgtcaagaagactgaagtccagacgggcgggttctcaaaggaatctattctgc- cca agcggaactcggataagcttatcgccagaaagaaggactgggacccgaagaagtatggaggtttcgactcacca- acg gtggcttactctgtcctggttgtggcaaaggtggagaagggaaagtcaaagaagctcaagtctgtcaaggagct- cct gggtatcaccattatggagaggtccagcttcgaaaagaatccgatcgattttctcgaggcgaagggatataagg- aag tgaagaaggacctgatcattaagcttccaaagtacagtcttttcgagttggaaaacggcaggaagcgcatgttg- gct tccgcaggagagctccagaagggtaacgagcttgctttgccgtccaagtatgtgaacttcctctatctggcatc- cca ctacgagaagctcaagggcagcccagaggataacgaacagaagcaactgtttgtggagcaacacaagcattatc- ttg acgagatcattgaacagatttcggagttcagtaagcgcgtcatcctcgccgacgcgaatttggataaggttctc- tca gcctacaacaagcaccgggacaagcctatcagagagcaggcggaaaatatcattcatctcttcaccctgacaaa- cct tggggctcccgctgcattcaagtattttgacactacgattgatcggaagagatacacttctacgaaggaggtgc- tgg atgcaacccttatccaccaatcgattactggcctctacgagacgcggatcgacttgagtcagctcgggggggat- aag agaccagcggcaaccaagaaggcaggacaagcgaagaagaagaagtag eSpCas9(1.1) codon-optimized nucleotide sequence SEQ ID NO: 8 atggcccctaagaagaagagaaaggtcggtattcacggcgttcctgcggcgatggacaagaagtatagtattgg- tct ggacattgggacgaattccgttggctgggccgtgatcaccgatgagtacaaggtcccttccaagaagtttaagg- ttc tggggaacaccgatcggcacagcatcaagaagaatctcattggagccctcctgttcgactcaggcgagaccgcc- gaa gcaacaaggctcaagagaaccgcaaggagacggtatacaagaaggaagaataggatctgctacctgcaggagat- ttt cagcaacgaaatggcgaaggtggacgattcgttctttcatagattggaggagagtttcctcgtcgaggaagata- aga agcacgagaggcatcctatctttggcaacattgtcgacgaggttgcctatcacgaaaagtaccccacaatctat- cat ctgcggaagaagcttgtggactcgactgataaggcggaccttagattgatctacctcgctctggcacacatgat- taa gttcaggggccattttctgatcgagggggatcttaacccggacaatagcgatgtggacaagttgttcatccagc- tcg tccaaacctacaatcagctctttgaggaaaacccaattaatgcttcaggcgtcgacgccaaggcgatcctgtct- gca cgcctttcaaagtctcgccggcttgagaacttgatcgctcaactcccgggcgaaaagaagaacggcttgttcgg- gaa tctcattgcactttcgttggggctcacaccaaacttcaagagtaattttgatctcgctgaggacgcaaagctgc- agc tttccaaggacacttatgacgatgacctggataaccttttggcccaaatcggcgatcagtacgcggacttgttc- ctc gccgcgaagaatttgtcggacgcgatcctcctgagtgatattctccgcgtgaacaccgagattacaaaggcccc- gct ctcggcgagtatgatcaagcgctatgacgagcaccatcaggatctgacccttttgaaggctttggtccggcagc- aac tcccagagaagtacaaggaaatcttctttgatcaatccaagaacggctacgctggttatattgacggcggggca- tcg caggaggaattctacaagtttatcaagccaattctggagaagatggatggcacagaggaactcctggtgaagct- caa tagggaggaccttttgcggaagcaaagaactttcgataacggcagcatccctcaccagattcatctcggggagc- tgc acgccatcctgagaaggcaggaagacttctacccctttcttaaggataaccgggagaagatcgaaaagattctg- acg ttcagaattccgtactatgtcggaccactcgcccggggtaattccagatttgcgtggatgaccagaaagagcga- gga aaccatcacaccttggaacttcgaggaagtggtcgataagggcgcttccgcacagagcttcattgagcgcatga- caa attttgacaagaacctgcctaatgagaaggtccttcccaagcattccctcctgtacgagtatttcactgtttat- aac gaactcacgaaggtgaagtatgtgaccgagggaatgcgcaagcccgccttcctgagcggcgagcaaaagaaggc- gat cgtggaccttttgtttaagaccaatcggaaggtcacagttaagcagctcaaggaggactacttcaagaagattg- aat gcttcgattccgttgagatcagcggcgtggaagacaggtttaacgcgtcactggggacttaccacgatctcctg- aag atcattaaggataaggacttcttggacaacgaggaaaatgaggatatcctcgaagacattgtcctgactcttac- gtt gtttgaggatagggaaatgatcgaggaacgcttgaagacgtatgcccatctcttcgatgacaaggttatgaagc- agc tcaagagaagaagatacaccggatggggaaggctgtcccgcaagcttatcaatggcattagagacaagcaatca- ggg aagacaatccttgactttttgaagtctgatggcttcgcgaacaggaattttatgcagctgattcacgatgactc- act tactttcaaggaggatatccagaaggctcaagtgtcgggacaaggtgacagtctgcacgagcatatcgccaacc- ttg cgggatctcctgcaatcaagaagggtattctgcagacagtcaaggttgtggatgagcttgtgaaggtcatggga- cgg cataagcccgagaacatcgttattgagatggccagagaaaatcagaccacacaaaagggtcagaagaactcgag- gga gcgcatgaagcgcatcgaggaaggcattaaggagctggggagtcagatccttaaggagcacccggtggaaaaca- cgc agttgcaaaatgagaagctctatctgtactatctgcaaaatggcagggatatgtatgtggaccaggagttggat- att aaccgcctctcggattacgacgtcgatcatatcgttcctcagtccttccttgcggatgacagcattgacaataa- ggt tctcaccaggtccgacaagaaccgcgggaagtccgataatgtgcccagcgaggaagtcgttaagaagatgaaga- act actggaggcaacttttgaatgccaagttgatcacacagaggaagtttgataacctcactaaggccgagcgcgga- ggt ctcagcgaactggacaaggcgggcttcattaagcggcaactggttgagactagacagatcacgaagcacgtggc- gca gattctcgattcacgcatgaacacgaagtacgatgagaatgacaagctgatccgggaagtgaaggtcatcacct- tga agtcaaagctcgtttctgacttcaggaaggatttccaattttataaggtgcgcgagatcaacaattatcaccat- gct catgacgcatacctcaacgctgtggtcggaacagcattgattaagaagtacccggcgctcgagtccgaattcgt- gta cggtgactataaggtttacgatgtgcgcaagatgatcgccaagtcagagcaggaaattggcaaggccactgcga- agt atttcttttactctaacattatgaatttctttaagactgagatcacgctggctaatggcgaaatccggaaggcg- cca cttattgagaccaacggcgagacaggggaaatcgtgtgggacaaggggagggatttcgccacagtccgcaaggt- tct ctctatgcctcaagtgaatattgtcaagaagactgaagtccagacgggcgggttctcaaaggaatctattctgc- cca agcggaactcggataagcttatcgccagaaagaaggactgggacccgaagaagtatggaggtttcgactcacca- acg gtggcttactctgtcctggttgtggcaaaggtggagaagggaaagtcaaagaagctcaagtctgtcaaggagct- cct gggtatcaccattatggagaggtccagcttcgaaaagaatccgatcgattttctcgaggcgaagggatataagg- aag tgaagaaggacctgatcattaagcttccaaagtacagtcttttcgagttggaaaacggcaggaagcgcatgttg- gct tccgcaggagagctccagaagggtaacgagcttgctttgccgtccaagtatgtgaacttcctctatctggcatc- cca ctacgagaagctcaagggcagcccagaggataacgaacagaagcaactgtttgtggagcaacacaagcattatc- ttg acgagatcattgaacagatttcggagttcagtaagcgcgtcatcctcgccgacgcgaatttggataaggttctc- tca gcctacaacaagcaccgggacaagcctatcagagagcaggcggaaaatatcattcatctcttcaccctgacaaa- cct tggggctcccgctgcattcaagtattttgacactacgattgatcggaagagatacacttctacgaaggaggtgc- tgg atgcaacccttatccaccaatcgattactggcctctacgagacgcggatcgacttgagtcagctcgggggggat- aag agaccagcggcaaccaagaaggcaggacaagcgaagaagaagaagtag SpCas9-HF1 codon-optimized nucleotide sequence SEQ ID NO: 9 atggcccctaagaagaagagaaaggtcggtattcacggcgttcctgcggcgatggacaagaagtatagtattgg- tct ggacattgggacgaattccgttggctgggccgtgatcaccgatgagtacaaggtcccttccaagaagtttaagg- ttc tggggaacaccgatcggcacagcatcaagaagaatctcattggagccctcctgttcgactcaggcgagaccgcc- gaa gcaacaaggctcaagagaaccgcaaggagacggtatacaagaaggaagaataggatctgctacctgcaggagat- ttt cagcaacgaaatggcgaaggtggacgattcgttctttcatagattggaggagagtttcctcgtcgaggaagata- aga
agcacgagaggcatcctatctttggcaacattgtcgacgaggttgcctatcacgaaaagtaccccacaatctat- cat ctgcggaagaagcttgtggactcgactgataaggcggaccttagattgatctacctcgctctggcacacatgat- taa gttcaggggccattttctgatcgagggggatcttaacccggacaatagcgatgtggacaagttgttcatccagc- tcg tccaaacctacaatcagctctttgaggaaaacccaattaatgcttcaggcgtcgacgccaaggcgatcctgtct- gca cgcctttcaaagtctcgccggcttgagaacttgatcgctcaactcccgggcgaaaagaagaacggcttgttcgg- gaa tctcattgcactttcgttggggctcacaccaaacttcaagagtaattttgatctcgctgaggacgcaaagctgc- agc tttccaaggacacttatgacgatgacctggataaccttttggcccaaatcggcgatcagtacgcggacttgttc- ctc gccgcgaagaatttgtcggacgcgatcctcctgagtgatattctccgcgtgaacaccgagattacaaaggcccc- gct ctcggcgagtatgatcaagcgctatgacgagcaccatcaggatctgacccttttgaaggctttggtccggcagc- aac tcccagagaagtacaaggaaatcttctttgatcaatccaagaacggctacgctggttatattgacggcggggca- tcg caggaggaattctacaagtttatcaagccaattctggagaagatggatggcacagaggaactcctggtgaagct- caa tagggaggaccttttgcggaagcaaagaactttcgataacggcagcatccctcaccagattcatctcggggagc- tgc acgccatcctgagaaggcaggaagacttctacccctttcttaaggataaccgggagaagatcgaaaagattctg- acg ttcagaattccgtactatgtcggaccactcgcccggggtaattccagatttgcgtggatgaccagaaagagcga- gga aaccatcacaccttggaacttcgaggaagtggtcgataagggcgcttccgcacagagcttcattgagcgcatga- caG CCtttgacaagaacctgcctaatgagaaggtccttcccaagcattccctcctgtacgagtatttcactgtttat- aac gaactcacgaaggtgaagtatgtgaccgagggaatgcgcaagcccgccttcctgagcggcgagcaaaagaaggc- gat cgtggaccttttgtttaagaccaatcggaaggtcacagttaagcagctcaaggaggactacttcaagaagattg- aat gcttcgattccgttgagatcagcggcgtggaagacaggtttaacgcgtcactggggacttaccacgatctcctg- aag atcattaaggataaggacttcttggacaacgaggaaaatgaggatatcctcgaagacattgtcctgactcttac- gtt gtttgaggatagggaaatgatcgaggaacgcttgaagacgtatgcccatctcttcgatgacaaggttatgaagc- agc tcaagagaagaagatacaccggatggggaGCCctgtcccgcaagcttatcaatggcattagagacaagcaatca- ggg aagacaatccttgactttttgaagtctgatggcttcgcgaacaggaattttatgGCCctgattcacgatgactc- act tactttcaaggaggatatccagaaggctcaagtgtcgggacaaggtgacagtctgcacgagcatatcgccaacc- ttg cgggatctcctgcaatcaagaagggtattctgcagacagtcaaggttgtggatgagcttgtgaaggtcatggga- cgg cataagcccgagaacatcgttattgagatggccagagaaaatcagaccacacaaaagggtcagaagaactcgag- gga gcgcatgaagcgcatcgaggaaggcattaaggagctggggagtcagatccttaaggagcacccggtggaaaaca- cgc agttgcaaaatgagaagctctatctgtactatctgcaaaatggcagggatatgtatgtggaccaggagttggat- att aaccgcctctcggattacgacgtcgatcatatcgttcctcagtccttccttaaggatgacagcattgacaataa- ggt tctcaccaggtccgacaagaaccgcgggaagtccgataatgtgcccagcgaggaagtcgttaagaagatgaaga- act actggaggcaacttttgaatgccaagttgatcacacagaggaagtttgataacctcactaaggccgagcgcgga- ggt ctcagcgaactggacaaggcgggcttcattaagcggcaactggttgagactagaGCCatcacgaagcacgtggc- gca gattctcgattcacgcatgaacacgaagtacgatgagaatgacaagctgatccgggaagtgaaggtcatcacct- tga agtcaaagctcgtttctgacttcaggaaggatttccaattttataaggtgcgcgagatcaacaattatcaccat- gct catgacgcatacctcaacgctgtggtcggaacagcattgattaagaagtacccgaagctcgagtccgaattcgt- gta cggtgactataaggtttacgatgtgcgcaagatgatcgccaagtcagagcaggaaattggcaaggccactgcga- agt atttcttttactctaacattatgaatttctttaagactgagatcacgctggctaatggcgaaatccggaagaga- cca cttattgagaccaacggcgagacaggggaaatcgtgtgggacaaggggagggatttcgccacagtccgcaaggt- tct ctctatgcctcaagtgaatattgtcaagaagactgaagtccagacgggcgggttctcaaaggaatctattctgc- cca agcggaactcggataagcttatcgccagaaagaaggactgggacccgaagaagtatggaggtttcgactcacca- acg gtggcttactctgtcctggttgtggcaaaggtggagaagggaaagtcaaagaagctcaagtctgtcaaggagct- cct gggtatcaccattatggagaggtccagcttcgaaaagaatccgatcgattttctcgaggcgaagggatataagg- aag tgaagaaggacctgatcattaagcttccaaagtacagtcttttcgagttggaaaacggcaggaagcgcatgttg- gct tccgcaggagagctccagaagggtaacgagcttgctttgccgtccaagtatgtgaacttcctctatctggcatc- cca ctacgagaagctcaagggcagcccagaggataacgaacagaagcaactgtttgtggagcaacacaagcattatc- ttg acgagatcattgaacagatttcggagttcagtaagcgcgtcatcctcgccgacgcgaatttggataaggttctc- tca gcctacaacaagcaccgggacaagcctatcagagagcaggcggaaaatatcattcatctcttcaccctgacaaa- cct tggggctcccgctgcattcaagtattttgacactacgattgatcggaagagatacacttctacgaaggaggtgc- tgg atgcaacccttatccaccaatcgattactggcctctacgagacgcggatcgacttgagtcagctcgggggggat- aag agaccagcggcaaccaagaaggcaggacaagcgaagaagaagaagtag pJIT163-SpCas9 vector sequence SEQ ID NO: 10 gagctcggtacctgacccggtcgtgcccctctctagagataatgagcattgcatgtctaagttataaaaaatta- cca catattttttttgtcacacttgtttgaagtgcagtttatctatctttatacatatatttaaactttactctacg- aat aatataatctatagtactacaataatatcagtgttttagagaatcatataaatgaacagttagacatggtctaa- agg acaattgagtattttgacaacaggactctacagttttatctttttagtgtgcatgtgttctcctttttttttgc- aaa tagcttcacctatataatacttcatccattttattagtacatccatttagggtttagggttaatggtttttata- gac taatttttttagtacatctattttattctattttagcctctaaattaagaaaactaaaactctattttagtttt- ttt atttaataatttagatataaaatagaataaaataaagtgactaaaaattaaacaaataccctttaagaaattaa- aaa aactaaggaaacatttttcttgtttcgagtagataatgccagcctgttaaacgccgtcgacgagtctaacggac- acc aaccagcgaaccagcagcgtcgcgtcgggccaagcgaagcagacggcacggcatctctgtcgctgcctctggac- ccc tctcgatcgagagttccgctccaccgttggacttgctccgctgtcggcatccagaaattgcgtggcggagcggc- aga cgtgagccggcacggcaggcggcctcctcctcctctcacggcaccggcagctacgggggattcctttcccaccg- ctc cttcgctttcccttcctcgcccgccgtaataaatagacaccccctccacaccctctttccccaacctcgtgttg- ttc ggagcgcacacacacacaaccagatctcccccaaatccacccgtcggcacctccgcttcaaggtacgccgctcg- tcc tccccccccccccctctctaccttctctagatcggcgttccggtccatggttagggcccggtagttctacttct- gtt catgtttgtgttagatccgtgtttgtgttagatccgtgctgctagcgttcgtacacggatgcgacctgtacgtc- aga cacgttctgattgctaacttgccagtgtttctctttggggaatcctgggatggctctagccgttccgcagacgg- gat cgatttcatgattttttttgtttcgttgcatagggtttggtttgcccttttcctttatttcaatatatgccgtg- cac ttgtttgtcgggtcatcttttcatgcttttttttgtcttggttgtgatgatgtggtctggttgggcggtcgttc- tag atcggagtagaattaattctgtttcaaactacctggtggatttattaattttggatctgtatgtgtgtgccata- cat attcatagttacgaattgaagatgatggatggaaatatcgatctaggataggtatacatgttgatgcgggtttt- act gatgcatatacagagatgctttttgttcgcttggttgtgatgatgtggtgtggttgggcggtcgttcattcgtt- cta gatcggagtagaatactgtttcaaactacctggtgtatttattaattttggaactgtatgtgtgtgtcatacat- ctt catagttacgagtttaagatggatggaaatatcgatctaggataggtatacatgttgatgtgggttttactgat- gca tatacatgatggcatatgcagcatctattcatatgctctaaccttgagtacctatctattataataaacaagta- tgt tttataattattttgatcttgatatacttggatgatggcatatgcagcagctatatgtggatttttttagccct- gcc ttcatacgctatttatttgcttggtactgtttcttttgtcgatgctcaccctgttgtttggtgttacttctgca- aag cttccaccatggcgtgcaggtcgactctagaggatccccatggcccctaagaagaagagaaaggtcggtattca- cgg cgttcctgcggcgatggacaagaagtatagtattggtctggacattgggacgaattccgttggctgggccgtga- tca ccgatgagtacaaggtcccttccaagaagtttaaggttctggggaacaccgatcggcacagcatcaagaagaat- ctc attggagccctcctgttcgactcaggcgagaccgccgaagcaacaaggctcaagagaaccgcaaggagacggta- tac aagaaggaagaataggatctgctacctgcaggagattttcagcaacgaaatggcgaaggtggacgattcgttct- ttc atagattggaggagagtttcctcgtcgaggaagataagaagcacgagaggcatcctatctttggcaacattgtc- gac gaggttgcctatcacgaaaagtaccccacaatctatcatctgcggaagaagcttgtggactcgactgataaggc- gga
ccttagattgatctacctcgctctggcacacatgattaagttcaggggccattttctgatcgagggggatctta- acc cggacaatagcgatgtggacaagttgttcatccagctcgtccaaacctacaatcagctctttgaggaaaaccca- att aatgcttcaggcgtcgacgccaaggcgatcctgtctgcacgcctttcaaagtctcgccggcttgagaacttgat- cgc tcaactcccgggcgaaaagaagaacggcttgttcgggaatctcattgcactttcgttggggctcacaccaaact- tca agagtaattttgatctcgctgaggacgcaaagctgcagctttccaaggacacttatgacgatgacctggataac- ctt ttggcccaaatcggcgatcagtacgcggacttgttcctcgccgcgaagaatttgtcggacgcgatcctcctgag- tga tattctccgcgtgaacaccgagattacaaaggccccgctctcggcgagtatgatcaagcgctatgacgagcacc- atc aggatctgacccttttgaaggctttggtccggcagcaactcccagagaagtacaaggaaatcttctttgatcaa- tcc aagaacggctacgctggttatattgacggcggggcatcgcaggaggaattctacaagtttatcaagccaattct- gga gaagatggatggcacagaggaactcctggtgaagctcaatagggaggaccttttgcggaagcaaagaactttcg- ata acggcagcatccctcaccagattcatctcggggagctgcacgccatcctgagaaggcaggaagacttctacccc- ttt cttaaggataaccgggagaagatcgaaaagattctgacgttcagaattccgtactatgtcggaccactcgcccg- ggg taattccagatttgcgtggatgaccagaaagagcgaggaaaccatcacaccttggaacttcgaggaagtggtcg- ata agggcgcttccgcacagagcttcattgagcgcatgacaaattttgacaagaacctgcctaatgagaaggtcctt- ccc aagcattccctcctgtacgagtatttcactgtttataacgaactcacgaaggtgaagtatgtgaccgagggaat- gcg caagcccgccttcctgagcggcgagcaaaagaaggcgatcgtggaccttttgtttaagaccaatcggaaggtca- cag ttaagcagctcaaggaggactacttcaagaagattgaatgcttcgattccgttgagatcagcggcgtggaagac- agg tttaacgcgtcactggggacttaccacgatctcctgaagatcattaaggataaggacttcttggacaacgagga- aaa tgaggatatcctcgaagacattgtcctgactcttacgttgtttgaggatagggaaatgatcgaggaacgcttga- aga cgtatgcccatctcttcgatgacaaggttatgaagcagctcaagagaagaagatacaccggatggggaaggctg- tcc cgcaagcttatcaatggcattagagacaagcaatcagggaagacaatccttgactttttgaagtctgatggctt- cgc gaacaggaattttatgcagctgattcacgatgactcacttactttcaaggaggatatccagaaggctcaagtgt- cgg gacaaggtgacagtctgcacgagcatatcgccaaccttgcgggatctcctgcaatcaagaagggtattctgcag- aca gtcaaggttgtggatgagcttgtgaaggtcatgggacggcataagcccgagaacatcgttattgagatggccag- aga aaatcagaccacacaaaagggtcagaagaactcgagggagcgcatgaagcgcatcgaggaaggcattaaggagc- tgg ggagtcagatccttaaggagcacccggtggaaaacacgcagttgcaaaatgagaagctctatctgtactatctg- caa aatggcagggatatgtatgtggaccaggagttggatattaaccgcctctcggattacgacgtcgatcatatcgt- tcc tcagtccttccttaaggatgacagcattgacaataaggttctcaccaggtccgacaagaaccgcgggaagtccg- ata atgtgcccagcgaggaagtcgttaagaagatgaagaactactggaggcaacttttgaatgccaagttgatcaca- cag aggaagtttgataacctcactaaggccgagcgcggaggtctcagcgaactggacaaggcgggcttcattaagcg- gca actggttgagactagacagatcacgaagcacgtggcgcagattctcgattcacgcatgaacacgaagtacgatg- aga atgacaagctgatccgggaagtgaaggtcatcaccttgaagtcaaagctcgtttctgacttcaggaaggatttc- caa ttttataaggtgcgcgagatcaacaattatcaccatgctcatgacgcatacctcaacgctgtggtcggaacagc- att gattaagaagtacccgaagctcgagtccgaattcgtgtacggtgactataaggtttacgatgtgcgcaagatga- tcg ccaagtcagagcaggaaattggcaaggccactgcgaagtatttcttttactctaacattatgaatttctttaag- act gagatcacgctggctaatggcgaaatccggaagagaccacttattgagaccaacggcgagacaggggaaatcgt- gtg ggacaaggggagggatttcgccacagtccgcaaggttctctctatgcctcaagtgaatattgtcaagaagactg- aag tccagacgggcgggttctcaaaggaatctattctgcccaagcggaactcggataagcttatcgccagaaagaag- gac tgggacccgaagaagtatggaggtttcgactcaccaacggtggcttactctgtcctggttgtggcaaaggtgga- gaa gggaaagtcaaagaagctcaagtctgtcaaggagctcctgggtatcaccattatggagaggtccagcttcgaaa- aga atccgatcgattttctcgaggcgaagggatataaggaagtgaagaaggacctgatcattaagcttccaaagtac- agt cttttcgagttggaaaacggcaggaagcgcatgttggcttccgcaggagagctccagaagggtaacgagcttgc- ttt gccgtccaagtatgtgaacttcctctatctggcatcccactacgagaagctcaagggcagcccagaggataacg- aac agaagcaactgtttgtggagcaacacaagcattatcttgacgagatcattgaacagatttcggagttcagtaag- cgc gtcatcctcgccgacgcgaatttggataaggttctctcagcctacaacaagcaccgggacaagcctatcagaga- gca ggcggaaaatatcattcatctcttcaccctgacaaaccttggggctcccgctgcattcaagtattttgacacta- cga ttgatcggaagagatacacttctacgaaggaggtgctggatgcaacccttatccaccaatcgattactggcctc- tac gagacgcggatcgacttgagtcagctcgggggggataagagaccagcggcaaccaagaaggcaggacaagcgaa- gaa gaagaagtaggggcgagctcgaattcgctgaaatcaccagtctctctctacaaatctatctctctctattttct- cca taaataatgtgtgagtagtttcccgataagggaaattagggttcttatagggtttcgctcatgtgttgagcata- taa gaaacccttagtatgtatttgtatttgtaaaatacttctatcaataaaatttctaattcctaaaaccaaaatcc- agt actaaaatccagatctcctaaagtccctatagatctttgtcgtgaatataaaccagacacgagacgactaaacc- tgg agcccagacgccgttcgaagctagaagtaccgcttaggcaggaggccgttagggaaaagatgctaaggcagggt- tgg ttacgttgactcccccgtaggtttggtttaaatatgatgaagtggacggaaggaaggaggaagacaaggaagga- taa ggttgcaggccctgtgcaaggtaagaagatggaaatttgatagaggtacgctactatacttatactatacgcta- agg gaatgcttgtatttataccctataccccctaataaccccttatcaatttaagaaataatccgcataagcccccg- ctt aaaaattggtatcagagccatgaataggtctatgaccaaaactcaagaggataaaacctcaccaaaatacgaaa- gag ttcttaactctaaagataaaagatctttcaagatcaaaactagttccctcacaccggagcatgcgatatcctcg- aga gatctaggcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatac- gag ccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactg- ccc gctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcg- tat tgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctc- act caaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaa- aag gccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaa- tcg acgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcg- tgc gctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttct- caa tgctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgt- tca gcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactgg- cag cagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaac- tac ggctacactagaaggacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtag- ctc ttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaa- aag gatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggatt- ttg gtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaag- tat atatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttc- gtt catccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgct- gca atgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcg- cag aagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgc- cag ttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttca- ttc agctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcgg- tcc tccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctctta- ctg tcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcgg- cga ccgagttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcat- tgg aaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtg- cac ccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgca- aaa
aagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatca- ggg ttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttc- ccc gaaaagtgccacctgacgt pUC57-U3-tRNA-sgRNA vector sequence SEQ ID NO: 11 tcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaa- gcg gatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgc- ggc atcagagcagattgtactgagagtgcaccagatgcggtgtgaaataccgcacagatgcgtaaggagaaaatacc- gca tcaggcgccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctattacgc- cag ctggcgaaagggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacgttgtaa- aac gacggccagtgcctgcaggtcgacgattaaggaatctttaaacatacgaacagatcacttaaagttcttctgaa- gca acttaaagttatcaggcatgcatggatcttggaggaatcagatgtgcagtcagggaccatagcacaagacaggc- gtc ttctactggtgctaccagcaaatgctggaagccgggaacactgggtacgtcggaaaccacgtgatgtgaagaag- taa gataaactgtaggagaaaagcatttcgtagtgggccatgaagcctttcaggacatgtattgcagtatgggccgg- ccc attacgcaattggacgacaacaaagactagtattagtaccacctcggctatccacatagatcaaagctgattta- aaa gagttgtgcagatgatccgtggcaacaaagcaccagtggtctagtggtagaatagtaccctgccacggtacaga- ccc gggttcgattcccggctggtgcaagagaccgatatcccatggctcgagggtctcggttttagagctagaaatag- caa gttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtgctttttttccacataatctct- aga ggatccccggcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacat- acg agccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcac- tgc ccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttg- cgt attgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagc- tca ctcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagc- aaa aggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaa- aat cgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccct- cgt gcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgcttt- ctc atagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaacccccc- gtt cagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccact- ggc agcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggccta- act acggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggt- agc tcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaa- aaa aggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaaggga- ttt tggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaa- agt atatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatt- tcg ttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtg- ctg caatgataccgcgactcccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgag- cgc agaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttc- gcc agttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggctt- cat tcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttc- ggt cctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctct- tac tgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgc- ggc gaccgagttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatc- att ggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcg- tgc acccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccg- caa aaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttat- cag ggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatt- tcc ccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacga- ggc cctttcgtc sequence of 5' end ribozyme SEQ ID NO: 12 NNNNNNCTGATGAGTCCGTGAGGACGAAACGAGTAAGCTCGTC sequence of 3' end ribozyme SEQ ID NO: 13 GGCCGGCATGGTCCCAGCCTCCTCGCTGGCGCCGGCTGGGCAACATGCTTCGGCATGGCGAATGGGAC SEQ ID NO: 14 AAGAAGAGAAAGGTC SEQ ID NO: 15 CCCAAGAAGAAGAGGAAGGTG SEQ ID NO: 16 CCAAAGAAGAAGAGGAAGGTT SEQ ID NO: 17 SGGSPKKKRKV SEQ ID NO: 18 TCGGGGGGGAGCCCAAAGAAGAAGCGGAAGGTG SEQ ID NO: 19 PKKKRKV SEQ ID NO: 20 AGGTCGGGGAGGGGACGTACGGG SEQ ID NO: 21 GGCAAGGTCGGGGAGGGGACGTAC SEQ ID NO: 22 AAACGTACGTCCCCTCCCCGACCT SEQ ID NO: 23 GACGTCGGCGAGGAAGGCCTCGG SEQ ID NO: 24 GGCAGACGTCGGCGAGGAAGGCCT SEQ ID NO: 25 AAACAGGCCTTCCTCGCCGACGTC SEQ ID NO: 26 CATGGTGGGGAAAGCTTGGAGGG SEQ ID NO: 27 GGCACATGGTGGGGAAAGCTTGGA SEQ ID NO: 28 AAACTCCAAGCTTTCCCCACCATG SEQ ID NO: 29 CCGGACGACGACGTCGACGACGG SEQ ID NO: 30 GGCACCGGACGACGACGTCGACGA SEQ ID NO: 31 AAACTCGTCGACGTCGTCGTCCGG SEQ ID NO: 32 TTGAAGTCCCTTCTAGATGGAGG SEQ ID NO: 33 GGCATTGAAGTCCCTTCTAGATGG SEQ ID NO: 34 AAACCCATCTAGAAGGGACTTCAA SEQ ID NO: 35 ACTGCGACACCCAGATATCGTGG SEQ ID NO: 36 GGCAACTGCGACACCCAGATATCG SEQ ID NO: 37 AAACCGATATCTGGGTGTCGCAGT SEQ ID NO: 38 GTTGGTCTTTGCTCCTGCAGAGG SEQ ID NO: 39 GGCAGTTGGTCTTTGCTCCTGCAG SEQ ID NO: 40 AAACCTGCAGGAGCAAAGACCAAC SEQ ID NO: 41 TGCAAGGTCGGGGAGGGGACGTAC SEQ ID NO: 42 TGCAGACGTCGGCGAGGAAGGCCT SEQ ID NO: 43 TGCACATGGTGGGGAAAGCTTGGA SEQ ID NO: 44 TGCACCGGACGACGACGTCGACGA SEQ ID NO: 45 TGCATTGAAGTCCCTTCTAGATGG SEQ ID NO: 46 TGCACTGCGACACCCAGATATCG SEQ ID NO: 47 TGCAGTTGGTCTTTGCTCCTGCAG SEQ ID NO: 48 GGCGTTGGTCTTTGCTCCTGCAG
SEQ ID NO: 49 GGCGGTTGGTCTTTGCTCCTGCAG SEQ ID NO: 50 GGTGAGTGAGTGTGTGCGTGTGG SEQ ID NO: 51 CACCGGTGAGTGAGTGTGTGCGTG SEQ ID NO: 52 AAACCACGCACACACTCACTCACC SEQ ID NO: 53 CACCGGGTGAGTGAGTGTGTGCGTG SEQ ID NO: 54 AAACCACGCACACACTCACTCACCC SEQ ID NO: 55 CACCGaacaaagcaccagtggtctagtggtagaatagtaccctgccacggtacagacccgggttcgattcccgg- ctg gtgcaGGTGAGTGAGTGTGTGCGTG SEQ ID NO: 56 AAACCACGCACACACTCACTCACCtgcaccagccgggaatcgaacccgggtctgtaccgtggcagggtactatt- cta ccactagaccactggtgctttgttC SEQ ID NO: 57 NNNNNNNNNNNNNNNNNNNNTTTTTTT SEQ ID NO: 58 NNNNNNNNNNNNNNNNNNNTTTTTTT SEQ ID NO: 59 TGGAGTTGGTCTTTGCTCCTGCAGAGG SEQ ID NO: 60 GACGCCGGCGAGGAAGGCCTCGG SEQ ID NO: 61 GCAGTCGGAGAGGAAGGCCTGGG SEQ ID NO: 62 AGATCGGGGAGGGGACGTACGGG SEQ ID NO: 63 AGGTGGGGGAAGGGACGTACGGG SEQ ID NO: 64 AGATTGGGGAGGGCACGTACGGG SEQ ID NO: 65 AGCGTCGGCGAGGAAGGCCTCGG SEQ ID NO: 66 GGTGTCGGCGAGGAAGGCCTCGG SEQ ID NO: 67 GATATCGGCGAGGAAGGCCTCGG SEQ ID NO: 68 GACACCGGCGAGGAAGGCCTCGG SEQ ID NO: 69 GACGCTGGCGAGGAAGGCCTCGG SEQ ID NO: 70 GACGTTAGCGAGGAAGGCCTCGG SEQ ID NO: 71 GACGTCAACGAGGAAGGCCTCGG SEQ ID NO: 72 GACGTCGATGAGGAAGGCCTCGG SEQ ID NO: 73 GACGTCGGTAAGGAAGGCCTCGG SEQ ID NO: 74 GACGTCGGCAGGGAAGGCCTCGG SEQ ID NO: 75 GACGTCGGCGGAGAAGGCCTCGG SEQ ID NO: 76 GACGTCGGCGAAAAAGGCCTCGG SEQ ID NO: 77 GACGTCGGCGAGAGAGGCCTCGG SEQ ID NO: 78 GACGTCGGCGAGGGGGGCCTCGG SEQ ID NO: 79 GACGTCGGCGAGGAGAGCCTCGG SEQ ID NO: 80 GACGTCGGCGAGGAAAACCTCGG SEQ ID NO: 81 GACGTCGGCGAGGAAGATCTCGG SEQ ID NO: 82 GACGTCGGCGAGGAAGGTTTCGG SEQ ID NO: 83 GACGTCGGCGAGGAAGGCTCCGG SEQ ID NO: 84 GCTGAGTGAGTGTATGCGTGTGG SEQ ID NO: 85 TGTGGGTGAGTGTGTGCGTGAGG
Sequence CWU
1
1
85177DNAArtificial SequencetRNA encoding sequence 1aacaaagcac cagtggtcta
gtggtagaat agtaccctgc cacggtacag acccgggttc 60gattcccggc tggtgca
7724206DNAArtificial
SequenceSpCas9 nucleotide sequence 2atggccccta agaagaagag aaaggtcggt
attcacggcg ttcctgcggc gatggacaag 60aagtatagta ttggtctgga cattgggacg
aattccgttg gctgggccgt gatcaccgat 120gagtacaagg tcccttccaa gaagtttaag
gttctgggga acaccgatcg gcacagcatc 180aagaagaatc tcattggagc cctcctgttc
gactcaggcg agaccgccga agcaacaagg 240ctcaagagaa ccgcaaggag acggtataca
agaaggaaga ataggatctg ctacctgcag 300gagattttca gcaacgaaat ggcgaaggtg
gacgattcgt tctttcatag attggaggag 360agtttcctcg tcgaggaaga taagaagcac
gagaggcatc ctatctttgg caacattgtc 420gacgaggttg cctatcacga aaagtacccc
acaatctatc atctgcggaa gaagcttgtg 480gactcgactg ataaggcgga ccttagattg
atctacctcg ctctggcaca catgattaag 540ttcaggggcc attttctgat cgagggggat
cttaacccgg acaatagcga tgtggacaag 600ttgttcatcc agctcgtcca aacctacaat
cagctctttg aggaaaaccc aattaatgct 660tcaggcgtcg acgccaaggc gatcctgtct
gcacgccttt caaagtctcg ccggcttgag 720aacttgatcg ctcaactccc gggcgaaaag
aagaacggct tgttcgggaa tctcattgca 780ctttcgttgg ggctcacacc aaacttcaag
agtaattttg atctcgctga ggacgcaaag 840ctgcagcttt ccaaggacac ttatgacgat
gacctggata accttttggc ccaaatcggc 900gatcagtacg cggacttgtt cctcgccgcg
aagaatttgt cggacgcgat cctcctgagt 960gatattctcc gcgtgaacac cgagattaca
aaggccccgc tctcggcgag tatgatcaag 1020cgctatgacg agcaccatca ggatctgacc
cttttgaagg ctttggtccg gcagcaactc 1080ccagagaagt acaaggaaat cttctttgat
caatccaaga acggctacgc tggttatatt 1140gacggcgggg catcgcagga ggaattctac
aagtttatca agccaattct ggagaagatg 1200gatggcacag aggaactcct ggtgaagctc
aatagggagg accttttgcg gaagcaaaga 1260actttcgata acggcagcat ccctcaccag
attcatctcg gggagctgca cgccatcctg 1320agaaggcagg aagacttcta cccctttctt
aaggataacc gggagaagat cgaaaagatt 1380ctgacgttca gaattccgta ctatgtcgga
ccactcgccc ggggtaattc cagatttgcg 1440tggatgacca gaaagagcga ggaaaccatc
acaccttgga acttcgagga agtggtcgat 1500aagggcgctt ccgcacagag cttcattgag
cgcatgacaa attttgacaa gaacctgcct 1560aatgagaagg tccttcccaa gcattccctc
ctgtacgagt atttcactgt ttataacgaa 1620ctcacgaagg tgaagtatgt gaccgaggga
atgcgcaagc ccgccttcct gagcggcgag 1680caaaagaagg cgatcgtgga ccttttgttt
aagaccaatc ggaaggtcac agttaagcag 1740ctcaaggagg actacttcaa gaagattgaa
tgcttcgatt ccgttgagat cagcggcgtg 1800gaagacaggt ttaacgcgtc actggggact
taccacgatc tcctgaagat cattaaggat 1860aaggacttct tggacaacga ggaaaatgag
gatatcctcg aagacattgt cctgactctt 1920acgttgtttg aggataggga aatgatcgag
gaacgcttga agacgtatgc ccatctcttc 1980gatgacaagg ttatgaagca gctcaagaga
agaagataca ccggatgggg aaggctgtcc 2040cgcaagctta tcaatggcat tagagacaag
caatcaggga agacaatcct tgactttttg 2100aagtctgatg gcttcgcgaa caggaatttt
atgcagctga ttcacgatga ctcacttact 2160ttcaaggagg atatccagaa ggctcaagtg
tcgggacaag gtgacagtct gcacgagcat 2220atcgccaacc ttgcgggatc tcctgcaatc
aagaagggta ttctgcagac agtcaaggtt 2280gtggatgagc ttgtgaaggt catgggacgg
cataagcccg agaacatcgt tattgagatg 2340gccagagaaa atcagaccac acaaaagggt
cagaagaact cgagggagcg catgaagcgc 2400atcgaggaag gcattaagga gctggggagt
cagatcctta aggagcaccc ggtggaaaac 2460acgcagttgc aaaatgagaa gctctatctg
tactatctgc aaaatggcag ggatatgtat 2520gtggaccagg agttggatat taaccgcctc
tcggattacg acgtcgatca tatcgttcct 2580cagtccttcc ttaaggatga cagcattgac
aataaggttc tcaccaggtc cgacaagaac 2640cgcgggaagt ccgataatgt gcccagcgag
gaagtcgtta agaagatgaa gaactactgg 2700aggcaacttt tgaatgccaa gttgatcaca
cagaggaagt ttgataacct cactaaggcc 2760gagcgcggag gtctcagcga actggacaag
gcgggcttca ttaagcggca actggttgag 2820actagacaga tcacgaagca cgtggcgcag
attctcgatt cacgcatgaa cacgaagtac 2880gatgagaatg acaagctgat ccgggaagtg
aaggtcatca ccttgaagtc aaagctcgtt 2940tctgacttca ggaaggattt ccaattttat
aaggtgcgcg agatcaacaa ttatcaccat 3000gctcatgacg catacctcaa cgctgtggtc
ggaacagcat tgattaagaa gtacccgaag 3060ctcgagtccg aattcgtgta cggtgactat
aaggtttacg atgtgcgcaa gatgatcgcc 3120aagtcagagc aggaaattgg caaggccact
gcgaagtatt tcttttactc taacattatg 3180aatttcttta agactgagat cacgctggct
aatggcgaaa tccggaagag accacttatt 3240gagaccaacg gcgagacagg ggaaatcgtg
tgggacaagg ggagggattt cgccacagtc 3300cgcaaggttc tctctatgcc tcaagtgaat
attgtcaaga agactgaagt ccagacgggc 3360gggttctcaa aggaatctat tctgcccaag
cggaactcgg ataagcttat cgccagaaag 3420aaggactggg acccgaagaa gtatggaggt
ttcgactcac caacggtggc ttactctgtc 3480ctggttgtgg caaaggtgga gaagggaaag
tcaaagaagc tcaagtctgt caaggagctc 3540ctgggtatca ccattatgga gaggtccagc
ttcgaaaaga atccgatcga ttttctcgag 3600gcgaagggat ataaggaagt gaagaaggac
ctgatcatta agcttccaaa gtacagtctt 3660ttcgagttgg aaaacggcag gaagcgcatg
ttggcttccg caggagagct ccagaagggt 3720aacgagcttg ctttgccgtc caagtatgtg
aacttcctct atctggcatc ccactacgag 3780aagctcaagg gcagcccaga ggataacgaa
cagaagcaac tgtttgtgga gcaacacaag 3840cattatcttg acgagatcat tgaacagatt
tcggagttca gtaagcgcgt catcctcgcc 3900gacgcgaatt tggataaggt tctctcagcc
tacaacaagc accgggacaa gcctatcaga 3960gagcaggcgg aaaatatcat tcatctcttc
accctgacaa accttggggc tcccgctgca 4020ttcaagtatt ttgacactac gattgatcgg
aagagataca cttctacgaa ggaggtgctg 4080gatgcaaccc ttatccacca atcgattact
ggcctctacg agacgcggat cgacttgagt 4140cagctcgggg gggataagag accagcggca
accaagaagg caggacaagc gaagaagaag 4200aagtag
420631401PRTStreptococcus pyogenes 3Met
Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala1
5 10 15Ala Met Asp Lys Lys Tyr Ser
Ile Gly Leu Asp Ile Gly Thr Asn Ser 20 25
30Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser
Lys Lys 35 40 45Phe Lys Val Leu
Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu 50 55
60Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu
Ala Thr Arg65 70 75
80Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile
85 90 95Cys Tyr Leu Gln Glu Ile
Phe Ser Asn Glu Met Ala Lys Val Asp Asp 100
105 110Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val
Glu Glu Asp Lys 115 120 125Lys His
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala 130
135 140Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu
Arg Lys Lys Leu Val145 150 155
160Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala
165 170 175His Met Ile Lys
Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn 180
185 190Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile
Gln Leu Val Gln Thr 195 200 205Tyr
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp 210
215 220Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser
Lys Ser Arg Arg Leu Glu225 230 235
240Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
Gly 245 250 255Asn Leu Ile
Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn 260
265 270Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln
Leu Ser Lys Asp Thr Tyr 275 280
285Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala 290
295 300Asp Leu Phe Leu Ala Ala Lys Asn
Leu Ser Asp Ala Ile Leu Leu Ser305 310
315 320Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala
Pro Leu Ser Ala 325 330
335Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu
340 345 350Lys Ala Leu Val Arg Gln
Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe 355 360
365Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly
Gly Ala 370 375 380Ser Gln Glu Glu Phe
Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met385 390
395 400Asp Gly Thr Glu Glu Leu Leu Val Lys Leu
Asn Arg Glu Asp Leu Leu 405 410
415Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His
420 425 430Leu Gly Glu Leu His
Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro 435
440 445Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile
Leu Thr Phe Arg 450 455 460Ile Pro Tyr
Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala465
470 475 480Trp Met Thr Arg Lys Ser Glu
Glu Thr Ile Thr Pro Trp Asn Phe Glu 485
490 495Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe
Ile Glu Arg Met 500 505 510Thr
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His 515
520 525Ser Leu Leu Tyr Glu Tyr Phe Thr Val
Tyr Asn Glu Leu Thr Lys Val 530 535
540Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu545
550 555 560Gln Lys Lys Ala
Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val 565
570 575Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe
Lys Lys Ile Glu Cys Phe 580 585
590Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu
595 600 605Gly Thr Tyr His Asp Leu Leu
Lys Ile Ile Lys Asp Lys Asp Phe Leu 610 615
620Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr
Leu625 630 635 640Thr Leu
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr
645 650 655Ala His Leu Phe Asp Asp Lys
Val Met Lys Gln Leu Lys Arg Arg Arg 660 665
670Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly
Ile Arg 675 680 685Asp Lys Gln Ser
Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly 690
695 700Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp
Asp Ser Leu Thr705 710 715
720Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser
725 730 735Leu His Glu His Ile
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys 740
745 750Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu
Val Lys Val Met 755 760 765Gly Arg
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn 770
775 780Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
Glu Arg Met Lys Arg785 790 795
800Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
805 810 815Pro Val Glu Asn
Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr 820
825 830Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln
Glu Leu Asp Ile Asn 835 840 845Arg
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu 850
855 860Lys Asp Asp Ser Ile Asp Asn Lys Val Leu
Thr Arg Ser Asp Lys Asn865 870 875
880Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
Met 885 890 895Lys Asn Tyr
Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg 900
905 910Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg
Gly Gly Leu Ser Glu Leu 915 920
925Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile 930
935 940Thr Lys His Val Ala Gln Ile Leu
Asp Ser Arg Met Asn Thr Lys Tyr945 950
955 960Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val
Ile Thr Leu Lys 965 970
975Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val
980 985 990Arg Glu Ile Asn Asn Tyr
His His Ala His Asp Ala Tyr Leu Asn Ala 995 1000
1005Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys
Leu Glu Ser 1010 1015 1020Glu Phe Val
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met 1025
1030 1035Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala
Thr Ala Lys Tyr 1040 1045 1050Phe Phe
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr 1055
1060 1065Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro
Leu Ile Glu Thr Asn 1070 1075 1080Gly
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala 1085
1090 1095Thr Val Arg Lys Val Leu Ser Met Pro
Gln Val Asn Ile Val Lys 1100 1105
1110Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu
1115 1120 1125Pro Lys Arg Asn Ser Asp
Lys Leu Ile Ala Arg Lys Lys Asp Trp 1130 1135
1140Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala
Tyr 1145 1150 1155Ser Val Leu Val Val
Ala Lys Val Glu Lys Gly Lys Ser Lys Lys 1160 1165
1170Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met
Glu Arg 1175 1180 1185Ser Ser Phe Glu
Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly 1190
1195 1200Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys
Leu Pro Lys Tyr 1205 1210 1215Ser Leu
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser 1220
1225 1230Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu
Ala Leu Pro Ser Lys 1235 1240 1245Tyr
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys 1250
1255 1260Gly Ser Pro Glu Asp Asn Glu Gln Lys
Gln Leu Phe Val Glu Gln 1265 1270
1275His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe
1280 1285 1290Ser Lys Arg Val Ile Leu
Ala Asp Ala Asn Leu Asp Lys Val Leu 1295 1300
1305Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
Ala 1310 1315 1320Glu Asn Ile Ile His
Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro 1325 1330
1335Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys
Arg Tyr 1340 1345 1350Thr Ser Thr Lys
Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser 1355
1360 1365Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu
Ser Gln Leu Gly 1370 1375 1380Gly Asp
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys 1385
1390 1395Lys Lys Lys 140041401PRTArtificial
SequenceeSpCas9(1.0) amino acid sequence 4Met Ala Pro Lys Lys Lys Arg Lys
Val Gly Ile His Gly Val Pro Ala1 5 10
15Ala Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr
Asn Ser 20 25 30Val Gly Trp
Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys 35
40 45Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser
Ile Lys Lys Asn Leu 50 55 60Ile Gly
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg65
70 75 80Leu Lys Arg Thr Ala Arg Arg
Arg Tyr Thr Arg Arg Lys Asn Arg Ile 85 90
95Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys
Val Asp Asp 100 105 110Ser Phe
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys 115
120 125Lys His Glu Arg His Pro Ile Phe Gly Asn
Ile Val Asp Glu Val Ala 130 135 140Tyr
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val145
150 155 160Asp Ser Thr Asp Lys Ala
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala 165
170 175His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu
Gly Asp Leu Asn 180 185 190Pro
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr 195
200 205Tyr Asn Gln Leu Phe Glu Glu Asn Pro
Ile Asn Ala Ser Gly Val Asp 210 215
220Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu225
230 235 240Asn Leu Ile Ala
Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly 245
250 255Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr
Pro Asn Phe Lys Ser Asn 260 265
270Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr
275 280 285Asp Asp Asp Leu Asp Asn Leu
Leu Ala Gln Ile Gly Asp Gln Tyr Ala 290 295
300Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu
Ser305 310 315 320Asp Ile
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
325 330 335Ser Met Ile Lys Arg Tyr Asp
Glu His His Gln Asp Leu Thr Leu Leu 340 345
350Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
Ile Phe 355 360 365Phe Asp Gln Ser
Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala 370
375 380Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
Leu Glu Lys Met385 390 395
400Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu
405 410 415Arg Lys Gln Arg Thr
Phe Asp Asn Gly Ser Ile Pro His Gln Ile His 420
425 430Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu
Asp Phe Tyr Pro 435 440 445Phe Leu
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg 450
455 460Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly
Asn Ser Arg Phe Ala465 470 475
480Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu
485 490 495Glu Val Val Asp
Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met 500
505 510Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys
Val Leu Pro Lys His 515 520 525Ser
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val 530
535 540Lys Tyr Val Thr Glu Gly Met Arg Lys Pro
Ala Phe Leu Ser Gly Glu545 550 555
560Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys
Val 565 570 575Thr Val Lys
Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe 580
585 590Asp Ser Val Glu Ile Ser Gly Val Glu Asp
Arg Phe Asn Ala Ser Leu 595 600
605Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu 610
615 620Asp Asn Glu Glu Asn Glu Asp Ile
Leu Glu Asp Ile Val Leu Thr Leu625 630
635 640Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg
Leu Lys Thr Tyr 645 650
655Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg
660 665 670Tyr Thr Gly Trp Gly Arg
Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg 675 680
685Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser
Asp Gly 690 695 700Phe Ala Asn Arg Asn
Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr705 710
715 720Phe Lys Glu Asp Ile Gln Lys Ala Gln Val
Ser Gly Gln Gly Asp Ser 725 730
735Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys
740 745 750Gly Ile Leu Gln Thr
Val Lys Val Val Asp Glu Leu Val Lys Val Met 755
760 765Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met
Ala Arg Glu Asn 770 775 780Gln Thr Thr
Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg785
790 795 800Ile Glu Glu Gly Ile Lys Glu
Leu Gly Ser Gln Ile Leu Lys Glu His 805
810 815Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Ala Leu
Tyr Leu Tyr Tyr 820 825 830Leu
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn 835
840 845Arg Leu Ser Asp Tyr Asp Val Asp His
Ile Val Pro Gln Ser Phe Leu 850 855
860Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn865
870 875 880Arg Gly Lys Ser
Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met 885
890 895Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
Lys Leu Ile Thr Gln Arg 900 905
910Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu
915 920 925Asp Lys Ala Gly Phe Ile Lys
Arg Gln Leu Val Glu Thr Arg Gln Ile 930 935
940Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
Tyr945 950 955 960Asp Glu
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys
965 970 975Ser Lys Leu Val Ser Asp Phe
Arg Lys Asp Phe Gln Phe Tyr Lys Val 980 985
990Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu
Asn Ala 995 1000 1005Val Val Gly
Thr Ala Leu Ile Lys Lys Tyr Pro Ala Leu Glu Ser 1010
1015 1020Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp
Val Arg Lys Met 1025 1030 1035Ile Ala
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr 1040
1045 1050Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe
Lys Thr Glu Ile Thr 1055 1060 1065Leu
Ala Asn Gly Glu Ile Arg Lys Ala Pro Leu Ile Glu Thr Asn 1070
1075 1080Gly Glu Thr Gly Glu Ile Val Trp Asp
Lys Gly Arg Asp Phe Ala 1085 1090
1095Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys
1100 1105 1110Lys Thr Glu Val Gln Thr
Gly Gly Phe Ser Lys Glu Ser Ile Leu 1115 1120
1125Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp
Trp 1130 1135 1140Asp Pro Lys Lys Tyr
Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr 1145 1150
1155Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser
Lys Lys 1160 1165 1170Leu Lys Ser Val
Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg 1175
1180 1185Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu
Glu Ala Lys Gly 1190 1195 1200Tyr Lys
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr 1205
1210 1215Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys
Arg Met Leu Ala Ser 1220 1225 1230Ala
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys 1235
1240 1245Tyr Val Asn Phe Leu Tyr Leu Ala Ser
His Tyr Glu Lys Leu Lys 1250 1255
1260Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln
1265 1270 1275His Lys His Tyr Leu Asp
Glu Ile Ile Glu Gln Ile Ser Glu Phe 1280 1285
1290Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val
Leu 1295 1300 1305Ser Ala Tyr Asn Lys
His Arg Asp Lys Pro Ile Arg Glu Gln Ala 1310 1315
1320Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly
Ala Pro 1325 1330 1335Ala Ala Phe Lys
Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr 1340
1345 1350Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu
Ile His Gln Ser 1355 1360 1365Ile Thr
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly 1370
1375 1380Gly Asp Lys Arg Pro Ala Ala Thr Lys Lys
Ala Gly Gln Ala Lys 1385 1390 1395Lys
Lys Lys 140051401PRTArtificial SequenceeSpCas9(1.1) amino acid
sequence 5Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro
Ala1 5 10 15Ala Met Asp
Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser 20
25 30Val Gly Trp Ala Val Ile Thr Asp Glu Tyr
Lys Val Pro Ser Lys Lys 35 40
45Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu 50
55 60Ile Gly Ala Leu Leu Phe Asp Ser Gly
Glu Thr Ala Glu Ala Thr Arg65 70 75
80Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn
Arg Ile 85 90 95Cys Tyr
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp 100
105 110Ser Phe Phe His Arg Leu Glu Glu Ser
Phe Leu Val Glu Glu Asp Lys 115 120
125Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala
130 135 140Tyr His Glu Lys Tyr Pro Thr
Ile Tyr His Leu Arg Lys Lys Leu Val145 150
155 160Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr
Leu Ala Leu Ala 165 170
175His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn
180 185 190Pro Asp Asn Ser Asp Val
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr 195 200
205Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly
Val Asp 210 215 220Ala Lys Ala Ile Leu
Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu225 230
235 240Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys
Lys Asn Gly Leu Phe Gly 245 250
255Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn
260 265 270Phe Asp Leu Ala Glu
Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr 275
280 285Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly
Asp Gln Tyr Ala 290 295 300Asp Leu Phe
Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser305
310 315 320Asp Ile Leu Arg Val Asn Thr
Glu Ile Thr Lys Ala Pro Leu Ser Ala 325
330 335Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp
Leu Thr Leu Leu 340 345 350Lys
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe 355
360 365Phe Asp Gln Ser Lys Asn Gly Tyr Ala
Gly Tyr Ile Asp Gly Gly Ala 370 375
380Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met385
390 395 400Asp Gly Thr Glu
Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu 405
410 415Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
Ile Pro His Gln Ile His 420 425
430Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro
435 440 445Phe Leu Lys Asp Asn Arg Glu
Lys Ile Glu Lys Ile Leu Thr Phe Arg 450 455
460Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe
Ala465 470 475 480Trp Met
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu
485 490 495Glu Val Val Asp Lys Gly Ala
Ser Ala Gln Ser Phe Ile Glu Arg Met 500 505
510Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro
Lys His 515 520 525Ser Leu Leu Tyr
Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val 530
535 540Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe
Leu Ser Gly Glu545 550 555
560Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
565 570 575Thr Val Lys Gln Leu
Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe 580
585 590Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe
Asn Ala Ser Leu 595 600 605Gly Thr
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu 610
615 620Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp
Ile Val Leu Thr Leu625 630 635
640Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr
645 650 655Ala His Leu Phe
Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg 660
665 670Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu
Ile Asn Gly Ile Arg 675 680 685Asp
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly 690
695 700Phe Ala Asn Arg Asn Phe Met Gln Leu Ile
His Asp Asp Ser Leu Thr705 710 715
720Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
Ser 725 730 735Leu His Glu
His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys 740
745 750Gly Ile Leu Gln Thr Val Lys Val Val Asp
Glu Leu Val Lys Val Met 755 760
765Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn 770
775 780Gln Thr Thr Gln Lys Gly Gln Lys
Asn Ser Arg Glu Arg Met Lys Arg785 790
795 800Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile
Leu Lys Glu His 805 810
815Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr
820 825 830Leu Gln Asn Gly Arg Asp
Met Tyr Val Asp Gln Glu Leu Asp Ile Asn 835 840
845Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser
Phe Leu 850 855 860Ala Asp Asp Ser Ile
Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn865 870
875 880Arg Gly Lys Ser Asp Asn Val Pro Ser Glu
Glu Val Val Lys Lys Met 885 890
895Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg
900 905 910Lys Phe Asp Asn Leu
Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu 915
920 925Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu
Thr Arg Gln Ile 930 935 940Thr Lys His
Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr945
950 955 960Asp Glu Asn Asp Lys Leu Ile
Arg Glu Val Lys Val Ile Thr Leu Lys 965
970 975Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln
Phe Tyr Lys Val 980 985 990Arg
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala 995
1000 1005Val Val Gly Thr Ala Leu Ile Lys
Lys Tyr Pro Ala Leu Glu Ser 1010 1015
1020Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met
1025 1030 1035Ile Ala Lys Ser Glu Gln
Glu Ile Gly Lys Ala Thr Ala Lys Tyr 1040 1045
1050Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile
Thr 1055 1060 1065Leu Ala Asn Gly Glu
Ile Arg Lys Ala Pro Leu Ile Glu Thr Asn 1070 1075
1080Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp
Phe Ala 1085 1090 1095Thr Val Arg Lys
Val Leu Ser Met Pro Gln Val Asn Ile Val Lys 1100
1105 1110Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys
Glu Ser Ile Leu 1115 1120 1125Pro Lys
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp 1130
1135 1140Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser
Pro Thr Val Ala Tyr 1145 1150 1155Ser
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys 1160
1165 1170Leu Lys Ser Val Lys Glu Leu Leu Gly
Ile Thr Ile Met Glu Arg 1175 1180
1185Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly
1190 1195 1200Tyr Lys Glu Val Lys Lys
Asp Leu Ile Ile Lys Leu Pro Lys Tyr 1205 1210
1215Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala
Ser 1220 1225 1230Ala Gly Glu Leu Gln
Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys 1235 1240
1245Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys
Leu Lys 1250 1255 1260Gly Ser Pro Glu
Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln 1265
1270 1275His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln
Ile Ser Glu Phe 1280 1285 1290Ser Lys
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu 1295
1300 1305Ser Ala Tyr Asn Lys His Arg Asp Lys Pro
Ile Arg Glu Gln Ala 1310 1315 1320Glu
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro 1325
1330 1335Ala Ala Phe Lys Tyr Phe Asp Thr Thr
Ile Asp Arg Lys Arg Tyr 1340 1345
1350Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser
1355 1360 1365Ile Thr Gly Leu Tyr Glu
Thr Arg Ile Asp Leu Ser Gln Leu Gly 1370 1375
1380Gly Asp Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala
Lys 1385 1390 1395Lys Lys Lys
140061401PRTArtificial SequenceSpCas9-HF1 amino acid sequence 6Met Ala
Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala1 5
10 15Ala Met Asp Lys Lys Tyr Ser Ile
Gly Leu Asp Ile Gly Thr Asn Ser 20 25
30Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys
Lys 35 40 45Phe Lys Val Leu Gly
Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu 50 55
60Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala
Thr Arg65 70 75 80Leu
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile
85 90 95Cys Tyr Leu Gln Glu Ile Phe
Ser Asn Glu Met Ala Lys Val Asp Asp 100 105
110Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu
Asp Lys 115 120 125Lys His Glu Arg
His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala 130
135 140Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg
Lys Lys Leu Val145 150 155
160Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala
165 170 175His Met Ile Lys Phe
Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn 180
185 190Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln
Leu Val Gln Thr 195 200 205Tyr Asn
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp 210
215 220Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys
Ser Arg Arg Leu Glu225 230 235
240Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly
245 250 255Asn Leu Ile Ala
Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn 260
265 270Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu
Ser Lys Asp Thr Tyr 275 280 285Asp
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala 290
295 300Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser
Asp Ala Ile Leu Leu Ser305 310 315
320Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
Ala 325 330 335Ser Met Ile
Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu 340
345 350Lys Ala Leu Val Arg Gln Gln Leu Pro Glu
Lys Tyr Lys Glu Ile Phe 355 360
365Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala 370
375 380Ser Gln Glu Glu Phe Tyr Lys Phe
Ile Lys Pro Ile Leu Glu Lys Met385 390
395 400Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg
Glu Asp Leu Leu 405 410
415Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His
420 425 430Leu Gly Glu Leu His Ala
Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro 435 440
445Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr
Phe Arg 450 455 460Ile Pro Tyr Tyr Val
Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala465 470
475 480Trp Met Thr Arg Lys Ser Glu Glu Thr Ile
Thr Pro Trp Asn Phe Glu 485 490
495Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met
500 505 510Thr Ala Phe Asp Lys
Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His 515
520 525Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu
Leu Thr Lys Val 530 535 540Lys Tyr Val
Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu545
550 555 560Gln Lys Lys Ala Ile Val Asp
Leu Leu Phe Lys Thr Asn Arg Lys Val 565
570 575Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys
Ile Glu Cys Phe 580 585 590Asp
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu 595
600 605Gly Thr Tyr His Asp Leu Leu Lys Ile
Ile Lys Asp Lys Asp Phe Leu 610 615
620Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu625
630 635 640Thr Leu Phe Glu
Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr 645
650 655Ala His Leu Phe Asp Asp Lys Val Met Lys
Gln Leu Lys Arg Arg Arg 660 665
670Tyr Thr Gly Trp Gly Ala Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg
675 680 685Asp Lys Gln Ser Gly Lys Thr
Ile Leu Asp Phe Leu Lys Ser Asp Gly 690 695
700Phe Ala Asn Arg Asn Phe Met Ala Leu Ile His Asp Asp Ser Leu
Thr705 710 715 720Phe Lys
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser
725 730 735Leu His Glu His Ile Ala Asn
Leu Ala Gly Ser Pro Ala Ile Lys Lys 740 745
750Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys
Val Met 755 760 765Gly Arg His Lys
Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn 770
775 780Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu
Arg Met Lys Arg785 790 795
800Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
805 810 815Pro Val Glu Asn Thr
Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr 820
825 830Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu
Leu Asp Ile Asn 835 840 845Arg Leu
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu 850
855 860Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr
Arg Ser Asp Lys Asn865 870 875
880Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met
885 890 895Lys Asn Tyr Trp
Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg 900
905 910Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly
Gly Leu Ser Glu Leu 915 920 925Asp
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Ala Ile 930
935 940Thr Lys His Val Ala Gln Ile Leu Asp Ser
Arg Met Asn Thr Lys Tyr945 950 955
960Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
Lys 965 970 975Ser Lys Leu
Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val 980
985 990Arg Glu Ile Asn Asn Tyr His His Ala His
Asp Ala Tyr Leu Asn Ala 995 1000
1005Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser
1010 1015 1020Glu Phe Val Tyr Gly Asp
Tyr Lys Val Tyr Asp Val Arg Lys Met 1025 1030
1035Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys
Tyr 1040 1045 1050Phe Phe Tyr Ser Asn
Ile Met Asn Phe Phe Lys Thr Glu Ile Thr 1055 1060
1065Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu
Thr Asn 1070 1075 1080Gly Glu Thr Gly
Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala 1085
1090 1095Thr Val Arg Lys Val Leu Ser Met Pro Gln Val
Asn Ile Val Lys 1100 1105 1110Lys Thr
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu 1115
1120 1125Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala
Arg Lys Lys Asp Trp 1130 1135 1140Asp
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr 1145
1150 1155Ser Val Leu Val Val Ala Lys Val Glu
Lys Gly Lys Ser Lys Lys 1160 1165
1170Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg
1175 1180 1185Ser Ser Phe Glu Lys Asn
Pro Ile Asp Phe Leu Glu Ala Lys Gly 1190 1195
1200Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys
Tyr 1205 1210 1215Ser Leu Phe Glu Leu
Glu Asn Gly Arg Lys Arg Met Leu Ala Ser 1220 1225
1230Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro
Ser Lys 1235 1240 1245Tyr Val Asn Phe
Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys 1250
1255 1260Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
Phe Val Glu Gln 1265 1270 1275His Lys
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe 1280
1285 1290Ser Lys Arg Val Ile Leu Ala Asp Ala Asn
Leu Asp Lys Val Leu 1295 1300 1305Ser
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala 1310
1315 1320Glu Asn Ile Ile His Leu Phe Thr Leu
Thr Asn Leu Gly Ala Pro 1325 1330
1335Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr
1340 1345 1350Thr Ser Thr Lys Glu Val
Leu Asp Ala Thr Leu Ile His Gln Ser 1355 1360
1365Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu
Gly 1370 1375 1380Gly Asp Lys Arg Pro
Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys 1385 1390
1395Lys Lys Lys 140074206DNAArtificial
SequenceeSpCas9(1.0) codon-optimized nucleotide sequence 7atggccccta
agaagaagag aaaggtcggt attcacggcg ttcctgcggc gatggacaag 60aagtatagta
ttggtctgga cattgggacg aattccgttg gctgggccgt gatcaccgat 120gagtacaagg
tcccttccaa gaagtttaag gttctgggga acaccgatcg gcacagcatc 180aagaagaatc
tcattggagc cctcctgttc gactcaggcg agaccgccga agcaacaagg 240ctcaagagaa
ccgcaaggag acggtataca agaaggaaga ataggatctg ctacctgcag 300gagattttca
gcaacgaaat ggcgaaggtg gacgattcgt tctttcatag attggaggag 360agtttcctcg
tcgaggaaga taagaagcac gagaggcatc ctatctttgg caacattgtc 420gacgaggttg
cctatcacga aaagtacccc acaatctatc atctgcggaa gaagcttgtg 480gactcgactg
ataaggcgga ccttagattg atctacctcg ctctggcaca catgattaag 540ttcaggggcc
attttctgat cgagggggat cttaacccgg acaatagcga tgtggacaag 600ttgttcatcc
agctcgtcca aacctacaat cagctctttg aggaaaaccc aattaatgct 660tcaggcgtcg
acgccaaggc gatcctgtct gcacgccttt caaagtctcg ccggcttgag 720aacttgatcg
ctcaactccc gggcgaaaag aagaacggct tgttcgggaa tctcattgca 780ctttcgttgg
ggctcacacc aaacttcaag agtaattttg atctcgctga ggacgcaaag 840ctgcagcttt
ccaaggacac ttatgacgat gacctggata accttttggc ccaaatcggc 900gatcagtacg
cggacttgtt cctcgccgcg aagaatttgt cggacgcgat cctcctgagt 960gatattctcc
gcgtgaacac cgagattaca aaggccccgc tctcggcgag tatgatcaag 1020cgctatgacg
agcaccatca ggatctgacc cttttgaagg ctttggtccg gcagcaactc 1080ccagagaagt
acaaggaaat cttctttgat caatccaaga acggctacgc tggttatatt 1140gacggcgggg
catcgcagga ggaattctac aagtttatca agccaattct ggagaagatg 1200gatggcacag
aggaactcct ggtgaagctc aatagggagg accttttgcg gaagcaaaga 1260actttcgata
acggcagcat ccctcaccag attcatctcg gggagctgca cgccatcctg 1320agaaggcagg
aagacttcta cccctttctt aaggataacc gggagaagat cgaaaagatt 1380ctgacgttca
gaattccgta ctatgtcgga ccactcgccc ggggtaattc cagatttgcg 1440tggatgacca
gaaagagcga ggaaaccatc acaccttgga acttcgagga agtggtcgat 1500aagggcgctt
ccgcacagag cttcattgag cgcatgacaa attttgacaa gaacctgcct 1560aatgagaagg
tccttcccaa gcattccctc ctgtacgagt atttcactgt ttataacgaa 1620ctcacgaagg
tgaagtatgt gaccgaggga atgcgcaagc ccgccttcct gagcggcgag 1680caaaagaagg
cgatcgtgga ccttttgttt aagaccaatc ggaaggtcac agttaagcag 1740ctcaaggagg
actacttcaa gaagattgaa tgcttcgatt ccgttgagat cagcggcgtg 1800gaagacaggt
ttaacgcgtc actggggact taccacgatc tcctgaagat cattaaggat 1860aaggacttct
tggacaacga ggaaaatgag gatatcctcg aagacattgt cctgactctt 1920acgttgtttg
aggataggga aatgatcgag gaacgcttga agacgtatgc ccatctcttc 1980gatgacaagg
ttatgaagca gctcaagaga agaagataca ccggatgggg aaggctgtcc 2040cgcaagctta
tcaatggcat tagagacaag caatcaggga agacaatcct tgactttttg 2100aagtctgatg
gcttcgcgaa caggaatttt atgcagctga ttcacgatga ctcacttact 2160ttcaaggagg
atatccagaa ggctcaagtg tcgggacaag gtgacagtct gcacgagcat 2220atcgccaacc
ttgcgggatc tcctgcaatc aagaagggta ttctgcagac agtcaaggtt 2280gtggatgagc
ttgtgaaggt catgggacgg cataagcccg agaacatcgt tattgagatg 2340gccagagaaa
atcagaccac acaaaagggt cagaagaact cgagggagcg catgaagcgc 2400atcgaggaag
gcattaagga gctggggagt cagatcctta aggagcaccc ggtggaaaac 2460acgcagttgc
aaaatgaggc cctctatctg tactatctgc aaaatggcag ggatatgtat 2520gtggaccagg
agttggatat taaccgcctc tcggattacg acgtcgatca tatcgttcct 2580cagtccttcc
ttaaggatga cagcattgac aataaggttc tcaccaggtc cgacaagaac 2640cgcgggaagt
ccgataatgt gcccagcgag gaagtcgtta agaagatgaa gaactactgg 2700aggcaacttt
tgaatgccaa gttgatcaca cagaggaagt ttgataacct cactaaggcc 2760gagcgcggag
gtctcagcga actggacaag gcgggcttca ttaagcggca actggttgag 2820actagacaga
tcacgaagca cgtggcgcag attctcgatt cacgcatgaa cacgaagtac 2880gatgagaatg
acaagctgat ccgggaagtg aaggtcatca ccttgaagtc aaagctcgtt 2940tctgacttca
ggaaggattt ccaattttat aaggtgcgcg agatcaacaa ttatcaccat 3000gctcatgacg
catacctcaa cgctgtggtc ggaacagcat tgattaagaa gtacccggcg 3060ctcgagtccg
aattcgtgta cggtgactat aaggtttacg atgtgcgcaa gatgatcgcc 3120aagtcagagc
aggaaattgg caaggccact gcgaagtatt tcttttactc taacattatg 3180aatttcttta
agactgagat cacgctggct aatggcgaaa tccggaaggc gccacttatt 3240gagaccaacg
gcgagacagg ggaaatcgtg tgggacaagg ggagggattt cgccacagtc 3300cgcaaggttc
tctctatgcc tcaagtgaat attgtcaaga agactgaagt ccagacgggc 3360gggttctcaa
aggaatctat tctgcccaag cggaactcgg ataagcttat cgccagaaag 3420aaggactggg
acccgaagaa gtatggaggt ttcgactcac caacggtggc ttactctgtc 3480ctggttgtgg
caaaggtgga gaagggaaag tcaaagaagc tcaagtctgt caaggagctc 3540ctgggtatca
ccattatgga gaggtccagc ttcgaaaaga atccgatcga ttttctcgag 3600gcgaagggat
ataaggaagt gaagaaggac ctgatcatta agcttccaaa gtacagtctt 3660ttcgagttgg
aaaacggcag gaagcgcatg ttggcttccg caggagagct ccagaagggt 3720aacgagcttg
ctttgccgtc caagtatgtg aacttcctct atctggcatc ccactacgag 3780aagctcaagg
gcagcccaga ggataacgaa cagaagcaac tgtttgtgga gcaacacaag 3840cattatcttg
acgagatcat tgaacagatt tcggagttca gtaagcgcgt catcctcgcc 3900gacgcgaatt
tggataaggt tctctcagcc tacaacaagc accgggacaa gcctatcaga 3960gagcaggcgg
aaaatatcat tcatctcttc accctgacaa accttggggc tcccgctgca 4020ttcaagtatt
ttgacactac gattgatcgg aagagataca cttctacgaa ggaggtgctg 4080gatgcaaccc
ttatccacca atcgattact ggcctctacg agacgcggat cgacttgagt 4140cagctcgggg
gggataagag accagcggca accaagaagg caggacaagc gaagaagaag 4200aagtag
420684206DNAArtificial SequenceeSpCas9(1.1) codon-optimized nucleotide
sequence 8atggccccta agaagaagag aaaggtcggt attcacggcg ttcctgcggc
gatggacaag 60aagtatagta ttggtctgga cattgggacg aattccgttg gctgggccgt
gatcaccgat 120gagtacaagg tcccttccaa gaagtttaag gttctgggga acaccgatcg
gcacagcatc 180aagaagaatc tcattggagc cctcctgttc gactcaggcg agaccgccga
agcaacaagg 240ctcaagagaa ccgcaaggag acggtataca agaaggaaga ataggatctg
ctacctgcag 300gagattttca gcaacgaaat ggcgaaggtg gacgattcgt tctttcatag
attggaggag 360agtttcctcg tcgaggaaga taagaagcac gagaggcatc ctatctttgg
caacattgtc 420gacgaggttg cctatcacga aaagtacccc acaatctatc atctgcggaa
gaagcttgtg 480gactcgactg ataaggcgga ccttagattg atctacctcg ctctggcaca
catgattaag 540ttcaggggcc attttctgat cgagggggat cttaacccgg acaatagcga
tgtggacaag 600ttgttcatcc agctcgtcca aacctacaat cagctctttg aggaaaaccc
aattaatgct 660tcaggcgtcg acgccaaggc gatcctgtct gcacgccttt caaagtctcg
ccggcttgag 720aacttgatcg ctcaactccc gggcgaaaag aagaacggct tgttcgggaa
tctcattgca 780ctttcgttgg ggctcacacc aaacttcaag agtaattttg atctcgctga
ggacgcaaag 840ctgcagcttt ccaaggacac ttatgacgat gacctggata accttttggc
ccaaatcggc 900gatcagtacg cggacttgtt cctcgccgcg aagaatttgt cggacgcgat
cctcctgagt 960gatattctcc gcgtgaacac cgagattaca aaggccccgc tctcggcgag
tatgatcaag 1020cgctatgacg agcaccatca ggatctgacc cttttgaagg ctttggtccg
gcagcaactc 1080ccagagaagt acaaggaaat cttctttgat caatccaaga acggctacgc
tggttatatt 1140gacggcgggg catcgcagga ggaattctac aagtttatca agccaattct
ggagaagatg 1200gatggcacag aggaactcct ggtgaagctc aatagggagg accttttgcg
gaagcaaaga 1260actttcgata acggcagcat ccctcaccag attcatctcg gggagctgca
cgccatcctg 1320agaaggcagg aagacttcta cccctttctt aaggataacc gggagaagat
cgaaaagatt 1380ctgacgttca gaattccgta ctatgtcgga ccactcgccc ggggtaattc
cagatttgcg 1440tggatgacca gaaagagcga ggaaaccatc acaccttgga acttcgagga
agtggtcgat 1500aagggcgctt ccgcacagag cttcattgag cgcatgacaa attttgacaa
gaacctgcct 1560aatgagaagg tccttcccaa gcattccctc ctgtacgagt atttcactgt
ttataacgaa 1620ctcacgaagg tgaagtatgt gaccgaggga atgcgcaagc ccgccttcct
gagcggcgag 1680caaaagaagg cgatcgtgga ccttttgttt aagaccaatc ggaaggtcac
agttaagcag 1740ctcaaggagg actacttcaa gaagattgaa tgcttcgatt ccgttgagat
cagcggcgtg 1800gaagacaggt ttaacgcgtc actggggact taccacgatc tcctgaagat
cattaaggat 1860aaggacttct tggacaacga ggaaaatgag gatatcctcg aagacattgt
cctgactctt 1920acgttgtttg aggataggga aatgatcgag gaacgcttga agacgtatgc
ccatctcttc 1980gatgacaagg ttatgaagca gctcaagaga agaagataca ccggatgggg
aaggctgtcc 2040cgcaagctta tcaatggcat tagagacaag caatcaggga agacaatcct
tgactttttg 2100aagtctgatg gcttcgcgaa caggaatttt atgcagctga ttcacgatga
ctcacttact 2160ttcaaggagg atatccagaa ggctcaagtg tcgggacaag gtgacagtct
gcacgagcat 2220atcgccaacc ttgcgggatc tcctgcaatc aagaagggta ttctgcagac
agtcaaggtt 2280gtggatgagc ttgtgaaggt catgggacgg cataagcccg agaacatcgt
tattgagatg 2340gccagagaaa atcagaccac acaaaagggt cagaagaact cgagggagcg
catgaagcgc 2400atcgaggaag gcattaagga gctggggagt cagatcctta aggagcaccc
ggtggaaaac 2460acgcagttgc aaaatgagaa gctctatctg tactatctgc aaaatggcag
ggatatgtat 2520gtggaccagg agttggatat taaccgcctc tcggattacg acgtcgatca
tatcgttcct 2580cagtccttcc ttgcggatga cagcattgac aataaggttc tcaccaggtc
cgacaagaac 2640cgcgggaagt ccgataatgt gcccagcgag gaagtcgtta agaagatgaa
gaactactgg 2700aggcaacttt tgaatgccaa gttgatcaca cagaggaagt ttgataacct
cactaaggcc 2760gagcgcggag gtctcagcga actggacaag gcgggcttca ttaagcggca
actggttgag 2820actagacaga tcacgaagca cgtggcgcag attctcgatt cacgcatgaa
cacgaagtac 2880gatgagaatg acaagctgat ccgggaagtg aaggtcatca ccttgaagtc
aaagctcgtt 2940tctgacttca ggaaggattt ccaattttat aaggtgcgcg agatcaacaa
ttatcaccat 3000gctcatgacg catacctcaa cgctgtggtc ggaacagcat tgattaagaa
gtacccggcg 3060ctcgagtccg aattcgtgta cggtgactat aaggtttacg atgtgcgcaa
gatgatcgcc 3120aagtcagagc aggaaattgg caaggccact gcgaagtatt tcttttactc
taacattatg 3180aatttcttta agactgagat cacgctggct aatggcgaaa tccggaaggc
gccacttatt 3240gagaccaacg gcgagacagg ggaaatcgtg tgggacaagg ggagggattt
cgccacagtc 3300cgcaaggttc tctctatgcc tcaagtgaat attgtcaaga agactgaagt
ccagacgggc 3360gggttctcaa aggaatctat tctgcccaag cggaactcgg ataagcttat
cgccagaaag 3420aaggactggg acccgaagaa gtatggaggt ttcgactcac caacggtggc
ttactctgtc 3480ctggttgtgg caaaggtgga gaagggaaag tcaaagaagc tcaagtctgt
caaggagctc 3540ctgggtatca ccattatgga gaggtccagc ttcgaaaaga atccgatcga
ttttctcgag 3600gcgaagggat ataaggaagt gaagaaggac ctgatcatta agcttccaaa
gtacagtctt 3660ttcgagttgg aaaacggcag gaagcgcatg ttggcttccg caggagagct
ccagaagggt 3720aacgagcttg ctttgccgtc caagtatgtg aacttcctct atctggcatc
ccactacgag 3780aagctcaagg gcagcccaga ggataacgaa cagaagcaac tgtttgtgga
gcaacacaag 3840cattatcttg acgagatcat tgaacagatt tcggagttca gtaagcgcgt
catcctcgcc 3900gacgcgaatt tggataaggt tctctcagcc tacaacaagc accgggacaa
gcctatcaga 3960gagcaggcgg aaaatatcat tcatctcttc accctgacaa accttggggc
tcccgctgca 4020ttcaagtatt ttgacactac gattgatcgg aagagataca cttctacgaa
ggaggtgctg 4080gatgcaaccc ttatccacca atcgattact ggcctctacg agacgcggat
cgacttgagt 4140cagctcgggg gggataagag accagcggca accaagaagg caggacaagc
gaagaagaag 4200aagtag
420694206DNAArtificial SequenceSpCas9-HF1 codon-optimized
nucleotide sequence 9atggccccta agaagaagag aaaggtcggt attcacggcg
ttcctgcggc gatggacaag 60aagtatagta ttggtctgga cattgggacg aattccgttg
gctgggccgt gatcaccgat 120gagtacaagg tcccttccaa gaagtttaag gttctgggga
acaccgatcg gcacagcatc 180aagaagaatc tcattggagc cctcctgttc gactcaggcg
agaccgccga agcaacaagg 240ctcaagagaa ccgcaaggag acggtataca agaaggaaga
ataggatctg ctacctgcag 300gagattttca gcaacgaaat ggcgaaggtg gacgattcgt
tctttcatag attggaggag 360agtttcctcg tcgaggaaga taagaagcac gagaggcatc
ctatctttgg caacattgtc 420gacgaggttg cctatcacga aaagtacccc acaatctatc
atctgcggaa gaagcttgtg 480gactcgactg ataaggcgga ccttagattg atctacctcg
ctctggcaca catgattaag 540ttcaggggcc attttctgat cgagggggat cttaacccgg
acaatagcga tgtggacaag 600ttgttcatcc agctcgtcca aacctacaat cagctctttg
aggaaaaccc aattaatgct 660tcaggcgtcg acgccaaggc gatcctgtct gcacgccttt
caaagtctcg ccggcttgag 720aacttgatcg ctcaactccc gggcgaaaag aagaacggct
tgttcgggaa tctcattgca 780ctttcgttgg ggctcacacc aaacttcaag agtaattttg
atctcgctga ggacgcaaag 840ctgcagcttt ccaaggacac ttatgacgat gacctggata
accttttggc ccaaatcggc 900gatcagtacg cggacttgtt cctcgccgcg aagaatttgt
cggacgcgat cctcctgagt 960gatattctcc gcgtgaacac cgagattaca aaggccccgc
tctcggcgag tatgatcaag 1020cgctatgacg agcaccatca ggatctgacc cttttgaagg
ctttggtccg gcagcaactc 1080ccagagaagt acaaggaaat cttctttgat caatccaaga
acggctacgc tggttatatt 1140gacggcgggg catcgcagga ggaattctac aagtttatca
agccaattct ggagaagatg 1200gatggcacag aggaactcct ggtgaagctc aatagggagg
accttttgcg gaagcaaaga 1260actttcgata acggcagcat ccctcaccag attcatctcg
gggagctgca cgccatcctg 1320agaaggcagg aagacttcta cccctttctt aaggataacc
gggagaagat cgaaaagatt 1380ctgacgttca gaattccgta ctatgtcgga ccactcgccc
ggggtaattc cagatttgcg 1440tggatgacca gaaagagcga ggaaaccatc acaccttgga
acttcgagga agtggtcgat 1500aagggcgctt ccgcacagag cttcattgag cgcatgacag
cctttgacaa gaacctgcct 1560aatgagaagg tccttcccaa gcattccctc ctgtacgagt
atttcactgt ttataacgaa 1620ctcacgaagg tgaagtatgt gaccgaggga atgcgcaagc
ccgccttcct gagcggcgag 1680caaaagaagg cgatcgtgga ccttttgttt aagaccaatc
ggaaggtcac agttaagcag 1740ctcaaggagg actacttcaa gaagattgaa tgcttcgatt
ccgttgagat cagcggcgtg 1800gaagacaggt ttaacgcgtc actggggact taccacgatc
tcctgaagat cattaaggat 1860aaggacttct tggacaacga ggaaaatgag gatatcctcg
aagacattgt cctgactctt 1920acgttgtttg aggataggga aatgatcgag gaacgcttga
agacgtatgc ccatctcttc 1980gatgacaagg ttatgaagca gctcaagaga agaagataca
ccggatgggg agccctgtcc 2040cgcaagctta tcaatggcat tagagacaag caatcaggga
agacaatcct tgactttttg 2100aagtctgatg gcttcgcgaa caggaatttt atggccctga
ttcacgatga ctcacttact 2160ttcaaggagg atatccagaa ggctcaagtg tcgggacaag
gtgacagtct gcacgagcat 2220atcgccaacc ttgcgggatc tcctgcaatc aagaagggta
ttctgcagac agtcaaggtt 2280gtggatgagc ttgtgaaggt catgggacgg cataagcccg
agaacatcgt tattgagatg 2340gccagagaaa atcagaccac acaaaagggt cagaagaact
cgagggagcg catgaagcgc 2400atcgaggaag gcattaagga gctggggagt cagatcctta
aggagcaccc ggtggaaaac 2460acgcagttgc aaaatgagaa gctctatctg tactatctgc
aaaatggcag ggatatgtat 2520gtggaccagg agttggatat taaccgcctc tcggattacg
acgtcgatca tatcgttcct 2580cagtccttcc ttaaggatga cagcattgac aataaggttc
tcaccaggtc cgacaagaac 2640cgcgggaagt ccgataatgt gcccagcgag gaagtcgtta
agaagatgaa gaactactgg 2700aggcaacttt tgaatgccaa gttgatcaca cagaggaagt
ttgataacct cactaaggcc 2760gagcgcggag gtctcagcga actggacaag gcgggcttca
ttaagcggca actggttgag 2820actagagcca tcacgaagca cgtggcgcag attctcgatt
cacgcatgaa cacgaagtac 2880gatgagaatg acaagctgat ccgggaagtg aaggtcatca
ccttgaagtc aaagctcgtt 2940tctgacttca ggaaggattt ccaattttat aaggtgcgcg
agatcaacaa ttatcaccat 3000gctcatgacg catacctcaa cgctgtggtc ggaacagcat
tgattaagaa gtacccgaag 3060ctcgagtccg aattcgtgta cggtgactat aaggtttacg
atgtgcgcaa gatgatcgcc 3120aagtcagagc aggaaattgg caaggccact gcgaagtatt
tcttttactc taacattatg 3180aatttcttta agactgagat cacgctggct aatggcgaaa
tccggaagag accacttatt 3240gagaccaacg gcgagacagg ggaaatcgtg tgggacaagg
ggagggattt cgccacagtc 3300cgcaaggttc tctctatgcc tcaagtgaat attgtcaaga
agactgaagt ccagacgggc 3360gggttctcaa aggaatctat tctgcccaag cggaactcgg
ataagcttat cgccagaaag 3420aaggactggg acccgaagaa gtatggaggt ttcgactcac
caacggtggc ttactctgtc 3480ctggttgtgg caaaggtgga gaagggaaag tcaaagaagc
tcaagtctgt caaggagctc 3540ctgggtatca ccattatgga gaggtccagc ttcgaaaaga
atccgatcga ttttctcgag 3600gcgaagggat ataaggaagt gaagaaggac ctgatcatta
agcttccaaa gtacagtctt 3660ttcgagttgg aaaacggcag gaagcgcatg ttggcttccg
caggagagct ccagaagggt 3720aacgagcttg ctttgccgtc caagtatgtg aacttcctct
atctggcatc ccactacgag 3780aagctcaagg gcagcccaga ggataacgaa cagaagcaac
tgtttgtgga gcaacacaag 3840cattatcttg acgagatcat tgaacagatt tcggagttca
gtaagcgcgt catcctcgcc 3900gacgcgaatt tggataaggt tctctcagcc tacaacaagc
accgggacaa gcctatcaga 3960gagcaggcgg aaaatatcat tcatctcttc accctgacaa
accttggggc tcccgctgca 4020ttcaagtatt ttgacactac gattgatcgg aagagataca
cttctacgaa ggaggtgctg 4080gatgcaaccc ttatccacca atcgattact ggcctctacg
agacgcggat cgacttgagt 4140cagctcgggg gggataagag accagcggca accaagaagg
caggacaagc gaagaagaag 4200aagtag
4206109182DNAArtificial SequencepJIT163-SpCas9
vector sequence 10gagctcggta cctgacccgg tcgtgcccct ctctagagat aatgagcatt
gcatgtctaa 60gttataaaaa attaccacat attttttttg tcacacttgt ttgaagtgca
gtttatctat 120ctttatacat atatttaaac tttactctac gaataatata atctatagta
ctacaataat 180atcagtgttt tagagaatca tataaatgaa cagttagaca tggtctaaag
gacaattgag 240tattttgaca acaggactct acagttttat ctttttagtg tgcatgtgtt
ctcctttttt 300tttgcaaata gcttcaccta tataatactt catccatttt attagtacat
ccatttaggg 360tttagggtta atggttttta tagactaatt tttttagtac atctatttta
ttctatttta 420gcctctaaat taagaaaact aaaactctat tttagttttt ttatttaata
atttagatat 480aaaatagaat aaaataaagt gactaaaaat taaacaaata ccctttaaga
aattaaaaaa 540actaaggaaa catttttctt gtttcgagta gataatgcca gcctgttaaa
cgccgtcgac 600gagtctaacg gacaccaacc agcgaaccag cagcgtcgcg tcgggccaag
cgaagcagac 660ggcacggcat ctctgtcgct gcctctggac ccctctcgat cgagagttcc
gctccaccgt 720tggacttgct ccgctgtcgg catccagaaa ttgcgtggcg gagcggcaga
cgtgagccgg 780cacggcaggc ggcctcctcc tcctctcacg gcaccggcag ctacggggga
ttcctttccc 840accgctcctt cgctttccct tcctcgcccg ccgtaataaa tagacacccc
ctccacaccc 900tctttcccca acctcgtgtt gttcggagcg cacacacaca caaccagatc
tcccccaaat 960ccacccgtcg gcacctccgc ttcaaggtac gccgctcgtc ctcccccccc
ccccctctct 1020accttctcta gatcggcgtt ccggtccatg gttagggccc ggtagttcta
cttctgttca 1080tgtttgtgtt agatccgtgt ttgtgttaga tccgtgctgc tagcgttcgt
acacggatgc 1140gacctgtacg tcagacacgt tctgattgct aacttgccag tgtttctctt
tggggaatcc 1200tgggatggct ctagccgttc cgcagacggg atcgatttca tgattttttt
tgtttcgttg 1260catagggttt ggtttgccct tttcctttat ttcaatatat gccgtgcact
tgtttgtcgg 1320gtcatctttt catgcttttt tttgtcttgg ttgtgatgat gtggtctggt
tgggcggtcg 1380ttctagatcg gagtagaatt aattctgttt caaactacct ggtggattta
ttaattttgg 1440atctgtatgt gtgtgccata catattcata gttacgaatt gaagatgatg
gatggaaata 1500tcgatctagg ataggtatac atgttgatgc gggttttact gatgcatata
cagagatgct 1560ttttgttcgc ttggttgtga tgatgtggtg tggttgggcg gtcgttcatt
cgttctagat 1620cggagtagaa tactgtttca aactacctgg tgtatttatt aattttggaa
ctgtatgtgt 1680gtgtcataca tcttcatagt tacgagttta agatggatgg aaatatcgat
ctaggatagg 1740tatacatgtt gatgtgggtt ttactgatgc atatacatga tggcatatgc
agcatctatt 1800catatgctct aaccttgagt acctatctat tataataaac aagtatgttt
tataattatt 1860ttgatcttga tatacttgga tgatggcata tgcagcagct atatgtggat
ttttttagcc 1920ctgccttcat acgctattta tttgcttggt actgtttctt ttgtcgatgc
tcaccctgtt 1980gtttggtgtt acttctgcaa agcttccacc atggcgtgca ggtcgactct
agaggatccc 2040catggcccct aagaagaaga gaaaggtcgg tattcacggc gttcctgcgg
cgatggacaa 2100gaagtatagt attggtctgg acattgggac gaattccgtt ggctgggccg
tgatcaccga 2160tgagtacaag gtcccttcca agaagtttaa ggttctgggg aacaccgatc
ggcacagcat 2220caagaagaat ctcattggag ccctcctgtt cgactcaggc gagaccgccg
aagcaacaag 2280gctcaagaga accgcaagga gacggtatac aagaaggaag aataggatct
gctacctgca 2340ggagattttc agcaacgaaa tggcgaaggt ggacgattcg ttctttcata
gattggagga 2400gagtttcctc gtcgaggaag ataagaagca cgagaggcat cctatctttg
gcaacattgt 2460cgacgaggtt gcctatcacg aaaagtaccc cacaatctat catctgcgga
agaagcttgt 2520ggactcgact gataaggcgg accttagatt gatctacctc gctctggcac
acatgattaa 2580gttcaggggc cattttctga tcgaggggga tcttaacccg gacaatagcg
atgtggacaa 2640gttgttcatc cagctcgtcc aaacctacaa tcagctcttt gaggaaaacc
caattaatgc 2700ttcaggcgtc gacgccaagg cgatcctgtc tgcacgcctt tcaaagtctc
gccggcttga 2760gaacttgatc gctcaactcc cgggcgaaaa gaagaacggc ttgttcggga
atctcattgc 2820actttcgttg gggctcacac caaacttcaa gagtaatttt gatctcgctg
aggacgcaaa 2880gctgcagctt tccaaggaca cttatgacga tgacctggat aaccttttgg
cccaaatcgg 2940cgatcagtac gcggacttgt tcctcgccgc gaagaatttg tcggacgcga
tcctcctgag 3000tgatattctc cgcgtgaaca ccgagattac aaaggccccg ctctcggcga
gtatgatcaa 3060gcgctatgac gagcaccatc aggatctgac ccttttgaag gctttggtcc
ggcagcaact 3120cccagagaag tacaaggaaa tcttctttga tcaatccaag aacggctacg
ctggttatat 3180tgacggcggg gcatcgcagg aggaattcta caagtttatc aagccaattc
tggagaagat 3240ggatggcaca gaggaactcc tggtgaagct caatagggag gaccttttgc
ggaagcaaag 3300aactttcgat aacggcagca tccctcacca gattcatctc ggggagctgc
acgccatcct 3360gagaaggcag gaagacttct acccctttct taaggataac cgggagaaga
tcgaaaagat 3420tctgacgttc agaattccgt actatgtcgg accactcgcc cggggtaatt
ccagatttgc 3480gtggatgacc agaaagagcg aggaaaccat cacaccttgg aacttcgagg
aagtggtcga 3540taagggcgct tccgcacaga gcttcattga gcgcatgaca aattttgaca
agaacctgcc 3600taatgagaag gtccttccca agcattccct cctgtacgag tatttcactg
tttataacga 3660actcacgaag gtgaagtatg tgaccgaggg aatgcgcaag cccgccttcc
tgagcggcga 3720gcaaaagaag gcgatcgtgg accttttgtt taagaccaat cggaaggtca
cagttaagca 3780gctcaaggag gactacttca agaagattga atgcttcgat tccgttgaga
tcagcggcgt 3840ggaagacagg tttaacgcgt cactggggac ttaccacgat ctcctgaaga
tcattaagga 3900taaggacttc ttggacaacg aggaaaatga ggatatcctc gaagacattg
tcctgactct 3960tacgttgttt gaggataggg aaatgatcga ggaacgcttg aagacgtatg
cccatctctt 4020cgatgacaag gttatgaagc agctcaagag aagaagatac accggatggg
gaaggctgtc 4080ccgcaagctt atcaatggca ttagagacaa gcaatcaggg aagacaatcc
ttgacttttt 4140gaagtctgat ggcttcgcga acaggaattt tatgcagctg attcacgatg
actcacttac 4200tttcaaggag gatatccaga aggctcaagt gtcgggacaa ggtgacagtc
tgcacgagca 4260tatcgccaac cttgcgggat ctcctgcaat caagaagggt attctgcaga
cagtcaaggt 4320tgtggatgag cttgtgaagg tcatgggacg gcataagccc gagaacatcg
ttattgagat 4380ggccagagaa aatcagacca cacaaaaggg tcagaagaac tcgagggagc
gcatgaagcg 4440catcgaggaa ggcattaagg agctggggag tcagatcctt aaggagcacc
cggtggaaaa 4500cacgcagttg caaaatgaga agctctatct gtactatctg caaaatggca
gggatatgta 4560tgtggaccag gagttggata ttaaccgcct ctcggattac gacgtcgatc
atatcgttcc 4620tcagtccttc cttaaggatg acagcattga caataaggtt ctcaccaggt
ccgacaagaa 4680ccgcgggaag tccgataatg tgcccagcga ggaagtcgtt aagaagatga
agaactactg 4740gaggcaactt ttgaatgcca agttgatcac acagaggaag tttgataacc
tcactaaggc 4800cgagcgcgga ggtctcagcg aactggacaa ggcgggcttc attaagcggc
aactggttga 4860gactagacag atcacgaagc acgtggcgca gattctcgat tcacgcatga
acacgaagta 4920cgatgagaat gacaagctga tccgggaagt gaaggtcatc accttgaagt
caaagctcgt 4980ttctgacttc aggaaggatt tccaatttta taaggtgcgc gagatcaaca
attatcacca 5040tgctcatgac gcatacctca acgctgtggt cggaacagca ttgattaaga
agtacccgaa 5100gctcgagtcc gaattcgtgt acggtgacta taaggtttac gatgtgcgca
agatgatcgc 5160caagtcagag caggaaattg gcaaggccac tgcgaagtat ttcttttact
ctaacattat 5220gaatttcttt aagactgaga tcacgctggc taatggcgaa atccggaaga
gaccacttat 5280tgagaccaac ggcgagacag gggaaatcgt gtgggacaag gggagggatt
tcgccacagt 5340ccgcaaggtt ctctctatgc ctcaagtgaa tattgtcaag aagactgaag
tccagacggg 5400cgggttctca aaggaatcta ttctgcccaa gcggaactcg gataagctta
tcgccagaaa 5460gaaggactgg gacccgaaga agtatggagg tttcgactca ccaacggtgg
cttactctgt 5520cctggttgtg gcaaaggtgg agaagggaaa gtcaaagaag ctcaagtctg
tcaaggagct 5580cctgggtatc accattatgg agaggtccag cttcgaaaag aatccgatcg
attttctcga 5640ggcgaaggga tataaggaag tgaagaagga cctgatcatt aagcttccaa
agtacagtct 5700tttcgagttg gaaaacggca ggaagcgcat gttggcttcc gcaggagagc
tccagaaggg 5760taacgagctt gctttgccgt ccaagtatgt gaacttcctc tatctggcat
cccactacga 5820gaagctcaag ggcagcccag aggataacga acagaagcaa ctgtttgtgg
agcaacacaa 5880gcattatctt gacgagatca ttgaacagat ttcggagttc agtaagcgcg
tcatcctcgc 5940cgacgcgaat ttggataagg ttctctcagc ctacaacaag caccgggaca
agcctatcag 6000agagcaggcg gaaaatatca ttcatctctt caccctgaca aaccttgggg
ctcccgctgc 6060attcaagtat tttgacacta cgattgatcg gaagagatac acttctacga
aggaggtgct 6120ggatgcaacc cttatccacc aatcgattac tggcctctac gagacgcgga
tcgacttgag 6180tcagctcggg ggggataaga gaccagcggc aaccaagaag gcaggacaag
cgaagaagaa 6240gaagtagggg cgagctcgaa ttcgctgaaa tcaccagtct ctctctacaa
atctatctct 6300ctctattttc tccataaata atgtgtgagt agtttcccga taagggaaat
tagggttctt 6360atagggtttc gctcatgtgt tgagcatata agaaaccctt agtatgtatt
tgtatttgta 6420aaatacttct atcaataaaa tttctaattc ctaaaaccaa aatccagtac
taaaatccag 6480atctcctaaa gtccctatag atctttgtcg tgaatataaa ccagacacga
gacgactaaa 6540cctggagccc agacgccgtt cgaagctaga agtaccgctt aggcaggagg
ccgttaggga 6600aaagatgcta aggcagggtt ggttacgttg actcccccgt aggtttggtt
taaatatgat 6660gaagtggacg gaaggaagga ggaagacaag gaaggataag gttgcaggcc
ctgtgcaagg 6720taagaagatg gaaatttgat agaggtacgc tactatactt atactatacg
ctaagggaat 6780gcttgtattt ataccctata ccccctaata accccttatc aatttaagaa
ataatccgca 6840taagcccccg cttaaaaatt ggtatcagag ccatgaatag gtctatgacc
aaaactcaag 6900aggataaaac ctcaccaaaa tacgaaagag ttcttaactc taaagataaa
agatctttca 6960agatcaaaac tagttccctc acaccggagc atgcgatatc ctcgagagat
ctaggcgtaa 7020tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc
acacaacata 7080cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta
actcacatta 7140attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca
gctgcattaa 7200tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc
cgcttcctcg 7260ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc
tcactcaaag 7320gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat
gtgagcaaaa 7380ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt
ccataggctc 7440cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg
aaacccgaca 7500ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc
tcctgttccg 7560accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt
ggcgctttct 7620caatgctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa
gctgggctgt 7680gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta
tcgtcttgag 7740tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa
caggattagc 7800agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa
ctacggctac 7860actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt
cggaaaaaga 7920gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt
ttttgtttgc 7980aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat
cttttctacg 8040gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat
gagattatca 8100aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc
aatctaaagt 8160atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc
acctatctca 8220gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta
gataactacg 8280atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga
cccacgctca 8340ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg
cagaagtggt 8400cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc
tagagtaagt 8460agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat
cgtggtgtca 8520cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag
gcgagttaca 8580tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat
cgttgtcaga 8640agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa
ttctcttact 8700gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa
gtcattctga 8760gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga
taataccgcg 8820ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg
gcgaaaactc 8880tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc
acccaactga 8940tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg
aaggcaaaat 9000gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact
cttccttttt 9060caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat
atttgaatgt 9120atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt
gccacctgac 9180gt
9182113243DNAArtificial SequencepUC57-U3-tRNA-sgRNA vector
sequence 11tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg
gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg
tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc 180accagatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc
atcaggcgcc 240attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc
tcttcgctat 300tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta
acgccagggt 360tttcccagtc acgacgttgt aaaacgacgg ccagtgcctg caggtcgacg
attaaggaat 420ctttaaacat acgaacagat cacttaaagt tcttctgaag caacttaaag
ttatcaggca 480tgcatggatc ttggaggaat cagatgtgca gtcagggacc atagcacaag
acaggcgtct 540tctactggtg ctaccagcaa atgctggaag ccgggaacac tgggtacgtc
ggaaaccacg 600tgatgtgaag aagtaagata aactgtagga gaaaagcatt tcgtagtggg
ccatgaagcc 660tttcaggaca tgtattgcag tatgggccgg cccattacgc aattggacga
caacaaagac 720tagtattagt accacctcgg ctatccacat agatcaaagc tgatttaaaa
gagttgtgca 780gatgatccgt ggcaacaaag caccagtggt ctagtggtag aatagtaccc
tgccacggta 840cagacccggg ttcgattccc ggctggtgca agagaccgat atcccatggc
tcgagggtct 900cggttttaga gctagaaata gcaagttaaa ataaggctag tccgttatca
acttgaaaaa 960gtggcaccga gtcggtgctt tttttccaca taatctctag aggatccccg
gcgtaatcat 1020ggtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac
aacatacgag 1080ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc
acattaattg 1140cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg
cattaatgaa 1200tcggccaacg cgcggggaga ggcggtttgc gtattgggcg ctcttccgct
tcctcgctca 1260ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac
tcaaaggcgg 1320taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga
gcaaaaggcc 1380agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat
aggctccgcc 1440cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac
ccgacaggac 1500tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct
gttccgaccc 1560tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg
ctttctcata 1620gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg
ggctgtgtgc 1680acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt
cttgagtcca 1740acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg
attagcagag 1800cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac
ggctacacta 1860gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga
aaaagagttg 1920gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt
gtttgcaagc 1980agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt
tctacggggt 2040ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga
ttatcaaaaa 2100ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc
taaagtatat 2160atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct
atctcagcga 2220tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata
actacgatac 2280gggagggctt accatctggc cccagtgctg caatgatacc gcgactccca
cgctcaccgg 2340ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga
agtggtcctg 2400caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga
gtaagtagtt 2460cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg
gtgtcacgct 2520cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga
gttacatgat 2580cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt
gtcagaagta 2640agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct
cttactgtca 2700tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca
ttctgagaat 2760agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat
accgcgccac 2820atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga
aaactctcaa 2880ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc
aactgatctt 2940cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg
caaaatgccg 3000caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc
ctttttcaat 3060attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt
gaatgtattt 3120agaaaaataa acaaataggg gttccgcgca catttccccg aaaagtgcca
cctgacgtct 3180aagaaaccat tattatcatg acattaacct ataaaaatag gcgtatcacg
aggccctttc 3240gtc
32431243DNAArtificial Sequencesequence of 5' end
ribozymemisc_feature(1)..(6)n is a, c, g, or t 12nnnnnnctga tgagtccgtg
aggacgaaac gagtaagctc gtc 431368DNAArtificial
Sequencesequence of 3' end ribozyme 13ggccggcatg gtcccagcct cctcgctggc
gccggctggg caacatgctt cggcatggcg 60aatgggac
681415DNAArtificial sequenceNLS
sequence 14aagaagagaa aggtc
151521DNAArtificial sequenceNLS sequence 15cccaagaaga agaggaaggt g
211621DNAArtificial
sequenceNLS sequence 16ccaaagaaga agaggaaggt t
211711PRTArtificial sequenceNLS sequence 17Ser Gly Gly
Ser Pro Lys Lys Lys Arg Lys Val1 5
101833DNAArtificial sequenceNLS sequence 18tcggggggga gcccaaagaa
gaagcggaag gtg 33197PRTArtificial
sequenceNLS sequence 19Pro Lys Lys Lys Arg Lys Val1
52023DNAOryza sativa 20aggtcgggga ggggacgtac ggg
232124DNAArtificial sequencesgRNA target sequence
21ggcaaggtcg gggaggggac gtac
242224DNAArtificial sequencesgRNA target sequence 22aaacgtacgt cccctccccg
acct 242323DNAOryza sativa
23gacgtcggcg aggaaggcct cgg
232424DNAArtificial sequencesgRNA target sequence 24ggcagacgtc ggcgaggaag
gcct 242524DNAArtificial
sequencesgRNA target sequence 25aaacaggcct tcctcgccga cgtc
242623DNAArtificial sequencesgRNA target
sequence 26catggtgggg aaagcttgga ggg
232724DNAArtificial sequencesgRNA target sequence 27ggcacatggt
ggggaaagct tgga
242824DNAArtificial sequencesgRNA target sequence 28aaactccaag ctttccccac
catg 242923DNAArtificial
sequencesgRNA target sequence 29ccggacgacg acgtcgacga cgg
233024DNAArtificial sequencesgRNA target
sequence 30ggcaccggac gacgacgtcg acga
243124DNAArtificial sequencesgRNA target sequence 31aaactcgtcg
acgtcgtcgt ccgg
243223DNAArtificial sequencesgRNA target sequence 32ttgaagtccc ttctagatgg
agg 233324DNAArtificial
sequencesgRNA target sequence 33ggcattgaag tcccttctag atgg
243424DNAArtificial sequencesgRNA target
sequence 34aaacccatct agaagggact tcaa
243523DNAArtificial sequencesgRNA target sequence 35actgcgacac
ccagatatcg tgg
233624DNAArtificial sequencesgRNA target sequence 36ggcaactgcg acacccagat
atcg 243724DNAArtificial
sequencesgRNA target sequence 37aaaccgatat ctgggtgtcg cagt
243823DNAOryza sativa 38gttggtcttt gctcctgcag
agg 233924DNAArtificial
sequencesgRNA target sequence 39ggcagttggt ctttgctcct gcag
244024DNAArtificial sequencesgRNA target
sequence 40aaacctgcag gagcaaagac caac
244124DNAArtificial sequenceOligo-F sequence 41tgcaaggtcg
gggaggggac gtac
244224DNAArtificial sequenceOligo-F sequence 42tgcagacgtc ggcgaggaag gcct
244324DNAArtificial
sequenceOligo-F sequence 43tgcacatggt ggggaaagct tgga
244424DNAArtificial sequenceOligo-F sequence
44tgcaccggac gacgacgtcg acga
244524DNAArtificial sequenceOligo-F sequence 45tgcattgaag tcccttctag atgg
244623DNAArtificial
sequenceOligo-F sequence 46tgcactgcga cacccagata tcg
234724DNAArtificial sequenceOligo-F sequence
47tgcagttggt ctttgctcct gcag
244823DNAArtificial sequenceOligo-F sequence 48ggcgttggtc tttgctcctg cag
234923DNAArtificial
sequenceOligo-F sequence 49ggcgttggtc tttgctcctg cag
235023DNAHomo sapiens 50ggtgagtgag tgtgtgcgtg tgg
235124DNAArtificial
sequenceOligo-F sequence 51caccggtgag tgagtgtgtg cgtg
245224DNAArtificial sequenceOligo-R sequence
52aaaccacgca cacactcact cacc
245325DNAArtificial sequenceOligo-F sequence 53caccgggtga gtgagtgtgt
gcgtg 255425DNAArtificial
sequenceOligo-R sequence 54aaaccacgca cacactcact caccc
2555102DNAArtificial sequenceOligo-F sequence
55caccgaacaa agcaccagtg gtctagtggt agaatagtac cctgccacgg tacagacccg
60ggttcgattc ccggctggtg caggtgagtg agtgtgtgcg tg
10256102DNAArtificial sequenceOligo-R sequence 56aaaccacgca cacactcact
cacctgcacc agccgggaat cgaacccggg tctgtaccgt 60ggcagggtac tattctacca
ctagaccact ggtgctttgt tc 1025727DNAArtificial
sequencesgRNA sequencemisc_feature(1)..(20)n is a, c, g, or t
57nnnnnnnnnn nnnnnnnnnn ttttttt
275826DNAArtificial sequencesgRNA sequencemisc_feature(1)..(19)n is a, c,
g, or t 58nnnnnnnnnn nnnnnnnnnt tttttt
265927DNAArtificial sequencesgRNA sequence 59tggagttggt ctttgctcct
gcagagg 276023DNAOryza sativa
60gacgccggcg aggaaggcct cgg
236123DNAOryza sativa 61gcagtcggag aggaaggcct ggg
236223DNAOryza sativa 62agatcgggga ggggacgtac ggg
236323DNAOryza sativa
63aggtggggga agggacgtac ggg
236423DNAOryza sativa 64agattgggga gggcacgtac ggg
236523DNAStreptococcus pyogenes 65agcgtcggcg
aggaaggcct cgg
236623DNAStreptococcus pyogenes 66ggtgtcggcg aggaaggcct cgg
236723DNAStreptococcus pyogenes
67gatatcggcg aggaaggcct cgg
236823DNAStreptococcus pyogenes 68gacaccggcg aggaaggcct cgg
236923DNAStreptococcus pyogenes
69gacgctggcg aggaaggcct cgg
237023DNAStreptococcus pyogenes 70gacgttagcg aggaaggcct cgg
237123DNAStreptococcus pyogenes
71gacgtcaacg aggaaggcct cgg
237223DNAStreptococcus pyogenes 72gacgtcgatg aggaaggcct cgg
237323DNAStreptococcus pyogenes
73gacgtcggta aggaaggcct cgg
237423DNAStreptococcus pyogenes 74gacgtcggca gggaaggcct cgg
237523DNAStreptococcus pyogenes
75gacgtcggcg gagaaggcct cgg
237623DNAStreptococcus pyogenes 76gacgtcggcg aaaaaggcct cgg
237723DNAStreptococcus pyogenes
77gacgtcggcg agagaggcct cgg
237823DNAStreptococcus pyogenes 78gacgtcggcg aggggggcct cgg
237923DNAStreptococcus pyogenes
79gacgtcggcg aggagagcct cgg
238023DNAStreptococcus pyogenes 80gacgtcggcg aggaaaacct cgg
238123DNAStreptococcus pyogenes
81gacgtcggcg aggaagatct cgg
238223DNAStreptococcus pyogenes 82gacgtcggcg aggaaggttt cgg
238323DNAStreptococcus pyogenes
83gacgtcggcg aggaaggctc cgg
238423DNAStreptococcus pyogenes 84gctgagtgag tgtatgcgtg tgg
238523DNAStreptococcus pyogenes
85tgtgggtgag tgtgtgcgtg agg
23
User Contributions:
Comment about this patent or add new information about this topic: