基于C2c1核酸酶的基因组编辑系统和方法

文档序号:25483104发布日期:2021-06-15 21:43阅读:212来源:国知局
基于C2c1核酸酶的基因组编辑系统和方法

本申请为申请日为2018年11月2日、申请号为201811300251.6、发明名称为“基于c2c1核酸酶的基因组编辑系统和方法”的发明专利申请的分案申请。

本发明涉及基因工程领域。具体而言,本发明涉及基于c2c1核酸酶的基因组编辑系统和方法。本发明还涉及可与不同c2c1核酸酶组合用于基因组编辑的人工向导rna。

发明背景

随着crispr-cas(clusteredregularlyinterspacedshortpalindromicrepeats-crispr-associated蛋白)系统的出现,精确的基因组编辑由于其在基因治疗中的光明前景已经成为最令人关注的领域。到目前为止,已成功利用三种类型的crispr-cas系统以促进哺乳动物基因组工程,包括ii型cas9(cong,l.etal.science339,819-823(2013);mali,p.etal.science339,823-826(2013))、v-a型cpf1(zetsche,b.etal.cell163,759-771(2015))和v-b型c2c1。对于ii型和v型crispr-cas系统,向导rna和cas效应蛋白是靶dna识别和切割的两种核心成分(wright,a.v.,nunez,j.k.&doudna,j.a.cell164,29-44(2016);shmakov,s.etal.natrevmicrobiol15,169-182(2017))。以前的研究表明在密切相关的cas9系统(fonfara,i.etal.nucleicacidsres42,2577-2590(2014))以及cpf1系统(zetsche,b.etal.cell163,759-771(2015))中,双rna(crrna和tracrrna)和蛋白质组分是可互换的,并能初步优化(nishimasu,h.etal.cell156,935-949(2014);zalatan,j.g.etal.cell160,339-350(2015))。虽然许多新兴的crispr-cas系统和研究促进crispr-cas系统的广泛应用(wright,a.v.,nunez,j.k.&doudna,j.a.cell164,29-44(2016);shmakov,s.etal.natrevmicrobiol15,169-182(2017)),但对于如何重新设计甚至从头合成促酶基因组编辑系统仍然知之甚少。

v-b型crispr-c2c1系统是一种新兴的具有前景的基因工程技术。然而,可用于哺乳动物基因组编辑的c2c1却很少,大大限制了其应用。本领域仍然需要新的可用于哺乳动物基因组编辑的基于c2c1核酸酶的基因组编辑系统。

发明简述

在一方面,本发明提供了一种用于对细胞基因组中的靶序列进行定点修饰的基因组编辑系统,其包含以下i)至v)中至少一项:

i)c2c1蛋白或其变体,和向导rna;

ii)包含编码c2c1蛋白或其变体的核苷酸序列的表达构建体,和向导rna;

iii)c2c1蛋白或其变体,和包含编码向导rna的核苷酸序列的表达构建体;

iv)包含编码c2c1蛋白或其变体的核苷酸序列的表达构建体,和包含编码向导rna的核苷酸序列的表达构建体;

v)包含编码c2c1蛋白或其变体的核苷酸序列和编码向导rna的核苷酸序列的表达构建体;

其中所述向导rna能够与所述c2c1蛋白或其变体形成复合物,将所述c2c1蛋白直系同源物或其变体靶向所述细胞基因组中的靶序列。

在一些实施方案中,所述c2c1蛋白是来自alicyclobacillusacidiphilus的aac2c1蛋白、来自alicyclobacilluskakegawensis的akc2c1蛋白、来自alicyclobacillusmacrosporangiidus的amc2c1蛋白、来自bacillushisashii的bhc2c1蛋白、来自bacillus属的bsc2c1蛋白、来自bacillus属的bs3c2c1蛋白、来自desulfovibrioinopinatus的dic2c1蛋白、来自laceyellasediminis的lsc2c1蛋白、来自spirochaetesbacterium的sbc2c1蛋白、来自tuberibacilluscalidus的tcc2c1蛋白。例如,所述c2c1蛋白是来自alicyclobacillusacidiphilusnbrc100859的aac2c1蛋白、来自alicyclobacilluskakegawensisnbrc103104的akc2c1蛋白、来自alicyclobacillusmacrosporangiidusstraindsm17980的amc2c1蛋白、来自bacillushisashiistrainc4的bhc2c1蛋白、来自bacillus属nsp2.1的bsc2c1蛋白、来自bacillus属v3-13contig_40的bs3c2c1蛋白、来自desulfovibrioinopinatusdsm10711的dic2c1蛋白、来自laceyellasediminisstrainrha1的lsc2c1蛋白、来自spirochaetesbacteriumgwb1_27_13的sbc2c1蛋白、来自tuberibacilluscalidusdsm17572的tcc2c1蛋白。

在第二方面,本发明提供了一种对细胞基因组中的靶序列进行定点修饰的方法,包括将本发明的基因组编辑系统导入所述细胞。

在第三方面,本发明提供了一种治疗有需要的对象中的疾病的方法,包括向所述对象递送有效量的本发明的基因组编辑系统以修饰所述对象中与所述疾病相关的基因。

在第四方面,本发明提供了本发明的基因组编辑系统在制备用于治疗有需要的对象中的疾病的药物组合物中的用途,其中所述基因组编辑系统用于修饰所述对象中与所述疾病相关的基因。

在第五方面,本发明提供了用于本发明的方法的试剂盒,该试剂盒包括本发明的基因组编辑系统,以及使用说明。

在第六方面,本发明提供了一种用于治疗有需要的对象中的疾病的药物组合物,其包含本发明的基因组编辑系统和药学可接受的载体,其中所述基因组编辑系统用于修饰所述对象中与所述疾病相关的基因。

附图简述

图1.选择用于基因组编辑测试的非冗余c2c1直系同源物的系统发生树及其基因座。

(a)邻接系统发生树,显示测试的c2c1直系同源物的进化关系。(b)对应于(a)中突出显示的8种c2c1蛋白的细菌基因座图谱crrnadr和推定的tracrrna的模拟共折叠显示出稳定的二级结构。dr,直接重复。每个细菌基因组间隔区(spacer)的数目在其crispr阵列的上方或下方表示。

图2.c2c1直系同源物的蛋白质比对:测试的10种c2c1直系同源物的氨基酸序列的多序列比对。保守的残基用红色背景突出显示,保守突变用轮廓和红色字体突出显示。

图3.人293t细胞中c2c1直系同源物介导的基因组靶向。

(a)t7ei测定结果表明在人类基因组中与其同源sgrna结合的八种c2c1蛋白的基因组靶向活性。三角形表示切割的条带。(b)t7ei测定结果表明在人293t细胞中由与其同源sgrna(bs3sgrna)结合的bs3c2c1介导的同时多重基因组靶向。(c)sanger测序显示由与bs3sgrna结合的bs3c2c1诱导的代表性插入缺失(indel)。pam和原间隔区序列分别用红色和蓝色着色。插入缺失和插入分别用紫色破折号和绿色小写字符表示。

图4.用于rna指导的基因组编辑的c2c1蛋白。

(a)本发明中测试的10种c2c1直系同源物的图形概述。示出其大小(氨基酸数目)。(b)t7ei测定结果表明在人293t细胞中由其同源sgrna指导的八种c2c1直系同源物的基因组靶向活性。三角形表示切割的条带。(c-d)t7ei测定结果表明在人293t细胞中由aasgrna(c)和aksgrna(d)指导的八种c2c1直系同源物的基因组靶向活性。三角形表示切割的条带。

图5.c2c1的sgrna的dna比对:测试衍生自10个c2c1基因座的8种sgrna的dna序列的多序列比对。

图6.不同c2c1直系同源物与sgrna之间的可互换性。

t7ei测定结果表明在人293t细胞中由aasgrna(a)、aksgrna(b)、amsgrna(c)、bs3sgrna(d)和lssgrna(e)指导的八种c2c1直系同源物的基因组靶向活性。红色三角形表示切割的条带。

图7.人工sgrna介导的多重基因组靶向。

(a)对应于dic2c1和tcc2c1的细菌基因座的图谱。两个c2c1基因座没有crispr阵列。(b-c)t7ei测定结果表明在人293t细胞中由aasgrna(b)和aksgrna(c)指导的aac2c1、dic2c1和tcc2c1的基因组靶向活性。三角形表示切割的条带。(d)t7ei测定结果表明在人293t细胞中由与aksgrna结合的tcc2c1介导的同时多重基因组靶向。(e)示意图说明人工sgrna支架13(artgrna13)的二级结构。(f)t7ei测定结果表明在人293t细胞中由与artgrna13结合的tcc2c1介导的同时多重基因组靶向。

图8.不同sgrna指导c2c1进行基因组编辑。

t7ei测定结果表明在人293t细胞中由aasgrna(a)、aksgrna(b)、amsgrna(c)、bs3sgrna(d)和lssgrna(e)指导的aac2c1、dic2c1和tcc2c1的基因组靶向活性。三角形表示切割的条带。

图9.tcc2c1介导的多重基因组编辑。

(a)t7ei测定结果表明在人293t细胞中由与amsgrna结合的tcc2c1介导的同时多重基因组靶向。(b-c)sanger测序显示由与aksgrna(b)和amsgrna(c)结合的tcc2c1诱导的代表性插入缺失。pam和原间隔区序列分别用红色和蓝色着色。插入缺失和插入分别用紫色破折号和绿色小写字符表示。

图10.人工sgrna指导tcc2c1进行基因组编辑。

(a)示意图说明36种人工sgrna(artgrna)支架(支架:1-12和14-37)的二级结构。(b)t7ei测定结果表明在人293t细胞中artsgrna指导的tcc2c1的基因组靶向活性。三角形表示切割的条带。(c)t7ei测定结果表明在人293t细胞中由与artgrna13结合的aac2c1介导的同时多重基因组靶向。

发明详述

一、定义

在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白质和核酸化学、分子生物学、细胞和组织培养、微生物学、免疫学相关术语和实验室操作步骤均为相应领域内广泛使用的术语和常规步骤。例如,本发明中使用的标准重组dna和分子克隆技术为本领域技术人员熟知,并且在如下文献中有更全面的描述:sambrook,j.,fritsch,e.f.和maniatis,t.,molecularcloning:alaboratorymanual;coldspringharborlaboratorypress:coldspringharbor,1989(下文称为“sambrook”)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。

在一方面,本发明提供了一种用于对细胞基因组中的靶序列进行定点修饰的基因组编辑系统,其包含以下i)至v)中至少一项:

i)c2c1蛋白或其变体,和向导rna;

ii)包含编码c2c1蛋白或其变体的核苷酸序列的表达构建体,和向导rna;

iii)c2c1蛋白或其变体,和包含编码向导rna的核苷酸序列的表达构建体;

iv)包含编码c2c1蛋白或其变体的核苷酸序列的表达构建体,和包含编码向导rna的核苷酸序列的表达构建体;

v)包含编码c2c1蛋白或其变体的核苷酸序列和编码向导rna的核苷酸序列的表达构建体;

其中所述向导rna能够与所述c2c1蛋白或其变体形成复合物,将所述c2c1蛋白或其变体靶向所述细胞基因组中的靶序列。在一些实施方案中,所述靶向导致所述靶序列中的一或多个核苷酸的取代、缺失和/或添加。

“基因组”如本文所用不仅涵盖存在于细胞核中的染色体dna,而且还包括存在于细胞的亚细胞组分(如线粒体、质体)中的细胞器dna。

“c2c1核酸酶”、“c2c1蛋白”和“c2c1”在本文中可互换使用,指的是包括c2c1蛋白或其片段的rna指导的核酸酶。c2c1具有向导rna介导的dna结合活性以及dna切割活性,能在向导rna的指导下靶向并切割dna靶序列形成dna双链断裂(dsb)。dsb能够激活细胞内固有的修复机制非同源末端连接(non-homologousendjoining,nhej)和同源重组(homologousrecombination,hr)对细胞中的dna损伤进行修复,在修复过程中,对该特定的dna序列进行定点编辑。

在一些实施方案中,所述c2c1蛋白是来自alicyclobacillusacidiphilus的aac2c1蛋白、来自alicyclobacilluskakegawensis的akc2c1蛋白、来自alicyclobacillusmacrosporangiidus的amc2c1蛋白、来自bacillushisashii的bhc2c1蛋白、来自bacillus属的bsc2c1蛋白、来自bacillus属的bs3c2c1蛋白、来自desulfovibrioinopinatus的dic2c1蛋白、来自laceyellasediminis的lsc2c1蛋白、来自spirochaetesbacterium的sbc2c1蛋白、来自tuberibacilluscalidus的tcc2c1蛋白。

例如,所述c2c1蛋白是来自alicyclobacillusacidiphilusnbrc100859的aac2c1蛋白、来自alicyclobacilluskakegawensisnbrc103104的akc2c1蛋白、来自alicyclobacillusmacrosporangiidusstraindsm17980的amc2c1蛋白、来自bacillushisashiistrainc4的bhc2c1蛋白、来自bacillus属nsp2.1的bsc2c1蛋白、来自bacillus属v3-13contig_40的bs3c2c1蛋白、来自desulfovibrioinopinatusdsm10711的dic2c1蛋白、来自laceyellasediminisstrainrha1的lsc2c1蛋白、来自spirochaetesbacteriumgwb1_27_13的sbc2c1蛋白、来自tuberibacilluscalidusdsm17572的tcc2c1蛋白。

在本发明一些实施方式中,所述c2c1蛋白是其天然基因座不具有crispr阵列的c2c1蛋白。在一些实施方式中,所述天然基因座不具有crispr阵列的c2c1蛋白是dic2c1或tcc2c1蛋白。

在一些实施方案中,所述c2c1蛋白包含seqidno:1-10中任一所示的氨基酸序列。例如,所述aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1、tcc2c1蛋白分别包含seqidno:1-10所示氨基酸序列。

在一些实施方案中,所述c2c1蛋白的变体分别包含与野生型c2c1蛋白(如野生型aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1或tcc2c1蛋白)具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%序列相同性的氨基酸序列,并且分别具有野生型c2c1蛋白(如野生型aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1、tcc2c1蛋白)的基因组编辑和/靶向活性。

在一些实施方案中,所述c2c1蛋白的变体分别包含相对于野生型c2c1蛋白(如野生型aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1、tcc2c1蛋白)具有一或多个氨基酸残基取代、缺失或添加的氨基酸序列,并且分别具有野生型c2c1蛋白(如野生型aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1、tcc2c1蛋白)的基因组编辑和/或靶向活性。例如,所述c2c1蛋白的变体分别包含相对于野生型c2c1蛋白(如野生型aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1、tcc2c1蛋白)具有1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸残基取代、缺失或添加的氨基酸序列。在一些实施方案中,所述氨基酸取代是保守型取代。

“多肽”、“肽”、和“蛋白质”在本发明中可互换使用,指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物,以及适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式,包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和adp-核糖基化。

序列“相同性”具有本领域公认的含义,并且可以利用公开的技术计算两个核酸或多肽分子或区域之间序列相同性的百分比。可以沿着多核苷酸或多肽的全长或者沿着该分子的区域测量序列相同性。(参见,例如:computationalmolecularbiology,lesk,a.m.,ed.,oxforduniversitypress,newyork,1988;biocomputing:informaticsandgenomeprojects,smith,d.w.,ed.,academicpress,newyork,1993;computeranalysisofsequencedata,parti,griffin,a.m.,andgriffin,h.g.,eds.,humanapress,newjersey,1994;sequenceanalysisinmolecularbiology,vonheinje,g.,academicpress,1987;andsequenceanalysisprimer,gribskov,m.anddevereux,j.,eds.,mstocktonpress,newyork,1991)。虽然存在许多测量两个多核苷酸或多肽之间的相同性的方法,但是术语“相同性”是技术人员公知的(carrillo,h.&lipman,d.,siamjappliedmath48:1073(1988))。

在肽或蛋白中,合适的保守型氨基酸取代是本领域技术人员已知的,并且一般可以进行而不改变所得分子的生物活性。通常,本领域技术人员认识到多肽的非必需区中的单个氨基酸取代基本上不改变生物活性(参见,例如,watsonetal.,molecularbiologyofthegene,4thedition,1987,thebenjamin/cummingspub.co.,p.224)。

在一些实施方案中,所述c2c1蛋白的变体包含核酸酶死亡的c2c1蛋白(dc2c1)。核酸酶死亡的c2c1蛋白指的是保留向导rna介导的dna结合活性但是不具备双链dna切割活性的c2c1蛋白。在一些实施方案中,所述核酸酶死亡的c2c1蛋白涵盖c2c1切口酶,其只切割双链靶dna的一条链。

在一些实施方案中,所述c2c1蛋白的变体是核酸酶死亡的c2c1蛋白与脱氨酶的融合蛋白。例如,所述融合蛋白中的核酸酶死亡的c2c1蛋白与脱氨酶可以通过接头例如肽接头连接。

如本发明所用,“脱氨酶”是指催化脱氨基反应的酶。在本发明一些实施方式中,所述脱氨酶指的是胞嘧啶脱氨酶,其能够接受单链dna作为底物并能够催化胞苷或脱氧胞苷分别脱氨化为尿嘧啶或脱氧尿嘧啶。在本发明一些实施方式中,所述脱氨酶指的是腺嘌呤脱氨酶,其能够接受单链dna作为底物并能够催化腺苷或脱氧腺苷(a)形成肌苷(i)。通过使用核酸酶死亡的c2c1蛋白与脱氨酶的融合蛋白,可以实现靶dna序列中的碱基编辑,例如c至t的转换或a至g的转换。

在本发明的一些实施方案中,本发明的基因组编辑系统中的c2c1蛋白或其变体还可以包含核定位序列(nls)。一般而言,所述c2c1蛋白或其变体中的一个或多个nls应具有足够的强度,以便在细胞核中驱动所述c2c1蛋白或其变体以可实现其基因编辑功能的量积聚。一般而言,核定位活性的强度由所述c2c1蛋白或其变体中nls的数目、位置、所使用的一个或多个特定的nls、或这些因素的组合决定。

在本发明的一些实施方案中,本发明的基因组编辑系统中的c2c1蛋白或其变体的nls可以位于n端和/或c端。在一些实施方案中,所述c2c1蛋白或其变体包含约1、2、3、4、5、6、7、8、9、10个或更多个nls。在一些实施方案中,所述c2c1蛋白或其变体包含在或接近于n端的约1、2、3、4、5、6、7、8、9、10个或更多个nls。在一些实施方案中,所述c2c1蛋白或其变体包含在或接近于c端约1、2、3、4、5、6、7、8、9、10个或更多个nls。在一些实施方案中,所述c2c1蛋白或其变体包含这些的组合,如包含在n端的一个或多个nls以及在c端的一个或多个nls。当存在多于一个nls时,每一个可以被选择为不依赖于其他nls。在本发明的一些实施方式中,所述c2c1蛋白或其变体包含2个nls,例如所述2个nls分别位于n端和c端。

一般而言,nls由暴露于蛋白表面上的带正电的赖氨酸或精氨酸的一个或多个短序列组成,但其他类型的nls也是已知的。nls的非限制性实例包括:kkrkv、pkkkrkv,或sggspkkkrkv。

此外,根据所需要编辑的dna位置,本发明的c2c1蛋白或其变体还可以包括其他的定位序列,例如细胞质定位序列、叶绿体定位序列、线粒体定位序列等。

在本发明的一些实施方案中,所述靶序列长度为18-35个核苷酸,优选20个核苷酸。在本发明的一些实施方案中,所述靶序列在其5’端侧翼为选自:5’tttn-3’、5’attn-3’、5’gttn-3’、5’cttn-3’、5’ttc-3’、5’ttg-3’、5’tta-3’、5’ttt-3’、5’tan-3’、5’tgn-3’、5’tcn-3’和5’atc-3’的pam(前间区邻近基序)序列,其中n选自a、g、c和t。

在本发明中,待进行修饰的靶序列可以位于基因组的任何位置,例如位于功能基因如蛋白编码基因内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而实现对所述基因功能修饰或对基因表达的修饰。可以通过t7ei、pcr/re或测序方法检测基因组靶序列中的取代、缺失和/或添加。

“向导rna”和“grna”在本文中可互换使用,通常由部分互补形成复合物的crrna和tracrrna分子构成,其中crrna包含与靶序列具有足够相同性以便与靶序列的互补序列杂交并且指导crispr复合物(c2c1+crrna+tracrrna)与该靶序列以序列特异性方式结合的序列。然而,可以设计并使用单向导rna(sgrna),其同时包含crrna和tracrrna的特征。

在本发明的一些实施方案中,所述向导rna是sgrna。在一些具体实施方案中,所述sgrna由选自以下之一的核酸序列编码:

5’-gtctaaaggacagaatttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgaacttctcaaaaagaacgctcgctcagtgttctgacgtcggatcactgagcgagcgatctgagaagtggcac-nx-3’(aasgrna);

5’-tcgtctataggacggcgaggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggcttcaagatgaccgctcgctcagcgatctgacaacggatcgctgagcgagcggtctgagaagtggcac-nx-3’(aksgrna1);

5’-ggaattgccgatctataggacggcagattcaacgggatgtgccaatgcactctttccaggagtgaacaccccgttggcttcaacatgatcgcccgctcaacggtccgatgtcggatcgttgagcgggcgatctgagaagtggcac-nx-3’(amsgrna1);

5’-gaggttctgtcttttggtcaggacaaccgtctagctataagtgctgcagggtgtgagaaactcctattgctggacgatgtctcttttatttcttttttcttggatgtccaagaaaaaagaaatgatacgaggcattagcac-nx-3’(bhsgrna);

5’-ccataagtcgacttacatatccgtgcgtgtgcattatgggcccatccacaggtctattcccacggataatcacgactttccactaagctttcgaatgttcgaaagcttagtggaaagcttcgtggttagcac-nx-3’(bssgrna);

5’-ggtgacctatagggtcaatgaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggattatcttatttctgctaagtgtttagttgcctgaatacttagcagaaataatgatgattggcac-nx-3’(bs3sgrna);

5’-ggcaaagaatactgtgcgtgtgctaaggatggaaaaaatccattcaaccacaggattacattatttatctaatcacttaaatctttaagtgattagatgaattaaatgtgattagcac-nx-3’(lssgrna);或

5’-gtcttagggtatatcccaaatttgtcttagtatgtgcattgcttacagcgacaactaaggtttgtttatcttttttttacattgtaagatgttttacattataaaaagaagataatcttattgcac-nx-3’(sbsgrna);

其中nx表示x个连续的核苷酸组成的核苷酸序列(spacer序列),n各自独立地选自a、g、c和t;x为18≤x≤35的整数。优选地,x=20。在一些实施方案中,序列nx(spacer序列)能够与靶序列的互补序列特异性杂交。所述sgrna中除nx之外的序列为sgrna的支架(scaffold)序列。在一些实施方案中,所述sgrna包含由seqidno:31-38中任一项的核苷酸序列编码的支架序列。

本发明令人惊奇地发现,不同的c2c1系统中的c2c1蛋白以及向导rna可以互换使用,从而使得可以人工设计通用的向导rna。

因此在另一方面,本发明提供一种人工sgrna,其由选自以下的核苷酸序列编码:

5’-ggtctaaaggacagaatttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgaacttcaagcgaagtggcac-nx-3’(artsgrna1);

5’-ggtctaaaggacagaagacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna2);

5’-ggtctaaaggacagaaaatctgtgcgtgtgccataagtaattaaaaattacccaccacagacttcaagcgaagtggcac-nx-3’(artsgrna3);

5’-ggtcgtctataggacggcgagtttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgaacttcaagcgaagtggcac-nx-3’(artsgrna4);

5’-ggtcgtctataggacggcgaggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna5);

5’-ggtcgtctataggacggcgagaatctgtgcgtgtgccataagtaattaaaaattacccaccacagacttcaagcgaagtggcac-nx-3’(artsgrna6);

5’-ggtgacctatagggtcaatgtttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgaacttcaagcgaagtggcac-nx-3’(artsgrna7);

5’-ggtgacctatagggtcaatggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna8);

5’-ggtgacctatagggtcaatgaatctgtgcgtgtgccataagtaattaaaaattacccaccacagacttcaagcgaagtggcac-nx-3’(artsgrna9);

5’-ggtctaaaggacagaatttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagcttcaaagaagtggcac-nx-3’(artsgrna10);

5’-ggtctaaaggacagaagacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrna11);

5’-ggtctaaaggacagaaaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggcttcaaagaagtggcac-nx-3’(artsgrna12);

5’-ggtcgtctataggacggcgagtttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagcttcaaagaagtggcac-nx-3’(artsgrna13);

5’-ggtcgtctataggacggcgaggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrna14);

5’-ggtcgtctataggacggcgagaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggcttcaaagaagtggcac-nx-3’(artsgrna15);

5’-ggtgacctatagggtcaatgtttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagcttcaaagaagtggcac-nx-3’(artsgrna16);

5’-ggtgacctatagggtcaatggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrna17);

5’-ggtgacctatagggtcaatgaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggcttcaaagaagtggcac-nx-3’(artsgrna18);

5’-ggtctaaaggacagaatttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagattatctatgatgattggcac-nx-3’(artsgrna19);

5’-ggtctaaaggacagaagacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggattatctatgatgattggcac-nx-3’(artsgrna20);

5’-ggtctaaaggacagaaaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggattatctatgatgattggcac-nx-3’(artsgrna21);

5’-ggtcgtctataggacggcgagtttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagattatctatgatgattggcac-nx-3’(artsgrna22);

5’-ggtcgtctataggacggcgaggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggattatctatgatgattggcac-nx-3’(artsgrna23);

5’-ggtcgtctataggacggcgagaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggattatctatgatgattggcac-nx-3’(artsgrna24);

5’-ggtgacctatagggtcaatgtttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagattatctatgatgattggcac-nx-3’(artsgrna25);

5’-ggtgacctatagggtcaatggacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggattatctatgatgattggcac-nx-3’(artsgrna26);

5’-ggtgacctatagggtcaatgaatctgtgcgtgtgccataagtaattaaaaattacccaccacaggattatctatgatgattggcac-nx-3’(artsgrna27);

5’-ggtctaaaggacagaacaacgggatgtgccaatgcactctttccaggagtgaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna28);

5’-ggtcgtctataggacggcgagcaacgggatgtgccaatgcactctttccaggagtgaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna29);

5’-ggaattgccgatctataggacggcagatttttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgaacttcaagcgaagtggcac-nx-3’(artsgrna30);

5’-ggaattgccgatctataggacggcagattgacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna31);

5’-ggaattgccgatctataggacggcagattcaacgggatgtgccaatgcactctttccaggagtgaacaccccgttgacttcaagcgaagtggcac-nx-3’(artsgrna32);

5’-ggtctaaaggacagaacaacgggatgtgccaatgcactctttccaggagtgaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrna33);

5’-ggtcgtctataggacggcgagcaacgggatgtgccaatgcactctttccaggagtgaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrna34);

5’-ggaattgccgatctataggacggcagatttttttcaacgggtgtgccaatggccactttccaggtggcaaagcccgttgagcttcaaagaagtggcac-nx-3’(artsgrna35);

5’-ggaattgccgatctataggacggcagattgacaacgggaagtgccaatgtgctctttccaagagcaaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrn36a);或

5’-ggaattgccgatctataggacggcagattcaacgggatgtgccaatgcactctttccaggagtgaacaccccgttggcttcaaagaagtggcac-nx-3’(artsgrna37),

其中nx表示x个连续的核苷酸组成的核苷酸序列(spacer序列),n各自独立地选自a、g、c和t;x为18≤x≤35的整数。优选地,x=20。在一些实施方案中,序列nx(spacer序列)能够与靶序列的互补序列特异性杂交。所述sgrna中除nx之外的序列为sgrna的支架(scaffold)序列。

在一些实施方案中,所述人工sgrna包含由seqidno:39-75中任一项的核苷酸序列编码的支架序列。

在一些实施方案中,本发明的基因组编辑系统中的向导rna是本发明的人工sgrna。

为了在靶细胞中获得有效表达,在本发明的一些实施方式中,所述编码c2c1蛋白或其变体的核苷酸序列针对待进行基因组编辑的细胞所来自的生物体进行密码子优化。

密码子优化是指通过用在宿主细胞的基因中更频繁地或者最频繁地使用的密码子代替天然序列的至少一个密码子(例如约或多于约1、2、3、4、5、10、15、20、25、50个或更多个密码子同时维持该天然氨基酸序列而修饰核酸序列以便增强在感兴趣宿主细胞中的表达的方法。不同的物种对于特定氨基酸的某些密码子展示出特定的偏好。密码子偏好性(在生物之间的密码子使用的差异)经常与信使rna(mrna)的翻译效率相关,而该翻译效率则被认为依赖于被翻译的密码子的性质和特定的转运rna(trna)分子的可用性。细胞内选定的trna的优势一般反映了最频繁用于肽合成的密码子。因此,可以将基因定制为基于密码子优化在给定生物中的最佳基因表达。密码子利用率表可以容易地获得,例如在www.kazusa.orjp/codon/上可获得的密码子使用数据库(“codonusagedatabase”)中,并且这些表可以通过不同的方式调整适用。参见,nakamuray.等,“codonusagetabulatedfromtheinternationaldnasequencedatabases:statusfortheyear2000.nucl.acidsres.,28:292(2000)。

可通过本发明的系统进行基因组编辑的细胞所来自的生物体优选是真核生物,包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。

在本发明的一些具体实施方式中,所述编码c2c1蛋白或其变体的核苷酸序列针对人进行密码子优化。在一些具体实施方式中,所述密码子优化的编码aac2c1、akc2c1、amc2c1、bhc2c1、bsc2c1、bs3c2c1、dic2c1、lsc2c1、sbc2c1、tcc2c1蛋白的核苷酸序列分别选自seqidno:11-20。

根据本发明的一些实施方式,本发明所述系统的表达构建体中所述编码c2c1蛋白或其变体的核苷酸序列和/或所述编码向导rna的核苷酸序列与表达调控元件如启动子可操作地连接。

如本发明所用,“表达构建体”是指适于感兴趣的核苷酸序列在生物体中表达的载体如重组载体。“表达”指功能产物的产生。例如,核苷酸序列的表达可指核苷酸序列的转录(如转录生成mrna或功能rna)和/或rna翻译成前体或成熟蛋白质。本发明的“表达构建体”可以是线性的核酸片段、环状质粒、病毒载体,或者,在一些实施方式中,可以是能够翻译的rna(如mrna)。

本发明的“表达构建体”可包含不同来源的调控序列和感兴趣的核苷酸序列,或相同来源但以不同于通常天然存在的方式排列的调控序列和感兴趣的核苷酸序列。

“调控序列”和“调控元件”可互换使用,指位于编码序列的上游(5'非编码序列)、中间或下游(3'非编码序列),并且影响相关编码序列的转录、rna加工或稳定性或者翻译的核苷酸序列。调控序列可包括但不限于启动子、翻译前导序列、内含子和多腺苷酸化识别序列。

“启动子”指能够控制另一核酸片段转录的核酸片段。在本发明的一些实施方案中,启动子是能够控制细胞中基因转录的启动子,无论其是否来源于所述细胞。启动子可以是组成型启动子或组织特异性启动子或发育调控启动子或诱导型启动子。

“组成型启动子”指一般将引起基因在多数细胞类型中在多数情况下表达的启动子。“组织特异性启动子”和“组织优选启动子”可互换使用,并且指主要但非必须专一地在一种组织或器官中表达,而且也可在一种特定细胞或细胞型中表达的启动子。“发育调控启动子”指其活性由发育事件决定的启动子。“诱导型启动子”响应内源性或外源性刺激(环境、激素、化学信号等)而选择性表达可操纵连接的dna序列。

如本文中所用,术语“可操作地连接”指调控元件(例如但不限于,启动子序列、转录终止序列等)与核酸序列(例如,编码序列或开放读码框)连接,使得核苷酸序列的转录被所述转录调控元件控制和调节。用于将调控元件区域可操作地连接于核酸分子的技术为本领域已知的。

本发明可使用的启动子的实例包括但不限于聚合酶(pol)i、polii或poliii启动子。poli启动子的实例包括鸡rnapoli启动子。polii启动子的实例包括但不限于巨细胞病毒立即早期(cmv)启动子、劳斯肉瘤病毒长末端重复(rsv-ltr)启动子和猿猴病毒40(sv40)立即早期启动子。poliii启动子的实例包括u6和h1启动子。可以使用诱导型启动子如金属硫蛋白启动子。启动子的其他实例包括t7噬菌体启动子、t3噬菌体启动子、β-半乳糖苷酶启动子和sp6噬菌体启动子。当用于植物时,启动子可以是花椰菜花叶病毒35s启动子、玉米ubi-1启动子、小麦u6启动子、水稻u3启动子、玉米u3启动子、水稻肌动蛋白启动子。

可通过本发明的系统进行基因组编辑的细胞优选是真核生物细胞,包括但不限于,哺乳动物细胞如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅的细胞;植物细胞包括单子叶植物细胞和双子叶植物细胞,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等的细胞。在本发明的一些实施方案中,所述细胞是真核生物细胞,优选哺乳动物细胞,更优选是人细胞。

在另一方面,本发明提供了一种修饰细胞基因组中靶序列的方法,包括将本发明的基因组编辑系统导入所述细胞,由此所述向导rna将所述c2c1蛋白或其变体靶向所述细胞基因组中的靶序列。在一些实施方案中,所述靶向导致所述靶序列中的一或多个核苷酸的取代、缺失和/或添加。

将本发明的基因组编辑系统的核酸分子(例如质粒、线性核酸片段、rna等)或蛋白质“导入”细胞是指用所述核酸或蛋白质转化细胞,使得所述核酸或蛋白质在细胞中能够发挥功能。本发明所用的“转化”包括稳定转化和瞬时转化。“稳定转化”指将外源核苷酸序列导入基因组中,导致外源基因稳定遗传。一旦稳定转化,外源核酸序列稳定地整合进所述生物体和其任何连续世代的基因组中。“瞬时转化”指将核酸分子或蛋白质导入细胞中,执行功能而没有外源基因稳定遗传。瞬时转化中,外源核酸序列不整合进基因组中。

可用于将本发明的基因组编辑系统导入细胞的方法包括但不限于:磷酸钙转染、原生质融合、电穿孔、脂质体转染、微注射、病毒感染(如杆状病毒、痘苗病毒、腺病毒、腺相关病毒、慢病毒和其他病毒)、基因枪法、peg介导的原生质体转化、土壤农杆菌介导的转化。

在一些实施方式中,所述方法在体外进行。例如,所述细胞是分离的细胞。在一些实施方式中,所述细胞是car-t细胞。在一些实施方式中,所述细胞是诱导的胚胎干细胞。

在另一些实施方式中,所述方法还可以在体内进行。例如,所述细胞是生物体内的细胞,可以通过例如病毒介导的方法将本发明的系统体内导入所述细胞。例如,所述细胞可以是患者体内的肿瘤细胞。

在另一方面,本发明还提一种产生经遗传修饰的细胞的方法,包括将本发明的基因组编辑系统导入细胞中,由此所述向导rna将所述c2c1蛋白或其变体靶向所述细胞基因组中的靶序列。在一些实施方式中,所述靶向导致所述靶序列中的一或多个核苷酸取代、缺失和/或添加。

在另一方面,本发明还提供经遗传修饰的生物体,其包含通过本发明的方法产生的经遗传修饰的细胞或其后代。

如本文所用,“生物体”包括适于基因组编辑的任何生物体,优选真核生物。生物体的实例包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。在本发明的一些实施方案中,所述生物体是真核生物,优选哺乳动物,更优选人。

如本文所用,“经遗传修饰的生物体”或“经遗传修饰的细胞”意指在其基因组内包含外源多核苷酸或修饰的基因或表达调控序列的生物体或细胞。例如外源多核苷酸能够稳定地整合进生物体或细胞的基因组中,并遗传连续的世代。外源多核苷酸可单独地或作为重组dna构建体的部分整合进基因组中。修饰的基因或表达调控序列为在生物体或细胞基因组中所述序列包含单个或多个脱氧核苷酸取代、缺失和添加。针对序列而言的“外源”意指来自外来物种的序列,或者如果来自相同物种,则指通过蓄意的人为干预而从其天然形式发生了组成和/或基因座的显著改变的序列。

在另一方面,本发明提供了一种基因表达调控系统,其基于本发明的核酸酶死亡的c2c1蛋白。此系统尽管并没有改变靶基因的序列,在本文范围内也定义为基因组编辑系统。

在一些实施方案中,本发明的基因表达调控系统是基因抑制或沉默系统,其可以包含以下之一:

i)核酸酶死亡的c2c1蛋白或其与转录阻遏蛋白的融合蛋白,和向导rna;

ii)包含编码核酸酶死亡的c2c1蛋白或其与转录阻遏蛋白的融合蛋白的核苷酸序列的表达构建体,和向导rna;

iii)核酸酶死亡的c2c1蛋白或其与转录阻遏蛋白的融合蛋白,和包含编码向导rna的核苷酸序列的表达构建体;

iv)包含编码核酸酶死亡的c2c1蛋白或其与转录阻遏蛋白的融合蛋白的核苷酸序列的表达构建体,和包含编码向导rna的核苷酸序列的表达构建体;或

v)包含编码核酸酶死亡的c2c1蛋白或其与转录阻遏蛋白的融合蛋白的核苷酸序列和编码向导rna的核苷酸序列的表达构建体。

所述核酸酶死亡的c2c1蛋白或向导rna的定义如上所述。所述转录阻遏蛋白的选择属于本领域技术人员的技能范围。

如本文所用,基因抑制或沉默是指基因表达水平的下调或消除,优选在转录水平。

然而,本发明的基因表达调控系统还可以使用核酸酶死亡的c2c1蛋白和转录激活蛋白的融合蛋白。在此种情况下,所述基因表达调控系统是基因表达激活系统。例如,本发明的基因表达激活系统可以包含以下之一:

i)核酸酶死亡的c2c1蛋白和转录激活蛋白的融合蛋白,和向导rna;

ii)包含编码核酸酶死亡的c2c1蛋白和转录激活蛋白的融合蛋白的核苷酸序列的表达构建体,和向导rna;

iii)核酸酶死亡的c2c1蛋白和转录激活蛋白的融合蛋白,和包含编码向导rna的核苷酸序列的表达构建体;

iv)包含编码核酸酶死亡的c2c1蛋白和转录激活蛋白的融合蛋白的核苷酸序列的表达构建体,和包含编码向导rna的核苷酸序列的表达构建体;或

v)包含编码核酸酶死亡的c2c1蛋白和转录激活蛋白的融合蛋白的核苷酸序列和编码向导rna的核苷酸序列的表达构建体。

所述核酸酶死亡的c2c1蛋白或向导rna的定义如上所述。所述转录激活蛋白的选择属于本领域技术人员的技能范围。

如本文所用,基因激活是指基因表达水平的上调,优选在转录水平。

在另一方面,本发明还涵盖本发明的基因组编辑系统在疾病治疗中的应用。

通过本发明的基因组编辑系统对疾病相关基因进行修饰,可以实现疾病相关基因的上调、下调、失活、激活或者突变纠正等,从而实现疾病的预防和/或治疗。例如,本发明中靶序列可以位于疾病相关基因的蛋白编码区内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而可以实现对所述疾病相关基因功能修饰或对疾病相关基因表达的修饰。

“疾病相关”基因是指与非疾病对照的组织或细胞相比,在来源于疾病影响的组织的细胞中以异常水平或以异常形式产生转录或翻译产物的任何基因。在改变的表达与疾病的出现和/或进展相关的情况下,它可以是以异常高的水平被表达的基因;它可以是以异常低的水平被表达的基因。疾病相关基因还指具有一个或多个突变或直接负责或与一个或多个负责疾病的病因学的基因连锁不平衡的遗传变异的基因。转录的或翻译的产物可以是已知的或未知的,并且可以处于正常或异常水平。

因此,本发明还提供治疗有需要的对象中的疾病的方法,包括向所述对象递送有效量的本发明的基因组编辑系统以修饰与所述疾病相关的基因。

本发明还提供本发明的基因组编辑系统在制备用于治疗有需要的对象中的疾病的药物组合物中的用途,其中所述基因组编辑系统用于修饰与所述疾病相关的基因。

本发明还提供用于治疗有需要的对象中的疾病的药物组合物,其包含本发明的基因组编辑系统和药学可接受的载体,其中所述基因组编辑系统用于修饰与所述疾病相关的基因。

在一些实施方式中,所述对象是哺乳动物,例如人。

所述疾病的实例包括但不限于肿瘤、炎症、帕金森病、心血管疾病、阿尔茨海默病、自闭症、药物成瘾、年龄相关性黄斑变性、精神分裂症、遗传性疾病等。

在仍另一方面,本发明的范围内还包括用于本发明的方法的试剂盒,该试剂盒包括本发明的基因组编辑系统,以及使用说明。试剂盒一般包括表明试剂盒内容物的预期用途和/或使用方法的标签。术语标签包括在试剂盒上或与试剂盒一起提供的或以其他方式随试剂盒提供的任何书面的或记录的材料。

实施例

为了便于理解本发明,下面将参照相关具体实施例及附图对本发明进行更全面的描述。附图中给出了本发明的较佳实施例。但是,本发明可以以许多不同的形式来实现,并不限于本文所描述的实施例。相反地,提供这些实施例的目的是使对本发明的公开内容的理解更加透彻全面。

材料与方法

1.dna操作

根据molecularcloning:alaboratorymanual并进行一些修改进行dna操作,包括dna制备、消化、连接、扩增、合成、纯化、琼脂糖凝胶电泳等。简而言之,通过连接退火的寡核苷酸(表1)到bsai消化的puc19-u6-sgrna(seqidno:23-30)载体中构建用于细胞转染测定的靶向sgrna。

表1.人类基因组靶靶序列。

2.从头基因合成和质粒构建

采用psi-blast程序(altschul,s.f.etal.nucleicacidsres25,3389-3402(1997))鉴定新的crispr-c2c1直系同源物。其编码序列进行人源化(grote,a.etal.,nucleicacidsres33,w526-531(2005)),并且使用genedesign程序(richardson,s.m.etal.,genomeres16,550-556(2006))设计用于c2c1基因和sgrna合成的寡核苷酸。根据文献(li,g.etal.,methodsmolbiol1073,9-17(2013))合成各c2c1基因。使用hifidnaassemblymastermix(neb)通过体外同源重组体外组装纯化的产物成表达载体。将pcag-2aegfp(seqidno:21)和puc19-u6(seqidno:22)载体分别用于c2c1蛋白和sgrna的哺乳动物表达。

3.细胞培养和转染

将人胚胎肾293t细胞在补充有10%胎牛血清(fbs,gibco)和1%抗生素-抗真菌剂(gibco)的dulbecco'smodifiedeagle培养基(dmem,gibco)中于37℃,5%co2孵育下培养。按照制造商推荐的方案,使用lipofectamineltx(invitrogen)转染293t细胞。对于48孔板的每个孔,使用总共400ng质粒(c2c1:sgrna=2:1)。然后在转染后48小时,直接收获细胞用于基因组dna提取。

4.t7核酸内切酶i(t7ei)测定和sanger测序

将收获的细胞直接用补充有蛋白酶k的bufferl(bimake)裂解,并在55℃下孵育3小时并在95℃下灭活10分钟。对每个基因的c2c1靶位点周围的基因组区域进行pcr扩增。将200-400ngpcr产物与ddh2o混合至终体积10μl,并根据先前方法进行再退火过程以形成异源双链体。重新退火后,用1/10体积的nebuffertm2.1和0.2μlt7ei(neb)在37℃处理产物30分钟,并在3%琼脂糖凝胶上进行分析。基于相对条带强度对插入缺失进行定量(cong,l.etal.,science339,819-823(2013))。将t7ei测定鉴定的突变产物克隆到ta克隆载体中,并转化到感受态大肠杆菌菌株(transgenbiotech)中。过夜培养后,随机挑出菌落并测序。

实施例1、新c2c1蛋白鉴定

选择并从头合成来自不同细菌的六种代表性c2c1蛋白,以及之前报道的四种c2c1直系同源物,在人胚胎肾293t细胞中进行基因组编辑(图1、2和seqidno:1-10)。在这10种c2c1直系同源物中,来自d.inopinatus(dic2c1)和t.calidus(tcc2c1)的c2c1既没有可预测的前体crisprrna(pre-crrna)也没有反式激活crrna(tracrrna)(图1b),提示这两种c2c1蛋白可能不适合基因组编辑应用。

为了进行哺乳动物基因组编辑,用单独的c2c1酶和其靶向含有适当pam的人内源基因座的同源单向导rna(sgrna)共转染293t细胞(图1)。t7核酸内切酶(t7ei)测定的结果显示,除了发明人先前已经鉴定的aac2c1和akc2c1之外,四种新的c2c1直系同源物(amc2c1、bhc2c1、bs3c2c1和lsc2c1)稳健地编辑人类基因组,尽管它们的靶向效率在不同的直系同源物之间和在不同的靶向位点不同(图1b和图3a)。还通过简单地使用多个sgrna,使用bs3c2c1实现多重基因组编辑,同时编辑人类基因组中的四个位点(图3b,c)。这些新发现的c2c1直系同源物扩展对基于c2c1的基因组编辑的选择。

实施例2、不同c2c1及双rna的可互换性

为了研究c2c1系统中双rna(crrna和tracrrna)和蛋白质组分之间的可互换性,首先分析c2c1蛋白和双rna两者的保守性。除了c2c1直系同源物的保守氨基酸序列外(图4a和图2),前体crrna:tracrrna双链体的dna序列及其二级结构也表现出高保守性(图1b和图5)。接下来,用分别与来自8个c2c1系统的各sgrna复合的8种c2c1直系同源物,在293t细胞中进行基因组编辑。如t7ei测定的结果所示,衍生自aac2c1、akc2c1、amc2c1、bs3c2c1和lsc2c1基因座的sgrna可以替代原始sgrna用于哺乳动物基因组编辑,尽管在不同c2c1直系同源物和sgrna之间的活性有所不同(图4c,d和图6)。这些结果证明不同c2c1和来自不同c2c1基因座的双rna之间的可互换性。

实施例3、利用天然基因座无crispr阵列的c2c1进行基因组编辑

本发明进一步选择两个基因座没有携带crispr阵列的c2c1直系同源物dic2c1和tcc2c1进行后续实验(图7a)。基因座没有携带crispr阵列使得它们的crrna:tracrrna双链体的序列不可预测。在293t细胞中共转染与靶向不同基因组位点的衍生自其他8种c2c1直系同源物的基因座的sgrna组合的dic2c1和tcc2c1以及aac2c1。t7ei测定结果表明衍生自aac2c1、akc2c1、amc2c1、bs3c2c1和lsc2c1的sgrna使tcc2c1能够稳健地编辑人类基因组(图7b、c和图8)。此外,aasgrna或aksgrna能够使tcc2c1实现多重基因组编辑(图7d和图9)。上述结果表明在来自不同系统的c2c1和双链rna之间可互换性使得可能利用天然基因座不具有crispr阵列的c2c1直系同源物来编辑哺乳动物基因组。

实施例4、设计用于c2c1介导的基因组编辑的人工sgrna

不同c2c1系统中c2c1蛋白和双rna之间的可互换性有利于设计新的人工sgrna(artsgrna)支架以促进c2c1介导的基因组编辑。考虑到c2c1直系同源物中dna序列和二级结构的保守性(图1b和3),设计并从头合成37种sgrna支架(seqidno:39-75),用于靶向人ccr5基因座(图7e,图10a)。t7ei测定的结果表明22种artsgrna支架有效地工作(图10b)。为了验证artgrna的普遍适用性,使用artsgrna13指导tcc2c1或aac2c1进行多重基因组编辑(图10a)。t7ei测定结果表明,artsgrna13同时促进tcc2c1和aac2c1两者的多重基因组编辑(图7f和图10c)。结果表明通过设计和合成artsgrna能促进c2c1介导的基因组编辑特别是多重基因组编辑。

表2本发明涉及的序列及信息

序列表

<110>中国科学院动物研究所

<120>基于c2c1核酸酶的基因组编辑系统和方法

<130>tc5170

<160>80

<170>siposequencelisting1.0

<210>1

<211>1129

<212>prt

<213>alicyclobacillusacidiphilus

<400>1

metalavallyssermetlysvallysleuargleuaspasnmetpro

151015

gluileargalaglyleutrplysleuhisthrgluvalasnalagly

202530

valargtyrtyrthrglutrpleuserleuleuargglngluasnleu

354045

tyrargargserproasnglyaspglygluglnglucystyrlysthr

505560

alagluglucyslysalagluleuleugluargleuargalaarggln

65707580

valgluasnglyhiscysglyproalaglyseraspaspgluleuleu

859095

glnleualaargglnleutyrgluleuleuvalproglnalailegly

100105110

alalysglyaspalaglnglnilealaarglyspheleuserproleu

115120125

alaasplysaspalavalglyglyleuglyilealalysalaglyasn

130135140

lysproargtrpvalargmetargglualaglygluproglytrpglu

145150155160

gluglulysalalysalaglualaarglysserthraspargthrala

165170175

aspvalleuargalaleualaasppheglyleulysproleumetarg

180185190

valtyrthraspseraspmetserservalglntrplysproleuarg

195200205

lysglyglnalavalargthrtrpaspargaspmetpheglnglnala

210215220

ilegluargmetmetsertrpglusertrpasnglnargvalglyglu

225230235240

alatyralalysleuvalgluglnlysserargphegluglnlysasn

245250255

phevalglyglngluhisleuvalglnleuvalasnglnleuglngln

260265270

aspmetlysglualaserhisglyleugluserlysgluglnthrala

275280285

histyrleuthrglyargalaleuargglyserasplysvalpheglu

290295300

lystrpglulysleuaspproaspalapropheaspleutyraspthr

305310315320

gluilelysasnvalglnargargasnthrargargpheglyserhis

325330335

aspleuphealalysleualagluprolystyrglnalaleutrparg

340345350

gluaspalaserpheleuthrargtyralavaltyrasnserileval

355360365

arglysleuasnhisalalysmetphealathrphethrleuproasp

370375380

alathralahisproiletrpthrargpheasplysleuglyglyasn

385390395400

leuhisglntyrthrpheleupheasnglupheglygluglyarghis

405410415

alaileargpheglnlysleuleuthrvalgluaspglyvalalalys

420425430

gluvalaspaspvalthrvalproilesermetseralaglnleuasp

435440445

aspleuleuproargaspprohisgluleuvalalaleutyrphegln

450455460

asptyrglyalagluglnhisleualaglyglupheglyglyalalys

465470475480

ileglntyrargargaspglnleuasnhisleuhisalaargarggly

485490495

alaargaspvaltyrleuasnleuservalargvalglnserglnser

500505510

glualaargglygluargargproprotyralaalavalpheargleu

515520525

valglyaspasnhisargalaphevalhispheasplysleuserasp

530535540

tyrleualagluhisproaspaspglylysleuglysergluglyleu

545550555560

leuserglyleuargvalmetservalaspleuglyleuargthrser

565570575

alaserileservalpheargvalalaarglysaspgluleulyspro

580585590

asnsergluglyargvalprophecyspheproilegluglyasnglu

595600605

asnleuvalalavalhisgluargserglnleuleulysleuprogly

610615620

gluthrgluserlysaspleuargalaileargglugluargglnarg

625630635640

thrleuargglnleuargthrglnleualatyrleuargleuleuval

645650655

argcysglysergluaspvalglyargarggluargsertrpalalys

660665670

leuilegluglnprometaspalaasnglnmetthrproasptrparg

675680685

glualaphegluaspgluleuglnlysleulysserleutyrglyile

690695700

cysglyaspargglutrpthrglualavaltyrgluservalargarg

705710715720

valtrparghismetglylysglnvalargasptrparglysaspval

725730735

argserglygluargprolysileargglytyrglnlysaspvalval

740745750

glyglyasnserilegluglnileglutyrleugluargglntyrlys

755760765

pheleulyssertrpserphepheglylysvalserglyglnvalile

770775780

argalaglulysglyserargphealailethrleuarggluhisile

785790795800

asphisalalysgluaspargleulyslysleualaaspargileile

805810815

metglualaleuglytyrvaltyralaleuaspaspgluargglylys

820825830

glylystrpvalalalystyrproprocysglnleuileleuleuglu

835840845

gluleuserglutyrglnpheasnasnaspargproprosergluasn

850855860

asnglnleumetglntrpserhisargglyvalpheglngluleuleu

865870875880

asnglnalaglnvalhisaspleuleuvalglythrmettyralaala

885890895

pheserserargpheaspalaargthrglyalaproglyileargcys

900905910

argargvalproalaargcysalaarggluglnasnprogluprophe

915920925

protrptrpleuasnlysphevalalagluhislysleuaspglycys

930935940

proleuargalaaspaspleuileprothrglygluglygluphephe

945950955960

valserpropheseralaglugluglyaspphehisglnilehisala

965970975

aspleuasnalaalaglnasnleuglnargargleutrpseraspphe

980985990

aspileserglnileargleuargcysasptrpglygluvalaspgly

99510001005

gluprovalleuileproargthrthrglylysargthralaaspser

101010151020

tyrglyasnlysvalphetyrthrlysthrglyvalthrtyrtyrglu

1025103010351040

arggluargglylyslysargarglysvalphealaglnglugluleu

104510501055

serglugluglualagluleuleuvalglualaaspglualaargglu

106010651070

lysservalvalleumetargaspproserglyileileasnarggly

107510801085

asptrpthrargglnlysgluphetrpsermetvalasnglnargile

109010951100

gluglytyrleuvallysglnileargserargvalargleuglnglu

1105111011151120

seralacysgluasnthrglyaspile

1125

<210>2

<211>1147

<212>prt

<213>alicyclobacilluskakegawensis

<400>2

metalavallysserilelysvallysleuargleuserglucyspro

151015

aspileleualaglymettrpglnleuhisargalathrasnalagly

202530

valargtyrtyrthrglutrpvalserleumetargglngluileleu

354045

tyrserargglyproaspglyglyglnglncystyrmetthralaglu

505560

aspcysglnarggluleuleuargargleuargasnargglnleuhis

65707580

asnglyargglnaspglnproglythraspalaaspleuleualaile

859095

serargargleutyrgluileleuvalleuglnserileglylysarg

100105110

glyaspalaglnglnilealaserserpheleuserproleuvalasp

115120125

proasnserlysglyglyargglyglualalysserglyarglyspro

130135140

alatrpglnlysmetargaspglnglyaspproargtrpvalalaala

145150155160

argglulystyrgluglnarglysalavalaspproserlysgluile

165170175

leuasnserleuaspalaleuglyleuargproleuphealavalphe

180185190

thrgluthrtyrargserglyvalasptrplysproleuglylysser

195200205

glnglyvalargthrtrpaspargaspmetpheglnglnalaleuglu

210215220

argleumetsertrpglusertrpasnargargvalglygluglutyr

225230235240

alaargleupheglnglnlysmetlysphegluglngluhispheala

245250255

gluglnserhisleuvallysleualaargalaleuglualaaspmet

260265270

argalaalaserglnglypheglualalysargglythralahisgln

275280285

ilethrargargalaleuargglyalaaspargvalphegluiletrp

290295300

lysserileprogluglualaleupheserglntyraspgluvalile

305310315320

argglnvalglnalaglulysargargasppheglyserhisaspleu

325330335

phealalysleualagluprolystyrglnproleutrpargalaasp

340345350

gluthrpheleuthrargtyralaleutyrasnglyvalleuargasp

355360365

leuglulysalaargglnphealathrphethrleuproaspalacys

370375380

valasnproiletrpthrargphegluserserglnglyserasnleu

385390395400

hislystyrglupheleupheasphisleuglyproglyarghisala

405410415

valargpheglnargleuleuvalvalglusergluglyalalysglu

420425430

argaspservalvalvalprovalalaproserglyglnleuasplys

435440445

leuvalleuargglugluglulysserservalalaleuhisleuhis

450455460

aspthralaargproaspglyphemetalaglutrpalaglyalalys

465470475480

leuglntyrgluargserthrleualaarglysalaargargasplys

485490495

glnglymetargsertrpargargglnprosermetleumetserala

500505510

alaglnmetleugluaspalalysglnalaglyaspvaltyrleuasn

515520525

ileservalargvallysserprosergluvalargglyglnargarg

530535540

proprotyralaalaleupheargileaspasplysglnargargval

545550555560

thrvalasntyrasnlysleuseralatyrleuglugluhisproasp

565570575

lysglnileproglyalaproglyleuleuserglyleuargvalmet

580585590

servalaspleuglyleuargthrseralaserileservalphearg

595600605

valalalyslysglugluvalglualaleuglyaspglyargpropro

610615620

histyrtyrproilehisglythraspaspleuvalalavalhisglu

625630635640

argserhisleuileglnmetproglygluthrgluthrlysglnleu

645650655

arglysleuargglugluargglnalavalleuargproleupheala

660665670

glnleualaleuleuargleuleuvalargcysglyalaalaaspglu

675680685

argileargthrargsertrpglnargleuthrlysglnglyargglu

690695700

phethrlysargleuthrprosertrpargglualaleugluleuglu

705710715720

leuthrargleuglualatyrcysglyargvalproaspaspglutrp

725730735

serargilevalaspargthrvalilealaleutrpargargmetgly

740745750

lysglnvalargasptrparglysglnvallysserglyalalysval

755760765

lysvallysglytyrglnleuaspvalvalglyglyasnserleuala

770775780

glnileasptyrleugluglnglntyrlyspheleuargargtrpser

785790795800

phephealaargalaserglyleuvalvalargalaasparggluser

805810815

hisphealavalalaleuargglnhisilegluasnalalysargasp

820825830

argleulyslysleualaaspargileleumetglualaleuglytyr

835840845

valtyrglualaserglyproarggluglyglntrpthralaglnhis

850855860

proprocysglnleuileileleuglugluleuseralatyrargphe

865870875880

seraspaspargproprosergluasnserlysleumetalatrpgly

885890895

hisargglyileleuglugluleuvalasnglnalaglnvalhisasp

900905910

valleuvalglythrvaltyralaalapheserserargpheaspala

915920925

argthrglyalaproglyvalargcysargargvalproalaargphe

930935940

valglyalathrvalaspaspserleuproleutrpleuthrgluphe

945950955960

leuasplyshisargleuasplysasnleuleuargproaspaspval

965970975

ileprothrglygluglyglupheleuvalserprocysglygluglu

980985990

alaalaargvalargglnvalhisalaaspileasnalaalaglnasn

99510001005

leuglnargargleutrpglnasnpheaspilethrgluleuargleu

101010151020

argcysaspvallysmetglyglygluglythrvalleuvalproarg

1025103010351040

valasnasnalaargalalysglnleupheglylyslysvalleuval

104510501055

serglnaspglyvalthrphephegluargserglnthrglyglylys

106010651070

prohisserglulysglnthraspleuthrasplysgluleugluleu

107510801085

ilealaglualaaspglualaargalalysservalvalleuphearg

109010951100

aspproserglyhisileglylysglyhistrpileargglnargglu

1105111011151120

phetrpserleuvallysglnargilegluserhisthralagluarg

112511301135

ileargvalargglyvalglyserserleuasp

11401145

<210>3

<211>1146

<212>prt

<213>alicyclobacillusmacrosporangiidus

<400>3

metasnvalalavallysserilelysvallysleumetleuglyhis

151015

leuprogluilearggluglyleutrphisleuhisglualavalasn

202530

leuglyvalargtyrtyrthrglutrpleualaleuleuargglngly

354045

asnleutyrargargglylysaspglyalaglnglucystyrmetthr

505560

alagluglncysargglngluleuleuvalargleuargasparggln

65707580

lysargasnglyhisthrglyaspproglythraspglugluleuleu

859095

glyvalalaargargleutyrgluleuleuvalproglnservalgly

100105110

lyslysglyglnalaglnmetleualaserglypheleuserproleu

115120125

alaaspprolyssergluglyglylysglythrserlysserglyarg

130135140

lysproalatrpmetglymetlysglualaglyaspserargtrpval

145150155160

glualalysalaargtyrglualaasnlysalalysaspprothrlys

165170175

glnvalilealaserleuglumettyrglyleuargproleupheasp

180185190

valphethrgluthrtyrlysthrileargtrpmetproleuglylys

195200205

hisglnglyvalargalatrpaspargaspmetpheglnglnserleu

210215220

gluargleumetsertrpglusertrpasngluargvalglyalaglu

225230235240

phealaargleuvalaspargargaspargpheargglulyshisphe

245250255

thrglyglngluhisleuvalalaleualaglnargleugluglnglu

260265270

metlysglualaserproglyphegluserlysserserglnalahis

275280285

argilethrlysargalaleuargglyalaaspglyileileaspasp

290295300

trpleulysleusergluglygluprovalaspargpheaspgluile

305310315320

leuarglysargglnalaglnasnproargargpheglyserhisasp

325330335

leupheleulysleualagluprovalpheglnproleutrpargglu

340345350

aspproserpheleuserargtrpalasertyrasngluvalleuasn

355360365

lysleugluaspalalysglnphealathrphethrleuproserpro

370375380

cysserasnprovaltrpalaargphegluasnalagluglythrasn

385390395400

ilephelystyrasppheleupheasphispheglylysglyarghis

405410415

glyvalargpheglnargmetilevalmetargaspglyvalprothr

420425430

gluvalgluglyilevalvalproilealaproserargglnleuasp

435440445

alaleualaproasnaspalaalaserproileaspvalphevalgly

450455460

aspproalaalaproglyalapheargglyglnpheglyglyalalys

465470475480

ileglntyrargargseralaleuvalarglysglyargarggluglu

485490495

lysalatyrleucysglypheargleuproserglnargargthrgly

500505510

thrproalaaspaspalaglygluvalpheleuasnleuserleuarg

515520525

valgluserglnsergluglnalaglyargargasnproprotyrala

530535540

alavalphehisileseraspglnthrargargvalilevalargtyr

545550555560

glygluilegluargtyrleualagluhisproaspthrglyilepro

565570575

glyserargglyleuthrserglyleuargvalmetservalaspleu

580585590

glyleuargthrseralaalaileservalpheargvalalahisarg

595600605

aspgluleuthrproaspalahisglyargglnprophephephepro

610615620

ilehisglymetasphisleuvalalaleuhisgluargserhisleu

625630635640

ileargleuproglygluthrgluserlyslysvalargserilearg

645650655

gluglnargleuaspargleuasnargleuargserglnmetalaser

660665670

leuargleuleuvalargthrglyvalleuaspgluglnlysargasp

675680685

argasntrpgluargleuglnsersermetgluargglyglygluarg

690695700

metproserasptrptrpaspleupheglnalaglnvalargtyrleu

705710715720

alaglnhisargaspalaserglyglualatrpglyargmetvalgln

725730735

alaalavalargthrleutrpargglnleualalysglnvalargasp

740745750

trparglysgluvalargargasnalaasplysvallysilearggly

755760765

ilealaargaspvalproglyglyhisserleualaglnleuasptyr

770775780

leugluargglntyrargpheleuargsertrpseralapheserval

785790795800

glnalaglyglnvalvalargalagluargaspserargphealaval

805810815

alaleuarggluhisileaspasnglylyslysaspargleulyslys

820825830

leualaaspargileleumetglualaleuglytyrvaltyrvalthr

835840845

aspglyargargalaglyglntrpglnalavaltyrproprocysgln

850855860

leuvalleuleuglugluleuserglutyrargpheserasnasparg

865870875880

proprosergluasnserglnleumetvaltrpserhisargglyval

885890895

leuglugluleuilehisglnalaglnvalhisaspvalleuvalgly

900905910

thrileproalaalapheserserargpheaspalaargthrglyala

915920925

proglyileargcysargargvalproserileproleulysaspala

930935940

proserileproiletrpleuserhistyrleulysglnthrgluarg

945950955960

aspalaalaalaleuargproglygluleuileprothrglyaspgly

965970975

glupheleuvalthrproalaglyargglyalaserglyvalargval

980985990

valhisalaaspileasnalaalahisasnleuglnargargleutrp

99510001005

gluasnpheaspleuseraspileargvalargcysaspargargglu

101010151020

glylysaspglythrvalvalleuileproargleuthrasnglnarg

1025103010351040

vallysgluargtyrserglyvalilephethrsergluaspglyval

104510501055

serphethrvalglyaspalalysthrargargargserseralaser

106010651070

glnglygluglyaspaspleuseraspglugluglngluleuleuala

107510801085

glualaaspaspalaarggluargservalvalleupheargasppro

109010951100

serglyphevalasnglyglyargtrpthralaglnargalaphetrp

1105111011151120

glymetvalhisasnargilegluthrleuleualagluargpheser

112511301135

valserglyalaalaglulysvalarggly

11401145

<210>4

<211>1108

<212>prt

<213>bacillushisashii

<400>4

metalathrargserpheileleulysilegluproasnglugluval

151015

lyslysglyleutrplysthrhisgluvalleuasnhisglyileala

202530

tyrtyrmetasnileleulysleuileargglnglualailetyrglu

354045

hishisgluglnaspprolysasnprolyslysvalserlysalaglu

505560

ileglnalagluleutrpaspphevalleulysmetglnlyscysasn

65707580

serphethrhisgluvalasplysaspgluvalpheasnileleuarg

859095

gluleutyrglugluleuvalproserservalglulyslysglyglu

100105110

alaasnglnleuserasnlyspheleutyrproleuvalaspproasn

115120125

serglnserglylysglythralaserserglyarglysproargtrp

130135140

tyrasnleulysilealaglyaspprosertrpglugluglulyslys

145150155160

lystrpglugluasplyslyslysaspproleualalysileleugly

165170175

lysleualaglutyrglyleuileproleupheileprotyrthrasp

180185190

serasngluproilevallysgluilelystrpmetglulysserarg

195200205

asnglnservalargargleuasplysaspmetpheileglnalaleu

210215220

gluargpheleusertrpglusertrpasnleulysvallysgluglu

225230235240

tyrglulysvalglulysglutyrlysthrleuglugluargilelys

245250255

gluaspileglnalaleulysalaleugluglntyrglulysgluarg

260265270

glngluglnleuleuargaspthrleuasnthrasnglutyrargleu

275280285

serlysargglyleuargglytrparggluileileglnlystrpleu

290295300

lysmetaspgluasngluproserglulystyrleugluvalphelys

305310315320

asptyrglnarglyshisproargglualaglyasptyrservaltyr

325330335

glupheleuserlyslysgluasnhispheiletrpargasnhispro

340345350

glutyrprotyrleutyralathrphecysgluileasplyslyslys

355360365

lysaspalalysglnglnalathrphethrleualaaspproileasn

370375380

hisproleutrpvalargpheglugluargserglyserasnleuasn

385390395400

lystyrargileleuthrgluglnleuhisthrglulysleulyslys

405410415

lysleuthrvalglnleuaspargleuiletyrprothrglusergly

420425430

glytrpgluglulysglylysvalaspilevalleuleuproserarg

435440445

glnphetyrasnglnilepheleuaspilegluglulysglylyshis

450455460

alaphethrtyrlysaspgluserilelyspheproleulysglythr

465470475480

leuglyglyalaargvalglnpheaspargasphisleuargargtyr

485490495

prohislysvalgluserglyasnvalglyargiletyrpheasnmet

500505510

thrvalasnilegluprothrgluserprovalserlysserleulys

515520525

ilehisargaspasppheprolysvalvalasnphelysprolysglu

530535540

leuthrglutrpilelysaspserlysglylyslysleulyssergly

545550555560

ilegluserleugluileglyleuargvalmetserileaspleugly

565570575

glnargglnalaalaalaalaserilephegluvalvalaspglnlys

580585590

proaspilegluglylysleuphepheproilelysglythrgluleu

595600605

tyralavalhisargalaserpheasnilelysleuproglygluthr

610615620

leuvallysserarggluvalleuarglysalaarggluaspasnleu

625630635640

lysleumetasnglnlysleuasnpheleuargasnvalleuhisphe

645650655

glnglnphegluaspilethrgluargglulysargvalthrlystrp

660665670

ileserargglngluasnseraspvalproleuvaltyrglnaspglu

675680685

leuileglnilearggluleumettyrlysprotyrlysasptrpval

690695700

alapheleulysglnleuhislysargleugluvalgluileglylys

705710715720

gluvallyshistrparglysserleuseraspglyarglysglyleu

725730735

tyrglyileserleulysasnileaspgluileaspargthrarglys

740745750

pheleuleuargtrpserleuargprothrgluproglygluvalarg

755760765

argleugluproglyglnargphealaileaspglnleuasnhisleu

770775780

asnalaleulysgluaspargleulyslysmetalaasnthrileile

785790795800

methisalaleuglytyrcystyraspvalarglyslyslystrpgln

805810815

alalysasnproalacysglnileileleuphegluaspleuserasn

820825830

tyrasnprotyrglugluargserargphegluasnserlysleumet

835840845

lystrpserargarggluileproargglnvalalaleuglnglyglu

850855860

iletyrglyleuglnvalglygluvalglyalaglnpheserserarg

865870875880

phehisalalysthrglyserproglyileargcysservalvalthr

885890895

lysglulysleuglnaspasnargphephelysasnleuglnargglu

900905910

glyargleuthrleuasplysilealavalleulysgluglyaspleu

915920925

tyrproasplysglyglyglulyspheileserleuserlysasparg

930935940

lyscysvalthrthrhisalaaspileasnalaalaglnasnleugln

945950955960

lysargphetrpthrargthrhisglyphetyrlysvaltyrcyslys

965970975

alatyrglnvalaspglyglnthrvaltyrileprogluserlysasp

980985990

glnlysglnlysileilegluglupheglygluglytyrpheileleu

99510001005

lysaspglyvaltyrglutrpvalasnalaglylysleulysilelys

101010151020

lysglyserserlysglnsersersergluleuvalaspseraspile

1025103010351040

leulysaspserpheaspleualasergluleulysglyglulysleu

104510501055

metleutyrargaspproserglyasnvalpheproserasplystrp

106010651070

metalaalaglyvalphepheglylysleugluargileleuileser

107510801085

lysleuthrasnglntyrserileserthrilegluaspaspserser

109010951100

lysglnsermet

1105

<210>5

<211>1108

<212>prt

<213>bacillus

<400>5

metalaileargserilelysleulysleulysthrhisthrglypro

151015

glualaglnasnleuarglysglyiletrpargthrhisargleuleu

202530

asngluglyvalalatyrtyrmetlysmetleuleuleuphearggln

354045

gluserthrglygluargprolysglugluleuglnglugluleuile

505560

cyshisilearggluglnglnglnargasnglnalaasplysasnthr

65707580

glnalaleuproleuasplysalaleuglualaleuargglnleutyr

859095

gluleuleuvalproserservalglyglnserglyaspalaglnile

100105110

ileserarglyspheleuserproleuvalaspproasnserglugly

115120125

glylysglythrserlysalaglyalalysprothrtrpglnlyslys

130135140

lysglualaasnaspprothrtrpgluglnasptyrglulystrplys

145150155160

lysargargglugluaspprothralaservalilethrthrleuglu

165170175

glutyrglyileargproilepheproleutyrthrasnthrvalthr

180185190

aspilealatrpleuproleuglnserasnglnphevalargthrtrp

195200205

aspargaspmetleuglnglnalailegluargleuleusertrpglu

210215220

sertrpasnlysargvalglngluglutyralalysleulysglulys

225230235240

metalaglnleuasngluglnleugluglyglyglnglutrpileser

245250255

leuleugluglntyrglugluasnarggluarggluleuarggluasn

260265270

metthralaalaasnasplystyrargilethrlysargglnmetlys

275280285

glytrpasngluleutyrgluleutrpserthrpheproalaserala

290295300

serhisgluglntyrlysglualaleulysargvalglnglnargleu

305310315320

argglyargpheglyaspalahisphepheglntyrleumetgluglu

325330335

lysasnargleuiletrplysglyasnproglnargilehistyrphe

340345350

valalaargasngluleuthrlysargleugluglualalysglnser

355360365

alathrmetthrleuproasnalaarglyshisproleutrpvalarg

370375380

pheaspalaargglyglyasnleuglnasptyrtyrleuthralaglu

385390395400

alaasplysproargserargargphevalthrpheserglnleuile

405410415

trpprosergluserglytrpmetglulyslysaspvalgluvalglu

420425430

leualaleuserargglnphetyrglnglnvallysleuleulysasn

435440445

asplysglylysglnlysilegluphelysasplysglyserglyser

450455460

thrpheasnglyhisleuglyglyalalysleuglnleugluarggly

465470475480

aspleuglulysgluglulysasnphegluaspglygluileglyser

485490495

valtyrleuasnvalvalileaspphegluproleuglngluvallys

500505510

asnglyargvalglnalaprotyrglyglnvalleuglnleuilearg

515520525

argproasnglupheprolysvalthrthrtyrlyssergluglnleu

530535540

valglutrpilelysalaserproglnhisseralaglyvalgluser

545550555560

leualaserglypheargvalmetserileaspleuglyleuargala

565570575

alaalaalathrserilepheservalglugluserserasplysasn

580585590

alaalaaspphesertyrtrpilegluglythrproleuvalalaval

595600605

hisglnargsertyrmetleuargleuproglygluglnvalglulys

610615620

glnvalmetglulysargaspgluargpheglnleuhisglnargval

625630635640

lyspheglnileargvalleualaglnilemetargmetalaasnlys

645650655

glntyrglyaspargtrpaspgluleuaspserleulysglnalaval

660665670

gluglnlyslysserproleuaspglnthraspargthrphetrpglu

675680685

glyilevalcysaspleuthrlysvalleuproargasnglualaasp

690695700

trpgluglnalavalvalglnilehisarglysalagluglutyrval

705710715720

glylysalavalglnalatrparglysargphealaalaaspgluarg

725730735

lysglyilealaglyleusermettrpasnileglugluleuglugly

740745750

leuarglysleuleuilesertrpserargargthrargasnprogln

755760765

gluvalasnargphegluargglyhisthrserhisglnargleuleu

770775780

thrhisileglnasnvallysgluaspargleulysglnleuserhis

785790795800

alailevalmetthralaleuglytyrvaltyraspgluarglysgln

805810815

glutrpcysalaglutyrproalacysglnvalileleuphegluasn

820825830

leuserglntyrargserasnleuaspargserthrlysgluasnser

835840845

thrleumetlystrpalahisargserileprolystyrvalhismet

850855860

glnalagluprotyrglyileglnileglyaspvalargalaglutyr

865870875880

serserargphetyralalysthrglythrproglyileargcyslys

885890895

lysvalargglyglnaspleuglnglyargargphegluasnleugln

900905910

lysargleuvalasngluglnpheleuthrglugluglnvallysgln

915920925

leuargproglyaspilevalproaspaspserglygluleuphemet

930935940

thrleuthraspglyserglyserlysgluvalvalpheleuglnala

945950955960

aspileasnalaalahisasnleuglnlysargphetrpglnargtyr

965970975

asngluleuphelysvalsercysargvalilevalargaspgluglu

980985990

glutyrleuvalprolysthrlysservalglnalalysleuglylys

99510001005

glyleuphevallyslysseraspthralatrplysaspvaltyrval

101010151020

trpaspserglnalalysleulysglylysthrthrphethrgluglu

1025103010351040

sergluserprogluglnleugluasppheglngluileilegluglu

104510501055

alagluglualalysglythrtyrargthrleupheargaspproser

106010651070

glyvalphepheprogluservaltrptyrproglnlysaspphetrp

107510801085

glygluvallysarglysleutyrglylysleuarggluargpheleu

109010951100

thrlysalaarg

1105

<210>6

<211>1112

<212>prt

<213>bacillus

<400>6

metalaileargserilelysleulysmetlysthrasnserglythr

151015

aspseriletyrleuarglysalaleutrpargthrhisglnleuile

202530

asngluglyilealatyrtyrmetasnleuleuthrleutyrarggln

354045

glualaileglyasplysthrlysglualatyrglnalagluleuile

505560

asnileileargasnglnglnargasnasnglyserserglugluhis

65707580

glyseraspglngluileleualaleuleuargglnleutyrgluleu

859095

ileileproserserileglygluserglyaspalaasnglnleugly

100105110

asnlyspheleutyrproleuvalaspproasnserglnserglylys

115120125

glythrserasnalaglyarglysproargtrplysargleulysglu

130135140

gluglyasnproasptrpgluleuglulyslyslysaspglugluarg

145150155160

lysalalysaspprothrvallysilepheaspasnleuasnlystyr

165170175

glyleuleuproleupheproleuphethrasnileglnlysaspile

180185190

glutrpleuproleuglylysargglnservalarglystrpasplys

195200205

aspmetpheileglnalailegluargleuleusertrpglusertrp

210215220

asnargargvalalaaspglutyrlysglnleulysglulysthrglu

225230235240

sertyrtyrlysgluhisleuthrglyglygluglutrpileglulys

245250255

ilearglyspheglulysgluargasnmetgluleuglulysasnala

260265270

phealaproasnaspglytyrpheilethrserargglnilearggly

275280285

trpaspargvaltyrglulystrpserlysleuprogluseralaser

290295300

proglugluleutrplysvalvalalagluglnglnasnlysmetser

305310315320

gluglypheglyaspprolysvalpheserpheleualaasnargglu

325330335

asnargaspiletrpargglyhissergluargiletyrhisileala

340345350

alatyrasnglyleuglnlyslysleuserargthrlysgluglnala

355360365

thrphethrleuproaspalailegluhisproleutrpileargtyr

370375380

gluserproglyglythrasnleuasnleuphelysleugluglulys

385390395400

glnlyslysasntyrtyrvalthrleuserlysileiletrpproser

405410415

gluglulystrpileglulysgluasnilegluileproleualapro

420425430

serileglnpheasnargglnilelysleulysglnhisvallysgly

435440445

lysglngluileserpheserasptyrserserargileserleuasp

450455460

glyvalleuglyglyserargileglnpheasnarglystyrilelys

465470475480

asnhislysgluleuleuglygluglyaspileglyprovalphephe

485490495

asnleuvalvalaspvalalaproleuglngluthrargasnglyarg

500505510

leuglnserproileglylysalaleulysvalileserseraspphe

515520525

serlysvalileasptyrlysprolysgluleumetasptrpmetasn

530535540

thrglyseralaserasnserpheglyvalalaserleuleuglugly

545550555560

metargvalmetserileaspmetglyglnargthrseralaserval

565570575

serilephegluvalvallysgluleuprolysaspglngluglnlys

580585590

leuphetyrserileasnaspthrgluleuphealailehislysarg

595600605

serpheleuleuasnleuproglygluvalvalthrlysasnasnlys

610615620

glnglnargglngluargarglyslysargglnphevalargsergln

625630635640

ileargmetleualaasnvalleuargleugluthrlyslysthrpro

645650655

aspgluarglyslysalailehislysleumetgluilevalglnser

660665670

tyraspsertrpthralaserglnlysgluvaltrpglulysgluleu

675680685

asnleuleuthrasnmetalaalapheasnaspgluiletrplysglu

690695700

serleuvalgluleuhishisargilegluprotyrvalglyglnile

705710715720

valserlystrparglysglyleusergluglyarglysasnleuala

725730735

glyilesermettrpasnileaspgluleugluaspthrargargleu

740745750

leuilesertrpserlysargserargthrproglyglualaasnarg

755760765

ilegluthraspglupropheglyserserleuleuglnhisilegln

770775780

asnvallysaspaspargleulysglnmetalaasnleuileilemet

785790795800

thralaleuglyphelystyrasplysgluglulysaspargtyrlys

805810815

argtrplysgluthrtyrproalacysglnileileleuphegluasn

820825830

leuasnargtyrleupheasnleuaspargserargarggluasnser

835840845

argleumetlystrpalahisargserileproargthrvalsermet

850855860

glnglyglumetpheglyleuglnvalglyaspvalargserglutyr

865870875880

serserargphehisalalysthrglyalaproglyileargcyshis

885890895

alaleuthrglugluaspleulysalaglyserasnthrleulysarg

900905910

leuilegluaspglypheileasnglusergluleualatyrleulys

915920925

lysglyaspileileproserglnglyglygluleuphevalthrleu

930935940

serlysargtyrlyslysaspseraspasnasngluleuthrvalile

945950955960

hisalaaspileasnalaalaglnasnleuglnlysargphetrpgln

965970975

glnasnsergluvaltyrargvalprocysglnleualaargmetgly

980985990

gluasplysleutyrileprolysserglnthrgluthrilelyslys

99510001005

tyrpheglylysglyserphevallysasnasnthrgluglngluval

101010151020

tyrlystrpglulysserglulysmetlysilelysthraspthrthr

1025103010351040

pheaspleuglnaspleuaspglyphegluaspileserlysthrile

104510501055

gluleualaglngluglnglnlyslystyrleuthrmetpheargasp

106010651070

proserglytyrphepheasnasngluthrtrpargproglnlysglu

107510801085

tyrtrpserilevalasnasnileilelyssercysleulyslyslys

109010951100

ileleuserasnlysvalgluleu

11051110

<210>7

<211>1149

<212>prt

<213>desulfovibrioinopinatus

<400>7

metprothrargthrileasnleulysleuvalleuglylysasnpro

151015

gluasnalathrleuargargalaleupheserthrhisargleuval

202530

asnglnalathrlysargilegluglupheleuleuleucysarggly

354045

glualatyrargthrvalaspasngluglylysglualagluilepro

505560

arghisalavalglngluglualaleualaphealalysalaalagln

65707580

arghisasnglycysileserthrtyrgluaspglngluileleuasp

859095

valleuargglnleutyrgluargleuvalproservalasngluasn

100105110

asnglualaglyaspalaglnalaalaasnalatrpvalserproleu

115120125

metseralaglusergluglyglyleuservaltyrasplysvalleu

130135140

aspproproprovaltrpmetlysleulysgluglulysalaprogly

145150155160

trpglualaalaserglniletrpileglnseraspgluglyglnser

165170175

leuleuasnlysproglyserproproargtrpilearglysleuarg

180185190

serglyglnprotrpglnaspaspphevalseraspglnlyslyslys

195200205

glnaspgluleuthrlysglyasnalaproleuilelysglnleulys

210215220

glumetglyleuleuproleuvalasnprophephearghisleuleu

225230235240

aspprogluglylysglyvalserprotrpaspargleualavalarg

245250255

alaalavalalahispheilesertrpglusertrpasnhisargthr

260265270

argalaglutyrasnserleulysleuargargaspgluphegluala

275280285

alaseraspgluphelysaspaspphethrleuleuargglntyrglu

290295300

alalysarghisserthrleulysserilealaleualaaspaspser

305310315320

asnprotyrargileglyvalargserleuargalatrpasnargval

325330335

arggluglutrpileasplysglyalathrglugluglnargvalthr

340345350

ileleuserlysleuglnthrglnleuargglylyspheglyasppro

355360365

aspleupheasntrpleualaglnasparghisvalhisleutrpser

370375380

proargaspservalthrproleuvalargileasnalavalasplys

385390395400

valleuargargarglysprotyralaleumetthrphealahispro

405410415

argphehisproargtrpileleutyrglualaproglyglyserasn

420425430

leuargglntyralaleuaspcysthrgluasnalaleuhisilethr

435440445

leuproleuleuvalaspaspalahisglythrtrpileglulyslys

450455460

ileargvalproleualaproserglyglnileglnaspleuthrleu

465470475480

glulysleuglulyslyslysasnargleutyrtyrargserglyphe

485490495

glnglnphealaglyleualaglyglyalagluvalleuphehisarg

500505510

protyrmetgluhisaspgluargserglugluserleuleugluarg

515520525

proglyalavaltrpphelysleuthrleuaspvalalathrglnala

530535540

proproasntrpleuaspglylysglyargvalargthrproproglu

545550555560

valhishisphelysthralaleuserasnlysserlyshisthrarg

565570575

thrleuglnproglyleuargvalleuservalaspleuglymetarg

580585590

thrphealasercysservalphegluleuilegluglylysproglu

595600605

thrglyargalapheprovalalaaspgluargsermetaspserpro

610615620

asnlysleutrpalalyshisgluargserphelysleuthrleupro

625630635640

glygluthrproserarglysglugluglugluargserilealaarg

645650655

alagluiletyralaleulysargaspileglnargleulysserleu

660665670

leuargleuglyglugluaspasnaspasnargargaspalaleuleu

675680685

gluglnphephelysglytrpglyglugluaspvalvalproglygln

690695700

alapheproargserleupheglnglyleuglyalaalaprophearg

705710715720

serthrprogluleutrpargglnhiscysglnthrtyrtyrasplys

725730735

alaglualacysleualalyshisileserasptrparglysargthr

740745750

argproargprothrserargglumettrptyrlysthrargsertyr

755760765

hisglyglylysseriletrpmetleuglutyrleuaspalavalarg

770775780

lysleuleuleusertrpserleuargglyargthrtyrglyalaile

785790795800

asnargglnaspthralaargpheglyserleualaserargleuleu

805810815

hishisileasnserleulysgluaspargilelysthrglyalaasp

820825830

serilevalglnalaalaargglytyrileproleuprohisglylys

835840845

glytrpgluglnargtyrgluprocysglnleuileleuphegluasp

850855860

leualaargtyrargpheargvalaspargproargarggluasnser

865870875880

glnleumetglntrpasnhisargalailevalalagluthrthrmet

885890895

glnalagluleutyrglyglnilevalgluasnthralaalaglyphe

900905910

serserargphehisalaalathrglyalaproglyvalargcysarg

915920925

pheleuleugluargasppheaspasnaspleuprolysprotyrleu

930935940

leuarggluleusertrpmetleuglyasnthrlysvalgluserglu

945950955960

gluglulysleuargleuleuserglulysileargproglyserleu

965970975

valprotrpaspglyglygluglnphealathrleuhisprolysarg

980985990

glnthrleucysvalilehisalaaspmetasnalaalaglnasnleu

99510001005

glnargargphepheglyargcysglyglualapheargleuvalcys

101010151020

glnprohisglyaspaspvalleuargleualaserthrproglyala

1025103010351040

argleuleuglyalaleuglnglnleugluasnglyglnglyalaphe

104510501055

gluleuvalargaspmetglyserthrserglnmetasnargpheval

106010651070

metlysserleuglylyslyslysilelysproleuglnaspasnasn

107510801085

glyaspaspgluleugluaspvalleuservalleuproglugluasp

109010951100

aspthrglyargilethrvalpheargaspserserglyilephephe

1105111011151120

procysasnvaltrpileproalalysglnphetrpproalavalarg

112511301135

alametiletrplysvalmetalaserhisserleugly

11401145

<210>8

<211>1090

<212>prt

<213>laceyellasediminis

<400>8

metserileargserphelysleulysilelysthrlysserglyval

151015

asnalaglugluleuargargglyleutrpargthrhisglnleuile

202530

asnaspglyilealatyrtyrmetasntrpleuvalleuleuarggln

354045

gluaspleupheileargasnglugluthrasngluileglulysarg

505560

serlysglugluileglnglygluleuleugluargvalhislysgln

65707580

glnglnargasnglntrpserglygluvalaspaspglnthrleuleu

859095

glnthrleuarghisleutyrglugluilevalproservalilegly

100105110

lysserglyasnalaserleulysalaargphepheleuglyproleu

115120125

valaspproasnasnlysthrthrlysaspvalserlysserglypro

130135140

thrprolystrplyslysmetlysaspalaglyaspproasntrpval

145150155160

glnglutyrglulystyrmetalagluargglnthrleuvalargleu

165170175

gluglumetglyleuileproleuphepromettyrthraspgluval

180185190

glyaspilehistrpleuproglnalaserglytyrthrargthrtrp

195200205

aspargaspmetpheglnglnalailegluargleuleusertrpglu

210215220

sertrpasnargargvalarggluargargalaglnpheglulyslys

225230235240

thrhisaspphealaserargphesergluseraspvalglntrpmet

245250255

asnlysleuargglutyrglualaglnglnglulysserleugluglu

260265270

asnalaphealaproasngluprotyralaleuthrlyslysalaleu

275280285

argglytrpgluargvaltyrhissertrpmetargleuaspserala

290295300

alasergluglualatyrtrpglngluvalalathrcysglnthrala

305310315320

metargglyglupheglyaspproalailetyrglnpheleualagln

325330335

lysgluasnhisaspiletrpargglytyrprogluargvalileasp

340345350

phealagluleuasnhisleuglnarggluleuargargalalysglu

355360365

aspalathrphethrleuproaspservalasphisproleutrpval

370375380

argtyrglualaproglyglythrasnilehisglytyraspleuval

385390395400

glnaspthrlysargasnleuthrleuileleuasplyspheileleu

405410415

proaspgluasnglysertrphisgluvallyslysvalpropheser

420425430

leualalysserlysglnphehisargglnvaltrpleuglngluglu

435440445

glnlysglnlyslysarggluvalvalphetyrasptyrserthrasn

450455460

leuprohisleuglythrleualaglyalalysleuglntrpasparg

465470475480

asnpheleuasnlysargthrglnglnglnileglugluthrglyglu

485490495

ileglylysvalphepheasnileservalaspvalargproalaval

500505510

gluvallysasnglyargleuglnasnglyleuglylysalaleuthr

515520525

valleuthrhisproaspglythrlysilevalthrglytrplysala

530535540

gluglnleuglulystrpvalglygluserglyargvalserserleu

545550555560

glyleuaspserleusergluglyleuargvalmetserileaspleu

565570575

glyglnargthrseralathrvalservalphegluilethrlysglu

580585590

alaproaspasnprotyrlysphephetyrglnleugluglythrglu

595600605

leuphealavalhisglnargserpheleuleualaleuproglyglu

610615620

asnproproglnlysilelysglnmetarggluileargtrplysglu

625630635640

argasnargilelysglnglnvalaspglnleuseralaileleuarg

645650655

leuhislyslysvalasngluaspgluargileglnalaileasplys

660665670

leuleuglnlysvalalasertrpglnleuasnglugluilealathr

675680685

alatrpasnglnalaleuserglnleutyrserlysalalysgluasn

690695700

aspleuglntrpasnglnalailelysasnalahishisglnleuglu

705710715720

provalvalglylysglnileserleutrparglysaspleuserthr

725730735

glyargglnglyilealaglyleuserleutrpserileglugluleu

740745750

glualathrlyslysleuleuthrargtrpserlysargserargglu

755760765

proglyvalvallysargilegluargphegluthrphealalysgln

770775780

ileglnhishisileasnglnvallysgluasnargleulysglnleu

785790795800

alaasnleuilevalmetthralaleuglytyrlystyraspglnglu

805810815

glnlyslystrpilegluvaltyrproalacysglnvalvalleuphe

820825830

gluasnleuargsertyrargphesertyrgluargserargargglu

835840845

asnlyslysleumetglutrpserhisargserileprolysleuval

850855860

glnmetglnglygluleupheglyleuglnvalalaaspvaltyrala

865870875880

alatyrserserargtyrhisglyargthrglyalaproglyilearg

885890895

cyshisalaleuthrglualaaspleuargasngluthrasnileile

900905910

hisgluleuileglualaglypheilelysglugluhisargprotyr

915920925

leuglnglnglyaspleuvalprotrpserglyglygluleupheala

930935940

thrleuglnlysprotyraspasnproargileleuthrleuhisala

945950955960

aspileasnalaalaglnasnileglnlysargphetrphisproser

965970975

mettrppheargvalasncysgluservalmetgluglygluileval

980985990

thrtyrvalprolysasnlysthrvalhislyslysglnglylysthr

99510001005

pheargphevallysvalgluglyseraspvaltyrglutrpalalys

101010151020

trpserlysasnargasnlysasnthrpheserserilethrgluarg

1025103010351040

lysproprosersermetileleupheargaspproserglythrphe

104510501055

phelysgluglnglutrpvalgluglnlysthrphetrpglylysval

106010651070

glnsermetileglnalatyrmetlyslysthrilevalglnargmet

107510801085

gluglu

1090

<210>9

<211>1119

<212>prt

<213>spirochaetes

<400>9

metserphethrilesertyrprophelysleuileilelysasnlys

151015

aspglualalysalaleuleuaspthrhisglntyrmetasnglugly

202530

vallystyrtyrleuglulysleuleumetpheargglnglulysile

354045

pheileglygluaspgluthrglylysargiletyrileglugluthr

505560

glutyrlyslysglnileglugluphetyrleuilelyslysthrglu

65707580

leuglyargasnleuthrleuthrleuaspgluphelysthrleumet

859095

arggluleutyrilecysleuvalsersersermetgluasnlyslys

100105110

glypheproasnalaglnglnalaserleuasnilepheserproleu

115120125

pheaspalagluserlysglytyrileleulysglugluasnasnasn

130135140

ileserleuilehislysasptyrglylysileleuleulysargleu

145150155160

argaspasnasnleuileproilephethrlysphethraspilelys

165170175

lysilethralalysleuserprothralaleuaspargmetilephe

180185190

alaglnalaileglulysleuleusertyrglusertrpcyslysleu

195200205

metilelysgluargpheasplysgluvallysilelysgluleuglu

210215220

asnlyscysgluasnlysglngluargasplysilephegluileleu

225230235240

glulystyrgluglugluargglnlysthrphegluglnaspsergly

245250255

phealalyslysglylysphetyrilethrglyargmetleulysgly

260265270

pheaspgluilelysglulystrpleulysglulysaspargserglu

275280285

glnasnleuileasnileleuasnlystyrglnthraspasnserlys

290295300

leuvalglyaspargasnleupheglupheileilelysleugluasn

305310315320

glncysleutrpasnglyaspileasptyrleulysilelysargasp

325330335

ileasnlysasnglniletrpleuaspargproglumetproargphe

340345350

thrmetproaspphelyslyshisproleutrptyrargtyrgluasp

355360365

proserasnserasnpheargasntyrlysilegluvalvallysasp

370375380

gluasntyrilethrileproleuilethrgluargasnasnglutyr

385390395400

pheglugluasntyrthrpheasnleualalysleulyslysleuser

405410415

gluasnilethrpheileprolysserlysasnlysgluphegluphe

420425430

ileaspserasnaspgluglugluasplyslysaspglnlyslysser

435440445

lysglntyrilelystyrcysaspthralalysasnthrsertyrgly

450455460

lysserglyglyileargleutyrpheasnargasngluleugluasn

465470475480

tyrlysaspglylyslysmetaspsertyrthrvalphethrleuser

485490495

ileargasptyrlysserleuphealalysglulysleuglnprogln

500505510

ilepheasnthrvalaspasnlysilethrserleulysileglnlys

515520525

lyspheglyasnglugluglnthrasnpheleusertyrphethrgln

530535540

asnglnilethrlyslysasptrpmetaspglulysthrpheglnasn

545550555560

vallysgluleuasngluglyileargvalleuservalaspleugly

565570575

glnargphephealaalavalsercysphegluilemetsergluile

580585590

aspasnasnlysleuphepheasnleuasnaspglnasnhislysile

595600605

ileargileasnasplysasntyrtyralalyshisiletyrserlys

610615620

thrilelysleuserglygluaspaspaspleutyrlysgluarglys

625630635640

ileasnlysasntyrlysleusertyrglngluarglysasnlysile

645650655

glyilephethrargglnileasnlysleuasnglnleuleulysile

660665670

ileargasnaspgluileasplysglulysphelysgluleuileglu

675680685

thrthrlysargtyrvallysasnthrtyrasnaspglyileileasp

690695700

trpasnasnvalaspasnlysileleusertyrgluasnlysgluasp

705710715720

valileasnleuhislysgluleuasplyslysleugluileaspphe

725730735

lysglupheileargglucysarglysproilepheargserglygly

740745750

leusermetglnargileasppheleuglulysleuasnlysleulys

755760765

arglystrpvalalaargthrglnlysseralagluserilevalleu

770775780

thrprolyspheglytyrlysleulysgluhisileasngluleulys

785790795800

aspasnargvallysglnglyvalasntyrileleumetthralaleu

805810815

glytyrilelysaspasngluilelysasnaspserlyslyslysgln

820825830

lysgluasptrpvallyslysasnargalacysglnileileleumet

835840845

glulysleuthrglutyrthrphealagluaspargproarggluglu

850855860

asnserlysleuargmettrpserhisargglnilepheasnpheleu

865870875880

glnglnlysalaserleutrpglyileleuvalglyaspvalpheala

885890895

protyrthrserlyscysleuseraspasnasnalaproglyilearg

900905910

cyshisglnvalthrlyslysaspleuileaspasnsertrppheleu

915920925

lysilevalvallysaspaspalaphecysaspleuilegluileasn

930935940

lysgluasnvallysasnlysserilelysileasnaspileleupro

945950955960

leuargglyglygluleuphealaserilelysaspglylysleuhis

965970975

ilevalglnalaaspileasnalaserargasnilealalysargphe

980985990

leuserglnileasnpropheargvalvalleulyslysasplysasp

99510001005

gluthrphehisleulysasngluproasntyrleulysasntyrtyr

101010151020

serileleuasnphevalprothrasnglugluleuthrphephelys

1025103010351040

valglugluasnlysaspilelysprothrlysargilelysmetasp

104510501055

lyshisglulysgluserthraspgluglyaspasptyrserlysasn

106010651070

glnilealaleupheargaspaspserglyilephepheasplysser

107510801085

leutrpvalaspglylysilephetrpservalvallysasnlysmet

109010951100

thrlysleuleuarggluargasnasnlyslysasnglyserlys

110511101115

<210>10

<211>1142

<212>prt

<213>tuberibacilluscalidus

<400>10

metasnilehisleulysgluleuileargmetalathrlysserphe

151015

ileleulysmetlysthrlysasnasnproglnleuargleuserleu

202530

trplysthrhisgluleupheasnpheglyvalalatyrtyrmetasp

354045

leuleuserleupheargglnlysaspleutyrmethisasnaspglu

505560

aspproasphisprovalvalleulyslysglugluileglngluarg

65707580

leutrpmetlysvalarggluthrglnglnlysasnglyphehisgly

859095

gluvalserlysaspgluvalleugluthrleuargalaleutyrglu

100105110

gluleuvalproseralavalglylysserglyglualaasnglnile

115120125

serasnlystyrleutyrproleuthraspproalaserglnsergly

130135140

lysglythralaasnserglyarglysproargtrplyslysleulys

145150155160

glualaglyaspprosertrplysaspalatyrglulystrpglulys

165170175

gluargglngluaspprolysleulysileleualaalaleuglnser

180185190

pheglyleuileproleupheargprophethrgluasnasphislys

195200205

alavalileservallystrpmetprolysserlysasnglnserval

210215220

arglyspheasplysaspmetpheasnglnalailegluargpheleu

225230235240

sertrpglusertrpasnglulysvalalagluasptyrglulysthr

245250255

valseriletyrgluserleuglnlysgluleulysglyileserthr

260265270

lysalaphegluilemetgluargvalglulysalatyrglualahis

275280285

leuarggluilethrpheserasnserthrtyrargileglyasnarg

290295300

alaileargglytrpthrgluilevallyslystrpmetlysleuasp

305310315320

proseralaproglnglyasntyrleuaspvalvallysasptyrgln

325330335

argarghisproarggluserglyaspphelysleuphegluleuleu

340345350

serargprogluasnglnalaalatrpargglutyrproglupheleu

355360365

proleutyrvallystyrarghisalagluglnargmetlysthrala

370375380

lyslysglnalathrphethrleucysaspproilearghisproleu

385390395400

trpvalargtyrglugluargserglythrasnleuasnlystyrarg

405410415

leuilemetasnglulysglulysvalvalglnpheaspargleuile

420425430

cysleuasnalaaspglyhistyrglugluglngluaspvalthrval

435440445

proleualaproserglnglnpheaspaspglnilelyspheserser

450455460

gluaspthrglylysglylyshisasnphesertyrtyrhislysgly

465470475480

ileasntyrgluleulysglythrleuglyglyalaargileglnphe

485490495

asparggluhisleuleuargargglnglyvallysalaglyasnval

500505510

glyargilepheleuasnvalthrleuasnilegluprometglnpro

515520525

pheserargserglyasnleuglnthrservalglylysalaleulys

530535540

valtyrvalaspglytyrprolysvalvalasnphelysprolysglu

545550555560

leuthrgluhisilelysgluserglulysasnthrleuthrleugly

565570575

valgluserleuprothrglyleuargvalmetservalaspleugly

580585590

glnargglnalaalaalaileserilephegluvalvalserglulys

595600605

proaspaspasnlysleuphetyrprovallysaspthraspleuphe

610615620

alavalhisargthrserpheasnilelysleuproglyglulysarg

625630635640

thrgluargargmetleugluglnglnlysargaspglnalailearg

645650655

aspleuserarglysleulyspheleulysasnvalleuasnmetgln

660665670

lysleuglulysthraspgluargglulysargvalasnargtrpile

675680685

lysasparggluarggluglugluasnprovaltyrvalglngluphe

690695700

glumetileserlysvalleutyrserprohisservaltrpvalasp

705710715720

glnleulysserilehisarglysleuglugluglnleuglylysglu

725730735

ileserlystrpargglnserileserglnglyargglnglyvaltyr

740745750

glyileserleulysasnilegluaspileglulysthrargargleu

755760765

leupheargtrpsermetargprogluasnproglygluvallysgln

770775780

leuglnproglygluargphealaileaspglnglnasnhisleuasn

785790795800

hisleulysaspaspargilelyslysleualaasnglnilevalmet

805810815

thralaleuglytyrargtyraspglylysarglyslystrpileala

820825830

lyshisproalacysglnleuvalleuphegluaspleuserargtyr

835840845

alaphetyraspgluargserargleugluasnargasnleumetarg

850855860

trpserargarggluileprolysglnvalalaglnileglyglyleu

865870875880

tyrglyleuleuvalglygluvalglyalaglntyrserserargphe

885890895

hisalalysserglyalaproglyileargcysargvalvallysglu

900905910

hisgluleutyrilethrgluglyglyglnlysvalargasnglnlys

915920925

pheleuaspserleuvalgluasnasnileilegluproaspaspala

930935940

argargleugluproglyaspleuileargaspglnglyglyasplys

945950955960

phealathrleuaspgluargglygluleuvalilethrhisalaasp

965970975

ileasnalaalaglnasnleuglnlysargphetrpthrargthrhis

980985990

glyleutyrargileargcysgluserarggluilelysaspalaval

99510001005

valleuvalproserasplysaspglnlysglulysmetgluasnleu

101010151020

pheglyileglytyrleuglnprophelysglngluasnaspvaltyr

1025103010351040

lystrpvallysglyglulysilelysglylyslysthrsersergln

104510501055

seraspasplysgluleuvalsergluileleuglnglualaserval

106010651070

metalaaspgluleulysglyasnarglysthrleupheargasppro

107510801085

serglytyrvalpheprolysaspargtrptyrthrglyglyargtyr

109010951100

pheglythrleugluhisleuleulysarglysleualagluargarg

1105111011151120

leupheaspglyglyserserargargglyleupheasnglythrasp

112511301135

serasnthrasnvalglu

1140

<210>11

<211>3387

<212>dna

<213>alicyclobacillusacidiphilus

<400>11

atggccgtgaagagcatgaaggtgaagctgcgcctggacaacatgcccgagatccgcgcc60

ggcctgtggaagctgcacaccgaggtgaacgccggcgtgcgctactacaccgagtggctg120

agcctgctgcgccaggagaacctgtaccgccgcagccccaacggcgacggcgagcaggag180

tgctacaagaccgccgaggagtgcaaggccgagctgctggagcgcctgcgcgcccgccag240

gtggagaacggccactgcggccccgccggcagcgacgacgagctgctgcagctggcccgc300

cagctgtacgagctgctggtgccccaggccatcggcgccaagggcgacgcccagcagatc360

gcccgcaagttcctgagccccctggccgacaaggacgccgtgggcggcctgggcatcgcc420

aaggccggcaacaagccccgctgggtgcgcatgcgcgaggccggcgagcccggctgggag480

gaggagaaggccaaggccgaggcccgcaagagcaccgaccgcaccgccgacgtgctgcgc540

gccctggccgacttcggcctgaagcccctgatgcgcgtgtacaccgacagcgacatgagc600

agcgtgcagtggaagcccctgcgcaagggccaggccgtgcgcacctgggaccgcgacatg660

ttccagcaggccatcgagcgcatgatgagctgggagagctggaaccagcgcgtgggcgag720

gcctacgccaagctggtggagcagaagagccgcttcgagcagaagaacttcgtgggccag780

gagcacctggtgcagctggtgaaccagctgcagcaggacatgaaggaggccagccacggc840

ctggagagcaaggagcagaccgcccactacctgaccggccgcgccctgcgcggcagcgac900

aaggtgttcgagaagtgggagaagctggaccccgacgcccccttcgacctgtacgacacc960

gagatcaagaacgtgcagcgccgcaacacccgccgcttcggcagccacgacctgttcgcc1020

aagctggccgagcccaagtaccaggccctgtggcgcgaggacgccagcttcctgacccgc1080

tacgccgtgtacaacagcatcgtgcgcaagctgaaccacgccaagatgttcgccaccttc1140

accctgcccgacgccaccgcccaccccatctggacccgcttcgacaagctgggcggcaac1200

ctgcaccagtacaccttcctgttcaacgagttcggcgagggccgccacgccatccgcttc1260

cagaagctgctgaccgtggaggacggcgtggccaaggaggtggacgacgtgaccgtgccc1320

atcagcatgagcgcccagctggacgacctgctgccccgcgacccccacgagctggtggcc1380

ctgtacttccaggactacggcgccgagcagcacctggccggcgagttcggcggcgccaag1440

atccagtaccgccgcgaccagctgaaccacctgcacgcccgccgcggcgcccgcgacgtg1500

tacctgaacctgagcgtgcgcgtgcagagccagagcgaggcccgcggcgagcgccgcccc1560

ccctacgccgccgtgttccgcctggtgggcgacaaccaccgcgccttcgtgcacttcgac1620

aagctgagcgactacctggccgagcaccccgacgacggcaagctgggcagcgagggcctg1680

ctgagcggcctgcgcgtgatgagcgtggacctgggcctgcgcaccagcgccagcatcagc1740

gtgttccgcgtggcccgcaaggacgagctgaagcccaacagcgagggccgcgtgcccttc1800

tgcttccccatcgagggcaacgagaacctggtggccgtgcacgagcgcagccagctgctg1860

aagctgcccggcgagaccgagagcaaggacctgcgcgccatccgcgaggagcgccagcgc1920

accctgcgccagctgcgcacccagctggcctacctgcgcctgctggtgcgctgcggcagc1980

gaggacgtgggccgccgcgagcgcagctgggccaagctgatcgagcagcccatggacgcc2040

aaccagatgacccccgactggcgcgaggccttcgaggacgagctgcagaagctgaagagc2100

ctgtacggcatctgcggcgaccgcgagtggaccgaggccgtgtacgagagcgtgcgccgc2160

gtgtggcgccacatgggcaagcaggtgcgcgactggcgcaaggacgtgcgcagcggcgag2220

cgccccaagatccgcggctaccagaaggacgtggtgggcggcaacagcatcgagcagatc2280

gagtacctggagcgccagtacaagttcctgaagagctggagcttcttcggcaaggtgagc2340

ggccaggtgatccgcgccgagaagggcagccgcttcgccatcaccctgcgcgagcacatc2400

gaccacgccaaggaggaccgcctgaagaagctggccgaccgcatcatcatggaggccctg2460

ggctacgtgtacgccctggacgacgagcgcggcaagggcaagtgggtggccaagtacccc2520

ccctgccagctgatcctgctggaggagctgagcgagtaccagttcaacaacgaccgcccc2580

cccagcgagaacaaccagctgatgcagtggagccaccgcggcgtgttccaggagctgctg2640

aaccaggcccaggtgcacgacctgctggtgggcaccatgtacgccgccttcagcagccgc2700

ttcgacgcccgcaccggcgcccccggcatccgctgccgccgcgtgcccgcccgctgcgcc2760

cgcgagcagaaccccgagcccttcccctggtggctgaacaagttcgtggccgagcacaag2820

ctggacggctgccccctgcgcgccgacgacctgatccccaccggcgagggcgagttcttc2880

gtgagccccttcagcgccgaggagggcgacttccaccagatccacgccgacctgaacgcc2940

gcccagaacctgcagcgccgcctgtggagcgacttcgacatcagccagatccgcctgcgc3000

tgcgactggggcgaggtggacggcgagcccgtgctgatcccccgcaccaccggcaagcgc3060

accgccgacagctacggcaacaaggtgttctacaccaagaccggcgtgacctactacgag3120

cgcgagcgcggcaagaagcgccgcaaggtgttcgcccaggaggagctgagcgaggaggag3180

gccgagctgctggtggaggccgacgaggcccgcgagaagagcgtggtgctgatgcgcgac3240

cccagcggcatcatcaaccgcggcgactggacccgccagaaggagttctggagcatggtg3300

aaccagcgcatcgagggctacctggtgaagcagatccgcagccgcgtgcgcctgcaggag3360

agcgcctgcgagaacaccggcgacatc3387

<210>12

<211>3441

<212>dna

<213>alicyclobacilluskakegawensis

<400>12

atggccgtgaagagcatcaaggtgaagctgcgcctgagcgagtgccccgacatcctggcc60

ggcatgtggcagctgcaccgcgccaccaacgccggcgtgcgctactacaccgagtgggtg120

agcctgatgcgccaggagatcctgtacagccgcggccccgacggcggccagcagtgctac180

atgaccgccgaggactgccagcgcgagctgctgcgccgcctgcgcaaccgccagctgcac240

aacggccgccaggaccagcccggcaccgacgccgacctgctggccatcagccgccgcctg300

tacgagatcctggtgctgcagagcatcggcaagcgcggcgacgcccagcagatcgccagc360

agcttcctgagccccctggtggaccccaacagcaagggcggccgcggcgaggccaagagc420

ggccgcaagcccgcctggcagaagatgcgcgaccagggcgacccccgctgggtggccgcc480

cgcgagaagtacgagcagcgcaaggccgtggaccccagcaaggagatcctgaacagcctg540

gacgccctgggcctgcgccccctgttcgccgtgttcaccgagacctaccgcagcggcgtg600

gactggaagcccctgggcaagagccagggcgtgcgcacctgggaccgcgacatgttccag660

caggccctggagcgcctgatgagctgggagagctggaaccgccgcgtgggcgaggagtac720

gcccgcctgttccagcagaagatgaagttcgagcaggagcacttcgccgagcagagccac780

ctggtgaagctggcccgcgccctggaggccgacatgcgcgccgccagccagggcttcgag840

gccaagcgcggcaccgcccaccagatcacccgccgcgccctgcgcggcgccgaccgcgtg900

ttcgagatctggaagagcatccccgaggaggccctgttcagccagtacgacgaggtgatc960

cgccaggtgcaggccgagaagcgccgcgacttcggcagccacgacctgttcgccaagctg1020

gccgagcccaagtaccagcccctgtggcgcgccgacgagaccttcctgacccgctacgcc1080

ctgtacaacggcgtgctgcgcgacctggagaaggcccgccagttcgccaccttcaccctg1140

cccgacgcctgcgtgaaccccatctggacccgcttcgagagcagccagggcagcaacctg1200

cacaagtacgagttcctgttcgaccacctgggccccggccgccacgccgtgcgcttccag1260

cgcctgctggtggtggagagcgagggcgccaaggagcgcgacagcgtggtggtgcccgtg1320

gcccccagcggccagctggacaagctggtgctgcgcgaggaggagaagagcagcgtggcc1380

ctgcacctgcacgacaccgcccgccccgacggcttcatggccgagtgggccggcgccaag1440

ctgcagtacgagcgcagcaccctggcccgcaaggcccgccgcgacaagcagggcatgcgc1500

agctggcgccgccagcccagcatgctgatgagcgccgcccagatgctggaggacgccaag1560

caggccggcgacgtgtacctgaacatcagcgtgcgcgtgaagagccccagcgaggtgcgc1620

ggccagcgccgccccccctacgccgccctgttccgcatcgacgacaagcagcgccgcgtg1680

accgtgaactacaacaagctgagcgcctacctggaggagcaccccgacaagcagatcccc1740

ggcgcccccggcctgctgagcggcctgcgcgtgatgagcgtggacctgggcctgcgcacc1800

agcgccagcatcagcgtgttccgcgtggccaagaaggaggaggtggaggccctgggcgac1860

ggccgccccccccactactaccccatccacggcaccgacgacctggtggccgtgcacgag1920

cgcagccacctgatccagatgcccggcgagaccgagaccaagcagctgcgcaagctgcgc1980

gaggagcgccaggccgtgctgcgccccctgttcgcccagctggccctgctgcgcctgctg2040

gtgcgctgcggcgccgccgacgagcgcatccgcacccgcagctggcagcgcctgaccaag2100

cagggccgcgagttcaccaagcgcctgacccccagctggcgcgaggccctggagctggag2160

ctgacccgcctggaggcctactgcggccgcgtgcccgacgacgagtggagccgcatcgtg2220

gaccgcaccgtgatcgccctgtggcgccgcatgggcaagcaggtgcgcgactggcgcaag2280

caggtgaagagcggcgccaaggtgaaggtgaagggctaccagctggacgtggtgggcggc2340

aacagcctggcccagatcgactacctggagcagcagtacaagttcctgcgccgctggagc2400

ttcttcgcccgcgccagcggcctggtggtgcgcgccgaccgcgagagccacttcgccgtg2460

gccctgcgccagcacatcgagaacgccaagcgcgaccgcctgaagaagctggccgaccgc2520

atcctgatggaggccctgggctacgtgtacgaggccagcggcccccgcgagggccagtgg2580

accgcccagcaccccccctgccagctgatcatcctggaggagctgagcgcctaccgcttc2640

agcgacgaccgcccccccagcgagaacagcaagctgatggcctggggccaccgcggcatc2700

ctggaggagctggtgaaccaggcccaggtgcacgacgtgctggtgggcaccgtgtacgcc2760

gccttcagcagccgcttcgacgcccgcaccggcgcccccggcgtgcgctgccgccgcgtg2820

cccgcccgcttcgtgggcgccaccgtggacgacagcctgcccctgtggctgaccgagttc2880

ctggacaagcaccgcctggacaagaacctgctgcgccccgacgacgtgatccccaccggc2940

gagggcgagttcctggtgagcccctgcggcgaggaggccgcccgcgtgcgccaggtgcac3000

gccgacatcaacgccgcccagaacctgcagcgccgcctgtggcagaacttcgacatcacc3060

gagctgcgcctgcgctgcgacgtgaagatgggcggcgagggcaccgtgctggtgccccgc3120

gtgaacaacgcccgcgccaagcagctgttcggcaagaaggtgctggtgagccaggacggc3180

gtgaccttcttcgagcgcagccagaccggcggcaagccccacagcgagaagcagaccgac3240

ctgaccgacaaggagctggagctgatcgccgaggccgacgaggcccgcgccaagagcgtg3300

gtgctgttccgcgaccccagcggccacatcggcaagggccactggatccgccagcgcgag3360

ttctggagcctggtgaagcagcgcatcgagagccacaccgccgagcgcatccgcgtgcgc3420

ggcgtgggcagcagcctggac3441

<210>13

<211>3438

<212>dna

<213>alicyclobacillusmacrosporangiidus

<400>13

atgaacgtggccgtgaagagcatcaaggtgaagctgatgctgggccacctgcccgagatc60

cgcgagggcctgtggcacctgcacgaggccgtgaacctgggcgtgcgctactacaccgag120

tggctggccctgctgcgccagggcaacctgtaccgccgcggcaaggacggcgcccaggag180

tgctacatgaccgccgagcagtgccgccaggagctgctggtgcgcctgcgcgaccgccag240

aagcgcaacggccacaccggcgaccccggcaccgacgaggagctgctgggcgtggcccgc300

cgcctgtacgagctgctggtgccccagagcgtgggcaagaagggccaggcccagatgctg360

gccagcggcttcctgagccccctggccgaccccaagagcgagggcggcaagggcaccagc420

aagagcggccgcaagcccgcctggatgggcatgaaggaggccggcgacagccgctgggtg480

gaggccaaggcccgctacgaggccaacaaggccaaggaccccaccaagcaggtgatcgcc540

agcctggagatgtacggcctgcgccccctgttcgacgtgttcaccgagacctacaagacc600

atccgctggatgcccctgggcaagcaccagggcgtgcgcgcctgggaccgcgacatgttc660

cagcagagcctggagcgcctgatgagctgggagagctggaacgagcgcgtgggcgccgag720

ttcgcccgcctggtggaccgccgcgaccgcttccgcgagaagcacttcaccggccaggag780

cacctggtggccctggcccagcgcctggagcaggagatgaaggaggccagccccggcttc840

gagagcaagagcagccaggcccaccgcatcaccaagcgcgccctgcgcggcgccgacggc900

atcatcgacgactggctgaagctgagcgagggcgagcccgtggaccgcttcgacgagatc960

ctgcgcaagcgccaggcccagaacccccgccgcttcggcagccacgacctgttcctgaag1020

ctggccgagcccgtgttccagcccctgtggcgcgaggaccccagcttcctgagccgctgg1080

gccagctacaacgaggtgctgaacaagctggaggacgccaagcagttcgccaccttcacc1140

ctgcccagcccctgcagcaaccccgtgtgggcccgcttcgagaacgccgagggcaccaac1200

atcttcaagtacgacttcctgttcgaccacttcggcaagggccgccacggcgtgcgcttc1260

cagcgcatgatcgtgatgcgcgacggcgtgcccaccgaggtggagggcatcgtggtgccc1320

atcgcccccagccgccagctggacgccctggcccccaacgacgccgccagccccatcgac1380

gtgttcgtgggcgaccccgccgcccccggcgccttccgcggccagttcggcggcgccaag1440

atccagtaccgccgcagcgccctggtgcgcaagggccgccgcgaggagaaggcctacctg1500

tgcggcttccgcctgcccagccagcgccgcaccggcacccccgccgacgacgccggcgag1560

gtgttcctgaacctgagcctgcgcgtggagagccagagcgagcaggccggccgccgcaac1620

cccccctacgccgccgtgttccacatcagcgaccagacccgccgcgtgatcgtgcgctac1680

ggcgagatcgagcgctacctggccgagcaccccgacaccggcatccccggcagccgcggc1740

ctgaccagcggcctgcgcgtgatgagcgtggacctgggcctgcgcaccagcgccgccatc1800

agcgtgttccgcgtggcccaccgcgacgagctgacccccgacgcccacggccgccagccc1860

ttcttcttccccatccacggcatggaccacctggtggccctgcacgagcgcagccacctg1920

atccgcctgcccggcgagaccgagagcaagaaggtgcgcagcatccgcgagcagcgcctg1980

gaccgcctgaaccgcctgcgcagccagatggccagcctgcgcctgctggtgcgcaccggc2040

gtgctggacgagcagaagcgcgaccgcaactgggagcgcctgcagagcagcatggagcgc2100

ggcggcgagcgcatgcccagcgactggtgggacctgttccaggcccaggtgcgctacctg2160

gcccagcaccgcgacgccagcggcgaggcctggggccgcatggtgcaggccgccgtgcgc2220

accctgtggcgccagctggccaagcaggtgcgcgactggcgcaaggaggtgcgccgcaac2280

gccgacaaggtgaagatccgcggcatcgcccgcgacgtgcccggcggccacagcctggcc2340

cagctggactacctggagcgccagtaccgcttcctgcgcagctggagcgccttcagcgtg2400

caggccggccaggtggtgcgcgccgagcgcgacagccgcttcgccgtggccctgcgcgag2460

cacatcgacaacggcaagaaggaccgcctgaagaagctggccgaccgcatcctgatggag2520

gccctgggctacgtgtacgtgaccgacggccgccgcgccggccagtggcaggccgtgtac2580

cccccctgccagctggtgctgctggaggagctgagcgagtaccgcttcagcaacgaccgc2640

ccccccagcgagaacagccagctgatggtgtggagccaccgcggcgtgctggaggagctg2700

atccaccaggcccaggtgcacgacgtgctggtgggcaccatccccgccgccttcagcagc2760

cgcttcgacgcccgcaccggcgcccccggcatccgctgccgccgcgtgcccagcatcccc2820

ctgaaggacgcccccagcatccccatctggctgagccactacctgaagcagaccgagcgc2880

gacgccgccgccctgcgccccggcgagctgatccccaccggcgacggcgagttcctggtg2940

acccccgccggccgcggcgccagcggcgtgcgcgtggtgcacgccgacatcaacgccgcc3000

cacaacctgcagcgccgcctgtgggagaacttcgacctgagcgacatccgcgtgcgctgc3060

gaccgccgcgagggcaaggacggcaccgtggtgctgatcccccgcctgaccaaccagcgc3120

gtgaaggagcgctacagcggcgtgatcttcaccagcgaggacggcgtgagcttcaccgtg3180

ggcgacgccaagacccgccgccgcagcagcgccagccagggcgagggcgacgacctgagc3240

gacgaggagcaggagctgctggccgaggccgacgacgcccgcgagcgcagcgtggtgctg3300

ttccgcgaccccagcggcttcgtgaacggcggccgctggaccgcccagcgcgccttctgg3360

ggcatggtgcacaaccgcatcgagaccctgctggccgagcgcttcagcgtgagcggcgcc3420

gccgagaaggtgcgcggc3438

<210>14

<211>3324

<212>dna

<213>bacillushisashii

<400>14

atggccacccgcagcttcatcctgaagatcgagcccaacgaggaggtgaagaagggcctg60

tggaagacccacgaggtgctgaaccacggcatcgcctactacatgaacatcctgaagctg120

atccgccaggaggccatctacgagcaccacgagcaggaccccaagaaccccaagaaggtg180

agcaaggccgagatccaggccgagctgtgggacttcgtgctgaagatgcagaagtgcaac240

agcttcacccacgaggtggacaaggacgaggtgttcaacatcctgcgcgagctgtacgag300

gagctggtgcccagcagcgtggagaagaagggcgaggccaaccagctgagcaacaagttc360

ctgtaccccctggtggaccccaacagccagagcggcaagggcaccgccagcagcggccgc420

aagccccgctggtacaacctgaagatcgccggcgaccccagctgggaggaggagaagaag480

aagtgggaggaggacaagaagaaggaccccctggccaagatcctgggcaagctggccgag540

tacggcctgatccccctgttcatcccctacaccgacagcaacgagcccatcgtgaaggag600

atcaagtggatggagaagagccgcaaccagagcgtgcgccgcctggacaaggacatgttc660

atccaggccctggagcgcttcctgagctgggagagctggaacctgaaggtgaaggaggag720

tacgagaaggtggagaaggagtacaagaccctggaggagcgcatcaaggaggacatccag780

gccctgaaggccctggagcagtacgagaaggagcgccaggagcagctgctgcgcgacacc840

ctgaacaccaacgagtaccgcctgagcaagcgcggcctgcgcggctggcgcgagatcatc900

cagaagtggctgaagatggacgagaacgagcccagcgagaagtacctggaggtgttcaag960

gactaccagcgcaagcacccccgcgaggccggcgactacagcgtgtacgagttcctgagc1020

aagaaggagaaccacttcatctggcgcaaccaccccgagtacccctacctgtacgccacc1080

ttctgcgagatcgacaagaagaagaaggacgccaagcagcaggccaccttcaccctggcc1140

gaccccatcaaccaccccctgtgggtgcgcttcgaggagcgcagcggcagcaacctgaac1200

aagtaccgcatcctgaccgagcagctgcacaccgagaagctgaagaagaagctgaccgtg1260

cagctggaccgcctgatctaccccaccgagagcggcggctgggaggagaagggcaaggtg1320

gacatcgtgctgctgcccagccgccagttctacaaccagatcttcctggacatcgaggag1380

aagggcaagcacgccttcacctacaaggacgagagcatcaagttccccctgaagggcacc1440

ctgggcggcgcccgcgtgcagttcgaccgcgaccacctgcgccgctacccccacaaggtg1500

gagagcggcaacgtgggccgcatctacttcaacatgaccgtgaacatcgagcccaccgag1560

agccccgtgagcaagagcctgaagatccaccgcgacgacttccccaaggtggtgaacttc1620

aagcccaaggagctgaccgagtggatcaaggacagcaagggcaagaagctgaagagcggc1680

atcgagagcctggagatcggcctgcgcgtgatgagcatcgacctgggccagcgccaggcc1740

gccgccgccagcatcttcgaggtggtggaccagaagcccgacatcgagggcaagctgttc1800

ttccccatcaagggcaccgagctgtacgccgtgcaccgcgccagcttcaacatcaagctg1860

cccggcgagaccctggtgaagagccgcgaggtgctgcgcaaggcccgcgaggacaacctg1920

aagctgatgaaccagaagctgaacttcctgcgcaacgtgctgcacttccagcagttcgag1980

gacatcaccgagcgcgagaagcgcgtgaccaagtggatcagccgccaggagaacagcgac2040

gtgcccctggtgtaccaggacgagctgatccagatccgcgagctgatgtacaagccctac2100

aaggactgggtggccttcctgaagcagctgcacaagcgcctggaggtggagatcggcaag2160

gaggtgaagcactggcgcaagagcctgagcgacggccgcaagggcctgtacggcatcagc2220

ctgaagaacatcgacgagatcgaccgcacccgcaagttcctgctgcgctggagcctgcgc2280

cccaccgagcccggcgaggtgcgccgcctggagcccggccagcgcttcgccatcgaccag2340

ctgaaccacctgaacgccctgaaggaggaccgcctgaagaagatggccaacaccatcatc2400

atgcacgccctgggctactgctacgacgtgcgcaagaagaagtggcaggccaagaacccc2460

gcctgccagatcatcctgttcgaggacctgagcaactacaacccctacgaggagcgcagc2520

cgcttcgagaacagcaagctgatgaagtggagccgccgcgagatcccccgccaggtggcc2580

ctgcagggcgagatctacggcctgcaggtgggcgaggtgggcgcccagttcagcagccgc2640

ttccacgccaagaccggcagccccggcatccgctgcagcgtggtgaccaaggagaagctg2700

caggacaaccgcttcttcaagaacctgcagcgcgagggccgcctgaccctggacaagatc2760

gccgtgctgaaggagggcgacctgtaccccgacaagggcggcgagaagttcatcagcctg2820

agcaaggaccgcaagtgcgtgaccacccacgccgacatcaacgccgcccagaacctgcag2880

aagcgcttctggacccgcacccacggcttctacaaggtgtactgcaaggcctaccaggtg2940

gacggccagaccgtgtacatccccgagagcaaggaccagaagcagaagatcatcgaggag3000

ttcggcgagggctacttcatcctgaaggacggcgtgtacgagtgggtgaacgccggcaag3060

ctgaagatcaagaagggcagcagcaagcagagcagcagcgagctggtggacagcgacatc3120

ctgaaggacagcttcgacctggccagcgagctgaagggcgagaagctgatgctgtaccgc3180

gaccccagcggcaacgtgttccccagcgacaagtggatggccgccggcgtgttcttcggc3240

aagctggagcgcatcctgatcagcaagctgaccaaccagtacagcatcagcaccatcgag3300

gacgacagcagcaagcagagcatg3324

<210>15

<211>3324

<212>dna

<213>bacillus

<400>15

atggccatccgcagcatcaagctgaagctgaagacccacaccggccccgaggcccagaac60

ctgcgcaagggcatctggcgcacccaccgcctgctgaacgagggcgtggcctactacatg120

aagatgctgctgctgttccgccaggagagcaccggcgagcgccccaaggaggagctgcag180

gaggagctgatctgccacatccgcgagcagcagcagcgcaaccaggccgacaagaacacc240

caggccctgcccctggacaaggccctggaggccctgcgccagctgtacgagctgctggtg300

cccagcagcgtgggccagagcggcgacgcccagatcatcagccgcaagttcctgagcccc360

ctggtggaccccaacagcgagggcggcaagggcaccagcaaggccggcgccaagcccacc420

tggcagaagaagaaggaggccaacgaccccacctgggagcaggactacgagaagtggaag480

aagcgccgcgaggaggaccccaccgccagcgtgatcaccaccctggaggagtacggcatc540

cgccccatcttccccctgtacaccaacaccgtgaccgacatcgcctggctgcccctgcag600

agcaaccagttcgtgcgcacctgggaccgcgacatgctgcagcaggccatcgagcgcctg660

ctgagctgggagagctggaacaagcgcgtgcaggaggagtacgccaagctgaaggagaag720

atggcccagctgaacgagcagctggagggcggccaggagtggatcagcctgctggagcag780

tacgaggagaaccgcgagcgcgagctgcgcgagaacatgaccgccgccaacgacaagtac840

cgcatcaccaagcgccagatgaagggctggaacgagctgtacgagctgtggagcaccttc900

cccgccagcgccagccacgagcagtacaaggaggccctgaagcgcgtgcagcagcgcctg960

cgcggccgcttcggcgacgcccacttcttccagtacctgatggaggagaagaaccgcctg1020

atctggaagggcaacccccagcgcatccactacttcgtggcccgcaacgagctgaccaag1080

cgcctggaggaggccaagcagagcgccaccatgaccctgcccaacgcccgcaagcacccc1140

ctgtgggtgcgcttcgacgcccgcggcggcaacctgcaggactactacctgaccgccgag1200

gccgacaagccccgcagccgccgcttcgtgaccttcagccagctgatctggcccagcgag1260

agcggctggatggagaagaaggacgtggaggtggagctggccctgagccgccagttctac1320

cagcaggtgaagctgctgaagaacgacaagggcaagcagaagatcgagttcaaggacaag1380

ggcagcggcagcaccttcaacggccacctgggcggcgccaagctgcagctggagcgcggc1440

gacctggagaaggaggagaagaacttcgaggacggcgagatcggcagcgtgtacctgaac1500

gtggtgatcgacttcgagcccctgcaggaggtgaagaacggccgcgtgcaggccccctac1560

ggccaggtgctgcagctgatccgccgccccaacgagttccccaaggtgaccacctacaag1620

agcgagcagctggtggagtggatcaaggccagcccccagcacagcgccggcgtggagagc1680

ctggccagcggcttccgcgtgatgagcatcgacctgggcctgcgcgccgccgccgccacc1740

agcatcttcagcgtggaggagagcagcgacaagaacgccgccgacttcagctactggatc1800

gagggcacccccctggtggccgtgcaccagcgcagctacatgctgcgcctgcccggcgag1860

caggtggagaagcaggtgatggagaagcgcgacgagcgcttccagctgcaccagcgcgtg1920

aagttccagatccgcgtgctggcccagatcatgcgcatggccaacaagcagtacggcgac1980

cgctgggacgagctggacagcctgaagcaggccgtggagcagaagaagagccccctggac2040

cagaccgaccgcaccttctgggagggcatcgtgtgcgacctgaccaaggtgctgccccgc2100

aacgaggccgactgggagcaggccgtggtgcagatccaccgcaaggccgaggagtacgtg2160

ggcaaggccgtgcaggcctggcgcaagcgcttcgccgccgacgagcgcaagggcatcgcc2220

ggcctgagcatgtggaacatcgaggagctggagggcctgcgcaagctgctgatcagctgg2280

agccgccgcacccgcaacccccaggaggtgaaccgcttcgagcgcggccacaccagccac2340

cagcgcctgctgacccacatccagaacgtgaaggaggaccgcctgaagcagctgagccac2400

gccatcgtgatgaccgccctgggctacgtgtacgacgagcgcaagcaggagtggtgcgcc2460

gagtaccccgcctgccaggtgatcctgttcgagaacctgagccagtaccgcagcaacctg2520

gaccgcagcaccaaggagaacagcaccctgatgaagtgggcccaccgcagcatccccaag2580

tacgtgcacatgcaggccgagccctacggcatccagatcggcgacgtgcgcgccgagtac2640

agcagccgcttctacgccaagaccggcacccccggcatccgctgcaagaaggtgcgcggc2700

caggacctgcagggccgccgcttcgagaacctgcagaagcgcctggtgaacgagcagttc2760

ctgaccgaggagcaggtgaagcagctgcgccccggcgacatcgtgcccgacgacagcggc2820

gagctgttcatgaccctgaccgacggcagcggcagcaaggaggtggtgttcctgcaggcc2880

gacatcaacgccgcccacaacctgcagaagcgcttctggcagcgctacaacgagctgttc2940

aaggtgagctgccgcgtgatcgtgcgcgacgaggaggagtacctggtgcccaagaccaag3000

agcgtgcaggccaagctgggcaagggcctgttcgtgaagaagagcgacaccgcctggaag3060

gacgtgtacgtgtgggacagccaggccaagctgaagggcaagaccaccttcaccgaggag3120

agcgagagccccgagcagctggaggacttccaggagatcatcgaggaggccgaggaggcc3180

aagggcacctaccgcaccctgttccgcgaccccagcggcgtgttcttccccgagagcgtg3240

tggtacccccagaaggacttctggggcgaggtgaagcgcaagctgtacggcaagctgcgc3300

gagcgcttcctgaccaaggcccgc3324

<210>16

<211>3336

<212>dna

<213>bacillus

<400>16

atggccatccgcagcatcaagctgaagatgaagaccaacagcggcaccgacagcatctac60

ctgcgcaaggccctgtggcgcacccaccagctgatcaacgagggcatcgcctactacatg120

aacctgctgaccctgtaccgccaggaggccatcggcgacaagaccaaggaggcctaccag180

gccgagctgatcaacatcatccgcaaccagcagcgcaacaacggcagcagcgaggagcac240

ggcagcgaccaggagatcctggccctgctgcgccagctgtacgagctgatcatccccagc300

agcatcggcgagagcggcgacgccaaccagctgggcaacaagttcctgtaccccctggtg360

gaccccaacagccagagcggcaagggcaccagcaacgccggccgcaagccccgctggaag420

cgcctgaaggaggagggcaaccccgactgggagctggagaagaagaaggacgaggagcgc480

aaggccaaggaccccaccgtgaagatcttcgacaacctgaacaagtacggcctgctgccc540

ctgttccccctgttcaccaacatccagaaggacatcgagtggctgcccctgggcaagcgc600

cagagcgtgcgcaagtgggacaaggacatgttcatccaggccatcgagcgcctgctgagc660

tgggagagctggaaccgccgcgtggccgacgagtacaagcagctgaaggagaagaccgag720

agctactacaaggagcacctgaccggcggcgaggagtggatcgagaagatccgcaagttc780

gagaaggagcgcaacatggagctggagaagaacgccttcgcccccaacgacggctacttc840

atcaccagccgccagatccgcggctgggaccgcgtgtacgagaagtggagcaagctgccc900

gagagcgccagccccgaggagctgtggaaggtggtggccgagcagcagaacaagatgagc960

gagggcttcggcgaccccaaggtgttcagcttcctggccaaccgcgagaaccgcgacatc1020

tggcgcggccacagcgagcgcatctaccacatcgccgcctacaacggcctgcagaagaag1080

ctgagccgcaccaaggagcaggccaccttcaccctgcccgacgccatcgagcaccccctg1140

tggatccgctacgagagccccggcggcaccaacctgaacctgttcaagctggaggagaag1200

cagaagaagaactactacgtgaccctgagcaagatcatctggcccagcgaggagaagtgg1260

atcgagaaggagaacatcgagatccccctggcccccagcatccagttcaaccgccagatc1320

aagctgaagcagcacgtgaagggcaagcaggagatcagcttcagcgactacagcagccgc1380

atcagcctggacggcgtgctgggcggcagccgcatccagttcaaccgcaagtacatcaag1440

aaccacaaggagctgctgggcgagggcgacatcggccccgtgttcttcaacctggtggtg1500

gacgtggcccccctgcaggagacccgcaacggccgcctgcagagccccatcggcaaggcc1560

ctgaaggtgatcagcagcgacttcagcaaggtgatcgactacaagcccaaggagctgatg1620

gactggatgaacaccggcagcgccagcaacagcttcggcgtggccagcctgctggagggc1680

atgcgcgtgatgagcatcgacatgggccagcgcaccagcgccagcgtgagcatcttcgag1740

gtggtgaaggagctgcccaaggaccaggagcagaagctgttctacagcatcaacgacacc1800

gagctgttcgccatccacaagcgcagcttcctgctgaacctgcccggcgaggtggtgacc1860

aagaacaacaagcagcagcgccaggagcgccgcaagaagcgccagttcgtgcgcagccag1920

atccgcatgctggccaacgtgctgcgcctggagaccaagaagacccccgacgagcgcaag1980

aaggccatccacaagctgatggagatcgtgcagagctacgacagctggaccgccagccag2040

aaggaggtgtgggagaaggagctgaacctgctgaccaacatggccgccttcaacgacgag2100

atctggaaggagagcctggtggagctgcaccaccgcatcgagccctacgtgggccagatc2160

gtgagcaagtggcgcaagggcctgagcgagggccgcaagaacctggccggcatcagcatg2220

tggaacatcgacgagctggaggacacccgccgcctgctgatcagctggagcaagcgcagc2280

cgcacccccggcgaggccaaccgcatcgagaccgacgagcccttcggcagcagcctgctg2340

cagcacatccagaacgtgaaggacgaccgcctgaagcagatggccaacctgatcatcatg2400

accgccctgggcttcaagtacgacaaggaggagaaggaccgctacaagcgctggaaggag2460

acctaccccgcctgccagatcatcctgttcgagaacctgaaccgctacctgttcaacctg2520

gaccgcagccgccgcgagaacagccgcctgatgaagtgggcccaccgcagcatcccccgc2580

accgtgagcatgcagggcgagatgttcggcctgcaggtgggcgacgtgcgcagcgagtac2640

agcagccgcttccacgccaagaccggcgcccccggcatccgctgccacgccctgaccgag2700

gaggacctgaaggccggcagcaacaccctgaagcgcctgatcgaggacggcttcatcaac2760

gagagcgagctggcctacctgaagaagggcgacatcatccccagccagggcggcgagctg2820

ttcgtgaccctgagcaagcgctacaagaaggacagcgacaacaacgagctgaccgtgatc2880

cacgccgacatcaacgccgcccagaacctgcagaagcgcttctggcagcagaacagcgag2940

gtgtaccgcgtgccctgccagctggcccgcatgggcgaggacaagctgtacatccccaag3000

agccagaccgagaccatcaagaagtacttcggcaagggcagcttcgtgaagaacaacacc3060

gagcaggaggtgtacaagtgggagaagagcgagaagatgaagatcaagaccgacaccacc3120

ttcgacctgcaggacctggacggcttcgaggacatcagcaagaccatcgagctggcccag3180

gagcagcagaagaagtacctgaccatgttccgcgaccccagcggctacttcttcaacaac3240

gagacctggcgcccccagaaggagtactggagcatcgtgaacaacatcatcaagagctgc3300

ctgaagaagaagatcctgagcaacaaggtggagctg3336

<210>17

<211>3447

<212>dna

<213>desulfovibrioinopinatus

<400>17

atgcccacccgcaccatcaacctgaagctggtgctgggcaagaaccccgagaacgccacc60

ctgcgccgcgccctgttcagcacccaccgcctggtgaaccaggccaccaagcgcatcgag120

gagttcctgctgctgtgccgcggcgaggcctaccgcaccgtggacaacgagggcaaggag180

gccgagatcccccgccacgccgtgcaggaggaggccctggccttcgccaaggccgcccag240

cgccacaacggctgcatcagcacctacgaggaccaggagatcctggacgtgctgcgccag300

ctgtacgagcgcctggtgcccagcgtgaacgagaacaacgaggccggcgacgcccaggcc360

gccaacgcctgggtgagccccctgatgagcgccgagagcgagggcggcctgagcgtgtac420

gacaaggtgctggaccccccccccgtgtggatgaagctgaaggaggagaaggcccccggc480

tgggaggccgccagccagatctggatccagagcgacgagggccagagcctgctgaacaag540

cccggcagccccccccgctggatccgcaagctgcgcagcggccagccctggcaggacgac600

ttcgtgagcgaccagaagaagaagcaggacgagctgaccaagggcaacgcccccctgatc660

aagcagctgaaggagatgggcctgctgcccctggtgaaccccttcttccgccacctgctg720

gaccccgagggcaagggcgtgagcccctgggaccgcctggccgtgcgcgccgccgtggcc780

cacttcatcagctgggagagctggaaccaccgcacccgcgccgagtacaacagcctgaag840

ctgcgccgcgacgagttcgaggccgccagcgacgagttcaaggacgacttcaccctgctg900

cgccagtacgaggccaagcgccacagcaccctgaagagcatcgccctggccgacgacagc960

aacccctaccgcatcggcgtgcgcagcctgcgcgcctggaaccgcgtgcgcgaggagtgg1020

atcgacaagggcgccaccgaggagcagcgcgtgaccatcctgagcaagctgcagacccag1080

ctgcgcggcaagttcggcgaccccgacctgttcaactggctggcccaggaccgccacgtg1140

cacctgtggagcccccgcgacagcgtgacccccctggtgcgcatcaacgccgtggacaag1200

gtgctgcgccgccgcaagccctacgccctgatgaccttcgcccacccccgcttccacccc1260

cgctggatcctgtacgaggcccccggcggcagcaacctgcgccagtacgccctggactgc1320

accgagaacgccctgcacatcaccctgcccctgctggtggacgacgcccacggcacctgg1380

atcgagaagaagatccgcgtgcccctggcccccagcggccagatccaggacctgaccctg1440

gagaagctggagaagaagaagaaccgcctgtactaccgcagcggcttccagcagttcgcc1500

ggcctggccggcggcgccgaggtgctgttccaccgcccctacatggagcacgacgagcgc1560

agcgaggagagcctgctggagcgccccggcgccgtgtggttcaagctgaccctggacgtg1620

gccacccaggccccccccaactggctggacggcaagggccgcgtgcgcaccccccccgag1680

gtgcaccacttcaagaccgccctgagcaacaagagcaagcacacccgcaccctgcagccc1740

ggcctgcgcgtgctgagcgtggacctgggcatgcgcaccttcgccagctgcagcgtgttc1800

gagctgatcgagggcaagcccgagaccggccgcgccttccccgtggccgacgagcgcagc1860

atggacagccccaacaagctgtgggccaagcacgagcgcagcttcaagctgaccctgccc1920

ggcgagacccccagccgcaaggaggaggaggagcgcagcatcgcccgcgccgagatctac1980

gccctgaagcgcgacatccagcgcctgaagagcctgctgcgcctgggcgaggaggacaac2040

gacaaccgccgcgacgccctgctggagcagttcttcaagggctggggcgaggaggacgtg2100

gtgcccggccaggccttcccccgcagcctgttccagggcctgggcgccgcccccttccgc2160

agcacccccgagctgtggcgccagcactgccagacctactacgacaaggccgaggcctgc2220

ctggccaagcacatcagcgactggcgcaagcgcacccgcccccgccccaccagccgcgag2280

atgtggtacaagacccgcagctaccacggcggcaagagcatctggatgctggagtacctg2340

gacgccgtgcgcaagctgctgctgagctggagcctgcgcggccgcacctacggcgccatc2400

aaccgccaggacaccgcccgcttcggcagcctggccagccgcctgctgcaccacatcaac2460

agcctgaaggaggaccgcatcaagaccggcgccgacagcatcgtgcaggccgcccgcggc2520

tacatccccctgccccacggcaagggctgggagcagcgctacgagccctgccagctgatc2580

ctgttcgaggacctggcccgctaccgcttccgcgtggaccgcccccgccgcgagaacagc2640

cagctgatgcagtggaaccaccgcgccatcgtggccgagaccaccatgcaggccgagctg2700

tacggccagatcgtggagaacaccgccgccggcttcagcagccgcttccacgccgccacc2760

ggcgcccccggcgtgcgctgccgcttcctgctggagcgcgacttcgacaacgacctgccc2820

aagccctacctgctgcgcgagctgagctggatgctgggcaacaccaaggtggagagcgag2880

gaggagaagctgcgcctgctgagcgagaagatccgccccggcagcctggtgccctgggac2940

ggcggcgagcagttcgccaccctgcaccccaagcgccagaccctgtgcgtgatccacgcc3000

gacatgaacgccgcccagaacctgcagcgccgcttcttcggccgctgcggcgaggccttc3060

cgcctggtgtgccagccccacggcgacgacgtgctgcgcctggccagcacccccggcgcc3120

cgcctgctgggcgccctgcagcagctggagaacggccagggcgccttcgagctggtgcgc3180

gacatgggcagcaccagccagatgaaccgcttcgtgatgaagagcctgggcaagaagaag3240

atcaagcccctgcaggacaacaacggcgacgacgagctggaggacgtgctgagcgtgctg3300

cccgaggaggacgacaccggccgcatcaccgtgttccgcgacagcagcggcatcttcttc3360

ccctgcaacgtgtggatccccgccaagcagttctggcccgccgtgcgcgccatgatctgg3420

aaggtgatggccagccacagcctgggc3447

<210>18

<211>3270

<212>dna

<213>laceyellasediminis

<400>18

atgagcatccgcagcttcaagctgaagatcaagaccaagagcggcgtgaacgccgaggag60

ctgcgccgcggcctgtggcgcacccaccagctgatcaacgacggcatcgcctactacatg120

aactggctggtgctgctgcgccaggaggacctgttcatccgcaacgaggagaccaacgag180

atcgagaagcgcagcaaggaggagatccagggcgagctgctggagcgcgtgcacaagcag240

cagcagcgcaaccagtggagcggcgaggtggacgaccagaccctgctgcagaccctgcgc300

cacctgtacgaggagatcgtgcccagcgtgatcggcaagagcggcaacgccagcctgaag360

gcccgcttcttcctgggccccctggtggaccccaacaacaagaccaccaaggacgtgagc420

aagagcggccccacccccaagtggaagaagatgaaggacgccggcgaccccaactgggtg480

caggagtacgagaagtacatggccgagcgccagaccctggtgcgcctggaggagatgggc540

ctgatccccctgttccccatgtacaccgacgaggtgggcgacatccactggctgccccag600

gccagcggctacacccgcacctgggaccgcgacatgttccagcaggccatcgagcgcctg660

ctgagctgggagagctggaaccgccgcgtgcgcgagcgccgcgcccagttcgagaagaag720

acccacgacttcgccagccgcttcagcgagagcgacgtgcagtggatgaacaagctgcgc780

gagtacgaggcccagcaggagaagagcctggaggagaacgccttcgcccccaacgagccc840

tacgccctgaccaagaaggccctgcgcggctgggagcgcgtgtaccacagctggatgcgc900

ctggacagcgccgccagcgaggaggcctactggcaggaggtggccacctgccagaccgcc960

atgcgcggcgagttcggcgaccccgccatctaccagttcctggcccagaaggagaaccac1020

gacatctggcgcggctaccccgagcgcgtgatcgacttcgccgagctgaaccacctgcag1080

cgcgagctgcgccgcgccaaggaggacgccaccttcaccctgcccgacagcgtggaccac1140

cccctgtgggtgcgctacgaggcccccggcggcaccaacatccacggctacgacctggtg1200

caggacaccaagcgcaacctgaccctgatcctggacaagttcatcctgcccgacgagaac1260

ggcagctggcacgaggtgaagaaggtgcccttcagcctggccaagagcaagcagttccac1320

cgccaggtgtggctgcaggaggagcagaagcagaagaagcgcgaggtggtgttctacgac1380

tacagcaccaacctgccccacctgggcaccctggccggcgccaagctgcagtgggaccgc1440

aacttcctgaacaagcgcacccagcagcagatcgaggagaccggcgagatcggcaaggtg1500

ttcttcaacatcagcgtggacgtgcgccccgccgtggaggtgaagaacggccgcctgcag1560

aacggcctgggcaaggccctgaccgtgctgacccaccccgacggcaccaagatcgtgacc1620

ggctggaaggccgagcagctggagaagtgggtgggcgagagcggccgcgtgagcagcctg1680

ggcctggacagcctgagcgagggcctgcgcgtgatgagcatcgacctgggccagcgcacc1740

agcgccaccgtgagcgtgttcgagatcaccaaggaggcccccgacaacccctacaagttc1800

ttctaccagctggagggcaccgagctgttcgccgtgcaccagcgcagcttcctgctggcc1860

ctgcccggcgagaaccccccccagaagatcaagcagatgcgcgagatccgctggaaggag1920

cgcaaccgcatcaagcagcaggtggaccagctgagcgccatcctgcgcctgcacaagaag1980

gtgaacgaggacgagcgcatccaggccatcgacaagctgctgcagaaggtggccagctgg2040

cagctgaacgaggagatcgccaccgcctggaaccaggccctgagccagctgtacagcaag2100

gccaaggagaacgacctgcagtggaaccaggccatcaagaacgcccaccaccagctggag2160

cccgtggtgggcaagcagatcagcctgtggcgcaaggacctgagcaccggccgccagggc2220

atcgccggcctgagcctgtggagcatcgaggagctggaggccaccaagaagctgctgacc2280

cgctggagcaagcgcagccgcgagcccggcgtggtgaagcgcatcgagcgcttcgagacc2340

ttcgccaagcagatccagcaccacatcaaccaggtgaaggagaaccgcctgaagcagctg2400

gccaacctgatcgtgatgaccgccctgggctacaagtacgaccaggagcagaagaagtgg2460

atcgaggtgtaccccgcctgccaggtggtgctgttcgagaacctgcgcagctaccgcttc2520

agctacgagcgcagccgccgcgagaacaagaagctgatggagtggagccaccgcagcatc2580

cccaagctggtgcagatgcagggcgagctgttcggcctgcaggtggccgacgtgtacgcc2640

gcctacagcagccgctaccacggccgcaccggcgcccccggcatccgctgccacgccctg2700

accgaggccgacctgcgcaacgagaccaacatcatccacgagctgatcgaggccggcttc2760

atcaaggaggagcaccgcccctacctgcagcagggcgacctggtgccctggagcggcggc2820

gagctgttcgccaccctgcagaagccctacgacaacccccgcatcctgaccctgcacgcc2880

gacatcaacgccgcccagaacatccagaagcgcttctggcaccccagcatgtggttccgc2940

gtgaactgcgagagcgtgatggagggcgagatcgtgacctacgtgcccaagaacaagacc3000

gtgcacaagaagcagggcaagaccttccgcttcgtgaaggtggagggcagcgacgtgtac3060

gagtgggccaagtggagcaagaaccgcaacaagaacaccttcagcagcatcaccgagcgc3120

aagccccccagcagcatgatcctgttccgcgaccccagcggcaccttcttcaaggagcag3180

gagtgggtggagcagaagaccttctggggcaaggtgcagagcatgatccaggcctacatg3240

aagaagaccatcgtgcagcgcatggaggag3270

<210>19

<211>3357

<212>dna

<213>spirochaetes

<400>19

atgagcttcaccatcagctaccccttcaagctgatcatcaagaacaaggacgaggccaag60

gccctgctggacacccaccagtacatgaacgagggcgtgaagtactacctggagaagctg120

ctgatgttccgccaggagaagatcttcatcggcgaggacgagaccggcaagcgcatctac180

atcgaggagaccgagtacaagaagcagatcgaggagttctacctgatcaagaagaccgag240

ctgggccgcaacctgaccctgaccctggacgagttcaagaccctgatgcgcgagctgtac300

atctgcctggtgagcagcagcatggagaacaagaagggcttccccaacgcccagcaggcc360

agcctgaacatcttcagccccctgttcgacgccgagagcaagggctacatcctgaaggag420

gagaacaacaacatcagcctgatccacaaggactacggcaagatcctgctgaagcgcctg480

cgcgacaacaacctgatccccatcttcaccaagttcaccgacatcaagaagatcaccgcc540

aagctgagccccaccgccctggaccgcatgatcttcgcccaggccatcgagaagctgctg600

agctacgagagctggtgcaagctgatgatcaaggagcgcttcgacaaggaggtgaagatc660

aaggagctggagaacaagtgcgagaacaagcaggagcgcgacaagatcttcgagatcctg720

gagaagtacgaggaggagcgccagaagaccttcgagcaggacagcggcttcgccaagaag780

ggcaagttctacatcaccggccgcatgctgaagggcttcgacgagatcaaggagaagtgg840

ctgaaggagaaggaccgcagcgagcagaacctgatcaacatcctgaacaagtaccagacc900

gacaacagcaagctggtgggcgaccgcaacctgttcgagttcatcatcaagctggagaac960

cagtgcctgtggaacggcgacatcgactacctgaagatcaagcgcgacatcaacaagaac1020

cagatctggctggaccgccccgagatgccccgcttcaccatgcccgacttcaagaagcac1080

cccctgtggtaccgctacgaggaccccagcaacagcaacttccgcaactacaagatcgag1140

gtggtgaaggacgagaactacatcaccatccccctgatcaccgagcgcaacaacgagtac1200

ttcgaggagaactacaccttcaacctggccaagctgaagaagctgagcgagaacatcacc1260

ttcatccccaagagcaagaacaaggagttcgagttcatcgacagcaacgacgaggaggag1320

gacaagaaggaccagaagaagagcaagcagtacatcaagtactgcgacaccgccaagaac1380

accagctacggcaagagcggcggcatccgcctgtacttcaaccgcaacgagctggagaac1440

tacaaggacggcaagaagatggacagctacaccgtgttcaccctgagcatccgcgactac1500

aagagcctgttcgccaaggagaagctgcagccccagatcttcaacaccgtggacaacaag1560

atcaccagcctgaagatccagaagaagttcggcaacgaggagcagaccaacttcctgagc1620

tacttcacccagaaccagatcaccaagaaggactggatggacgagaagaccttccagaac1680

gtgaaggagctgaacgagggcatccgcgtgctgagcgtggacctgggccagcgcttcttc1740

gccgccgtgagctgcttcgagatcatgagcgagatcgacaacaacaagctgttcttcaac1800

ctgaacgaccagaaccacaagatcatccgcatcaacgacaagaactactacgccaagcac1860

atctacagcaagaccatcaagctgagcggcgaggacgacgacctgtacaaggagcgcaag1920

atcaacaagaactacaagctgagctaccaggagcgcaagaacaagatcggcatcttcacc1980

cgccagatcaacaagctgaaccagctgctgaagatcatccgcaacgacgagatcgacaag2040

gagaagttcaaggagctgatcgagaccaccaagcgctacgtgaagaacacctacaacgac2100

ggcatcatcgactggaacaacgtggacaacaagatcctgagctacgagaacaaggaggac2160

gtgatcaacctgcacaaggagctggacaagaagctggagatcgacttcaaggagttcatc2220

cgcgagtgccgcaagcccatcttccgcagcggcggcctgagcatgcagcgcatcgacttc2280

ctggagaagctgaacaagctgaagcgcaagtgggtggcccgcacccagaagagcgccgag2340

agcatcgtgctgacccccaagttcggctacaagctgaaggagcacatcaacgagctgaag2400

gacaaccgcgtgaagcagggcgtgaactacatcctgatgaccgccctgggctacatcaag2460

gacaacgagatcaagaacgacagcaagaagaagcagaaggaggactgggtgaagaagaac2520

cgcgcctgccagatcatcctgatggagaagctgaccgagtacaccttcgccgaggaccgc2580

ccccgcgaggagaacagcaagctgcgcatgtggagccaccgccagatcttcaacttcctg2640

cagcagaaggccagcctgtggggcatcctggtgggcgacgtgttcgccccctacaccagc2700

aagtgcctgagcgacaacaacgcccccggcatccgctgccaccaggtgaccaagaaggac2760

ctgatcgacaacagctggttcctgaagatcgtggtgaaggacgacgccttctgcgacctg2820

atcgagatcaacaaggagaacgtgaagaacaagagcatcaagatcaacgacatcctgccc2880

ctgcgcggcggcgagctgttcgccagcatcaaggacggcaagctgcacatcgtgcaggcc2940

gacatcaacgccagccgcaacatcgccaagcgcttcctgagccagatcaaccccttccgc3000

gtggtgctgaagaaggacaaggacgagaccttccacctgaagaacgagcccaactacctg3060

aagaactactacagcatcctgaacttcgtgcccaccaacgaggagctgaccttcttcaag3120

gtggaggagaacaaggacatcaagcccaccaagcgcatcaagatggacaagcacgagaag3180

gagagcaccgacgagggcgacgactacagcaagaaccagatcgccctgttccgcgacgac3240

agcggcatcttcttcgacaagagcctgtgggtggacggcaagatcttctggagcgtggtg3300

aagaacaagatgaccaagctgctgcgcgagcgcaacaacaagaagaacggcagcaag3357

<210>20

<211>3426

<212>dna

<213>tuberibacilluscalidus

<400>20

atgaacatccacctgaaggagctgatccgcatggccaccaagagcttcatcctgaagatg60

aagaccaagaacaacccccagctgcgcctgagcctgtggaagacccacgagctgttcaac120

ttcggcgtggcctactacatggacctgctgagcctgttccgccagaaggacctgtacatg180

cacaacgacgaggaccccgaccaccccgtggtgctgaagaaggaggagatccaggagcgc240

ctgtggatgaaggtgcgcgagacccagcagaagaacggcttccacggcgaggtgagcaag300

gacgaggtgctggagaccctgcgcgccctgtacgaggagctggtgcccagcgccgtgggc360

aagagcggcgaggccaaccagatcagcaacaagtacctgtaccccctgaccgaccccgcc420

agccagagcggcaagggcaccgccaacagcggccgcaagccccgctggaagaagctgaag480

gaggccggcgaccccagctggaaggacgcctacgagaagtgggagaaggagcgccaggag540

gaccccaagctgaagatcctggccgccctgcagagcttcggcctgatccccctgttccgc600

cccttcaccgagaacgaccacaaggccgtgatcagcgtgaagtggatgcccaagagcaag660

aaccagagcgtgcgcaagttcgacaaggacatgttcaaccaggccatcgagcgcttcctg720

agctgggagagctggaacgagaaggtggccgaggactacgagaagaccgtgagcatctac780

gagagcctgcagaaggagctgaagggcatcagcaccaaggccttcgagatcatggagcgc840

gtggagaaggcctacgaggcccacctgcgcgagatcaccttcagcaacagcacctaccgc900

atcggcaaccgcgccatccgcggctggaccgagatcgtgaagaagtggatgaagctggac960

cccagcgccccccagggcaactacctggacgtggtgaaggactaccagcgccgccacccc1020

cgcgagagcggcgacttcaagctgttcgagctgctgagccgccccgagaaccaggccgcc1080

tggcgcgagtaccccgagttcctgcccctgtacgtgaagtaccgccacgccgagcagcgc1140

atgaagaccgccaagaagcaggccaccttcaccctgtgcgaccccatccgccaccccctg1200

tgggtgcgctacgaggagcgcagcggcaccaacctgaacaagtaccgcctgatcatgaac1260

gagaaggagaaggtggtgcagttcgaccgcctgatctgcctgaacgccgacggccactac1320

gaggagcaggaggacgtgaccgtgcccctggcccccagccagcagttcgacgaccagatc1380

aagttcagcagcgaggacaccggcaagggcaagcacaacttcagctactaccacaagggc1440

atcaactacgagctgaagggcaccctgggcggcgcccgcatccagttcgaccgcgagcac1500

ctgctgcgccgccagggcgtgaaggccggcaacgtgggccgcatcttcctgaacgtgacc1560

ctgaacatcgagcccatgcagcccttcagccgcagcggcaacctgcagaccagcgtgggc1620

aaggccctgaaggtgtacgtggacggctaccccaaggtggtgaacttcaagcccaaggag1680

ctgaccgagcacatcaaggagagcgagaagaacaccctgaccctgggcgtggagagcctg1740

cccaccggcctgcgcgtgatgagcgtggacctgggccagcgccaggccgccgccatcagc1800

atcttcgaggtggtgagcgagaagcccgacgacaacaagctgttctaccccgtgaaggac1860

accgacctgttcgccgtgcaccgcaccagcttcaacatcaagctgcccggcgagaagcgc1920

accgagcgccgcatgctggagcagcagaagcgcgaccaggccatccgcgacctgagccgc1980

aagctgaagttcctgaagaacgtgctgaacatgcagaagctggagaagaccgacgagcgc2040

gagaagcgcgtgaaccgctggatcaaggaccgcgagcgcgaggaggagaaccccgtgtac2100

gtgcaggagttcgagatgatcagcaaggtgctgtacagcccccacagcgtgtgggtggac2160

cagctgaagagcatccaccgcaagctggaggagcagctgggcaaggagatcagcaagtgg2220

cgccagagcatcagccagggccgccagggcgtgtacggcatcagcctgaagaacatcgag2280

gacatcgagaagacccgccgcctgctgttccgctggagcatgcgccccgagaaccccggc2340

gaggtgaagcagctgcagcccggcgagcgcttcgccatcgaccagcagaaccacctgaac2400

cacctgaaggacgaccgcatcaagaagctggccaaccagatcgtgatgaccgccctgggc2460

taccgctacgacggcaagcgcaagaagtggatcgccaagcaccccgcctgccagctggtg2520

ctgttcgaggacctgagccgctacgccttctacgacgagcgcagccgcctggagaaccgc2580

aacctgatgcgctggagccgccgcgagatccccaagcaggtggcccagatcggcggcctg2640

tacggcctgctggtgggcgaggtgggcgcccagtacagcagccgcttccacgccaagagc2700

ggcgcccccggcatccgctgccgcgtggtgaaggagcacgagctgtacatcaccgagggc2760

ggccagaaggtgcgcaaccagaagttcctggacagcctggtggagaacaacatcatcgag2820

cccgacgacgcccgccgcctggagcccggcgacctgatccgcgaccagggcggcgacaag2880

ttcgccaccctggacgagcgcggcgagctggtgatcacccacgccgacatcaacgccgcc2940

cagaacctgcagaagcgcttctggacccgcacccacggcctgtaccgcatccgctgcgag3000

agccgcgagatcaaggacgccgtggtgctggtgcccagcgacaaggaccagaaggagaag3060

atggagaacctgttcggcatcggctacctgcagcccttcaagcaggagaacgacgtgtac3120

aagtgggtgaagggcgagaagatcaagggcaagaagaccagcagccagagcgacgacaag3180

gagctggtgagcgagatcctgcaggaggcgagcgtgatggccgacgagctgaagggcaac3240

cgcaagaccctgttccgcgaccccagcggctacgtgttccccaaggaccgctggtacacc3300

ggcggccgctacttcggcaccctggagcacctgctgaagcgcaagctggccgagcgccgc3360

ctgttcgacggcggcagcagccgccgcggcctgttcaacggcaccgacagcaacaccaac3420

gtggag3426

<210>21

<211>2870

<212>dna

<213>artificialsequence

<220>

当前第1页1 2 
网友询问留言 已有0条留言
  • 还没有人留言评论。精彩留言会获得点赞!
1