制备普伐他汀的工艺的制作方法

文档序号:439320阅读:453来源:国知局
专利名称:制备普伐他汀的工艺的制作方法
专利说明制备普伐他汀的工艺 发明领域 本发明涉及生产普伐他汀的方法。

背景技术
他汀(Statins)是已知的3-羟基-3-甲基丁酰辅酶A还原酶(胆固醇生物合成中的限速酶)抑制剂。由此,他汀能够在多种哺乳动物物种(包括人)中降低血浆胆固醇水平,这些化合物因此在对高胆固醇血症的治疗中是有效的。市场上有若干种他汀,包括阿托伐他汀(atorvastatin)、普伐他汀(pravastatin)、制甲羟酶素、洛伐他汀和辛伐他汀等。尽管阿托伐他汀是通过化学合成制造的,但是后四种是通过直接发酵生产,或通过前体发酵生产。这些(前体)发酵由Penicillium、Aspergillus和Monascus属的真菌完成。
普伐他汀(pravastatin)是在两次顺序发酵中生产的。首先Penicilliumcitrinum生产制甲羟酶素,制甲羟酶素的内酯环被化学水解;随后将得到的产物进料至Streptomyces carbophilus培养物中,所述Streptomycescarbophilus培养物将其羟基化为普伐他汀。在本发明的语境中,术语“经水解的制甲羟酶素”指制甲羟酶素的非内酯形式,即其中通过与水反应打开内酯环(

图1);同样,术语“制甲羟酶素的水解”是指内酯环的打开。使用不同的方法对生产这些代谢产物的工业物种和方法进行最优化。藉此将Penicillium citrinum的制甲羟酶素产量从原始的40mg/L提高至5g/L。对生物催化的转化而言,Metkinen获得了下述Streptomyces突变体菌株,其对具有80%转化率的3g/L美伐他汀具有抗性(Metkinen News March2000,Metkinen Oy,Finland;reviewed by Manzoni and Rollini,2002,ApplMicrobiol Biotechnol 58555-564)。尽管在商业上具有活力,但是该工艺远非最优,因为与例如工业氨基酸或青霉素G生产相比,制甲羟酶素效价相对较低;另外,制甲羟酶素必须被稀释,以防止对生物转化中使用的Streptomyces菌株的毒性效应(Hosobuchi et al.,1983,J Antibiotics 36887-891),并且20%的制甲羟酶素补料不能被Streptomyces菌株转化。
从制甲羟酶素到普伐他汀的转化由Streptomyces carbophilus的p450酶催化(见Matsuoka et al.,1989,Eur.J Biochem.184707-713)。Streptomyces细菌存在一种常见的问题,由于其在丝体中生长从而导致以高粘度培养,引起低氧转移率并因而引起更低的发酵产出。最适地,在大规模生物催化中广泛使用的一种宿主——工业上经良好装备的物种如Escherichia coli会是有用的,但是该物种既不具有p450酶也不具有p450还原氧化再生体系。迄今为止,尚未报道适用于发酵和酶生产的物种如Escherichia coli在制甲羟酶素向普伐他汀的转化中的用途。
另一个问题是p450酶对辅助因子再生的需要,这通常通过宿主细胞中存在的特定蛋白质对来实现。如果该系统不是最适的,则总体转化会实际上低于100%,如在制甲羟酶素例子中一样。已进行了多种尝试来分离备选的物种,但其均不具有100%的转化率(见US 6,905,851、US 6,365,382、US 2005/0153422、US 2004/0253692和US 2004/0209335)。另外,这些均不显示超越Streptomyces carbophilus的真实改进。还报道了对制甲羟酶素具有极高抗性的物种,但是它们仅提供效率很低的转化(US 6,306,629、US 6,750,366)。其他人提出使用家族改组(family shuffling)作为改进已知的p450酶转化率的方法,但是未给出任何数据(US 6,605,430),因为事实上这会是非常困难的,因为p450酶可以非常具有底物特异性,不具有太多序列同一性并且需要特定的酶用于辅助因子再生。已尝试通过分离使用不同的酶来进行转化的物种,以解决后一问题。该领域的一个具体例子是能够以78%的最大转化率将制甲羟酶素转化为普伐他汀的Actinomadura物种(Peng and Demain(1998,J.Ind.Microbiol.Biotechnol.20373-375;US 6,274,360))。因此,尽管有所有的努力,但是仅具有80%转化率的Streptomyces carbophilus仍然被用作普伐他汀转化所选择的工业物种,并且非常期望有所改进。
发明描述 在本发明的语境中,术语“保守取代”旨在表示下述取代,其中氨基酸残基被替换为具有相似侧链的氨基酸残基。这些家族是本领域已知的,并包括具有碱性侧链的氨基酸(例如赖氨酸、精氨酸和组氨酸)、具有酸性侧链的氨基酸(例如天冬氨酸、谷氨酸)、具有不带电的极性侧链的氨基酸(例如甘氨酸、天冬酰胺、谷氨酰胺、丝氨酸、苏氨酸、酪氨酸、半胱氨酸)、具有非极性侧链的氨基酸(例如丙氨酸、缬氨酸、亮氨酸、异亮氨酸、脯氨酸、苯丙氨酸、甲硫氨酸、色氨酸)、具有β-分支侧链的氨基酸(例如苏氨酸、缬氨酸、异亮氨酸)和具有芳香族侧链的氨基酸(例如酪氨酸、苯丙氨酸、色氨酸、组氨酸)。
在本文中使用的术语“经分离的多核苷酸或核酸序列”是指基本不含其它核酸序列的多核苷酸或核酸序列,例如通过琼脂糖电泳测定为至少20%纯净,优选地至少40%纯净,更优选地至少60%纯净,进一步更优选地至少80%纯净,最优选地至少90%纯净。例如,可通过遗传工程中使用的标准克隆步骤获得经分离的核酸序列,从而将核酸序列从其天然位点再定位于其会被再生产的不同位点。
术语“普伐他汀”被定义为具有α-或β-构象的制甲羟酶素的6′-羟基变体,或α-和β-构象二者的混合物。在此处提及下述内容是重要的在科学文献中,术语普伐他汀仅用于制甲羟酶素6′-羟基变体的β-构象,而α变体被称作表-普伐他汀。然而,本发明描述了产生制甲羟酶素6-羟基变体的一种普遍有效的方法。因此,术语普伐他汀适用于α和β两种形式。
本发明的一个目的是提供将制甲羟酶素转化为普伐他汀的有效的并可工业应用的方法。使用来自Amycolatopsis orientalis新颖的p450酶将制甲羟酶素转化为普伐他汀是本发明的另一目的。本发明通过提供下述工艺解决了现有技术工艺中遇到的问题,所述工艺中在Escherichia coli中有效地进行制甲羟酶素的羟基化。本发明还提供了其中以100%转化进行制甲羟酶素羟基化的工艺。更特别地,本发明提供了下述工艺,其中通过将Amycolatopsis orientalis的全细胞或无细胞提取物与制甲羟酶素接触,使得制甲羟酶素与Amycolatopsis orientalis制甲羟酶素羟化酶(由cmpH基因编码)接触。优选地提供了下述工艺,其中从Amycolatopsis orientalis中获得制甲羟酶素羟化酶(cmpH),并将其转移至另一宿主物种。优选地,该宿主对高水平的制甲羟酶素有耐性并且能够生产制甲羟酶素。
在第一个方面,本发明提供了选自下组的多肽,该组由以下组成具有根据SEQ ID NO 3的氨基酸序列的多肽和具有与SEQ ID NO 3序列基本同源的氨基酸的多肽,所述多肽展示出制甲羟酶素羟化酶活性。
在第一个实施方案中,所述多肽以至少50%、优选地至少70%、更优选地至少80%、进一步更优选地至少90%、最优选地至少99%的效率来羟基化制甲羟酶素。优选地,所述羟基化的产物是普伐他汀。
作为本发明的一部分,展示目前可获得的制甲羟酶素羟化酶的工业应用被限制于来自放线菌纲的物种;即它们不能被改变为更适合工业规模发酵的物种如Escherichia coli或丝状真菌如Aspergillus或Penicillium的物种。本发明所述的制甲羟酶素羟化酶基因不具有该问题。因此,由这些基因编码的新颖多肽的活性可如下表征它们能够应用于除放线菌外的其它物种,例如Escherichia coli,和/或它们能够以至少80%的转化效率有效羟基化制甲羟酶素。在本发明的语境中,至少80%的效率表示至少80%的制甲羟酶素被转化为普伐他汀。
具有与SEQ ID NO 3基本同源的氨基酸序列的多肽被定义为具有下述氨基酸序列的多肽,所述氨基酸序列与特定的氨基酸序列具有至少50%,优选地至少60%,更优选地至少75%,进一步更优选地至少90%,最优选地至少95%,进一步最优选地至少97%,极限地至少98%,进一步更极限地至少99%的同一性程度,所述基本同源的肽展示出制甲羟酶素羟化酶活性。基本同源的多肽包括多态现象,其可能由于天然的等位变异或菌株内变异而存在于来自不同种群的细胞或种群内的细胞中。基本同源的多肽还可衍生自除特定氨基酸和/或DNA序列起源的物种以外的物种,或可由人工设计和合成的DNA序列编码。与特定的DNA序列相关并通过遗传密码子简并获得的DNA序列也是本发明的一部分。同源物还包括全长序列的、仍然展示制甲羟酶素羟化酶活性的生物活性片段。
两条氨基酸序列之间的同一性程度是指两条序列之间相同的氨基酸的百分比。使用BLAST算法测定同一性程度,所述BLAST在Latched et al.(1990,J.Mol.Biol.215403-410)中描述。BLAST分析软件可以通过National Center for Biotechnology Information(http://www.ncbi.nlm.nih.gov/)获得。BLAST算法参数W、T和X确定比对的灵敏度和速度。BLAST程序使用下述作为默认词长(W)11、BLOSUM62评分矩阵(见Henikoffand Henikoff,1989,Proc Natl.Acad.Sci.USA 8910915)、比对(B)50、预期(E)10、M=5且N=-4。
基本同源的多肽可仅含有特定氨基酸序列的一个或多个氨基酸的保守取代,或非必需氨基酸的取代、插入或缺失。因此,非必需的氨基酸是在这些序列之一中可以被改变而不显著改变生物功能的残基。例如,涉及如何制造表型沉默氨基酸取代的指南在Bowie et al.(1990,Science 2471306-1310)中提供,其中作者指出存在两种研究氨基酸序列对改变的耐受的途径。第一种方法依赖于进化过程,其中突变被自然选择接受或拒绝。第二种途径使用基因工程在被克隆的基因的特定位置上引入氨基酸改变,并选择或筛选以鉴定维持功能性的序列。这些研究揭示了蛋白质惊人地耐受氨基酸取代,并且揭示了在蛋白质的某位置上何种改变可能是允许的。例如,大部分被埋藏的氨基酸残基需要非极性侧链,而表面侧链通常很少有特征是保守的。其它这类表型沉默取代被描述于Bowie et al和其中所引用的参考文献中。
在第二个实施方案中,可通过修饰编码制甲羟酶素羟化酶的多核苷酸序列,来获得导致改进的催化功能(即制甲羟酶素成为普伐他汀的转化)的变体。这些修饰包括 -以使得密码子适应用于表达制甲羟酶素羟化酶的宿主物种的方式,改进密码子使用 -以使得密码子适应用于表达制甲羟酶素羟化酶的宿主物种的方式,改进密码子对使用 -对编码制甲羟酶素羟化酶的基因组信息添加稳定序列,从而产生具有提高的半衰期的mRNA分子 -进行易错PCR引入随机突变,然后筛选获得的变体(基本如实施例4中所述)并分离具有改进的动力学特性的变体 -对制甲羟酶素羟化酶相关变体进行家族改组,然后筛选获得的变体(基本如实施例4中所述),并分离具有改进的动力学特性的变体 分离具有改进的动力学特性的变体的优选方法描述于WO03010183和WO0301311中。
当获得下述编码具有改进的功能性的制甲羟酶素羟化酶的改进的多核苷酸时,就获得了改进的催化功能。作为本发明的一部分,已惊人地发现可使用SEQ ID NO 19、20、21、22、23、24、25或26的改进的多肽序列或与其基本同源的序列,显著地改进制甲羟酶素6-羟基变体的β-构象(即有药物活性的普伐他汀异构体)和制甲羟酶素6-羟基变体的α-构象之间的比例。
另外,确定了本发明第一方面多肽序列中的某些序列段(stretches)直接涉及制甲羟酶素羟基化的催化机制。它们是SEQ ID NO 43、44、45、46和47。可通过在SEQ ID NO 43-47之任一或所有中引入修饰,获得改进的催化功能。优选地,通过替换单个氨基酸、两个氨基酸、三个氨基酸或至多四个氨基酸修饰SEQ ID NO 43-47之任一或所有。确定了以下的修饰导致改进的催化功能。对SEQ ID NO 43而言,优选的修饰为SEQ ID NO 48、49和50,对SEQ ID NO 44而言,优选的修饰为SEQ ID NO 51、52和53,对SEQ ID NO 45而言,优选的修饰为SEQ ID NO 54,对SEQ ID NO 46而言,优选的修饰为SEQ ID NO 55,对SEQ ID NO 47而言,优选的修饰为SEQ ID NO 56、57、58和59。适合对制甲羟酶素羟基化做出贡献的序列段也是SEQ ID NO 43-59,其中用备选的氨基酸替换一个、两个或三个氨基酸。
在第三个实施方案中,提供了包含编码上述多肽的DNA序列的多核苷酸或核酸序列。这可以是经分离的基因组、cDNA、RNA、半合成的、合成来源的多核苷酸,或其任何组合。具体地,提供了编码SEQ ID NO 3多肽的特定DNA序列,即SEQ ID NO 1或2。更优选地,提供了编码SEQ ID 19-26多肽的特定DNA序列,即SEQ ID 11-18。除非另有说明,使用自动化DNA测序仪测定本文中通过对DNA分子测序所测定的所有核苷酸序列,并且通过翻译如上测定的DNA序列预测本文中测定的DNA分子编码的所有多肽的氨基酸序列。因此,对通过该自动化途径测定的任何DNA序列而言,测定的任何核苷酸序列可含有一些错误。通过自动化测定的核苷酸序列与被测序的DNA分子的真实核苷酸序列典型地至少约90%相同,更典型地至少与95%到至少约99.9%相同。可通过其它途径(包括手动DNA测序方法)更精确地测定真实的序列。还如本领域已知的,与真实序列相比,被测定的核苷酸序列中的单个插入或缺失会引起核苷酸序列翻译中的移码,从而由被测定的核苷酸序列编码的预测氨基酸序列与被测序的核苷酸分子实际编码的氨基酸序列会从这样的插入或缺失点开始完全不同。本领域技术人员能够鉴定这些被错误识别的碱基,并知道如何纠正这类错误。
本发明第一方面的多肽和编码核酸序列可得自任何原核细胞,优选地得自放线菌。优选的放线菌物种包括但不仅限于Streptomyces、Amycolatopsis、Pseudonocardia、Micromonospora、Nocardia和Actinokineospora的菌株。在一个优选的实施方案中,编码本发明多肽的核酸序列得自Amycolatopsis orientalis的菌株。
可通过杂交鉴定本发明的DNA序列。对应于本发明DNA的变体(例如天然等位变体)和同源物的核酸分子可基于它们与本文公开的核酸的同源性被分离,所述分离可使用本文公开的核酸或其合适的片段作为杂交探针,根据标准杂交技术,优选地在高度严格的杂交条件下进行。或者,可通过可获得的基因组数据库应用在计算机芯片上的筛选。杂交反应的“严格度”可由本领域常规技术人员容易地确定。杂交反应严格度的额外细节和解释见Ausubel et al.(1995,Current Protocols in Molecular Biology,WileyInterscience Publishers)。
可通过例如筛选被研究微生物的基因组或cDNA文库,来分离核酸序列。一旦例如用衍生自SEQ ID NO 2的探针检测到编码具有本发明活性的多肽的核酸序列,则可通过利用本领域常规技术人员已知的技术分离或克隆所述序列(见Sambrook et al.,1989,Molecular Cloning,A LaboratoryManual,2d edition,Cold Spring Harbor,New York)。也可实现从这类(基因组)DNA中克隆本发明的核酸序列,例如通过使用基于聚合酶链式反应(PCR)的方法或对表达文库进行抗体筛选以检测具有共享的结构特性的被克隆的DNA片段来实现(见例如Innis et al.,1990,PCRA Guide toMethods and Application,Academic Press,New York.)。
本文提供的序列信息不应被狭义地认为需要包括被错误识别的碱基。本文公开的特定序列可被容易地用于分离来自放线菌(尤其是Amycolatopsis orientalis)的完整基因,这随后可被容易地用于进一步的序列分析,从而鉴定测序错误。
除非另有说明,使用自动化DNA测序仪测定本文中通过对DNA分子测序所测定的所有核苷酸序列,并且通过翻译如上测定的DNA序列预测本文中测定的DNA分子编码的所有多肽的氨基酸序列。因此,如本领域所已知的,对通过该途径测定的任何DNA序列而言,本文中测定的任何核苷酸序列可含有错误。通过自动化测定的核苷酸序列与被测序的DNA分子的真实核苷酸序列典型地至少约90%相同,更典型地至少与95%到至少约99.9%相同。可通过其它途径(包括本领域公知的手动DNA测序方法)更精确地测定真实的序列。还如本领域已知的,与真实序列相比,被测定的核苷酸序列中的单个插入或缺失会引起核苷酸序列翻译中的移码,从而由被测定的核苷酸序列编码的预测氨基酸序列与被测序的核苷酸分子实际编码的氨基酸序列会从这样的插入或缺失点开始完全不同。本领域技术人员能够鉴定这些被错误识别的碱基,并知道如何纠正这类错误。
在第四个实施方案中,本发明通过将SEQ ID NO 3的多肽与所谓的还原酶结构域融合形成SEQ ID NO 6的多肽并展示制甲羟酶素羟化酶活性,提供了改进的制甲羟酶素羟化酶。本发明的范围不限于该特定的氨基酸序列,而是包括具有与SEQ ID NO 6序列“基本同源”的氨基酸序列的多肽,其被定义为具有下述氨基酸序列的多肽,所述氨基酸序列与特定的氨基酸序列具有至少60%、优选地至少70%、更优选地至少80%、进一步更优选地至少85%、进一步更优选地至少90%、进一步更优选地至少95%、进一步更优选地至少98%、最优选地最后死耗99%的同一性程度,所述基本同源的肽显示制甲羟酶素羟化酶活性。基本同源的多肽可包括多态现象,其可能由于天然的等位变异或菌株内变异而存在于来自不同种群的细胞或种群内的细胞中。基本同源的多肽还可衍生自除特定氨基酸和/或DNA序列起源的物种以外的物种,或可由人工设计和合成的DNA序列编码。与特定的DNA序列相关并通过遗传密码子的简并获得的DNA序列也是本发明的部分。同源物也包括全长序列的生物活性片段,其仍然展示制甲羟酶素羟化酶活性。本领域技术人员应当明白,该融合蛋白的羟化酶部分被交换为非同源的、但是仍然是功能等同的序列,如能够羟基化制甲羟酶素的其它p450酶,如Streptomyces carbophilus p450sca-2基因,只要融合蛋白展示朝向普伐他汀的制甲羟酶素羟基化即可。还可将还原酶结构域交换为非同源的、但是仍然是功能等同的序列,例如铁氧还蛋白和铁氧还蛋白还原酶,只要融合蛋白展示朝向普伐他汀的制甲羟酶素羟基化即可。可使用的备选的还原酶结构域是例如来自Bacillus megaterium的自给(self-sufficient)P450酶,P450 BM3,NCBI Genbank登录号gi142797。优选的融合多肽是SEQ ID NO 19-26的改进的多肽的同源物(congener),即SEQ IDNO 35、36、37、38、39、40、41或42或与其基本同源的序列。另外,编码SEQ ID NO 34-42多肽的特定DNA序列(即SEQ ID NO 27-34)也是本发明的部分。或者,还在还原酶区域上进行第二实施方案中所述的催化功能的改进。
第二方面中,本发明公开了第一方面的多核苷酸在重组宿主菌株中的用途。更具体地,公开了用于生产普伐他汀的方法,包括步骤 (i)用包含编码制甲羟酶素羟化酶的感兴趣的基因的多核苷酸来转化感兴趣的宿主细胞, (ii)选择经转化的细胞的克隆, (iii)培养所述选择出的细胞, (iv)任选地加工所述经培养的细胞(即固定), (v)向所述经培养的细胞补充制甲羟酶素, (vi)从所述培养物中分离普伐他汀。
在本发明的方法中,对宿主细胞的选择会在很大程度上取决于编码多肽的感兴趣的核酸序列(基因)的来源。优选地,宿主细胞是原核细胞。在一个优选的实施方案中,原核宿主细胞是下述物种的细胞,所述物种被引用为从中可获得第一或第二方面多核苷酸的物种,其例子为,但不限于Streptomyces物种(即Streptomyces carbophilus、Streptomycesflavidovirens、Streptomyces coelicolor、Streptomyces lividans、Streptomycesexfoliatus)或Amycolatopsis物种(即Amycolatopsis orientalis)。在最优选的情况下,宿主细胞是适合大规模发酵的宿主细胞,其例子为,但不限于Streptomyces的物种(即Streptomyces avermitilis、Streptomyces lividans、Streptomyces clavuligerus)或Bacillus的物种(即Bacillus subtilus、Bacillus amyloliquefaciens、Bacillus licheniformis)或Corynebacterium物种(即Corynebacterium glutamicum)或Escherichia的物种(即Escherichiacoli)。进一步更优选地,宿主细胞是真核细胞,如Saccharomyces、Aspergillus或Penicillium物种,其合适的例子是酵母Saccharomycescerevisiae或丝状真菌Aspergillus niger、Penicillium chrysogenum或Penicillium citrinum。
核酸构建体例如表达构建体可含有选择标记物基因和本发明的多核苷酸(制甲羟酶素羟化酶),各自与一个或多个控制序列可操作地连接,所述控制序列指导编码的多肽在合适的表达宿主中表达。核酸构建体可以在独立的片段上,或优选地在一个DNA片段上。表达应被理解为包括多肽生产中涉及的任何步骤,并可包括转录、转录后修饰、翻译、翻译后修饰和分泌。当核酸构建体含有编码序列在特定宿主生物中表达所需的所有控制序列时,术语“核酸构建体”与术语“表达载体”或“盒”同义。术语“控制序列”在本文中被定义为包括对多肽的表达来说必需的或有利的所有组件。每种控制序列对编码多肽的核酸而言可以是内源的(native)或外源的(foreign)。这类控制序列可包括,但不限于启动子、前导序列、最适翻译起始序列(如Kozak,1991,J.Biol.Chem.26619867-19870中所述)、分泌信号序列、前肽序列、多聚腺苷酸化序列、转录终止子。控制序列至少包括启动子以及转录和翻译终止信号。术语“可操作地连接”在本文中被定义为下述构型,其中控制序列被适当地置于与DNA序列的编码序列相关的位置,使得控制序列能指导多肽的生产。
控制序列可包括含有转录控制序列的适当的启动子序列。启动子可以是在细胞中显示转录调控活性的任何核酸序列,包括突变的、截短的和杂交的启动子,它们可得自编码细胞外或细胞内多肽的基因。启动子对细胞或多肽而言可以是同源的或异源的。对原核细胞而言优选的启动子是本领域已知的,并可以例如是确保高水平信使RNA的强启动子。根据本发明的表达盒中使用的启动子可选自用于高度表达下述操纵子/基因的公知的诱导型启动子集合,所述操纵子/基因如乳糖操纵子(lac,lacUV5)、阿拉伯糖操纵子(ara)、色氨酸操纵子(trp)和编码所有芳香族氨基酸生物合成通用酶的操纵子(aro),或这些启动子的功能杂合物,例如tac启动子,其为trp和lac启动子的融合物(Amann et al.,1983,Gene 25161-178)。或者可使用在细胞的整个生命中提供恒定的信使RNA供应的组成型启动子。任何其它有用的启动子可在诸如NCBI站点(http://www.ncbi.nlm.nih.gov/entrez/)中找到。
在一个优选的实施方案中,启动子可衍生自被高度表达的基因(在本文中定义为mRNA浓度至少为总细胞mRNA的0.5%(w/w))。在另一个优选的实施方案中,启动子可衍生自被中度表达的基因(在本文中定义为mRNA浓度至少为总细胞mNRA的0.01%至0.5%(w/w))。在另一优选的实施方案中,启动子可衍生自被低表达的基因(在本文中定义为mRNA浓度低于总细胞mRNA的0.01%(w/w))。
在一个进一步更优选的实施方案中,使用微阵列数据选择基因,并进而选择这些基因的启动子,所述启动子具有确定的转录水平和调节。藉此,可以使基因表达盒最适地适应其应当发挥功能的条件。
或者,可将随机的DNA片段克隆在本发明的多核苷酸之前。这些可通过所谓的直接选择途径被分离。使用无启动子的可选择标记物基因(即卡纳霉素抗性),可将随机的DNA片段克隆在该基因之前并容易地筛选活性启动子,因为这些会有助于在含卡纳霉素的培养基上的生长。这些DNA片段可衍生自许多来源,即不同的物种、经PCR扩增、合成等等。随后可分离序列并讲起克隆在本发明的多核苷酸之前。类似的策略可被用于通过由recDNA方法引入5′-非翻译前导区来促进信使RNA库的翻译,所述5′-非翻译前导区来自被有效翻译的信使RNA的前导区,如其可得自编码高度表达的延伸因子Tu蛋白质的tuf基因或色氨酸操纵子的经修饰的变体或合成变体。
控制序列还可以包括合适的转录终止子序列,这是被原核细胞识别为终止转录的序列。终止子序列与编码多肽的核酸序列的3’末端可操作地连接。在细胞中有功能的任何终止子都可用于本发明中。对原核细胞而言优选的终止子得自要被表达的天然基因,或得自如rRNA基因或病毒操纵子的来源,例如核糖体RNA终止子或fd终止子(Sambrook et al.,1989.Molecular Cloning 2nd edition;CSH Press)。
对于多肽的分泌而言,控制序列可包括编码与多肽氨基端连接的氨基酸序列的信号肽-编码区,其能够指导编码的多肽进入细胞的分泌途径。编码序列的5’端可固有地含有与编码区区段按照翻译读码框天然连接的信号肽-编码区,所述编码区区段编码被分泌的蛋白质。或者,编码序列的5’端可含有信号肽-编码区,其对于编码序列来说是外源的。当编码序列不正常地含有信号肽-编码区时,外源信号肽-编码区可能是必需的。或者,外源信号肽-编码区可以简单地替换天然的信号肽-编码区,从而获得多肽的增强的分泌。
核酸构建体可以是表达载体。表达载体可以是任何载体(例如质粒或病毒),其可便利地进行重组DNA步骤并可导致编码多肽的核酸序列的表达。载体的选择应典型地取决于载体与要引入载体的细胞的相容性。载体可以是线性的或闭合环状质粒。
在另一实施方案中,通过用trp启动子或aro启动子替换原始启动子,额外地修饰上文提到的表达盒。为了完全利用表达效率的基本提高,可对用于创建实际生产菌株的recDNA构建体应用涉及提高的基因表达、信使RNA翻译和质粒稳定性的额外修饰,如添加噬菌体fd的转录终止子,或引入来自质粒pSC101的隔离功能(partitioning function)par(Churchward et al.,1983.Nucl.Acid.Res.115645-5659)。
为了提高期望的蛋白质的生产,可在染色体外元件上插入表达盒,如质粒ColE1、ColD、R1162、RK2或其衍生物,所述质粒或其衍生物以预定的低拷贝数或通常以动态的高拷贝数存在,并且能够在例如Escherichiacoli菌株HB101、B7、RV308、DH1、HMS174、W3110、BL21中繁殖或自主复制。
载体可以是自主复制的载体,即作为染色体外实体存在的载体,其复制不依赖于染色体的复制,例如质粒、染色体外元件、小染色体或人工染色体。或者,载体可以是下述载体,当其被引入细胞时整合进基因组中,并与其被整合在其中的染色体一起复制。整合型克隆载体可以随机或在预先确定的靶基因座上整合进宿主细胞的染色体中。在本发明的一个优选的实施方案中,整合型克隆载体包括与宿主细胞基因组中预先确定的靶基因座中的DNA序列同源的DNA片段,用于将克隆载体的整合靶向该预先确定的基因座上。为了促进定向整合,克隆载体优选地在转化宿主细胞前被线性化。优选地进行线性化使得克隆载体的至少一端(但是优选任一端)侧翼是与靶基因座同源的序列。靶基因座侧翼的同源序列的长度优选地至少0.1kb,进一步优选地至少0.2kb,还更优选地至少0.5kb,进一步更优选地至少1kb,最优选地至少2kb。载体系统可以是单个载体或质粒,或者可以是两个或多个载体或质粒,其共同含有要被引入宿主细胞基因组中的总DNA。
DNA构建体可在附加型载体上使用。优选地,构建体被整合进宿主菌株的基因组中。
在另一个实施方案中,可通过从宿主菌株的基因组中缺失一个或多个限制普伐他汀产量的酶的内源基因来改进本发明的多肽的应用。这类酶的例子为(但不限于)水解制甲羟酶素或普伐他汀侧链的酶。
在一个优选的实施方案中,可在生产制甲羟酶素的宿主细胞中表达cmpH基因(SEQ ID NO 1)、所有同源序列、与还原酶结构域的cmpH融合物(SEQ ID NO 4)及编码制甲羟酶素的所有功能等同物,以生产普伐他汀。在原核宿主的情况下,可在如上所述的这类宿主中应用功能表达的所有方面。在真核宿主细胞的情况下,可优选地使表达构建体适应这类宿主中的有效表达。优选地,宿主细胞是真菌,更优选地是丝状真菌,最优选地,真菌宿主细胞是生产他汀(优选地为制甲羟酶素)的细胞。其例子为,但不限于Aspergillus物种(即Aspergillus terreus),或Penicillium物种(即Penicillium citrinum或chrysogenum)或Monascus物种(即Monascus ruber或paxii)。
对丝状真菌细胞而言优选的启动子是本领域已知的,并且可以是例如葡萄糖-6-磷酸脱氢酶gpdA启动子,蛋白酶启动子如pepA、pepB、pepC,葡萄糖淀粉酶glaA启动子,淀粉酶amyA、amyB启动子,过氧化氢酶catR或catA启动子,葡萄糖氧化酶goxC启动子,β-半乳糖苷酶lacA启动子,α-葡萄糖苷酶aglA启动子,翻译延伸因子tefA启动子,木聚糖酶启动子如xlnA、xlnB、xlnC、xlnD,纤维素酶启动子如eglA、eglB、cbhA,转录调节子的启动子如areA、creA、xlnR、pacC、prtT等或任何其它,并可在诸如NCBI站点(http://www.ncbi.nlm.nih.gov/entrez/)中找到。
在一个优选的实施方案中,启动子可衍生自被高度表达的基因(在本文中定义为mRNA浓度至少为总细胞mRNA的0.5%(w/w))。在另一个优选的实施方案中,启动子可衍生自被中度表达的基因(在本文中定义为mRNA浓度至少为总细胞mNRA的0.01%至0.5%(w/w))。在另一优选的实施方案中,启动子可衍生自被低表达的基因(在本文中定义为mRNA浓度低于总细胞mRNA的0.01%(w/w))。
在一个进一步更优选的实施方案中,使用微阵列数据选择基因,并进而选择这些基因的启动子,所述启动子具有确定的转录水平和调节。藉此,可以使基因表达盒最适地适应其应当发挥功能的条件。
控制序列还可以包括合适的转录终止子序列,这是被丝状真菌细胞识别为终止转录的序列。终止子序列与编码多肽的核酸序列的3’末端可操作地连接。在细胞中有功能的任何终止子都可用于本发明中。对丝状真菌细胞而言优选的终止子得自编码Aspergillus oryzae TAKA淀粉酶、Aspergillus niger葡萄糖淀粉酶、Aspergillus nidulans邻氨基苯甲酸合酶、Aspergillus niger α-葡萄糖苷酶、trpC基因和Fusarium oxysporum胰蛋白酶样蛋白酶的基因。
控制序列也可以包括合适的前导序列,这是对丝状真菌细胞翻译重要的mRNA的非翻译区。前导序列与编码多肽的核酸序列的5’端可操作地连接。在细胞中有功能的任何前导序列可以用于本发明中。丝状真菌细胞优选的前导序列得自编码Aspergillus oryzae TAKA淀粉酶和Aspergillusnidulans磷酸丙糖异构酶和Aspergillus niger glaA的基因。
控制序列也可以包括多聚腺苷酸化序列,其与核酸序列的3’端可操作地连接,并且在转录后被丝状真菌细胞识别为对经转录的mRNA添加多聚腺苷残基的信号。在细胞中有功能的任何多聚腺苷酸化序列可以被用于本发明中。对丝状真菌细胞来说优选的多聚腺苷酸化序列得自编码下述的基因Aspergillus oryzae TAKA淀粉酶;Aspergillus niger葡萄糖淀粉酶;Aspergillus nidulans邻氨基苯甲酸合酶;Fusarium oxysporum胰蛋白酶样蛋白酶和Aspergillus niger α-葡萄糖苷酶。
核酸构建体可以是表达载体。表达载体可以是任何载体(例如质粒或病毒),其可便利地进行重组DNA步骤并可导致编码多肽的核酸序列的表达。载体的选择应典型地取决于载体与要引入载体的细胞的相容性。载体可以是线性的或闭合环状质粒。
载体可以是表达载体。表达载体可以是任何载体(例如质粒或病毒),其可便利地进行重组DNA步骤并可导致编码多肽的核酸序列的表达。对载体的选择应典型地取决于载体与要引入载体的细胞的相容性。载体可以是线性的或闭合环状质粒。载体可以是自主复制的载体,即作为染色体外实体存在的载体,其复制不依赖于染色体的复制,例如质粒、染色体外元件、小染色体或人工染色体。用于丝状真菌的自主维持的克隆载体可包括AMA1-序列(见例如Aleksenko and Clutterbuck(1997),FungalGenet.Biol.21373-397)。或者,载体可以是下述载体,当其被引入细胞时整合进基因组中,并与其被整合在其中的染色体一起复制。整合型克隆载体可以随机或在预先确定的靶基因座上整合进宿主细胞的染色体中。优选地,整合型克隆载体包括与宿主细胞基因组中预先确定的靶基因座中的DNA序列同源的DNA片段,用于将克隆载体的整合靶向该预先确定的基因座上。为了促进定向整合,克隆载体优选地在转化宿主细胞前被线性化。优选地进行线性化使得克隆载体的至少一端(但是优选任一端)侧翼是与靶基因座同源的序列。靶基因座侧翼的同源序列的长度优选地至少0.1kb,进一步优选地至少0.2kb,还更优选地至少0.5kb,进一步更优选地至少1kb,最优选地至少2kb。载体系统可以是单个载体或质粒,或者可以是两个或多个载体或质粒,其共同含有要被引入宿主细胞基因组中的总DNA。
DNA构建体可在附加型载体上使用。优选地,构建体被整合进宿主菌株的基因组中。
使用共转化来转化真菌细胞,即与感兴趣的基因一起还转化了可选择的标记物基因。其可以与感兴趣的基因物理连接(即在质粒上),或位于独立的片段上。转染后,针对该选择标记物基因的存在筛选转化体,并随后分析感兴趣的基因的存在。可选择的标记物是提供针对杀生物剂或病毒的抗性、针对重金属的抗性、针对营养缺陷型的原养型等等的产物。有用的可选择标记物包括amdS(乙酰胺酶)、argB(鸟氨酸氨甲酰基转移酶)、bar(膦丝菌素酰基转移酶)、hygB(潮霉素磷酸转移酶)、niaD(硝酸盐还原酶)、pyrG(乳清苷-5’-磷酸盐脱羧酶)、sC或sutB(硫酸盐腺嘌呤基转移酶)、trpC(邻氨基苯甲酸合酶)、ble(腐草霉素抗性蛋白质)或其等价物。
获得的宿主细胞可被用于生产普伐他汀。
本发明第三方面提供了分离编码下述多肽的多核苷酸的方法,所述多肽能够促进第二方面的制甲羟酶素到普伐他汀的转化,所述方法包括步骤 (i)用本发明第一方面的多核苷酸转化宿主细胞; (ii)针对其羟基化制甲羟酶素的能力选择经转化的细胞的克隆; (iii)用多种多核苷酸再转化这些经分离的克隆; (iv)针对其羟基化制甲羟酶素的能力选择经转化的细胞的克隆; (v)分离质粒; (vi)对所述质粒插入物测序。
步骤(iii)的多种多核苷酸可得自若干种来源。其可以是基因组DNA、拷贝DNA、RNA半合成的或来自合成起源。其可来自真核或原核宿主。其可作为环状或线性多核苷酸提供。其可以是特定的多核苷酸(即基因或基因家族或衍生自基因的易错文库),或其可以是随机的多核苷酸(即宏基因组文库(metagenomic library)或经随机消化的基因组DNA)。其可从其自身的启动子表达,或其可被克隆在在步骤(i)的宿主中有功能的启动子之后。
也可通过例如筛选第一方面的多核苷酸供体微生物的基因组或cDNA文库,来分离编码促进制甲羟酶素羟化酶活性的多肽的这类核酸序列。一旦检测到与衍生自SEQ ID NO 2的探针同源的核酸序列,则可通过利用本领域常规技术人员已知的技术分离或克隆该序列或其周围的DNA(见Sambrook et al.,1989,Molecular Cloning,A Laboratory Manual,2d edition,Cold Spring Harbor,New York)。
藉此,能够克隆编码具有增强的功能的多肽的变体多核苷酸,或编码加速或促进制甲羟酶素羟化酶功能的多肽的多核苷酸,或活化制甲羟酶素羟化酶基因之前的启动子的多核苷酸。
在一个实施方案中公开了通过分离还原氧化再生体系并将其引入表达制甲羟酶素羟化酶的宿主细胞中,来改进制甲羟酶素到普伐他汀的转化效率的方法,所述还原氧化再生体系事实上是p450酶(Pylypenko andSchlichting,2004,Annu.Rev.Biochem.73991-1018)。在宿主细胞中引入这类体系的一般方法与引入制甲羟酶素羟化酶所述的方法相同,并在上文给出。这类还原氧化再生体系可得自下述物种,所述物种被引用为从中可获得或在其中可异源表达第二方面多核苷酸的物种;其例子为,但不限于Streptomyces物种(即Streptomyces carbophilus、Streptomycesflavidovirens、Streptomyces coelicolor、Streptomyces lividans、Streptomycesexfoliatus、Streptomyces avermitilis、Streptomyces clavuligerus)或Amycolatopsis物种(即Amycolatopsis orientalis)或Bacillus species(即Bacillus subtilus、Bacillus amyloliquefaciens、Bacillus licheniformis)或Corynebacterium物种(即Corynebacterium glutamicum)或Escherichia物种(即Escherichia coli)。还可应用备选的体系。备选体系的例子为,但不限于,将本发明的制甲羟酶素羟化酶整合在IV类p450体系中,从而使其与还原氧化配偶体融合(Roberts et al.,2002,J.Bacteriol.1843898-3908and Kubota et al.,2005,Biosci.Biotechnol.Biochem.692421-2430)或通过产NAD(P)H的并非与p450相连的酶如亚磷酸盐脱氢酶(Johannes et al.,2005,Appl Environ Microbiol.715728-5734.)或通过非酶手段(Hollmann et al.,2006,Trends Biotechnol.24163-171)实现。
在本发明的第四方面中,根据第三方面的方法生产的普伐他汀被包含在药物组合物中。
附例 图1显示了由cmpH基因产物——制甲羟酶素羟化酶催化的转化。图例[C]=制甲羟酶素;[P]=普伐他汀。
图2显示了质粒pZERO-Ao-11H9。图例ORF-1=第一开放读码框,ORF-2=第二开放读码框,zeo=编码博莱霉素(zeocin)抗性的基因,kan=编码卡纳霉素抗性的基因。
图3显示了质粒pZERO-Ao-11H9d。图例ORF-1=第一开放读码框,zeo=编码博莱霉素抗性的基因,kan=编码卡纳霉素抗性的基因。
图4显示了质粒pACYC-taqScp450。图例Sc-p450=编码制甲羟酶素羟化酶p450的Streptomyces carbophilus基因,cat=编码氯霉素抗性的基因。
图5显示了质粒pACYC-taqAop450。图例A0-cmpH=编码制甲羟酶素羟化酶p450的Amycolatopsis orientalis基因,cat=编码氯霉素抗性的基因。
实施例 一般方法 如其它地方所述,进行标准的DNA步骤和原核生物培养(Sambrook,J.et al.,1989,Molecular cloninga laboratory manual,2nd Ed.,Cold SpringHarbor Laboratory Press,Cold Spring Harbor,New York)。使用保真酶Phusion聚合酶(Finnzymes)扩增DNA。限制性酶来自Invitrogen或NewEngland Biolabs。通过将制甲羟酶素在乙醇中溶解至20mg/ml的终浓度,完成制甲羟酶素的水解。从4M储存液中添加NaOH至0.1M的终浓度。将溶液在50℃加热1到2小时,随后冷却至室温。该溶液可在室温下储存3个月。通过将普伐他汀和未水解的制甲羟酶素二者以20mg/ml溶于乙醇中,制备其储存液。
实施例1 筛选高效的全细胞制甲羟酶素到普伐他汀的生物转化 测试不同的原核和真菌物种(表1),以分离具有改进的转化的物种,所述转化来自于经水解的制甲羟酶素。将所有的物种在25ml 2xYT培养基中预培养1-3天(取决于物种的生长率),洗涤并悬浮于25ml新鲜的2xYT培养基中。在280rpm和30℃下摇动数小时的适应周期后,以0.1、0.2、0.5和1mg/ml的终浓度添加经水解的制甲羟酶素。孵育24小时后,通过将摇瓶的内容物转移进50ml Greiner管中,收集发酵液。将样品冷冻于-20℃,然后冻干。如下所述来提取他汀向冻干的样品中添加1-2ml甲醇,然后重复振荡。通过离心将固体与液相分离。将200μl甲醇提取物转移进HPLC管中,然后如下进行HPLC分析 洗脱液AmilliQ水中33%乙腈,0.025%三氟乙酸 BmilliQ水中80%乙腈 梯度 时间(分钟)洗脱液A% 洗脱液B% 0-8 100 0 8-8.1 100→0 0→100 8.1-120 100 12-13 0→100 100→0 13-14 100 0 柱Waters XTerra RP18(柱温度=室温) 流速 1ml/分钟 注射体积 10μl;(支架温度=室温) 设备 Waters Alliance 2695 检测器Waters 996光二极管阵列 波长 238nm 驻留时间 普伐他汀4分钟,经水解的制甲羟酶素10.4分钟,制甲羟酶素10.9分钟 表1针对经水解的制甲羟酶素羟基化进行测试的原核物种 如在表1中可以看到的,普伐他汀由测试组的四种物种合成Actinokineospora riparia、Pseudonocardia alni、Streptomyces carbophilus和Amycolatopsis orientalis。
实施例2 制甲羟酶素的生物水解 为了确定实施例1中所述物种是否也能够水解和/或羟基化内酯形式的制甲羟酶素,将所选择的四种物种在25ml 2xYT培养基中预培养1-3天(取决于物种的生长速率),洗涤并重悬于25ml新鲜的2xYT培养基中。在280rpm和30℃下摇动数小时的适应周期后,以0.2mg/ml添加未水解的制甲羟酶素。孵育24小时后,通过将摇瓶的内容物转移进50mlGreiner管中,收集发酵液。将样品冷冻于-20℃,然后冻干。如下提取他汀向冻干的样品中添加1-2ml甲醇,然后重复振荡。通过离心将固体与液相分离。将200μl甲醇提取物转移进HPLC管中,然后如实施例1中所述进行HPLC分析。所有四种物种(Actinokineospora riparia、Escherichiacoli、Streptomyces carbophilus和Amycolatopsis orientalis)水解制甲羟酶素,但是这不是普伐他汀形成的必要条件。Amycolatopsis orientalis是合成普伐他汀中最有效的物种。
表2针对制甲羟酶素的水解和/或羟基化进行测试的原核物种 实施例3 Amycolatopsis orientalis具有非常高效的制甲羟酶素羟基化 从实施例1中概括出来,对于制甲羟酶素羟基化而言,Amycolatopsisorientalis优于Streptomyces carbophilus。为了进一步研究,将两种物种均在25ml 2xYT培养基中预培养24小时,洗涤并重悬于25ml新鲜的2xYT培养基中。在280rpm和30℃下摇动若干小时后,以0.1和0.2mg/ml添加经水解的制甲羟酶素。孵育24小时后,通过将摇瓶内容物转移进50mlGreiner管中收集发酵液。将样品冷冻于-20℃,然后冻干。如下提取他汀向冻干的样品中添加1-2ml甲醇,然后重复振荡。通过离心将固体与液相分离。将200μl甲醇提取物转移进HPLC管中,然后如实施例1中所述进行HPLC分析。如从表3中可以看出,Amycolatopsis orientalis能够以100%的效率将制甲羟酶素转化为普伐他汀,而Streptomyces carbophilus则不能。
表3Amycolatopsis orientalis和Streptomyces carbophilus的制甲羟酶素羟基化中的比较。

实施例4 分离编码将制甲羟酶素转化为普伐他汀的生物催化剂的基因片段基因文库Amycolatopsis orientalis 在28℃下培养液体培养基(10g/l葡萄糖、5g/l酵母提取物、20g/l淀粉、1g/l CaCO3和0.5g/l水解酪蛋白氨基酸,带挡板的烧瓶)中的Amycolatopsis orientalis菌落,直至OD=2.0。部分被用于制备甘油储存液,部分被用于接种(1/50的比例)含50ml液体培养基的烧瓶,以制备用于基因组DNA分离的细胞。28℃下16小时后,将培养物用于分离基因组DNA。另外,在孵育的最后一个小时添加氨苄西林至200μg/ml的终浓度。通过离心(8000rpm下15分钟)收获细胞并将沉淀物重悬于用50mM EDT调节至pH 8.0的5ml 50mM Tris-HCl中。添加100μl溶菌酶(100mg/ml)和40μl蛋白水解酶K(20mg/ml)后,将悬浮液在37℃孵育30分钟。添加Promega的核裂解溶液(6ml)。在80℃孵育15分钟和在65℃孵育30分钟导致几乎全部细胞裂解。核糖核酸酶处理(10μl 100mg/ml核糖核酸酶溶液)后,添加2ml Promega的蛋白质沉淀溶液,将混合物振荡(20秒)并在冰上孵育(15分钟)。离心(5000rpm下15分钟)后,将上清液与0.1体积的NaAc(3M,pH 5)和2体积的EtOH(96%)混合。用巴斯德吸管转移沉淀的基因组DNA的可见复合物,并溶于500μl 10mMTris(pH 8.0)中。进行第二次蛋白酶K处理(每200μl样品使用10μl 20mg/ml储存溶液,然后在37℃下孵育30分钟),以去除剩余的蛋自质。在蛋白酶K步骤后,添加500μl苯酚/氯仿/异戊醇(PCI,25∶24∶1)并将混合物在14,000rpm下离心5分钟。将上部相转移至新管中,并添加500μlPCI(24∶1)以去除痕量的苯酚。通过离心分离各相,并将上层相与0.1体积的NaAc(3M,pH 5)和2体积的EtOH(96%)混合,以沉淀DNA。用吸管取出基因组DNA,用70%冷EtOH冲洗并溶于500μl Tris-EDTA缓冲液中。这得到134μg经纯化的基因组DNA,其具有1.85的A260nm/A280nm。使用Sau3AI(0.067单位/μg DNA)部分消化经分离的Amycolatopsisorientalis DNA,以获得范围在4到10kb之间的更小片段。消化基因组DNA(50μg),使用Qiagen QIAquick提取试剂盒从制备的0.6%琼脂糖凝胶中分离4和10kb之间的片段,最终溶于20μl 10mM Tris,pH 8.0中。将这些片段与经BamHI消化的pZErO-2(Invitrogen)连接,并转化进Escherichia coli DH10B中,得到约39,000个菌落。使用二十个个体菌落接种10ml 2xYT培养基,以检验文库的多样性并确定平均插入物大小。19个质粒含有不同大小的插入物,而一个菌落不具有插入物(自连的载体5%)。pZErO-2中gDNA片段的平均插入物大小为与3.8kb。从平板收集所有获得的转化体并重悬于含卡纳霉素的液体2xYT培养基中,添加甘油至8%(v/v)的终浓度并储存于-80℃下。
筛选Amycolatopsis orientalis基因文库 将Amycolatopsis orientalis基因文库涂布在2xYT琼脂+卡纳霉素(50mg/L)上,并在室温下孵育72小时。几乎12,000个菌落被用于接种含0.2ml 2xYT培养基+35mg/L卡纳霉素的120个96孔微量滴定板(MTPs)。将MTP在25℃下用500rpm孵育48小时。将来自每个孔的140μl细胞悬浮液在3,000rpm离心10分钟,并通过将平板在面巾纸上轻叩弃去上清液(将培养物的剩余部分添加至50μl 20%甘油中并储存于-80℃下)。每孔250μl底物溶液(2xYT培养基,含经水解的制甲羟酶素,200mg/L;葡萄糖,2g/l;磷酸盐缓冲液,50mM;pH 6,8)。重悬孔中的细胞沉淀物并在30℃,280rpm下孵育48小时。在每孔添加0.35ml甲醇并在280rpm下混合1小时后提取他汀。通过在2750rpm离心15分钟去除细胞碎片。通过LC-MS分析100μl样品。
对Amycolatopsis orientalis基因文库的LC-MS分析 通过在氦气氛2下搅拌过夜,在MeOH中用1.5M NaOH水解美伐他汀(A.G.Scientific,目录号A7413,纯度99.36%)(1∶2),制备制甲羟酶素标准。通过添加HCl(4M)降低pH,并用水进一步稀释标准。用ACT(Advanced Chromatography Technologies)的短(20mm)CN柱在WatersLC/MS体系上分析样品,使用水和乙腈(均含0.1%甲酸)作为流动相。LC-部分的细节为 装置 Waters Alliance 2795 LC 流动相溶剂A含0.1%甲酸的水 溶剂B含0.1%甲酸的乙腈 针洗涤剂 50%Milli-Q水+50%乙腈 梯度时间表时间(分钟) A% B% 流速(ml/分钟) 曲线 0.0080.0 20.0 1.001 0.3580.0 20.0 1.006 1.0020.0 80.0 1.006 1.4020.0 80.0 1.006 1.5080.0 20.0 1.006 2.00结束 柱ACT,ACE 3 CN,20x2.1mm,颗粒大小3μm 柱温度25℃ 注射体积 5μl 在MS电喷射离子化中,使用阳性模式(ES+)并且化合物被分析为与三种化合物所选离子(SIR)的钠加合物([M+Na]+)。MS-部分的细节为 装置 Waters ZQ 2000 来源ES+ 毛细管3.50kV 锥体(cone) 30V 去溶剂化温度 360℃来源温度140℃ 提取器(Extractor) 2V RF透镜 0.3V 锥体气流 130l/小时去溶剂化气流610l/小时 LM1重溶液 15.0 HM2重溶液 15.0 离子能量1 0.1倍增器(Multiplier)650V 扫描质量范围(m/z)200-600amu,扫描持续时间0.20s,扫描间延迟0.05s;3个频道的SIR413.30、431.3、447.3;暂停(dwell)0.07s;扫描间延迟0.05s。

制甲羟酶素(内酯) 制甲羟酶素(酸,经水普伐他汀(β-变体) 解的形式) MW=390.24MW=408.25MW=424.25 [M+Na]+=413.3[M+Na]+=431.3[M+Na]+=447.3 在该设置下,可能区分四种最重要的分子——制甲羟酶素、经水解的制甲羟酶素和6-羟基-制甲羟酶素的两种立体异构体,即β-变体普伐他汀(结构见上文)和α-变体普伐他汀。
表4制甲羟酶素羟化酶MTP筛选结果的例子 若干个克隆被鉴定为候选者,因为它们在普伐他汀的位置上显示小的却显著的峰。表4中显示了这些结果的一个子集作为例子,其中一个克隆给出了高于背景的信号(孔位置B1中的克隆)。
实施例5 鉴定编码制甲羟酶素到普伐他汀的生物催化剂的基因对来自Amycolatopsis orientalis的推定的制甲羟酶素羟化酶进行再测试 用在第一轮鉴定为推定克隆的四种克隆重复实施例4的分析。然而,这次在摇瓶而不是MTP中培养克隆,并且改变一些培养条件。将克隆在10ml 2xYT中预培养并在30℃、280rpm下培养24小时,所述10ml2xYT用含有推定的制甲羟酶素羟化酶的Escherichia coli细胞接种。随后添加0.1-0.5mM IPTG和0.5mM δ-氨基乙酰丙酸盐,并将培养物在22℃、280rpm下孵育12小时。收获细胞,洗涤并通过振荡重悬于新鲜的2xYT培养基(补充有经羟基化的制甲羟酶素,200mg/L;葡萄糖,2g/l;磷酸盐缓冲液,50mM;pH 6,8)中。将细胞悬浮液在30或37℃下,于280rpm下孵育24或48小时。如实施例4中所述提取他汀并进行分析。
表5来自A.orientalis的推定的制甲羟酶素羟化酶再测试的结果。
如从表5中可以看出,只有克隆11H9具有真实的制甲羟酶素到普伐他汀的显著转化。因此选择所述克隆用于进一步的分析。
测序和序列分析 在含卡纳霉素的2xYT中培养Escherichia coli克隆11H9,使用QiagenQIAprepep试剂盒分离质粒DNA,并测定pZERO-2质粒中Amycolatopsisorientalis基因组插入物的序列。插入物的序列为2545个核苷酸长(见SEQ ID NO.1)。进行DNA序列分析并鉴定了两个开放读码框(ORF)(图2)。第一个ORF编码401个氨基酸的推定蛋白质(SEQ ID no.2和3),其与已知的p450酶具有一些同源性(即最好是来自Streptomycestubercidicus的细胞色素p450单氧合酶CYP105S2)。第二个ORF具有一些跨膜区,并可编码ATP-型结合盒(ABC)蛋白质。
鉴定结构基因cmpH 为了鉴定能够羟基化制甲羟酶素的结构基因,通过用SalI和XhoI双重消化,从pZERO-Ao-11H9中缺失ORF-2。随后分离4.9kb片段并自连。得到的质粒pZERO-Ao-11H9仅含有ORF-1作为完整的ORF(图3)。该克隆具有与克隆pZERO-Ao-11H9相同的转化率,表明ORF-1编码功能性制甲羟酶素羟化酶,称作cmpH。
比较实施例6 Streptomyces carbophilus制甲羟酶素羟化酶在Escherichia coli中的活性构建p450-SCA E.coli表达克隆 使用SEQ ID NO.7和SEQ ID NO.8的引物,从分离自菌株FERM-BP1145的基因组DNA中PCR扩增编码Streptomyces carbophilus p450的基因。根据供应商(Invitrogen)的说明在pCR2.1TOPO/TA载体中克隆PCR片段。如下构建表达克隆用Acc65I消化pACYC-taq(

M.,2000.Untersuchungen zum Einfluss

Bereitstellung von Erythrose-4-Phosphatund Phosphoenolpyruvat auf den Kohlenstofffluss in denAromatenbiosyntheseweg von Escherichia coli.Berichte desForschungszentrums Jülich 3824,ISSN 0944-2952,PhD Thesis,University ofDüsseldorf)并连接分离自pCR2.1TOPO/TA载体的Acc65I片段,得到pACYC-taqScp450(图4)。
Escherichia coli提取物中的活性测定 在含氯霉素的10ml 2xYT中培养含pACYC-taqScp450的Escherichiacoli细胞。大致加工细胞悬浮液并如实施例5中所述用制甲羟酶素孵育。在该情况下,培养温度为37℃,并向反应混合物中添加氯霉素和IPTG(0.1mM)。将反应在30℃和220rpm下孵育。在不同的时间点取样,并如实施例1所述使用HPLC方案分析。24小时后未检测到普伐他汀。
实施例7 Amycolatopsis orientalis制甲羟酶素羟化酶在Escherichia coli中的活性构建Ao-cmpH Escherichia coli表达克隆 使用SEQ ID NO.9和SEQ ID NO.10的引物,从经分离的基因组DNA中PCR扩增编码Amycolatopsis orientalis p450的基因。根据供应商(Invitrogen)的说明在pCR2.1TOPO/TA载体中克隆PCR片段。如下构建表达克隆用Acc65I消化pACYC-taq(

2000)并连接分离自pCR2.1TOPO/TA载体的Acc65I片段,得到pACYC-taqAop450(图5)。
Escherichia coli提取物中的活件测定 在含氯霉素的2xYT(10ml)中培养含pACYC-taqAop450(图5)的Escherichia coli细胞。如实施例5中所述用制甲羟酶素孵育细胞悬浮液。培养温度为37℃,并添加氯霉素和IPTG(0.1mM)。将反应在30℃和220rpm下孵育。在不同的时间点取样,并如实施例1所述使用HPLC方案分析。24小时后检测到大的普伐他汀峰。该结果清楚地证明Amycolatopsisorientalis p450酶比其它p450更适合在Escherichia coli中羟基化制甲羟酶素得多。
表6带有P450基因的Escherichia coli菌株的制甲羟酶素转化。数据总结了50个测试克隆的平均转化比例,每个克隆给出至少90%的制甲羟酶素到普伐他汀的转化。如下所示,SEQ ID NO 11-18、27-34编码的所有DNA片段催化非常相似的制甲羟酶素转化特征。百分比是指被转化的制甲羟酶素。
实施例8 Amycolatopsis orientalis制甲羟酶素羟化酶的衍生物在Escherichia coli中的活性 合成生产基因SEQ ID NO 11-18、27-34并用作PCR反应的模板,所述PCR反应使用添加了attB1(针对SEQ ID NO 9)和attB2(针对SEQ ID NO 10)重组位点的寡核苷酸SEQ ID NO 9和10。通过进行Gateway BP反应(Invitrogen Corporation)将PCR片段克隆进pDONR221载体(Invitrogen Corporation,荷兰)中;通过DNA测序验证序列以排除PCR相关的错误。使用Gateway LR反应,将基因从pDONR221载体转移至pET-DEST42载体,得到最终的表达载体pET-DEST42-P450。在含有卡纳霉素的10ml 2xYT中于30℃下培养带有pET-DEST42-P450(SEQ ID NO 11-18、27-34作为插入物)的Escherichia coli BL21 DE3,直至OD600=0.5-1.0。随后对培养物补充0.1-0.3mM IPTG和0.5mMδ-氨基乙酰丙酸盐,并在22℃和280rpm下孵育12小时。收获细胞,洗涤并重悬于新鲜的2xYT培养基(补充经水解的制甲羟酶素,200mg/L;葡萄糖,2g/l;磷酸盐缓冲液,50mM;pH 6.8)中。将细胞悬浮液在30℃或37℃下于280rpm下孵育24或48小时。如实施例4中所述提取和分析他汀。24小时后,可识别一个非常大的普伐他汀峰。与实施例7中所述实验相反,大部分产生的普伐他汀是β-变体,给出下述证据如果分别与SEQ ID NO 3或SEQ ID NO 6编码的制甲羟酶素羟化酶比较,则SEQ IDNO 19-26和SEQ ID NO 35-42的酶的立体专一性显著改变。
表7带有P450基因的Escherichia coli菌株的制甲羟酶素转化。数据总结了50个测试克隆的平均转化比例,每个克隆给出至少90%的制甲羟酶素到普伐他汀的转化。如下所示,SEQ ID NO 11-18、27-34编码的所有DNA片段催化非常相似的制甲羟酶素转化特征。百分比是指被转化的制甲羟酶素。

序列表
<110>帝斯曼知识产权资产管理有限公司
<120>制备普伐他汀的工艺
<130>25590WO
<150>EP06126046.9
<151>2006-12-13
<160>59
<170>PatentIn version 3.2
<210>1
<211>2545
<212>DNA
<213>Amycolatopsis orientalis
<400>1
gatctctacc tcgcgctggc gaacgacacg gactgactag ccccgcggcg ggttgaagat 60
catcgattcc gggttgatct gcttggcctt ctccatcagg ccgtactgcg tcatcaggtc 120
gatcacccgc tgcatgcgca ccgggctcat cgcggtcggc cacgtgccga ggcgcatcag 180
gctgacggtg tccttgtcca ctttggcgta gctggtcacg gtctgctcga ccaggctgcg 240
gttggccgcg tcccgctgtc ccttgacgat cgcccgctgg aacgccgccg tggtcttcgg 300
gttctcctgt gcgtacttgg cgctggtcgc ccacaccgcg atcggcacgt ccaaagtggc 360
cccggtcgcc gcgtccagca ccggcagcat cccggccttg cgctgcgcct gggtgatgta 420
cggctccacc atgaacgcgg cgtcgacgtt cttgcgctcg atggccgcct gcatgtccgg 480
gaacgggatc tcggtgaagg tcaccgtctt gatgtccaca ccgttggcct cgagcgcgga 540
ccgtgcggtc agttcgacga tgttcgcctt ggtgttgatc gcgatcttct tgccggccag 600
gtcggctggc ttggtgatcg cgttgtcctt gccggtcagg atcaggaaca tgccctgagc 660
ggcctggtag gcgtccgcga ccagcttgat gtccagcacg ttcttgtact gcgcggtgaa 720
gaacgagacg tagttgccga atgcgaactg cagttcgccg ttggcaaggc cgggcaccgc 780
cgccgcgccg cccggcagcg acttcagttc gacatcaagg ccttcctggg tgaagtagcc 840
tttctgctgt gcgatggcca gcggtacggt gtccacaatg ggcaacgtgc cgacgaccac 900
tttggtctgc tccaaaccgc ctgtctggtt gggcttttcg gtatccccac ccagtgccga 960
acagctggcg gcggcgaggg cgaggacgca ggacagggct atgcgccacg gacgggcgag 1020
tgacatgcgg ggttctcctg gcaggcaaga cacgatgatc tgggccggat cagaccacat 1080
cgtccctgtt caacgccagt cgacggaccc tactttcgac tgaaatatgc cagagctcac 1140
tcgttaagtg gcacgaatgt gctatgcatc ccatgcaacg ggcagcgccc aaaccccgta 1200
cgccgcggac ttctcgcgca gcttgatctc ctcggccggc accgcgagcc gcagcgacgg 1260
gaacctcgcg aacagccggg tgaagccgat gcgcatctcc actctggcca gttgctggcc 1320
gaggcattgg tgtataccac cgccgaacgc ggcgtgcttg cgtgcgtcca ctctgtccag 1380
ttgaaggata tcgggttcgt cgaacacctt cgggtcccta ttgaccgcgg gcagtccgat 1440
cgcgacagtg tcgcccttcc tgatcatctg gccttcgagc tccacgtcct ccagcgccgc 1500
ccggttgggc gttccaaggt ggacgatcga gaggtagcgc agcagttcct ccaccgcgtc 1560
cgggctgtcc agggcagcga tctgctccgg atgctgaagg agcgcgaaag tgcctaatcc 1620
caacatgttc gcggtggtct cgtgcccggc gacgagcaaa agcaacgcga tgttcgtcag1680
ctcttcatcg gtcagatcgg tgtccgtgat caagctgcca agcaggtcgt ccttggggct1740
caaccgcttc gtggcgacca gttcagcgat gtagcgggtg agtttgccaa gcgccgtcgt1800
cacctcatcc tgtgtcttgt ccacactggc catgatcgtg gtctgctctt ggaagaacgc1860
gtgatcggca tacgagacgc ccagcagctc gcagatcacc agcgaaggca ctggcaacgc1920
gaacgcctgc accagatcga ccggcggtcc tgctttggcc atcgcgtcga ggtggtcctc1980
ggtgatctgg acgatccgcg gttcgagttc cttgatccgt cgcacggtga actggctgat2040
cagcatccgg cggtaacgcg tgtgctcggg tgcgtccatg ttgatgaacc agccgggcgc2100
cggggctttc gtggctccgc ctggtcgtgg gatgacgctg aacaccgggt gcttgtgctc2160
ggggcggttg ctgaagcgcg gatcgatcat gacagtccgc gctgcggcat ggctggtcac2220
cagccagccg atgtggccgt cagggaatcg catcgggctc actggcggaa gtttcaccag2280
gtcaggcggt gggtcgaagg ggtagcccac ggctcggccg gtcggtagtg tcactggctc2340
gttcatattt tcggagtcta ctctcatttg atggtggact gtcaaagaag agagttctcc2400
ggtgtgcagt tatcctgctg cggtggatca accagcgact gggttgcggg aacgcaagaa2460
ggccagaacg aagaccgcca tccagcagca cgcgctgcgg ctgttcaagg agcacggcta2520
ccaggccacc acggtcgagc agatc 2545
<210>2
<211>1206
<212>DNA
<213>Amycolatopsis orientalis
<220>
<221>CDS
<222>(1)..(1206)
<400>2
atg aga gta gac tcc gaa aat atg aac gag cca gtg aca cta ccg acc48
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
ggc cga gcc gtg ggc tac ccc ttc gac cca ccg cct gac ctg gtg aaa96
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
ctt ccg cca gtg agc ccg atg cga ttc cct gac ggc cac atc ggc tgg144
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
ctg gtg acc agc cat gcc gca gcg cgg act gtc atg atc gat ccg cgc192
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
ttc agc aac cgc ccc gag cac aag cac ccg gtg ttc agc gtc atc cca240
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
cga cca ggc gga gcc acg aaa gcc ccg gcg ccc ggc tgg ttc atc aac288
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
atg gac gca ccc gag cac acg cgt tac cgc cgg atg ctg atc agc cag336
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
ttc acc gtg cga cgg atc aag gaa ctc gaa ccg cgg atc gtc cag atc384
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
acc gag gac cac ctc gac gcg atg gcc aaa gca gga ccg ccg gtc gat 432
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130135 140
ctg gtg cag gcg ttc gcg ttg cca gtg cct tcg ctg gtg atc tgc gag 480
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
ctg ctg ggc gtc tcg tat gcc gat cac gcg ttc ttc caa gag cag acc 528
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
acg atc atg gcc agt gtg gac aag aca cag gat gag gtg acg acg gcg 576
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
ctt ggc aaa ctc acc cgc tac atc gct gaa ctg gtc gcc acg aag cgg 624
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
ttg agc ccc aag gac gac ctg ctt ggc agc ttg atc acg gac acc gat 672
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
ctg acc gat gaa gag ctg acg aac atc gcg ttg ctt ttg ctc gtc gcc 720
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
ggg cac gag acc acc gcg aac atg ttg gga tta ggc act ttc gcg ctc 768
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
ctt cag cat ccg gag cag atc gct gcc ctg gac agc ccg gac gcg gtg 816
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
gag gaa ctg ctg cgc tac ctc tcg atc gtc cac ctt gga acg ccc aac 864
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
cgg gcg gcg ctg gag gac gtg gag ctc gaa ggc cag atg atc agg aag 912
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
ggc gac act gtc gcg atc gga ctg ccc gcg gtc aat agg gac ccg aag 960
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
gtg ttc gac gaa ccc gat atc ctt caa ctg gac aga gtg gac gca cgc 1008
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
aag cac gcc gcg ttc ggc ggt ggt ata cac caa tgc ctc ggc cag caa 1056
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
ctg gcc aga gtg gag atg cgc atc ggc ttc acc cgg ctg ttc gcg agg 1104
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
ttc ccg tcg ctg cgg ctc gcg gtg ccg gcc gag gag atc aag ctg cgc 1152
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
gag aag tcc gcg gcg tac ggg gtt tgg gcg ctg ccc gtt gca tgg gat 1200
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
gca tag 1206
Ala
<210>3
<211>401
<212>PRT
<213>Amycolatopsis orientalis
<400>3
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
15 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>4
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>4
atgcgtgtcg actccgaaaa catgaacgag cctgtgaccc tccccaccgg ccgtgccgtg 60
ggctacccct tcgaccctcc tcctgacctg gtgaagcttc ctcccgtgag ccccatgcgc 120
ttccctgacg gccacatcgg ctggctggtg accagccacg ccgctgcgcg tactgtcatg 180
atcgatcccc gcttcagcaa ccgccccgag cacaagcacc ctgtgttcag cgtcatcccc 240
cgccccggcg gagccactaa ggcccccgcg cccggctggt tcatcaacat ggacgccccc 300
gagcacaccc gttaccgccg catgctgatc agccagttca ccgtgcgccg tatcaaggaa 360
ctcgaacctc gtatcgtcca gatcaccgag gaccacctcg acgcgatggc caaggctgga 420
cctcctgtcg atctggtgca ggcgttcgcg ttgcctgtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtacgc cgatcacgcg ttcttccagg agcagaccac catcatggcc 540
tccgtggaca agactcagga tgaggtgacc accgcgcttg gcaagctcac ccgctacatc 600
gctgaactgg tcgccactaa gcgtttgagc cccaaggacg acctgcttgg cagcttgatc 660
actgacaccg atctgaccga tgaagagctg accaacatcg cgttgctttt gctcgtcgcc 720
ggtcacgaga ccaccgcgaa catgttggga ctcggcactt tcgcgctcct tcagcacccc 780
gagcagatcg ctgccctgga cagccccgac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacccc caaccgtgcg gcgctggagg acgtggagct cgaaggccag 900
atgatccgca agggcgacac tgtcgcgatc ggactgcccg cggtcaaccg tgaccccaag 960
25590WO Sequence Listing.ST25.txt
gtgttcgacg aacccgatat ccttcagctg gaccgtgtgg acgctcgcaa gcacgccgcg1020
ttcggcggtg gtattcacca gtgcctcggc cagcagctgg cccgtgtgga gatgcgcatc1080
ggcttcaccc gtctgttcgc gcgcttcccc tcgctgcgtc tcgcggtgcc cgccgaggag1140
atcaagctgc gcgagaagtc cgcggcgtac ggtgtttggg cgctgcccgt tgcttgggat1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>5
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<220>
<221>CDS
<222>(1)..(2199)
<400>5
atg cgt gtc gac tcc gaa aac atg aac gag cct gtg acc ctc ccc acc 48
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
ggc cgt gcc gtg ggc tac ccc ttc gac cct cct cct gac ctg gtg aag 96
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
ctt cct ccc gtg agc ccc atg cgc ttc cct gac ggc cac atc ggc tgg 144
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
ctg gtg acc agc cac gcc gct gcg cgt act gtc atg atc gat ccc cgc 192
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
ttc agc aac cgc ccc gag cac aag cac cct gtg ttc agc gtc atc ccc 240
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
cgc ccc ggc gga gcc act aag gcc ccc gcg ccc ggc tgg ttc atc aac 288
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
atg gac gcc ccc gag cac acc cgt tac cgc cgc atg ctg atc agc cag 336
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
ttc acc gtg cgc cgt atc aag gaa ctc gaa cct cgt atc gtc cag atc 384
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
acc gag gac cac ctc gac gcg atg gcc aag gct gga cct cct gtc gat 432
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
ctg gtg cag gcg ttc gcg ttg cct gtg cct tcg ctg gtg atc tgc gag 480
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
ctg ctg ggc gtc tcg tac gcc gat cac gcg ttc ttc cag gag cag acc 528
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
acc atc atg gcc tcc gtg gac aag act cag gat gag gtg acc acc gcg 576
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
ctt ggc aag ctc acc cgc tac atc gct gaa ctg gtc gcc act aag cgt 624
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
ttg agc ccc aag gac gac ctg ctt ggc agc ttg atc act gac acc gat 672
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
ctg acc gat gaa gag ctg acc aac atc gcg ttg ctt ttg ctc gtc gcc 720
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
ggt cac gag acc acc gcg aac atg ttg gga ctc ggc act ttc gcg ctc 768
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
ctt cag cac ccc gag cag atc gct gcc ctg gac agc ccc gac gcg gtg 816
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
gag gaa ctg ctg cgc tac ctc tcg atc gtc cac ctt gga acc ccc aac 864
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
cgt gcg gcg ctg gag gac gtg gag ctc gaa ggc cag atg atc cgc aag 912
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
ggc gac act gtc gcg atc gga ctg ccc gcg gtc aac cgt gac ccc aag 960
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
gtg ttc gac gaa ccc gat atc ctt cag ctg gac cgt gtg gac gct cgc 1008
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
aag cac gcc gcg ttc ggc ggt ggt att cac cag tgc ctc ggc cag cag 1056
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
ctg gcc cgt gtg gag atg cgc atc ggc ttc acc cgt ctg ttc gcg cgc 1104
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
ttc ccc tcg ctg cgt ctc gcg gtg ccc gcc gag gag atc aag ctg cgc 1152
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
gag aag tcc gcg gcg tac ggt gtt tgg gcg ctg ccc gtt gct tgg gat 1200
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
gcc tct agt gtg ctg cac cgt cac cag cct gtc acc atc gga gaa ccc 1248
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
gcc gcc cgt gcg gtg tcc cgc acc gtc acc gtc gag cgc ctg gac cgt 1296
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
atc gcc gac gac gtg ctg cgc ctc gtc ctg cgc gac gcc ggc gga aag 1344
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
act ctc ccc act tgg act ccc ggc gcc cac atc gac ctc gac ctc ggc 1392
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
gcg ctg tcg cgc cag tac tcc ctg tgc ggc gcg ccc gat gcg cct agc 1440
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
tac gag att gcc gtg cac ctg gat ccc gag agc cgc ggc ggt tcg cgc 1488
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
tac atc cac gaa cag ctc gag gtg gga agc cct ctc cgt atg cgc ggc 1536
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
cct cgt aac cac ttc gcg ctc gac ccc ggc gcc gag cac tac gtg ttc 1584
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
gtc gcc ggc ggc atc ggc atc acc cct gtc ctg gcc atg gcc gac cac 1632
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
gcc cgc gcc cgt gga tgg agc tac gaa ctg cac tac tgc ggc cgt aac 1680
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
cgt tcc ggc atg gcc tac ctc gag cgt gtc gcc ggt cac ggt gac cgt 1728
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
gcc gcc ctg cac gtg tcc gag gaa ggc acc cgt atc gac ctc gcc gcc 1776
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
ctc ctc gcc gag ccc gcc ccc ggc gtc cag atc tac gcg tgc ggt gcc 1824
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
ggt cgt ctg ctc gcc gga ctc gag gac gcg agc cgt aac tgg ccc gac 1872
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
ggt gcg ctg cac gtc gag cac ttc acc tcg tcc ctc gcg gcg ctc gat 1920
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
cct gac gtc gag cac gcc ttc gac ctc gaa ctg cgt gac tcg ggt ctg 1968
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
acc gtg cgt gtc gaa ccc acc cag acc gtc ctc gac gcg ttg cgc gcc 2016
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
aac aac atc gac gtg ccc agc gac tgc gag gaa ggc ctc tgc ggc tcg 2064
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
tgc gag gtc gcc gtc ctc gac ggc gag gtc gac cac cgc gac act gtg 2112
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
ctg acc aag gcc gag cgt gcg gcg aac cgt cag atg atg acc tgc tgc 2160
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
tcg cgt gcc tgc ggc gac cgt ctg gcc ctg cgt ctc taa 2199
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>6
<211>732
<212>PRT
<213>人工
<220>
<223>合成DNA
<400>6
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Ala Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>7
<211>33
<212>DNA
<213>人工
<220>
<223>合成引物
<400>7
gggggtacca tggccgagat gacagagaaa gcc33
<210>8
<211>30
<212>DNA
<213>人工
<220>
<223>合成引物
<400>8
gggggtacct caccaggtga ccgggagttc30
<210>9
<211>40
<212>DNA
<213>人工
<220>
<223>合成引物
<400>9
gggggtacca tgagagtaga ctccgaaaat atgaacgagc 40
<210>10
<211>30
<212>DNA
<213>人工
<220>
<223>合成引物
<400>10
gggggtaccc tatgcatccc atgcaacggg 30
<210>11
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>11
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgctccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgtta540
agtgtggaca agacacagga tgaggtgacg acagcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgctggga ttaggcactt tcgcgctcct tcagcatccg780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>12
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>12
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gaccgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tctttaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgctttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>13
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>13
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgtgctcct tcagcacccg780
gagcagatcg ctcttctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat1200
gcatag 1206
<210>14
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>14
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
tttcctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcgccaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag aacagaccac gatcatgttt540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cctaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacaccg cgttgctttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggagcgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>15
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>15
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtccg gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctaatctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>16
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>16
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag agcagaccac gatcatgctg 540
agtgtggaca agacacagga taaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>17
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>17
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccgcctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgctt540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acatggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcctcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>18
<211>1206
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>18
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctaccccc tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgttg 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg cttgcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcatgggat 1200
gcatag 1206
<210>19
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>19
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Ser Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>20
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>20
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Phe Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>21
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>21
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Val Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Leu Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>22
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>22
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ala Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Phe Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Thr Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Ala Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>23
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>23
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Arg Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Asn Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>24
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>24
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Lys Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>25
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>25
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp Arg Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Met Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Leu Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>26
<211>401
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>26
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Leu Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Cys Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala
<210>27
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>27
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgctccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgtta 540
agtgtggaca agacacagga tgaggtgacg acagcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgctggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>28
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>28
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gaccgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tctttaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgctttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>29
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>29
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgtgctcct tcagcacccg 780
gagcagatcg ctcttctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>30
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>30
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
tttcctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcgccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag aacagaccac gatcatgttt 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cctaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacaccg cgttgctttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgccctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggagcgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>31
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>31
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcaccaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtccg gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatggtc 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctaatctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>32
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>32
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc tgatcacgcg ttcttccaag agcagaccac gatcatgctg 540
agtgtggaca agacacagga taaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>33
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>33
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg60
ggctacccct tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa360
ctcgaaccgc ggatcgtcca gatcaccgag gaccgcctcg acgcgatggc caaagcagga420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgctt540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg780
gagcagatcg ctgtcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acatggagct cgaaggccag900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc1080
ggcctcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>34
<211>2199
<212>DNA
<213>人工
<220>
<223>合成DNA
<400>34
atgagagtag actccgaaaa tatgaacgag ccagtgacac taccgaccgg ccgagccgtg 60
ggctaccccc tcgacccacc gcctgacctg gtgaaacttc cgccagtgag cccgatgcga 120
ttccctgacg gccacatcgg ctggctggtg accagccatg ccgcagcgcg gactgtcatg 180
atcgatccgc gcttcagcaa ccgccccgag cacaagcacc cggtgttcag cgtcatccca 240
cgaccaggcg gagccacgaa agccccggcg cccggctggt tcatcaacat ggacgcaccc 300
gagcacacgc gttaccgccg gatgctgatc agccagttca ccgtgcgacg gatcaaggaa 360
ctcgaaccgc ggatcgtcca gatcaccgag gaccacctcg acgcgatggc caaagcagga 420
ccgccggtcg atctggtgca ggcgttcgcg ttgccagtgc cttcgctggt gatctgcgag 480
ctgctgggcg tctcgtatgc cgatcacgcg ttcttccaag agcagaccac gatcatgttg 540
agtgtggaca agacacagga tgaggtgacg acggcgcttg gcaaactcac ccgctacatc 600
gctgaactgg tcgccacgaa gcggttgagc cccaaggacg acctgcttgg cagcttgatc 660
acggacaccg atctgaccga tgaagagctg acgaacatcg cgttgatttt gctcgtcgcc 720
gggcacgaga ccaccgcgaa catgttggga ttaggcactt tcgcgctcct tcagcatccg 780
gagcagatcg cttgcctgga cagcccggac gcggtggagg aactgctgcg ctacctctcg 840
atcgtccacc ttggaacgcc caaccgggcg gcgctggagg acgtggagct cgaaggccag 900
atgatcagga agggcgacac tgtcgcgatc ggactgcccg cggtcaatag ggacccgaag 960
gtgttcgacg aacccgatat ccttcaactg gacagagtgg acgcacgcaa gcacgccgcg 1020
ttcggcggtg gtatacacca atgcctcggc cagcaactgg ccagagtgga gatgcgcatc 1080
ggcttcaccc ggctgttcgc gaggttcccg tcgctgcggc tcgcggtgcc ggccgaggag 1140
atcaagctgc gcgagaagtc cgcggcgtac ggggtttggg cgctgcccgt tgcttgggat 1200
gcctctagtg tgctgcaccg tcaccagcct gtcaccatcg gagaacccgc cgcccgtgcg 1260
gtgtcccgca ccgtcaccgt cgagcgcctg gaccgtatcg ccgacgacgt gctgcgcctc 1320
gtcctgcgcg acgccggcgg aaagactctc cccacttgga ctcccggcgc ccacatcgac 1380
ctcgacctcg gcgcgctgtc gcgccagtac tccctgtgcg gcgcgcccga tgcgcctagc 1440
tacgagattg ccgtgcacct ggatcccgag agccgcggcg gttcgcgcta catccacgaa 1500
cagctcgagg tgggaagccc tctccgtatg cgcggccctc gtaaccactt cgcgctcgac 1560
cccggcgccg agcactacgt gttcgtcgcc ggcggcatcg gcatcacccc tgtcctggcc 1620
atggccgacc acgcccgcgc ccgtggatgg agctacgaac tgcactactg cggccgtaac 1680
cgttccggca tggcctacct cgagcgtgtc gccggtcacg gtgaccgtgc cgccctgcac 1740
gtgtccgagg aaggcacccg tatcgacctc gccgccctcc tcgccgagcc cgcccccggc 1800
gtccagatct acgcgtgcgg tgccggtcgt ctgctcgccg gactcgagga cgcgagccgt 1860
aactggcccg acggtgcgct gcacgtcgag cacttcacct cgtccctcgc ggcgctcgat 1920
cctgacgtcg agcacgcctt cgacctcgaa ctgcgtgact cgggtctgac cgtgcgtgtc 1980
gaacccaccc agaccgtcct cgacgcgttg cgcgccaaca acatcgacgt gcccagcgac 2040
tgcgaggaag gcctctgcgg ctcgtgcgag gtcgccgtcc tcgacggcga ggtcgaccac 2100
cgcgacactg tgctgaccaa ggccgagcgt gcggcgaacc gtcagatgat gacctgctgc 2160
tcgcgtgcct gcggcgaccg tctggccctg cgtctctaa 2199
<210>35
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>35
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Ser Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>36
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>36
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Phe Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Ash Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>37
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>37
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Val Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Leu Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>38
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>38
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ala Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Phe Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Thr Ala Leu Leu Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Ala Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Ala Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>39
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>39
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Arg Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Val Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Asn Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>40
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>40
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Lys Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>41
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>41
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Phe Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp Arg Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Val Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Met Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Leu Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>42
<211>732
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>42
Met Arg Val Asp Ser Glu Asn Met Asn Glu Pro Val Thr Leu Pro Thr
1 5 10 15
Gly Arg Ala Val Gly Tyr Pro Leu Asp Pro Pro Pro Asp Leu Val Lys
20 25 30
Leu Pro Pro Val Ser Pro Met Arg Phe Pro Asp Gly His Ile Gly Trp
35 40 45
Leu Val Thr Ser His Ala Ala Ala Arg Thr Val Met Ile Asp Pro Arg
50 55 60
Phe Ser Asn Arg Pro Glu His Lys His Pro Val Phe Ser Val Ile Pro
65 70 75 80
Arg Pro Gly Gly Ala Thr Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn
85 90 95
Met Asp Ala Pro Glu His Thr Arg Tyr Arg Arg Met Leu Ile Ser Gln
100 105 110
Phe Thr Val Arg Arg Ile Lys Glu Leu Glu Pro Arg Ile Val Gln Ile
115 120 125
Thr Glu Asp His Leu Asp Ala Met Ala Lys Ala Gly Pro Pro Val Asp
130 135 140
Leu Val Gln Ala Phe Ala Leu Pro Val Pro Ser Leu Val Ile Cys Glu
145 150 155 160
Leu Leu Gly Val Ser Tyr Ala Asp His Ala Phe Phe Gln Glu Gln Thr
165 170 175
Thr Ile Met Leu Ser Val Asp Lys Thr Gln Asp Glu Val Thr Thr Ala
180 185 190
Leu Gly Lys Leu Thr Arg Tyr Ile Ala Glu Leu Val Ala Thr Lys Arg
195 200 205
Leu Ser Pro Lys Asp Asp Leu Leu Gly Ser Leu Ile Thr Asp Thr Asp
210 215 220
Leu Thr Asp Glu Glu Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala
225 230 235 240
Gly His Glu Thr Thr Ala Asn Met Leu Gly Leu Gly Thr Phe Ala Leu
245 250 255
Leu Gln His Pro Glu Gln Ile Ala Cys Leu Asp Ser Pro Asp Ala Val
260 265 270
Glu Glu Leu Leu Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn
275 280 285
Arg Ala Ala Leu Glu Asp Val Glu Leu Glu Gly Gln Met Ile Arg Lys
290 295 300
Gly Asp Thr Val Ala Ile Gly Leu Pro Ala Val Asn Arg Asp Pro Lys
305 310 315 320
Val Phe Asp Glu Pro Asp Ile Leu Gln Leu Asp Arg Val Asp Ala Arg
325 330 335
Lys His Ala Ala Phe Gly Gly Gly Ile His Gln Cys Leu Gly Gln Gln
340 345 350
Leu Ala Arg Val Glu Met Arg Ile Gly Phe Thr Arg Leu Phe Ala Arg
355 360 365
Phe Pro Ser Leu Arg Leu Ala Val Pro Ala Glu Glu Ile Lys Leu Arg
370 375 380
Glu Lys Ser Ala Ala Tyr Gly Val Trp Ala Leu Pro Val Ala Trp Asp
385 390 395 400
Ala Ser Ser Val Leu His Arg His Gln Pro Val Thr Ile Gly Glu Pro
405 410 415
Ala Ala Arg Ala Val Ser Arg Thr Val Thr Val Glu Arg Leu Asp Arg
420 425 430
Ile Ala Asp Asp Val Leu Arg Leu Val Leu Arg Asp Ala Gly Gly Lys
435 440 445
Thr Leu Pro Thr Trp Thr Pro Gly Ala His Ile Asp Leu Asp Leu Gly
450 455 460
Ala Leu Ser Arg Gln Tyr Ser Leu Cys Gly Ala Pro Asp Ala Pro Ser
465 470 475 480
Tyr Glu Ile Ala Val His Leu Asp Pro Glu Ser Arg Gly Gly Ser Arg
485 490 495
Tyr Ile His Glu Gln Leu Glu Val Gly Ser Pro Leu Arg Met Arg Gly
500 505 510
Pro Arg Asn His Phe Ala Leu Asp Pro Gly Ala Glu His Tyr Val Phe
515 520 525
Val Ala Gly Gly Ile Gly Ile Thr Pro Val Leu Ala Met Ala Asp His
530 535 540
Ala Arg Ala Arg Gly Trp Ser Tyr Glu Leu His Tyr Cys Gly Arg Asn
545 550 555 560
Arg Ser Gly Met Ala Tyr Leu Glu Arg Val Ala Gly His Gly Asp Arg
565 570 575
Ala Ala Leu His Val Ser Glu Glu Gly Thr Arg Ile Asp Leu Ala Ala
580 585 590
Leu Leu Ala Glu Pro Ala Pro Gly Val Gln Ile Tyr Ala Cys Gly Ala
595 600 605
Gly Arg Leu Leu Ala Gly Leu Glu Asp Ala Ser Arg Asn Trp Pro Asp
610 615 620
Gly Ala Leu His Val Glu His Phe Thr Ser Ser Leu Ala Ala Leu Asp
625 630 635 640
Pro Asp Val Glu His Ala Phe Asp Leu Glu Leu Arg Asp Ser Gly Leu
645 650 655
Thr Val Arg Val Glu Pro Thr Gln Thr Val Leu Asp Ala Leu Arg Ala
660 665 670
Asn Asn Ile Asp Val Pro Ser Asp Cys Glu Glu Gly Leu Cys Gly Ser
675 680 685
Cys Glu Val Ala Val Leu Asp Gly Glu Val Asp His Arg Asp Thr Val
690 695 700
Leu Thr Lys Ala Glu Arg Ala Ala Asn Arg Gln Met Met Thr Cys Cys
705 710 715 720
Ser Arg Ala Cys Gly Asp Arg Leu Ala Leu Arg Leu
725 730
<210>43
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>43
Lys Ala Pro Ala Pro Gly Trp Phe Ile Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>44
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>44
Leu Thr Asn Ile Ala Leu Leu Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>45
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>45
Arg Tyr Leu Ser Ile Val His Leu Gly Thr Pro Asn Arg
1 5 10
<210>46
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>46
Glu Glu Ile Lys Leu Arg Glu Lys Ser Ala Ala Tyr Gly
1 5 10
<210>47
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>47
Phe Gln Glu Gln Thr Thr Ile Met Ala Ser Val Asp
1 5 10
<210>48
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>48
Lys Ala Pro Ala Pro Gly Trp Phe Ala Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>49
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>49
Lys Ala Pro Ala Pro Gly Trp Phe Thr Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>50
<211>16
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>50
Lys Ala Pro Ala Pro Gly Trp Phe Phe Asn Met Asp Ala Pro Glu His
1 5 10 15
<210>51
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>51
Leu Thr Asn Thr Ala Leu Leu Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>52
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>52
Leu Thr Asn Ile Ala Leu Ile Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>53
<211>18
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>53
Leu Thr Asn Ile Ala Leu Pro Leu Leu Val Ala Gly His Glu Thr Thr
1 5 10 15
Ala Asn
<210>54
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>54
Arg Tyr Leu Ser Ile Val His Leu Gly Ala Pro Asn Arg
1 5 10
<210>55
<211>13
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>55
Glu Glu Ile Lys Leu Arg Glu Lys Ser Thr Ala Tyr Gly
1 5 10
<210>56
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>56
Phe Gln Glu Gln Thr Thr Ile Met Thr Ser Val Asp
1 5 10
<210>57
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>57
Phe Gln Glu Gln Thr Thr Ile Met Val Ser Val Asp
1 5 10
<210>58
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>58
Phe Gln Glu Gln Thr Thr Ile Met Leu Ser Val Asp
1 5 10
<210>59
<211>12
<212>PRT
<213>人工
<220>
<223>合成蛋白质
<400>59
Phe Gln Glu Gln Thr Thr Ile Met Phe Ser Val Asp
1 5 10
权利要求
1.选自下组的多肽,所述组由以下组成具有根据SEQ ID NO 3、SEQ ID NO 6、SEQ ID NO 43-59的氨基酸序列的多肽,具有与SEQ IDNO 3至少50%同一性程度的氨基酸序列的多肽,具有与SEQ ID NO 6至少60%同一性程度的氨基酸序列的多肽,和具有与SEQ ID NO 43-59差异不多于3个氨基酸的氨基酸序列的多肽。
2.根据权利要求1的多肽,其具有根据SEQ ID NO 3、SEQ ID NO6、SEQ ID NO 19-26或SEQ ID NO 35-59的氨基酸序列,或具有与SEQID NO 3、SEQ ID NO 6、SEQ ID NO 19-26或SEQ ID NO 35-59至少90%同一性程度的氨基酸序列。
3.能够以至少50%的效率将制甲羟酶素转化为普伐他汀的多肽。
4.多核苷酸,其包含编码权利要求1到3中任一项的多肽的DNA序列。
5.权利要求4的多肽,其为SEQ ID NO 1、2、4或5。
6.用于生产普伐他汀的方法,包括步骤
(i)在生产宿主中表达权利要求4到5中任一项的多核苷酸;
(ii)培养在步骤(i)中获得的所述生产宿主;
(iii)从步骤(ii)中获得的混合物中分离普伐他汀。
7.用于分离编码下述多肽的多核苷酸的方法,所述多肽能够促进制甲羟酶素成为普伐他汀的转化,所述方法包括步骤
(i)用权利要求4到5中任一项的多核苷酸转化宿主细胞;
(ii)针对其羟基化制甲羟酶素的能力选择经转化的细胞的克隆;
(iii)用多种多核苷酸再转化这些经分离的克隆;
(iv)针对其羟基化制甲羟酶素的能力选择经转化的细胞的克隆;
(v)分离质粒;
(vi)对所述质粒的插入物进行测序。
8.根据权利要求6的方法,还包括在步骤(i)中获得的生产宿主中共同表达根据权利要求7的经分离的多核苷酸。
9.根据权利要求8的方法,其中在所述生产宿主的生长期间添加制甲羟酶素。
10.根据权利要求8到9中任一项的方法,其中所述生产宿主是真菌细胞或细菌细胞。
11.根据权利要求10的方法,其中所述真菌细胞是酵母或丝状真菌细胞,以及,所述细菌细胞选自由放线菌和变形菌组成的组。
12.根据权利要求11的方法,其中所述酵母是Saccharomycescerevisiae、Hansenula polymorpha、Kluyveromyces lactis或Pichiapastoris,所述丝状真菌细胞是Aspergillus terreus、Aspergillus nidulans、Aspergillus niger、Penicillium citrinum、Penicillium chrysogenum、Monascus ruber或Monascus paxii,所述放线菌是Streptomyces、Amycolatopsis或Actinomadura,所述变形菌是Escherichia或Bacillus。
13.根据权利要求12的方法,其中所述Streptomyces是Streptomycescarbophilus、Streptomyces lividans、Streptomyces coelicolor或Streptomycesclavuligerus,所述Amycolatopsis是Amycolatopsis orientalis,所述Escherichia是Escherichia coli,所述Bacillus是Bacillusamyloliquefaciens、Bacillus licheniformis或Bacillus subtilis。
14.药物组合物,其包含根据权利要求6和8到13中任一项获得的普伐他汀。
全文摘要
本发明提供了具有根据SEQ ID NO 3、SEQ ID NO 6或SEQ ID NO43-59的氨基酸序列的多肽。本发明还提供了包含编码这些多肽的DNA序列的多核苷酸用于分离编码下述多肽的多核苷酸的方法,所述多肽能够促进制甲羟酶素向普伐他汀的转化。另外,本发明提供了生产普伐他汀和包含普伐他汀的药物组合物的方法。
文档编号C12P7/62GK101558152SQ200780046270
公开日2009年10月14日 申请日期2007年12月11日 优先权日2006年12月13日
发明者保罗·克莱斯森, 阿德里安努斯·维尔赫穆斯·赫曼努斯·沃勒布里吉特, 马尔科·亚田山大·范德勃戈, 马库斯·汉斯, 简·米特斯卡·范德拉恩 申请人:帝斯曼知识产权资产管理有限公司
网友询问留言 已有0条留言
  • 还没有人留言评论。精彩留言会获得点赞!
1