专利名称:甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列的制作方法
技术领域:
本发明涉及一种甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列。
背景技术:
可溶性酸性转化酶(Soluble Acid Invertase)(以下简称INV)是高等植物体内控制蔗糖代谢的关键酶,不可逆的催化蔗糖裂解形成葡萄糖和果糖。主要存在于液泡中,在调控己糖和蔗糖水平的过程中发挥重要作用,并且与果实发育、成熟以及糖的积累密切相关。在对不同植物幼苗、幼叶、幼根、幼果研究发现,高活性的INV常与果实的发育或贮藏器官的迅速膨大有关。Lingle研究表明,在甜高粱茎秆中,SAI酶的活性与节间的长度有很高的正相关性,而李兴军对杨梅花芽孕育期间研究,发现蔗糖含量与Inv酶活成显著的负相关性。INV主要负责液泡中蔗糖的降解,它活性的缺乏是蔗糖开始累积的先决条件。 含糖量高的植物如甜高粱、甜菜、柑桔、甘蔗都具有缺乏酸性Irw活性的特点。在比较甜与不甜的葫芦科植物果实蔗糖积累特点时,发现输入的蔗糖在不甜的果实中可能被区隔在液泡中,在INV的作用下蔗糖分解成己糖。进一步研究二者的基因型的差异不是在于输入和区隔的能力而是在于输入的蔗糖的进一步代谢,INV的缺乏可以阻碍这一步的蔗糖代谢,随后蔗糖则积累,果实变甜。Inv活力在植物生长发育的不同阶段和不同组织器官具有表达特异性。马铃薯 Inv在幼嫩的源器官如叶、根、萌发的种子中表达量很高,而在库细胞中表达量很少。用GA 处理,可以通过诱导提高酸性Irw活性而降低蔗糖水平,引起新梢旺长而消耗还原糖。同时 Inv基因的表达还受糖、乙烯、ABA的调节,具体调节机理还不是很清楚。番茄果实成熟过程中INV活性升高是受mRAN水平控制的,在对其转入反义INV基因植株的番茄果实中INV的活力与非转基因植株和GUS转基因植株相比是非常低的。而对非转基因番茄果实中的蔗糖含量极低,但在所有拥有反义基因的转基因植株蔗糖的含量则显著提高,己糖的含量下降,表明非转基因番茄在成熟的果实中高活性的INV阻止了蔗糖的积累。尽管拥有反义基因的转基因番茄INV活性受到有效抑制,但这些转基因番茄还是在完熟期积累一定量的己糖。高等植物组织中存在着多种转化酶的同工酶形式。根据最适PH可以分为两大类酸性转化酶和中性/碱性转化酶;根据其在细胞所处的位置可以分为三类细胞壁转化酶、液泡转化酶和细胞质转化酶。细胞壁转化酶和液泡转化酶由于最适PH都偏向酸性,因此合在一起归于酸性转化酶类。所有酸性转化酶都是糖基化的,它们不仅可以催化蔗糖的水解,而且可以催化一些寡聚糖如棉子糖和水苏糖的水解。糖基水解酶(glycosyl hydrolase f ami lie)分为四个基因家族GH32,GH43,GH62,GH68。本发明的甘蔗可溶性酸性转化酶(SoINV)基因属于GH32家族。甘蔗到目前为止,只克隆出了一些INV cDNA序列片段,且没有进一步研究的报道。
发明内容
本发明的目的是为进一步研究甘蔗GH32家族可溶性酸性转化酶(SoINV)基因的功能及应用甘蔗GH32家族可溶性酸性转化酶(SoINV)基因改良作物品种,提供一种甘蔗 GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列。本发明所采取的技术方案如下一种甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,是对所有植物可溶性酸性转化酶基因全长序列进行进化分析的基础上,比对同一家族的可溶性酸性转化酶基因cDNA序列,并设计扩增SoINV基因核心片段的引物,从甘蔗幼嫩叶片提取总RNA, 并经反转录,用常规PCR法扩增SoINV基因核心片段;然后在核心片段序列5'端设计若干对特异引物,在核心片段序列3'端设计一个特异引物和两个锚定引物,通过RACE PCR技术分别扩增到SoINV基因核心区的5'端三个不同序列片段和3'端一个序列片段,将三个 5'端序列分别与核心区、3'端序列和AY302083片段进行拼接,获得甘蔗GH32家族三个可溶性酸性转化酶(SoINV)基因全长cDNA序列,分别记为SoINVl、SoINV2和SoINV3。以上所述的SoINVl总长2387bp ;经Vector NTI Advance 11软件分析,该序列的 ORF为2055bp,编码685个氨基酸;起始密码子(ATG)位于转录起始位点后215bp处,终止密码子(TAG)位于2272bp,其后还有一段115bp的非编码序列,并带有真核生物典型的polyA 尾巴;甘蔗GH32家族SoINVl基因所编码蛋白的氨基酸序列 METRDTTAPLPYSYTPLPAADAASAEVTGTGGRSRRRSLCAAALVLSAALLLAVAALAAAGRRPTTAVG ETAGVGVVPGVGTPQATSTRSISRGPDAGVSEKTSGAWSGVVDDGGRLRADGGGNAFPWSNAMLQWQRTGFHFQPQR NWMNDPNGPVYYKGWYHLFYQYNPDGAIWGNKIAffGHAVSRDLIHWRHLPLAMLPDQffYDTNGVffTGSATTLPDGRL AMLYTGSTNTSVQVQCLAVPADDDDPLLTNWTKYEGNPALYPPPGIGPRDFRDPTTAWFDPSDSTWRIVIGSKDDAE ⑶HAGIAVVYRTRDFVHFELLPDLLHRVAGTGMWECIDFYPVATRGKASGNGVDMSDALAKNGAVV⑶VVHVMKASM DDDRHDYYALGRYDAAANAWTPLDAEKDVGTGLRYDWGKFYASKTFYDPAKRRRVLffGffVGETDSERADVSKGffASL QGlPRTVLLDTKTGSNLLQffPVEEVE TLRTNSTDLSGITIDYGSTFPLNLRRATQLDIEAEFELDRRAVMSLNEAD VGYNCSTSGGAAARGALGPFGLLVLTDKHLHEQTAVYFYVAKGLDGSLTTHFCQDESRSSSANDIVKRVVGSAVPVL EDETTLSLRVLVDHSIVESFAQGGRSTATSRVYPTKAIYANAGVFLFNNATAARVTAKKLVVHEMDSSYNHDYMVTD I以上所述的SoINV2总长2429bp ;经Vector NTI Advance 11软件分析,该序列的 ORF为2085bp,编码695个氨基酸;起始密码子(ATG)位于转录起始位点后227bp处,终止密码子(TAA)位于2314bp,其后还有一段115bp的非编码序列,并带有真核生物典型的polyA 尾巴;甘蔗GH32家族SoINV2基因所编码蛋白的氨基酸序列METRDTTAPLPYSYTPLPAADAASAEVTGTGHRGGGRSRRSSLCAAALVLSAALLLAVAALAGVGGRVA VVPRPTTAVGETAGVGVGPGAGTPQATSTRSISRGPDAGVSEKTSGAWSGVVDDGGRLRADGGGNAFPWSNAMLQWQ RTGFHFQPQRNWMNDPNGPVYYKGWYHLFYQYNPDGAIWGNKIAffGHAVSRDLIHWRHLPLAMLPDQffYDTNGVffTG SATTLPDGRLAMLYTGSTNTSVQVQCLAVPA⑶DDPLLTNWTKYEGNPALYPPPGIGPRDFRDPTTAWFDPSDSTWR IVIGSKDDAE ⑶ HAGIAWYRTRDFVHFELLPDLLHRVAGTGMWECIDFYPVATRGKASGNGVDMSDALAKNGAWG DVVHVMKASMDDDRHDYYALGRYDAAANAWTPLDAEKDVGTGLRYDWGKFYASKTFYDPAKRRRVLffGffVGETDSER ADVSKGWASLQGIPRTVLLDTKTGSNLLQWPVEEVETLRTNSTDLSGITIDYGSTFPLNLRRATQLDIEAEFELDRRAVMSLNEADVGYNCSTSGGAAARGALGPFGLLVLTDKHLHEQTAVYFYVAKGLDGSLTTHFCQDESRSSSANDIVKR VVGSAVPVLEDETTLSLRVLVDHSIVESFAQGGRSTATSRVYPTKAIYANAGVFLFNNATAARVTAKKLVVHEMDSS YNHDYMVTDI以上所述的SoINV3总长2373bp ;经Vector NTI Advance 11软件分析,该序列的 ORF为2088bp,编码696个氨基酸;起始密码子(ATG)位于转录起始位点后168bp处,终止密码子(TAA)位于2258bp,其后还有一段115bp的非编码序列,并带有真核生物典型的polyA 尾巴;甘蔗GH32家族SoINV3基因所编码蛋白的氨基酸序列为METRDTTAPLPYSYTPLPAADAASAEVTGTGGSRSRRRRPLCAAALVLSAALLLAVAALAGVGSRVAAV VPRPTTAVGETAGVGVGVVPGAGTPQATSTRSRSRGPDAGVSEKTSGVWTGVIDDGARLRTDAGGNAFPWSNAMLQW QRTGFHFQPQRNWMNDPNGPVYYKGWYHLFYQYNPDGAIWGNKIAffGHAVSRDLIHWRHLPLAMLPDQffYDTNGVffT GSATTLPDGRLAMLYTGSTNTSVQVQCLAVPADDDDPLLTNWTKYEGNPALYPPPGIGPRDFRDPTTAWFDPSDSTW RIVIGSKDDAE⑶HAGIAVVYRTRDFVHFELLPDLLHRVAGTGMWECIDFYPVATRGKASGNGVDMSDALAKNGAVV ⑶VVHVMKASMDDDRHDYYALGRYDAAANAWTPLDAEKDVGTGLRYDWGKFYASKTFYDPAKRRRVLWGWVGETDSE RADVSKGffASLQGIPRTVLLDTKTGSNLLQffPVEEVETLRTNSTDLSGITIDYGSTFPLNLRRATQLDIEAEFELDR RAVMSLNEADVGYNCSTSGGAAARGALGPFGLLVLTDKHLHEQTAVYFYVAKGLDGSLTTHFCQDESRSSSANDIVK RVVGSAVPVLEDETTLSLRVLVDHSIVESFAQGGRSTATSRVYPTKAIYANAGVFLFNNATAARVTAKKLVVHEMDS SYNHDYMVTDI以上所述的甘蔗GH32家族三个可溶性酸性转化酶(SoINV)基因全长cDNA序列, 核心区序列(M-SoINV)是用以下核苷酸序列为引物扩增获得N fl :5, -tctggggcaacaagatcgcgt-3‘;N rl 5' -aaattgggtgcagcggtgggt-3‘ 0以上所述的3'端的序列(3' -SoINV)是用以下核苷酸序列为引物,与3' -Full RACEKit(TAKARA)提供的引物结合,鸟巢式扩增获得GZ f2 -cctcttcaacaacgccaccgccg-3‘。以上所述的5'端序列(5' -SoINV)分别是用以下核苷酸序列为引物,与 5' -Full RACEKit(TAKARA)提供的引物结合,鸟巢式扩增获得用于扩增甘蔗GH32家族可溶性酸性转化酶SoINVl基因5INVSYR1 (5 ‘ -gttggtggagccggtgtagagcat-3‘);INVRl (5' -gcccctgctgatgctcctggtcg-3‘);用于扩增甘蔗GH32家族可溶性酸性转化酶SoINV2基因5INVSY Rl: (5‘ -gttggtggagccggtgtagagcat-3‘);INVR2 (5‘ -acgtcgcctgtggtgtcccc-3 ‘);用于扩增甘蔗GH32家族可溶性酸性转化酶SoINV3基因5INVSY Rl: (5‘ -gttggtggagccggtgtagagcat-3‘);INVR3 (5‘ -ggggtcgttcatccagttcctct-3‘)。本发明的优点及其效果本发明一种甘蔗GH32家族可溶性酸性转化酶(SoINV)基因序列,不仅为研究甘蔗可溶性酸性转化酶的转录和表达机制,进一步探讨蔗糖的积累机理奠定基础,而且通过其氨基酸序列可以获得具有生物活性的纯化蛋白,为研究可溶性酸性转化酶的生物学功能及
端的引物 端的引物 端的引物利用GH32家族SoINV基因进行甘蔗品种改良提供基础。
图1甘蔗GH32家族可老撤Ig性转化酶(SOINVI)基因的全长cDNA序列;
图2甘蔗GH32家族可奔酬I性转化酶(SoINVI)基因所编码蛋白的氨基·陵序列
图3甘蔗GH32家族可老撤Ig性转化酶(SoINV2)基因的全长cDNA序列;
图4甘蔗GH32家族可奔酬I性转化酶(SoINV2)基因所编码蛋白的氨基·陵序列
图5甘蔗GH32家族可老撤Ig性转化酶(SoINV3)基因的全长cDNA序列;
图6甘蔗GH32家族可箱 性酸H生转化酶(SoINV3)基因所编码蛋白的氨基画I序列。
具体实施例方式实施例1 (甘蔗GH32家族SoINVl)1、植物 INV基因进化分析用“soluble acid invertase”搜索NCBI 网站(http:// www. ncbi. nlm. nih. gov)非冗余氨基酸数据库,获得所有植物INV序列。对所获得的所有序列进行分析,除去重复的序列。共得到植物INV序列20条,其中全长序列7条。用ClustalX 对所有全长INV的氨基酸序列进行比对,并将比对结果保存为PHYLIP格式;然后用PHYLIP 软件包的Seqboot,Protdi st,Neighbor和Consense程序进行进化分析,将植物所有全长 INV基因分为相应的家族。2、甘蔗RNA的提取(1)取新鲜甘蔗心叶(桂糖28),在液氮中充分研磨成粉末状。称取IOOmg研磨后的产物,放入用液氮预冷过的1.5ml离心管中,加入天根生化科技有限公司生产的 TRNzol-A+提取液 Iml ;(2)将样品充分混勻,室温下放置5min,使得核酸蛋白复合物完全分离;(3)控制 4"C,14,OOOg,离心 IOmin ;(4)吸取上清液于另一个1. 5ml离心管中,加入氯仿体积为上清液体积的1/5,充分振荡混勻15sec,室温放置2 3min ;(5)控制 4"C,14,OOOg,离心 15min ;(6)重复步骤4,5—次;(7)吸取上层水相溶液于另一个1. 5ml离心管中,加入等体积预冷的异丙醇,混勻 lOmin,冰上放置25min ;(8)控制4°C,14,000g,离心IOmin后,在管侧和管底形成白色胶状沉淀;(9)去上清液,留沉淀。加入Iml 75%乙醇洗涤沉淀2遍(每次4°C,2,300g,离心 3min);(10)倒出液体,室温放置3-5min。加入30_50μ1 DEPC处理水,充分溶解后于-70°C保存;(11)取1 μ 1 RNA样品用1 %琼脂糖凝胶进行电泳检测,另取1 μ 1 RNA样品测OD值。3、RT-PCR (反转录 PCR)(1)准备一个0. 2ml的离心管,加入以下成分
7
甘蔗总RNA 0. 1-5 μ g ;oligo(dT)18 引物(0. 5μ g/μ 1) 1μ 1 ;力口人 DEPC-treated water 至Ij总体积 12 μ 1。(2)轻微混勻,简短离心之后,65°C温育5min,放置冰上冷却。(3)按要求加入以下成分5XReaction Buffer 4μ 1 ;RiboLock RNase Inhibitor (20u/μ 1) 1 μ 1 ;IOmM dNTP Mix 2 μ 1 ;ReverAid M-MuLV Reverse Tranxcriptase 1 μ 1 ;总体积为20 μ 1。(4)轻微混勻,简短离心。(5)421温育60111土11。(6) 70°C温育5min终止反应,_20°C保存备用。4、SoINV基因中间片段克隆在Vector NTI Advance 11上比对单子叶植物INV 基因的cDNA序列,在其保守区设计一对用于扩增SoINV中间片段的引物。用Nfl和Nrl引物进行常规PCR扩增,PCR产物经胶回收纯化后克隆到T载体并测序。测序结果表明,PCR 产物长度 1601bp (M-SoINV)。Nfl 5‘ -tctggggcaacaagatcgcgt-3‘;Nrl 5‘ -aaattgggtgcagcggtgggt-3‘。5、SoINV基因3 ‘ -RACE克隆根据上一步测序获得的中间片段序列,设计了 3' -RACE的一个基因特异引物。GZ f2 -cctcttcaacaacgccaccgccg-3‘。3' -Full RACE Kit (TAKARA)提供的锚定引物GZ rl :5' -ggccacgcgtcgactagtac-3‘;GZ r2 -ggccacgcgtcgactagtacttttttttttttttt-3‘ 0提取甘蔗幼嫩叶总RNA,按照上述RT-PCR方法,特异引物GZr2为反转录引物,逆转录合成cDNA。用引物GZ f2和GZ rl进行常规的PCR扩增。产物经胶回收纯化后克隆到T 载体并测序。测序结果表明,产物长度218bp(3' -SoINV)。6、SoINV基因5' -RACE克隆克隆参照中间片段和基因库AY302083基因上游序列,设计两个5' -RACE的基因特异引物,SoINVl基因两个5' -RACE的基因特异引物INVSYRl 5' -gttggtggagccggtgtagagcat-3‘;INVRl 5' -gcccctgctgatgctcctggtcg-3‘。5' -Full RACE Kit(TAKARA)试剂盒提供的引物Outer Primer :5' _catggctacatgctgacagccta_3 ;Inner Primer 5‘ -cgcggatccacagcctactgatgatcagtcgatg-3‘。提取甘蔗幼嫩叶总RNA,按照5 ‘ -Full RACE Kit (TAKARA)操作手册进行 5' -RACE。鸟巣式PCR:以外引物INVSYR1和Outer Primer的PCR产物为模板,再以内引物INVRl和IrmerPrimer进行第二轮PCR,PCR产物经胶回收纯化后克隆到T载体并测序。测序结果表明,获得5' -RACE产物长度为496bp(5' -SoINVl)。7、甘蔗GH32家族可溶性酸性转化酶SoINVl基因全长核苷酸序列分析将所获得的三个序列(5‘ -SoINVl、M-SoINV和3' -SoINV) O和AY302083片段进行拼接,获得了可溶性酸性转化酶(SoINVI)基因全长cDNA序列,总长2387bp。经Vector NTIAdvance 11软件分析,该序列的ORF为2055bp,编码685个氨基酸。起始密码子(ATG) 位于转录起始位点后215bp处,终止密码子(TAA)位于2272bp,其后还有一段115bp的非编码序列,并带有真核生物典型的PolyA尾巴。甘蔗GH32家族可溶性酸性转化酶SoINVl基因及蛋白序列表,详见序列表1。实施例2 (甘蔗GH32家族SoINV2)1至5步同实施例1。6、SoINV基因5' -RACE克隆克隆参照中间片段和基因库AY302083基因上游序列,设计两个5' -RACE的基因特异引物,SoINV2基因两个5' -RACE的基因特异引物INVSYR1 (5 ‘ -gttggtggagccggtgtagagcat-3‘);INVR2 (5‘ -acgtcgcctgtggtgtcccc-3‘)。5' -Full RACE Kit(TAKARA)试剂盒提供的引物同实施例1。提取甘蔗幼嫩叶总RNA,按照5 ‘ -Full RACE Kit (TAKARA)操作手册进行 5' -RACE。鸟巣式PCR:以外引物INVSYR1和Outer Primer的PCR产物为模板,再以内引物INVR2和IrmerPrimer进行第二轮PCR,PCR产物经胶回收纯化后克隆到T载体并测序。 测序结果表明,获得5' -RACE产物长度为515bp(5' _SoINV2)。7、甘蔗GH32家族可溶性酸性转化酶SoINV2基因全长核苷酸序列分析将所获得的三个序列(5‘ -SoINV2、M-SoINV和3' -SoINV) O和AY302083片段进行拼接,获得了 SoINV2基因全长cDNA序列,总长2429bp。经Vector NTI Advance 11软件分析,该序列的ORF为2085bp,编码695个氨基酸。起始密码子(ATG)位于转录起始位点后 227bp处,终止密码子(TAA)位于2314bp,其后还有一段115bp的非编码序列,并带有真核生物典型的PolyA尾巴。甘蔗GH32家族可溶性酸性转化酶SoINV2基因及蛋白序列表,详见序列表2。实施例3 (甘蔗GH32家族SoINV3)1至5步同实施例1。6、SoINV基因5' -RACE克隆克隆参照中间片段和基因库AY302083基因上游序列,设计两个5' -RACE的基因特异引物,SoINV3基因两个5' -RACE的基因特异引物INVSYR1 (5 ‘ -gttggtggagccggtgtagagcat-3‘);INVR3 (5‘ -ggggtcgttcatccagttcctct-3‘)。5' -Full RACE Kit(TAKARA)试剂盒提供的引物同实施例1。提取甘蔗幼嫩叶总RNA,按照5 ‘ -Full RACE Kit (TAKARA)操作手册进行 5' -RACE。鸟巣式PCR:以外引物INVSYR1和Outer Primer的PCR产物为模板,再以内引物INVRl、INVR2、INVR3分别和Inner Primer进行第二轮PCR, PCR产物经胶回收纯化后克隆到T载体并测序。测序结果表明,获得5' -RACE产物长度分别为496bp(5' -SoINVl)、515bp(5‘ -SoINV2)和 656bp (5‘ -SoINV3)。7、甘蔗GH32家族可溶性酸性转化酶SoINV3基因全长核苷酸序列分析将所获得的三个序列(5‘ -SoINV3、M-SoINV和3' -SoINV)和AY302083片段进行拼接,获得了 SoINV3基因全长cDNA序列,总长2373bp。经Vector NTI Advance 11软件分析,该序列的ORF为2088bp,编码696个氨基酸。起始密码子(ATG)位于转录起始位点后 168bp处,终止密码子(TAA)位于2258bp,其后还有一段115bp的非编码序列,并带有真核生物典型的PolyA尾巴。甘蔗GH32家族可溶性酸性转化酶SoINV3基因及蛋白序列表,详见序列表3。序列表序列表1<110>广西大学<120>甘蔗可溶性酸性转化酶(SoINVI)基因及蛋白序列<160>2<210>1<211>2387<212>DNA<213> 甘胃(Saccharum officinarum)<220><221>CDS<222>(215). . . (2272)<220><221>5' UTP<222>(1). . . (214)<220><221>3' UTP<222> (2273). .. (2387)<400>1GAAAAATGTG TTGTTGTATG TACTACTATA TTCTAGTTACTTGCGCGGCT CCAGATCGAA TCGGCCGGCG AGGAGTCGGTCCGTAGCCGC CCCTAGAGAG GGTCGCCCGC CGTCGTAACCCCTCCTTCCG ATCGATCCTC TTCCGCCGTC GGCA 214ATG GAG ACC CGG GAC ACG ACG GCG CCG CTC CCCMet Glu Thr Arg Asp Thr Thr Ala Pro Leu Pro510CCG CTG CCG GCC GCC GAC GCC GCG TCC GCC GAGPro Leu Pro Ala Ala Asp Ala Ala Ser Ala Glu2025GGC GGC AGG AGC AGG CGG AGG TCC CTC TGC GCCGly Gly Arg Ser Arg Arg Arg Ser Leu Cys Ala
AATCGAGTCG AATCGCCGCA 60 CATCGTCGCT CCGGCCGCCG 120 GCAACACAAG TCGCCGGCGG 180
TAC TCG TAC ACG 259 Tyr Ser Tyr Thr 15
GTC ACC GGC ACC 304 Val Thr Gly Thr 30
GCG GCG CTC GTC 349 Ala Ala Leu Val0140]354045CTC TCC GCC GCGCTG CTC CTC GCC GTG GCC GCG CTC GCC GCC GCC 394Leu Ser Ala AlaLeu Leu Leu Ala ValAla Ala Leu Ala Ala Ala505560GGC CGC CGC CCAACG ACC GCG GTG GGA GAA ACG GCC GGC GTC GGC 439Gly Arg Arg ProThr Thr Ala Val GlyGlu Thr Ala Gly Val Gly657075GTC GTC CCT GGCGTG GGG ACA CCA CAG GCG ACG TCG ACC AGG AGC 484Val Val Pro GlyVal Gly Thr Pro GlnAla Thr Ser Thr Arg Ser808590ATC AGC AGG GGCCCC GAC GCC GGC GTG TCG GAG AAG ACG TCC GGC 529lie Ser Arg GlyPro Asp Ala Gly ValSer Glu Lys Thr Ser Gly95100105GCG TGG AGC GGCGTC GTC GAC GAT GGC GGG AGG CTC CGT GCT GAC 574Ala Trp Ser GlyVal Val Asp Asp GlyGly Arg Leu Arg Ala Asp110115120GGC GGC GGG AACGCG TTC CCG TGG AGC AAT GCG ATG CTG CAG TGG 619Gly Gly Gly AsnAla Phe Pro Trp SerAsn Ala Met Leu Gln Trp125130135CAG CGC ACG GGATTC CAC TTC CAG CCG CAG AGG AAC TGG ATG AAC 664Gln Arg Thr GlyPhe His Phe Gln ProGln Arg Asn Trp Met Asn140145150GAC CCC AAT GGCCCG GTG TAC TAC AAG GGC TGG TAC CAC CTG TTC 709Asp Pro Asn GlyPro Val Tyr Tyr LysGly Trp Tyr His Leu Phe155160165TAC CAA TAC AACCCG GAC GGC GCC ATC TGG GGC AAC AAG ATC GCG 754Tyr Gln Tyr AsnPro Asp Gly Ala lieTrp Gly Asn Lys lie Ala170175180TGG GGC CAC GCCGTC TCC CGC GAC CTC ATC CAC TGG CGC CAC CTC 799Trp Gly His AlaVal Ser Arg Asp Leulie His Trp Arg His Leu185190195CCG CTG GCC ATGCTG CCC GAC CAG TGG TAC GAC ACC AAC GGC GTC 844Pro Leu Ala MetLeu Pro Asp Gln TrpTyr Asp Thr Asn Gly Val200205210TGG ACG GGC TCCGCC ACC ACG CTC CCC GAC GGC CGC CTC GCC ATG 889Trp Thr Gly SerAla Thr Thr Leu ProAsp Gly Arg Leu Ala Met215220225CTC TAC ACC GGCTCC ACC AAC ACC TCC GTG CAG GTG CAG TGC CTC 934Leu Tyr Thr GlySer Thr Asn Thr SerVal Gln Val Gln Cys LeuGCC GTC Ala Val
CCC GCC Pro Ala
AAG Lys
CCC Pro
GAC Asp
GGC Gly
GTG Val
TAC Tyr
AGG Arg
TCC
Ser
GAC Asp
CAC His
ACG GGG Thr Gly
GGC Gly
AAG Lys
AGC
Ser
GAC Asp
GTC Val
AAG Lys
AAG Lys
AAC Asn
ATG Met
GCG Ala
GGC Gly
ACG Thr
GAG Glu
GAC Asp
ACC Thr
CAC His
TTC Phe
ATG Met
GCG Ala
GGC Gly
GAC Asp
GCT Ala
ACC Thr
TTC Phe
GGC Gly
TTC Phe
TGG Trp
GCC Ala
GAG Glu
TGG Trp
TCC
Ser
GCC Ala
GAC Asp
GCC Ala
GGC Gly
TAC Tyr
230 GAC Asp 245 AAC Asn 260 CGC Arg 275 CGC Arg 290 GGC Gly 305 CTC Leu 320 GAG Glu 335 GGG Gly 350 GTC Val 365 GAC Asp 380 AAC Asn 395 CTG Leu 410 GAC Asp
GAC Asp
CCG Pro
GAC Asp
ATC lie
ATC lie
CTC Leu
TGC Cys
AAC Asn
GTC Val
CGA Arg
GCG Ala
CGG Arg
CCG Pro
GAC Asp
GCG Ala
CCC Pro
GTC Val
GAC Asp
CTG Leu
ACC Thr
ATC lie
GCC GTG Ala Val
CCG Pro
ATC lie
GGC Gly
GGG Gly
CAT His
TGG Trp
TAC Tyr
GCC Ala
GAC Asp
GAC Asp
GTC Val
GAC Asp
GAC Asp
ACG Thr
GAC Asp
AAG Lys
CCG Pro
TAC Tyr
ACG Thr
GGC Gly
GTG Val
CTG Leu
TTC Phe
GAC Asp
GTG Val
TAC Tyr
CCG Pro
TGG Trp
CGC Arg
235 CTG Leu 250 CCG Pro 265 GCG Ala 280 TCC Ser 295 TAC Tyr 310 CTC Leu 325 TAC Tyr 340 ATG Met 355 GTG Val 370 TAC Tyr 385 CTC Leu 400 GGC Gly 415 CGC Arg
CTC Leu
CCG Pro
ACC Thr
CCG Pro
TGG TTC Trp Phe
AAG Lys
CGC Arg
CAC His
CCC Pro
TCC
Ser
CAC His
GAC Asp
ACC Thr
CGC Arg
GTC Val
GAC Asp
GTC Val
GCG CTC Ala Leu
GAC Asp
AAG Lys
CGC Arg
GCC Ala
TTC Phe
GTG Val
AAC Asn
GGG Gly
GAC Asp
GAC Asp
AGG Arg
GTC Val
GCC Ala
GCC Ala
ATG Met
GGG Gly
GAG Glu
TAC Tyr
CTC Leu
TGG Trp
ATC lie
CCG Pro
GCC Ala
GAC Asp
GCG Ala
ACC Thr
CTC Leu
AAG Lys
AGG Arg
AAG Lys
GCG Ala
TGG Trp
240 ACC Thr 255 GGG Gly 270 TCG Ser 285 GAG Glu 300 TTC Phe 315 GGG Gly 330 CGC Arg 345 GCC Ala 360 GCC Ala 375 TAT Tyr 390 GAC Asp 405 TCC Ser 420 GGA Gly
979
1024
1069
1114
1159
1204
1249
1294
1339
1384
1429
1474
1519
425430435TGG GTC GGC GAG ACC GAC TCG GAG CGC GCT GAC GTC TCCAAG GGA 1564Trp Val Gly GluThr Asp Ser Glu Arg Ala Asp Val SerLys Gly440445450TGG GCA TCG CTG CAG GGT ATC CCC CGG ACG GTG CTG CTGGAC ACC 1609Trp Ala Ser LeuGln Gly lie Pro Arg Thr Val Leu LeuAsp Thr455460465AAG ACG GGC AGC AAC CTG CTG CAG TGG CCC GTG GAG GAAGTG GAG 1654Lys Thr Gly SerAsn Leu Leu Gln Trp Pro Val Glu GluVal Glu470475480ACG CTG CGC ACC AAC TCC ACG GAC CTC AGC GGC ATC ACCATC GAC 1699Thr Leu Arg ThrAsn Ser Thr Asp Leu Ser Gly lie Thrlie Asp485490495TAC GGC TCC ACG TTC CCG CTC AAC CTC CGC CGC GCC ACGCAG CTG 1744Tyr Gly Ser ThrPhe Pro Leu Asn Leu Arg Arg Ala ThrGln Leu500505510GAC ATC GAG GCG GAG TTC GAG CTG GAC CGC CGC GCC GTCATG TCC 1789Asp lie Glu Ala Glu Phe Glu Leu Asp Arg Arg Ala ValMet Ser515520525CTC AAC GAG GCC GAC GTG GGC TAC AAC TGC AGC ACC AGCGGC GGC 1834Leu Asn Glu Ala Asp Val Gly Tyr Asn Cys Ser Thr SerGly Gly530535540GCC GCC GCC CGC GGC GCG CTG GGC CCC TTC GGC CTG CTCGTC CTC 1879Ala Ala Ala Arg Gly Ala Leu Gly Pro Phe Gly Leu LeuVal Leu545550555ACC GAC AAG CAC CTG CAC GAG CAG ACG GCC GTC TAC TTCTAC GTG 1924Thr Asp Lys HisLeu His Glu Gln Thr Ala Val Tyr PheTyr Val560565570GCC AAA GGC CTG GAC GGC TCC CTC ACC ACG CAC TTC TGCCAG GAC 1969Ala Lys Gly LeuAsp Gly Ser Leu Thr Thr His Phe CysGln Asp575580585GAG TCC CGG TCG TCC AGC GCC AAC GAC ATC GTC AAG CGCGTC GTC 2014Glu Ser Arg SerSer Ser Ala Asn Asp lie Val Lys ArgVal Val590595600GGC AGC GCC GTC CCC GTG CTG GAG GAC GAG ACC ACA CTCTCG CTT 2059Gly Ser Ala ValPro Val Leu Glu Asp Glu Thr Thr LeuSer Leu605610615CGC GTG CTC GTC GAC CAC TCC ATC GTC GAG AGC TTC GCGCAG GGT 2104Arg Val Leu ValAsp His Ser lie Val Glu Ser Phe AlaGln Gly
13
620625630
GGAAGGTCAACGGCCACCTCGCGCGTCTACCCCACCAAGGCCATC2149
GlyArgSerThrAlaThrSerArgValTyrProThrLysAlalie
635640645
TACGCCAACGCCGGCGTGTTCCTCTTCAACAACGCCACCGCCGCG2194
TyrAlaAsnAlaGlyValPheLeuPheAsnAsnAlaThrAlaAla
650655660
CGCGTCACCGCCAAGAAGCTCGTCGTCCACGAGATGGACTCGTCC2239
ArgValThrAlaLysLysLeuValValHisGluMetAspSerSer
665670675
TACAACCACGACTACATGGTCACGGACATC2269
TyrAsnHisAspTyrMetValThrAsplie
680685
TGATGCTGCT GCTGCTGCTG CTGCTGCTGACCCGTCGTCCATCCAACCCACCGCTGCACC 2329
CAATTTTTTG AACCCATATA TAGCGAAGCATCTTCTTGTACCTMMMAKkkkkkkk 2387
<210>2
<211>685
<212>PRT
<213> (Saccharum officinarum)
<400>2
MetGluThrArgAspThrThrAlaProLeuProTyrSerTyrThr
51015
ProLeuProAlaAlaAspAlaAlaSerAlaGluValThrGlyThr
202530
GlyGlyArgSerArgArgArgSerLeuCysAlaAlaAlaLeuVal
354045
LeuSerAlaAlaLeuLeuLeuAlaValAlaAlaLeuAlaAlaAla
505560
GlyArgArgProThrThrAlaValGlyGluThrAlaGlyValGly
657075
ValValProGlyValGlyThrProGlnAlaThrSerThrArgSer
808590
lieSerArgGlyProAspAlaGlyValSerGluLysThrSerGly
95100105
AlaTrpSerGlyValValAspAspGlyGlyArgLeuArgAlaAsp
110115120
GlyGlyGlyAsnAlaPheProTrpSerAsnAlaMetLeuGlnTrp
125130135
GlnArgThrGlyPheHisPheGlnProGlnArgAsnTrpMetAsnAsp Pro Asn Tyr Trp
Gln Tyr Gly His
Trp Thr Leu Tyr Ala Val Pro Lys Tyr Pro Arg
Asp Ser Gly Asp His Val His Phe Thr Gly Lys
Gly Lys Asn
Ser Met
Asp Val Lys
Ala Gly Thr
Gly Asn Ala
Pro Leu Ala Met Gly Ser Thr
Glu Asp Thr
Met
Ala Gly Asp Ala Thr Phe
Gly Ala Gly Phe Trp Ala Glu Trp Ser Ala Asp Ala Gly Tyr
140 Pro 155 Pro 170 Val 185 Leu 200 Ala 215 Ser 230 Asp 245 Asn 260 Arg 275 Arg 290 Gly 305 Leu 320 Glu 335 Gly 350 Val 365 Asp 380 Asn 395 Leu 410 Asp 425
Val Asp Ser Pro Thr Thr Asp
Tyr Gly Arg Asp Thr Asn Asp Ala
Tyr Lys Ala lie
Asp Leu lie His Gln Trp
Leu Pro
Thr Ser Asp Pro Leu Leu
Pro Ala Leu Tyr Asp Pro Thr Thr lie Val lie Gly lie Ala Val Val
Leu Pro
Cys Asn Val Arg Ala Trp Arg Tyr Pro Ala
lie Gly Gly His
Thr lie Val Asp Asp Val Asp Asp Thr Asp Lys
Phe Asp Val Tyr Pro Trp Arg
145 Gly 160 Trp 175 lie 190 Tyr 205 Asp 220 Val 235 Leu 250 Pro 265 Ala 280 Ser 295 Tyr 310 Leu 325 Tyr 340 Met 355 Val 370 Tyr 385 Leu 400 Gly 415 Arg 430
Trp Gly
Asp Gly Gln
Tyr Asn Trp Thr Arg Val Thr
Pro Pro
Trp Lys Arg
Leu Leu His
Pro Ser His Ala Asp Lys Arg
Ala Phe Val
His Leu Lys lie Arg His Asn Gly Ala Cys Trp lie
Phe Asp Thr Arg Val Asp Val
Leu Gln Asn Gly Asp Asp Arg Val Ala Ala Met
Leu Gly Glu Tyr
Leu
Pro Ala Asp Ala Thr Leu Lys Arg Lys Ala Trp
150 Phe 165 Ala 180 Leu 195 Val 210 Met 225 Leu 240 Thr 255 Gly 270 Ser 285 Glu 300 Phe 315 Gly 330 Arg 345 Ala 360 Ala 375 Tyr 390 Asp 405 Ser 420 Gly 435
Trp Val Gly Glu ThrAsp Ser Glu Arg Ala AspVal Ser Lys Gly440445450Trp Ala Ser Leu GlnGly lie Pro Arg Thr ValLeu Leu Asp Thr455460465Lys Thr Gly Ser AsnLeu Leu Gln Trp Pro ValGlu Glu Val Glu470475480Thr Leu Arg Thr AsnSer Thr Asp Leu Ser Glylie Thr lie Asp485490495Tyr Gly Ser Thr PhePro Leu Asn Leu Arg ArgAla Thr Gln Leu500505510Asp lie Glu Ala GluPhe Glu Leu Asp Arg ArgAla Val Met Ser515520525Leu Asn Glu Ala AspVal Gly Tyr Asn Cys SerThr Ser Gly Gly530535540Ala Ala Ala Arg GlyAla Leu Gly Pro Phe GlyLeu Leu Val Leu545550555Thr Asp Lys His LeuHis Glu Gln Thr Ala ValTyr Phe Tyr Val560565570Ala Lys Gly Leu AspGly Ser Leu Thr Thr HisPhe Cys Gln Asp575580585Glu Ser Arg Ser SerSer Ala Asn Asp lie ValLys Arg Val Val590595600Gly Ser Ala Val ProVal Leu Glu Asp Glu ThrThr Leu Ser Leu605610615Arg Val Leu Val AspHis Ser lie Val Glu SerPhe Ala Gln Gly620625630Gly Arg Ser Thr AlaThr Ser Arg Val Tyr ProThr Lys Ala lie635640645Tyr Ala Asn Ala GlyVal Phe Leu Phe Asn AsnAla Thr Ala Ala650655660Arg Val Thr Ala LysLys Leu Val Val His GluMet Asp Ser Ser665670675Tyr Asn His Asp TyrMet Val Thr Asp lie680685序列表序列表2<110>广西大学<120>甘蔗可溶性酸性转化酶基因(SoINV2)基因及蛋白序列<160>2
<210>1<211>2429<212>DNA<213> 甘胃(Saccharum officinarum)<220><221>CDS<222>(227). . . (2314)<220><221>5,UTP<222>(1). . . (226)<220><221>3,UTP<222> (2315)... (2429)<400>1GAAACCACAT CGCTCGTCCA ATGTGTTGTTGAAATCTAGTTACAATCGAG TCGAGTCGCC60
GCATTGCTCG CTTCCGCGGC TCCAGATCGAATCGGCCGGCGAGGAGTCGG TCATCGTCGC120
TCCGGCCGCC GCCGTAGCCG CCCCTAGAGAGAGTCGCCCGCCGTCGTAAC 丨CGTAACACAA180
GTCGCCCGGC GGCCTCCTTC CGATCGATCCTCTTCCGTCGTCGGCA226
ATGGAGACCCGGGACACGACGGCGCCGCTCCCCTACTCGTACACG271
MetGluThrArgAspThrThrAlaProLeuProTyrSerTyrThr
51015
CCGCTGCCGGCCGCCGACGCCGCGTCCGCCGAGGTCACCGGCACC316
ProLeuProAlaAlaAspAlaAlaSerAlaGluValThrGlyThr
202530
GGCCACCGCGGCGGCGGCAGGAGCAGGCGTAGTTCCCTCTGCGCC361
GlyHisArgGlyGlyGlyArgSerArgArgSerSerLeuCysAla
354045
GCGGCGCTCGTCCTCTCCGCCGCGCTGCTCCTCGCCGTGGCCGCG406
AlaAlaLeuValLeuSerAlaAlaLeuLeuLeuAlaValAlaAla
505560
CTCGCCGGCGTCGGCGGCCGCGTCGCCGTCGTCCCCCGCCCAACG451
LeuAlaGlyValGlyGlyArgValAlaValValProArgProThr
657075
ACCGCGGTGGGAGAAACGGCCGGCGTCGGCGTCGGCCCTGGCGCG496
ThrAlaValGlyGluThrAlaGlyValGlyValGlyProGlyAla
808590
GGGACACCACAGGCGACGTCGACCAGGAGCATCAGCAGGGGCCCC541
GlyThrProGlnAlaThrSerThrArgSerlieSerArgGlyPro
95100105
GAC GCC GGC GTG TCG GAG AAG ACGTCC GGC GCG TGG AGCGGC GTC 586Asp Ala Gly ValSer Glu Lys ThrSer Gly Ala Trp SerGly Val110115 120GTC GAC GAT GGC GGG AGG CTC CGTGCT GAC GGC GGC GGGAAC GCG 631Val Asp Asp GlyGly Arg Leu ArgAla Asp Gly Gly GlyAsn Ala125130 135TTC CCG TGG AGC AAT GCG ATG CTGCAG TGG CAG CGC ACGGGA TTC 676Phe Pro Trp SerAsn Ala Met LeuGln Trp Gln Arg ThrGly Phe140145 150CAC TTC CAG CCG CAG AGG AAC TGGATG AAC GAC CCC AATGGC CCG 721His Phe Gln ProGln Arg Asn TrpMet Asn Asp Pro AsnGly Pro155160 165GTG TAC TAC AAG GGC TGG TAC CACCTG TTC TAC CAA TACAAC CCG 766Val Tyr Tyr LysGly Trp Tyr HisLeu Phe Tyr Gln TyrAsn Pro170175 180GAC GGC GCC ATC TGG GGC AAC AAGATC GCG TGG GGC CACGCC GTC 811Asp Gly Ala lieTrp Gly Asn Lyslie Ala Trp Gly HisAla Val185190 195TCC CGC GAC CTC ATC CAC TGG CGCCAC CTC CCG CTG GCCATG CTG 856Ser Arg Asp Leulie His Trp ArgHis Leu Pro Leu AlaMet Leu200205 210CCC GAC CAG TGG TAC GAC ACC AACGGC GTC TGG ACG GGCTCC GCC 901Pro Asp Gln TrpTyr Asp Thr AsnGly Val Trp Thr GlySer Ala215220 225ACC ACG CTC CCC GAC GGC CGC CTCGCC ATG CTC TAC ACCGGC TCC 946Thr Thr Leu ProAsp Gly Arg LeuAla Met Leu Tyr ThrGly Ser230235 240ACC AAC ACC TCC GTG CAG GTG CAGTGC CTC GCC GTC CCCGCC GGC 991Thr Asn Thr SerVal Gln Val GlnCys Leu Ala Val ProAla Asp245250 255GAC GAC GAC CCG CTG CTC ACC AACTGG ACC AAG TAC GAGGGC AAC 1036Asp Asp Asp ProLeu Leu Thr AsnTrp Thr Lys Tyr GluGly Asn260265 270CCG GCG CTG TAC CCG CCG CCG GGGATC GGG CCC AGG GACTTC CGC 1081Pro Ala Leu TyrPro Pro Pro Glylie Gly Pro Arg AspPhe Arg275280 285GAC CCC ACC ACG GCG TGG TTC GACCCG TCG GAC TCC ACCTGG CGC 1126Asp Pro Thr ThrAla Trp Phe AspPro Ser Asp Ser ThrTrp Arg290295 300ATC GTC lie Val
ATC GCC lie Ala
CTC CCG Leu Pro
TGC ATC Cys Ile
AAC GGC Asn Gly
GTC GGG Val Gly
CGA CAT Arg His
GCG Ala
CGG Arg
CCG Pro
GAC Asp
GGT Gly
CTG Leu
TGG Trp
TAC Tyr
GCC Ala
TCG Ser
ATC lie
CTG Leu
ATC lie
GTG Val
GAC Asp
GAC Asp
GTC Val
GAC Asp
GAC Asp
ACG Thr
GAC Asp
AAG Lys
GAG Glu
CCC Pro
CAG Gln
GGC Gly
GTG Val
CTG Leu
TTC Phe
GAC Asp
GTG Val
TAC Tyr
CCG Pro
TGG Trp
CGC Arg
CGC Arg
CGG Arg
TGG Trp
TCC Ser 305 TAC Tyr 320 CTC Leu 335 TAC Tyr 350 ATG Met 365 GTG Val 380 TAC Tyr 395 CTC Leu 410 GGC Gly 425 CGC Arg 440 GCT Ala 455 ACG Thr 470 CCC Pro 485
AAG Lys
CGC Arg
CAC His
CCC Pro
TCC
Ser
CAC His
GCG Ala
GAC Asp
AAG Lys
CGC Arg
GAC Asp
GTG Val
GTG Val
GAC Asp
ACC Thr
CGC Arg
GTC Val
GAC Asp
GTC Val
CTC Leu
GCC Ala
TTC Phe
GTG Val
GTC Val
CTG Leu
GAG Glu
GAC Asp
AGG Arg
GTC Val
GCC Ala
GCC Ala
ATG Met
GGG Gly
GAG Glu
TAC Tyr
CTC Leu
TCC
Ser
CTG Leu
GAA Glu
GCC Ala
GAC Asp
GCG Ala
ACC Thr
CTC Leu
AAG Lys
AGG Arg
AAG Lys
GCG Ala
TGG Trp
AAG Lys
GAC Asp
GTG Val
GAG Glu 310 TTC Phe 325 GGG Gly 340 CGC Arg 355 GCC Ala 370 GCC Ala 385 TAT Tyr 400 GAC Asp 415 TCC Ser 430 GGA Gly 445 GGA Gly 460 ACC Thr 475 GAG Glu 490
GGC Gly
GTG Val
ACG Thr
GGC Gly
AAG Lys
AGC
Ser
GAC Asp
CAC His
GGG Gly
AAG Lys
AAC Asn
ATG Met
GAC GCG Asp Ala
GTC GGC Val Gly
AAG ACG Lys Thr
TGG GTC Trp Val
TGG Trp
AAG Lys
ACG Thr
GCA Ala
ACG Thr
CTG Leu
CAC His
TTC Phe
ATG Met
GCG Ala
GGC Gly
GAC Asp
GCT Ala
ACC Thr
TTC Phe
GGC Gly
TCG Ser
GGC Gly
CGC Arg
GCC Ala
GAG Glu
TGG Trp
TCC
Ser
GCC Ala
GAC Asp
GCC Ala
GGC Gly
TAC Tyr
GAG Glu
CTG Leu
AGC
Ser
ACC Thr
GGC Gly 315 CTC Leu 330 GAG Glu 345 GGG Gly 360 GTC Val 375 GAC Asp 390 AAC Asn 405 CTG Leu 420 GAC Asp 435 ACC Thr 450 CAG Gln 465 AAC Asn 480 AAC Asn 495
1171
1216
1261
1306
1351
1396
1441
1486
1531
1576
1621
1666
1711
19
TCC ACG GAC CTC AGC GGC ATC ACC ATCGAC TAC GGC TCC ACGTTC 1756500505510CCG CTC AAC CTC CGC CGC GCC ACG CAGCTG GAC ATC GAG GCGGAG 1801Pro Leu Asn Leu Arg Arg Ala Thr GlnLeu Asp lie Glu AlaGlu515520525TTC GAG CTG GAC CGC CGC GCC GTC ATGTCC CTC AAC GAG GCCGAC 1846Phe Glu Leu Asp Arg Arg Ala Val MetSer Leu Asn Glu AlaAsp530535540GTG GGC TAC AAC TGC AGC ACC AGC GGCGGC GCC GCC GCC CGCGGC 1891Val Gly Tyr Asn Cys Ser Thr Ser GlyGly Ala Ala Ala ArgGly545550555GCG CTG GGC CCC TTC GGC CTG CTC GTCCTC ACC GAC AAG CACCTG 1936Ala Leu Gly Pro Phe Gly Leu Leu ValLeu Thr Asp Lys HisLeu560565570CAC GAG CAG ACG GCC GTC TAC TTC TACGTG GCC AAA GGC CTGGAC 1981His Glu Gln Thr Ala Val Tyr Phe TyrVal Ala Lys Gly LeuAsp575580585GGC TCC CTC ACC ACG CAC TTC TGC CAGGAC GAG TCC CGG TCGTCC 2026Gly Ser Leu Thr Thr His Phe Cys GlnAsp Glu Ser Arg SerSer590595600AGC GCC AAC GAC ATC GTC AAG CGC GTCGTC GGC AGC GCC GTCCCC 2071Ser Ala Asn Asp lie Val Lys Arg ValVal Gly Ser Ala ValPro605610615GTG CTG GAG GAC GAG ACC ACA CTC TCGCTT CGC GTG CTC GTCGAC 2116Val Leu Glu Asp Glu Thr Thr Leu SerLeu Arg Val Leu ValAsp620625630CAC TCC ATC GTC GAG AGC TTC GCG CAGGGT GGA AGG TCA ACGGCC 2161His Ser lie Val Glu Ser Phe Ala GlnGly Gly Arg Ser ThrAla635640645ACC TCG CGC GTC TAC CCC ACC AAG GCCATC TAC GCC AAC GCCGGC 2206Thr Ser Arg Val Tyr Pro Thr Lys Alalie Tyr Ala Asn AlaGly650655660GTG TTC CTC TTC AAC AAC GCC ACC GCCGCG CGC GTC ACC GCCAAG 2251Val Phe Leu Phe Asn Asn Ala Thr AlaAla Arg Val Thr AlaLys665670675AAG CTC GTC GTC CAC GAG ATG GAC TCGTCC TAC AAC CAC GACTAC 2296Lys Leu Val Val His Glu Met Asp SerSer Tyr Asn His AspTyr680685690
ATGGTCACGGACATC2311
MetValThr Asplie
695
TGATGCTGCT GCTGCTGCTG CTGCTGCTGACCCGTCGTCC
CAATTTTTTG AACCCATATA TAGCGAAGCATCTTCTTGTA
<210>2
<211>695
<212>PRT
<213> (Saccharum officinarum)
<400>2
MetGluThrArgAspThrThrAlaProLeuPro
510
ProLeuProAlaAlaAspAlaAlaSerAlaGlu
2025
GlyHisArgGlyGlyGlyArgSerArgArgSer
3540
AlaAlaLeuValLeuSerAlaAlaLeuLeuLeu
5055
LeuAlaGlyValGlyGlyArgValAlaValVal
6570
ThrAlaValGlyGluThrAlaGlyValGlyVal
8085
GlyThrProGlnAlaThrSerThrArgSerlie
95100
AspAlaGlyValSerGluLysThrSerGlyAla
110115
ValAspAspGlyGlyArgLeuArgAlaAspGly
125130
PheProTrpSerAsnAlaMetLeuGlnTrpGln
140145
HisPheGlnProGlnArgAsnTrpMetAsnAsp
155160
ValTyrTyrLysGlyTrpTyrHisLeuPheTyr
170175
AspGlyAlalieTrpGlyAsnLyslieAlaTrp
185190
SerArgAspLeulieHisTrpArgHisLeuPro
200205
ProAspGlnTrpTyrAspThrAsnGlyValTrp
ATCCAACCCA CCGCTGCACC 2371 CCTMMMA AMMMA 2429
Tyr Ser Tyr Thr 15
Val Thr Gly Thr 30
Ser Leu Cys Ala 45
Ala Val Ala Ala 60
Pro Arg Pro Thr 75
Gly Pro Gly Ala 90
Ser Arg Gly Pro 105
Trp Ser Gly Val 120
Gly Gly Asn Ala 135
Arg Thr Gly Phe 150
Pro Asn Gly Pro 165
Gln Tyr Asn Pro 180
Gly His Ala Val 195
Leu Ala Met Leu 210
Thr Gly Ser Ala
215 220225Thr Thr Leu Pro Asp GlyArg Leu Ala Met Leu Tyr ThrGly Ser230 235240Thr Asn Thr Ser Val GlnVal Gln Cys Leu Ala Val ProAla Asp245 250255Asp Asp Asp Pro Leu LeuThr Asn Trp Thr Lys Tyr GluGly Asn260 265270Pro Ala Leu Tyr Pro ProPro Gly Ile Gly Pro Arg AspPhe Arg275 280285Asp Pro Thr Thr Ala TrpPhe Asp Pro Ser Asp Ser ThrTrp Arg290 295300Ile Val Ile Gly Ser LysAsp Asp Ala Glu Gly Asp HisAla Gly305 310315Ile Ala Val Val Tyr Arg Thr Arg Asp Phe Val His PheGlu Leu320 325330Leu Pro Asp Leu Leu HisArg Val Ala Gly Thr Gly MetTrp Glu335 340345Cys Ile Asp Phe Tyr ProVal Ala Thr Arg Gly Lys AlaSer Gly350 355360Asn Gly Val Asp Met SerAsp Ala Leu Ala Lys Asn GlyAla Val365 370375Val Gly Asp Val Val HisVal Met Lys Ala Ser Met AspAsp Asp380 385390Arg His Asp Tyr Tyr Ala Leu Gly Arg Tyr Asp Ala AlaAla Asn395 400405Ala Trp Thr Pro Leu AspAla Glu Lys Asp Val Gly ThrGly Leu410 415420Arg Tyr Asp Trp Gly LysPhe Tyr Ala Ser Lys Thr PheTyr Asp425 430435Pro Ala Lys Arg Arg Arg Val Leu Trp Gly Trp Val GlyGlu Thr440 445450Asp Ser Glu Arg Ala AspVal Ser Lys Gly Trp Ala SerLeu Gln455 460465Gly Ile Pro Arg Thr ValLeu Leu Asp Thr Lys Thr GlySer Asn470 475480Leu Leu Gln Trp Pro ValGlu Glu Val Glu Thr Leu ArgThr Asn485 490495Ser Thr Asp Leu Ser GlyIle Thr Ile Asp Tyr Gly SerThr Phe500 505510
Pro Leu AsnLeuArgArgAlaThrGlnLeuAsplieGluAlaGlu
515520525
Phe Glu LeuAspArgArgAlaValMetSerLeuAsnGluAlaAsp
530535540
Val Gly TyrAsnCysSerThrSerGlyGlyAlaAlaAlaArgGly
545550555
Ala Leu GlyProPheGlyLeuLeuValLeuThrAspLysHisLeu
560565570
His Glu GlnThrAlaValTyrPheTyrValAlaLysGlyLeuAsp
575580585
Gly Ser LeuThrThrHisPheCysGlnAspGluSerArgSerSer
590595600
Ser Ala AsnAsplieValLysArgValValGlySerAlaValPro
605610615
Val Leu GluAspGluThrThrLeuSerLeuArgValLeuValAsp
620625630
His Ser lieValGluSerPheAlaGlnGlyGlyArgSerThrAla
635640645
Thr Ser ArgValTyrProThrLysAlalieTyrAlaAsnAlaGly
650655660
Val Phe LeuPheAsnAsnAlaThrAlaAlaArgValThrAlaLys
665670675
Lys Leu ValValHisGluMetAspSerSerTyrAsnHisAspTyr
680685690
Met Val ThrAsplie
695
序列表
序列表3
<110>广西大学
<120>甘蔗可溶性酸性转化酶(SoINV3)基因及蛋白序列
<160>2
<210>1
<211>2373
<212>DNA
<213> (Saccharum officinarum)
<220>
<221>CDS
<222>(168).. (2258)
<220>
23
<221>5,UTP
<222>(1)...(167)
<220>
<221>3,UTP
<222>(2259). . (2373)
<400>1
GAMGTTGTTGTATGTACTA CTAGATTCTA GTTACMTCG AGTCGCATTGCGCGGCTCCA 60
GATCGAATCG GCCAGGAGTC GGTCATCGTCGTCCGCTCCGCTCCGGCCGCCACCGTAACC 120
GTAACACAAGTCGCCGGCGCCCTCCTTCCGATCCTCTTCCTTCGACA167
ATGGAGACCCGGGACACGACGGCGCCGCTCCCCTACTCGTACACG212
MetGluThrArgAspThrThrAlaProLeuProTyrSerTyrThr
51015
CCGCTGCCGGCCGCCGACGCCGCGTCCGCCGAGGTCACCGGCACC257
ProLeuProAlaAlaAspAlaAlaSerAlaGluValThrGlyThr
202530
GGCGGCAGCAGGAGCAGGCGGAGGCGGCCGCTCTGCGCCGCGGCG302
GlyGlySerArgSerArgArgArgArgProLeuCysAlaAlaAla
354045
CTCGTCCTCTCCGCCGCGCTGCTCCTCGCCGTGGCCGCGCTCGCC347
LeuValLeuSerAlaAlaLeuLeuLeuAlaValAlaAlaLeuAla
505560
GGCGTCGGCAGCCGCGTCGCCGCCGTCGTCCCCCGCCCAACGACC392
GlyValGlySerArgValAlaAlaValValProArgProThrThr
657075
GCGGTGGGAGAAACGGCCGGCGTCGGCGTCGGCGTCGTCCCTGGC437
AlaValGlyGluThrAlaGlyValGlyValGlyValValProGly
808590
GCGGGGACACCACAGGCGACGTCGACCAGGAGCCGCAGCAGGGGC482
AlaGlyThrProGlnAlaThrSerThrArgSerArgSerArgGly
95100105
CCCGATGCCGGCGTGTCGGAGAAGACGTCCGGCGTGTGGACCGGC527
ProAspAlaGlyValSerGluLysThrSerGlyValTrpThrGly
110115120
GTCATCGACGATGGCGCCAGGCTGCGGACTGACGCCGGCGGCAAC572
VallieAspAspGlyAlaArgLeuArgThrAspAlaGlyGlyAsn
125130135
GCGTTCCCGTGGAGCAATGCGATGCTGCAGTGGCAGCGCACGGGC617
AlaPheProTrpSerAsnAlaMetLeuGlnTrpGlnArgThrGly
140145150
TTC CAC TTC CAGCCG CAG AGG AAC TGG ATG AAC GAC CCC AAT GGC 662Phe His Phe GlnPro Gln Arg Asn TrpMet Asn Asp Pro Asn Gly155160165CCG GTG TAC TACAAG GGC TGG TAC CAC CTG TTC TAC CAA TAC AAC 707Pro Val Tyr TyrLys Gly Trp Tyr HisLeu Phe Tyr Gln Tyr Asn170175180CCG GAC GGC GCCATC TGG GGC AAC AAG ATC GCG TGG GGC CAC GCC 752Pro Asp Gly Alalie Trp Gly Asn Lyslie Ala Trp Gly His Ala185190195GTC TCC CGC GACCTC ATC CAC TGG CGC CAC CTC CCG CTG GCC ATG 797Val Ser Arg AspLeu lie His Trp Arg His Leu Pro Leu Ala Met200205210CTG CCC GAC CAGTGG TAC GAC ACC AAC GGC GTC TGG ACG GGC TCC 842Leu Pro Asp GlnTrp Tyr Asp Thr AsnGly Val Trp Thr Gly Ser215220225GCC ACC ACG CTCCCC GAC GGC CGC CTC GCC ATG CTC TAC ACC GGC 887Ala Thr Thr LeuPro Asp Gly Arg LeuAla Met Leu Tyr Thr Gly230235240TCC ACC AAC ACCTCC GTG CAG GTG CAG TGC CTC GCC GTC CCC GCC 932Ser Thr Asn ThrSer Val Gln Val GlnCys Leu Ala Val Pro Ala245250255GAC GAC GAC GACCCG CTG CTC ACC AAC TGG ACC AAG TAC GAG GGC 977Asp Asp Asp AspPro Leu Leu Thr AsnTrp Thr Lys Tyr Glu Gly260265270AAC CCG GCG CTGTAC CCG CCG CCG GGG ATC GGG CCC AGG GAC TTC 1022Asn Pro Ala LeuTyr Pro Pro Pro Glylie Gly Pro Arg Asp Phe275280285CGC GAC CCC ACCACG GCG TGG TTC GAC CCG TCG GAC TCC ACC TGG 1067Arg Asp Pro ThrThr Ala Trp Phe AspPro Ser Asp Ser Thr Trp290295300CGC ATC GTC ATCGGC TCC AAG GAC GAC GCC GAG GGC GAC CAC GCC 1112Arg lie Val lieGly Ser Lys Asp AspAla Glu Gly Asp His Ala305310315GGC ATC GCC GTGGTG TAC CGC ACC AGG GAC TTC GTG CAC TTC GAG 1157Gly lie Ala ValVal Tyr Arg Thr Arg Asp Phe Val His Phe Glu320325330CTC CTC CCG GACCTG CTC CAC CGC GTC GCG GGG ACG GGG ATG TGG 1202Leu Leu Pro AspLeu Leu His Arg ValAla Gly Thr Gly Met Trp335340345
GAG TGC ATC GACTTC TAC CCC GTC GCC ACC CGC GGC AAGGCG TCC 1247Glu Cys lie AspPhe Tyr Pro Val Ala Thr Arg Gly LysAla Ser350355360GGG AAC GGC GTCGAC ATG TCC GAC GCC CTC GCC AAG AACGGC GCC 1292Gly Asn Gly ValAsp Met Ser Asp Ala Leu Ala Lys AsnGly Ala365370375GTC GTC GGG GACGTG GTG CAC GTC ATG AAG GCC AGC ATGGAC GAC 1337Val Val Gly AspVal Val His Val MetLys Ala Ser MetAsp Asp380385390GAC CGA CAT GACTAC TAC GCG CTC GGG AGG TAT GAC GCGGCT GCC 1382Asp Arg His AspTyr Tyr Ala Leu GlyArg Tyr Asp AlaAla Ala395400405AAC GCG TGG ACGCCG CTC GAC GCC GAG AAG GAC GTC GGCACC GGC 1427Asn Ala Trp ThrPro Leu Asp Ala GluLys Asp Val GlyThr Gly410415420CTG CGG TAC GACTGG GGC AAG TTC TAC GCG TCC AAG ACGTTC TAC 1472Leu Arg Tyr AspTrp Gly Lys Phe TyrAla Ser Lys ThrPhe Tyr425430435GAC CCG GCC AAGCGC CGC CGC GTG CTC TGG GGA TGG GTCGGC GAG 1517Asp Pro Ala LysArg Arg Arg Val LeuTrp Gly Trp ValGly Glu440445450ACC GAC TCG GAGCGC GCT GAC GTC TCC AAG GGA TGG GCATCG CTG 1562Thr Asp Ser GluArg Ala Asp Val SerLys Gly Trp AlaSer Leu455460465CAG GGT ATC CCCCGG ACG GTG CTG CTG GAC ACC AAG ACGGGC AGC 1607Gln Gly lie ProArg Thr Val Leu LeuAsp Thr Lys ThrGly Ser470475480AAC CTG CTG CAGTGG CCC GTG GAG GAA GTG GAG ACG CTGCGC ACC 1652Asn Leu Leu GlnTrp Pro Val Glu GluVal Glu Thr LeuArg Thr485490495AAC TCC ACG GACCTC AGC GGC ATC ACC ATC GAC TAC GGCTCC ACG 1697Asn Ser Thr AspLeu Ser Gly lie Thrlie Asp Tyr GlySer Thr500505510TTC CCG CTC AACCTC CGC CGC GCC ACG CAG CTG GAC ATCGAG GCG 1742Phe Pro Leu AsnLeu Arg Arg Ala ThrGln Leu Asp lieGlu Ala515520525GAG TTC GAG CTGGAC CGC CGC GCC GTC ATG TCC CTC AACGAG GCC 1787Glu Phe Glu LeuAsp Arg Arg Ala ValMet Ser Leu AsnGlu Ala530535540
GACGTGGGCTACAACTGCAGCACCAGCGGCGGCGCCGCCGCCCGC1832
AspValGlyTyrAsnCysSerThrSerGlyGlyAlaAlaAlaArg
545550555
GGCGCGCTGGGCCCCTTCGGCCTGCTCGTCCTCACCGACAAGCAC1877
GlyAlaLeuGlyProPheGlyLeuLeuValLeuThrAspLysHis
560565570
CTGCACGAGCAGACGGCCGTCTACTTCTACGTGGCCAAAGGCCTG1922
LeuHisGluGlnThrAlaValTyrPheTyrValAlaLysGlyLeu
575580585
GACGGCTCCCTCACCACGCACTTCTGCCAGGACGAGTCCCGGTCG1967
AspGlySerLeuThrThrHisPheCysGlnAspGluSerArgSer
590595600
TCCAGCGCCAACGACATCGTCAAGCGCGTCGTCGGCAGCGCCGTC2012
SerSerAlaAsnAsplieValLysArgValValGlySerAlaVal
605610615
CCCGTGCTGGAGGACGAGACCACACTCTCGCTTCGCGTGCTCGTC2057
ProValLeuGluAspGluThrThrLeuSerLeuArgValLeuVal
620625630
GACCACTCCATCGTCGAGAGCTTCGCGCAGGGTGGAAGGTCAACG2102
AspHisSerlieValGluSerPheAlaGlnGlyGlyArgSerThr
635640645
GCCACCTCGCGCGTCTACCCCACCAAGGCCATCTACGCCAACGCC2147
AlaThrSerArgValTyrProThrLysAlalieTyrAlaAsnAla
650655660
GGCGTGTTCCTCTTCAACAACGCCACCGCCGCGCGCGTCACCGCC2192
GlyValPheLeuPheAsnAsnAlaThrAlaAlaArgValThrAla
665670675
AAGAAGCTCGTCGTCCACGAGATGGACTCGTCCTACAACCACGAC2237
LysLysLeuValValHisGluMetAspSerSerTyrAshHisAsp
680685690
TACATGGTCACGGACATC2255
TyrMetValThrAsplie
695696TGATGCTGCT GCTGCTGCTG CTGCTGCTGA CCCGTCGTCC ATCCAACCCA CCGCTGCACC 2315CAATTTTTTG AACCCATATA TAGCGAAGCA TCTTCTTGTA CCTAAAAAAA MAMMA 2373<210>2<211>696<212>PRT<213> 甘 (Saccharum officinarum)<400>2 Met Glu
Thr
Pro Leu Pro
Gly Gly Leu Val Gly Val Ala Val Ala Gly Pro Asp Val lie Ala Phe Phe His Pro Val Pro Asp Val Ser Leu Pro Ala Thr Ser Thr Asp Asp Asn Pro
Ser
Arg Ala Arg
Leu Ser
Gly Gly Thr Ala Asp Pro Phe Tyr Gly Arg Asp Thr Asn Asp Ala
Ser Glu Pro Gly Asp Trp Gln Tyr Ala Asp Gln Leu Thr Asp Leu
Asp 5 Ala 20 Ser 35 Ala 50 Arg 65 Thr 80 Gln 95 Val 110 Gly 125 Ser 140 Pro 155 Lys 170 lie 185 Leu 200 Trp 215 Pro 230 Ser 245 Pro 260 Tyr 275
Thr Asp Arg Ala Val Ala Ala Ser Ala Asn Gln Gly Trp lie Tyr Asp Val Leu Pro
Thr Ala Arg Leu Ala Gly Thr Glu Arg Ala Arg Trp Gly His Asp Gly Gln Leu Pro
Ala Pro Leu Pro Tyr Ser Tyr Thr
1015
Ala Ser Ala Glu Val Thr Gly Thr
2530
Arg Arg Pro Leu Cys Ala Ala Ala
4045
Leu Leu Ala Val Ala Ala Leu Ala
5560
Ala Val Val Pro Arg Pro Thr Thr
7075
Val Gly Val Gly Val Val Pro Gly
8590
Ser Thr Arg Ser Arg Ser Arg Gly 100105
Lys Thr Ser Gly Val Trp Thr Gly 115120
Leu Arg Thr Asp Ala Gly Gly Asn 130135
Met Leu Gln Trp Gln Arg Thr Gly 145150
Asn Trp Met Asn Asp Pro Asn Gly 160165
Tyr His Leu Phe Tyr Gln Tyr Asn 175180
Asn Lys lie Ala Trp Gly His Ala 190195
Trp Arg His Leu Pro Leu Ala Met 205210
Thr Asn Gly Val Trp Thr Gly Ser 220225
Arg Leu Ala Met Leu Tyr Thr Gly 235240
Val Gln Cys Leu Ala Val Pro Ala 250255
Thr Asn Trp Thr Lys Tyr Glu Gly 265270
Pro Gly lie Gly Pro Arg Asp Phe 280285Arg Arg Gly Leu Glu Gly Val Asp Asn Leu Asp Thr Gln Asn Asn Phe Glu Asp Gly Leu
Asp Ile Ile Leu Cys Asn Val Arg Ala Arg Pro Asp Gly Leu Ser Pro Phe Val Ala His
Pro Val Ala Pro lie Gly Gly His Trp Tyr Ala Ser lie Leu Thr Leu Glu Gly Leu Glu
Thr lie Val Asp Asp Val Asp Asp Thr Asp Lys Glu Pro Gln Asp Asn Leu Tyr Gly Gln
Thr 290 Gly 305 Val 320 Leu 335 Phe 350 Asp 365 Val 380 Tyr 395 Pro 410 Trp 425 Arg 440 Arg 455 Arg 470 Trp 485 Leu 500 Leu 515 Asp 530 Asn 545 Pro 560 Thr
Ala Ser Tyr Leu Tyr Met Val Tyr Leu Gly Arg Ala Thr Pro Ser Arg Arg Cys Phe Ala
Trp Lys Arg His Pro Ser His Ala Asp Lys Arg Asp Val Val Gly Arg Arg Ser Gly Val
Phe Asp Thr Arg Val Asp Val Leu Ala Phe Val Val Leu Glu lie Ala Ala Thr Leu Tyr
Asp Asp Arg Val Ala Ala Met Gly Glu Tyr Leu Ser Leu Glu Thr Thr Val Ser Leu Phe
Pro 295 Ala 310 Asp 325 Ala 340 Thr 355 Leu 370 Lys 385 Arg 400 Lys 415 Ala 430 Trp 445 Lys 460 Asp 475 Val 490 lie 505 Gln 520 Met 535 Gly 550 Val 565 Tyr
Ser Glu Phe Gly Arg Ala Ala Tyr Asp Ser Gly Gly Thr Glu Asp Leu Ser Gly Leu Val
Asp Gly Val Thr Gly Lys Ser Asp Val Lys Trp Trp Lys Thr Tyr Asp Leu Ala Thr Ala
Ser Asp His Gly Lys Asn Met Ala Gly Thr Val Ala Thr Leu Gly lie Asn Ala Asp Lys
Thr His Phe Met Ala Gly Asp Ala Thr Phe Gly Ser Gly Arg Ser Glu Glu Ala Lys Gly
Trp 300 Ala 315 Glu 330 Trp 345 Ser 360 Ala 375 Asp 390 Ala 405 Gly 420 Tyr 435 Glu 450 Leu 465 Ser 480 Thr 495 Thr 510 Ala 525 Ala 540 Arg 555 His 570 Leu
575 580585Asp Gly Ser Leu Thr Thr His Phe Cys Gln Asp GluSer Arg Ser590 595600Ser Ser Ala Asn Asp lieVal Lys Arg Val Val GlySer Ala Val605 610615Pro Val Leu Glu Asp Glu Thr Thr Leu Ser Leu ArgVal Leu Val620 625630Asp His Ser lie Val Glu Ser Phe Ala Gln Gly GlyArg Ser Thr635 640645Ala Thr Ser Arg Val Tyr Pro Thr Lys Ala lie TyrAla Asn Ala650 655660Gly Val Phe Leu Phe Asn Asn Ala Thr Ala Ala ArgVal Thr Ala665 670675Lys Lys Leu Val Val HisGlu Met Asp Ser Ser TyrAsn His Asp680 685690Tyr Met Val Thr Asp lie695 69权利要求
1.一种甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于对所有植物可溶性酸性转化酶基因全长序列进行进化分析的基础上,比对同一家族的可溶性酸性转化酶基因cDNA序列,并设计扩增SoINV基因核心片段的引物,从甘蔗幼嫩叶片提取总RNA,并经反转录,用常规PCR法扩增SoINV基因核心片段;然后在核心片段序列5'端设计若干对特异引物,在核心片段序列3'端设计一个特异引物和两个锚定引物,通过RACE PCR技术分别扩增到SoINV基因核心区的5'端三个不同序列片段和3'端一个序列片段, 将三个5'端序列分别与核心区、3'端序列和AY302083片段进行拼接,获得甘蔗GH32家族三个可溶性酸性转化酶(SoINV)基因全长cDNA序列,分别记为SoINVl、SoINV2和SoINV3。
2.根据权利要求1所述的甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于所述的SoINVl总长2387bp ;经Vector NTI Advance 11软件分析,该序列的ORF为2055bp,编码685个氨基酸;起始密码子(ATG)位于转录起始位点后215bp处,终止密码子(TAG)位于2272bp,其后还有一段115bp的非编码序列,并带有真核生物典型的 PolyA尾巴;甘蔗GH32家族SoINVl基因所编码蛋白的氨基酸序列METRDTTAPLPYSYTPLPAADAASAEVTGTGGRSRRRSLCAAALVLSAALLLAVAALAAAGRRPTTAVGETA GVGVVPGVGTPQATSTRSISRGPDAGVSEKTSGAWSGVVDDGGRLRADGGGNAFPWSNAMLQWQRTGFHFQPQRNWM NDPNGPVYYKGWYHLFYQYNPDGAIWGNKIAffGHAVSRDLIHWRHLPLAMLPDQffYDTNGVffTGSATTLPDGRLAML YTGSTNTSVQVQCLAVPADDDDPLLTNWTKYEGNPALYPPPGIGPRDFRDPTTAWFDPSDSTWRIVIGSKDDAE⑶H AGIAWYRTRDFVHFELLPDLLHRVAGTGMWECIDFYPVATRGKASGNGVDMSDALAKNGAW ⑶ WHVMKASMDDD RHDYYALGRYDAAANAWTPLDAEKDVGTGLRYDWGKFYASKTFYDPAKRRRVLffGffVGETDSERADVSKGffASLQGI PRTVLLDTKTGSNLLQWPVEEVETLRTNSTDLSGITIDYGSTFPLNLRRATQLDIEAEFELDRRAVMSLNEADVGYN CSTSGGAAARGALGPFGLLVLTDKHLHEQTAVYFYVAKGLDGSLTTHFCQDESRSSSANDIVKRVVGSAVPVLEDET TLSLRVLVDHSIVESFAQGGRSTATSRVYPTKAIYANAGVFLFNNATAARVTAKKLVVHEMDSSYNHDYMVTDL·
3.根据权利要求1所述的甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于所述的SoINV2总长2429bp ;经Vector NTI Advance 11软件分析,该序列的ORF为2085bp,编码695个氨基酸;起始密码子(ATG)位于转录起始位点后227bp处,终止密码子(TAA)位于2314bp,其后还有一段115bp的非编码序列,并带有真核生物典型的 PolyA尾巴;甘蔗GH32家族SoINV2基因所编码蛋白的氨基酸序列METRDTTAPLPYSYTPLPAADAASAEVTGTGHRGGGRSRRSSLCAAALVLSAALLLAVAALAGVGGRVAVVP RPTTAVGETAGVGVGPGAGTPQATSTRSISRGPDAGVSEKTSGAWSGVVDDGGRLRADGGGNAFPWSNAMLQWQRTG FHFQPQRNWMNDPNGPVYYKGWYHLFYQYNPDGAIWGNKIAffGHAVSRDLIHWRHLPLAMLPDQffYDTNGVffTGSAT TLPDGRLAMLYTGS TNTSVQVQCLAVPA⑶DDPLLTNWTKYEGNPALYPPPGIGPRDFRDPTTAWFDPSDSTWRIV IGSKDDAE ⑶ HAGIAWYRTRDFVHFELLPDLLHRVAGTGMWECIDFYPVATRGKASGNGVDMSDALAKNGAW⑶ V VHVMKASMDDDRHDYYALGRYDAAANAWTPLDAEKDVGTGLRYDWGKFYASKTFYDPAKRRRVLWGffVGETDSERAD VSKGWASLQGIPRTVLLDTKTGSNLLQWPVEEVETLRTNSTDLSGITIDYGSTFPLNLRRATQLDIEAEFELDRRA VMSLNEADVGYNCSTSGGAAARGALGPFGLLVLTDKHLHEQTAVYFYVAKGLDGSLTTHFCQDESRSSSANDIVKR VVGSAVPVLEDETTLSLRVLVDHSIVESFAQGGRSTATSRVYPTKAIYANAGVFLFNNATAARVTAKKLVVHEMDS SYNHDYMVTDI。
4.根据权利要求1所述的甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于所述的SoINV3总长2373bp ;经Vector NTI Advance 11软件分析,该序列的ORF为2088bp,编码696个氨基酸;起始密码子(ATG)位于转录起始位点后168bp处,终止密码子(TAA)位于2258bp,其后还有一段115bp的非编码序列,并带有真核生物典型的 PolyA尾巴;甘蔗GH32家族SoINV3基因所编码蛋白的氨基酸序列为METRDTTAPLPYSYTPLPAADAASAEVTGTGGSRSRRRRPLCAAALVLSAALLLAVAALAGVGSRVAAVVPR PTTAVGETAGVGVGVVPGAGTPQATSTRSRSRGPDAGVSEKTSGVWTGVIDDGARLRTDAGGNAFPWSNAMLQWQRT GFHFQPQRNWMNDPNGPVYYKGWYHLFYQYNPDGAIWGNKIAffGHAVSRDLIHWRHLPLAMLPDQffYDTNGVffTGSA TTLPDGRLAMLYTGSTNTSVQVQCLAVPADDDDPLLTNWTKYEGNPALYPPPGIGPRDFRDPTTAWFDPSDSTWRIV IGSKDDAE ⑶ HAGIAWYRTRDFVHFELLPDLLHRVAGTGMWECIDFYPVATRGKASGNGVDMSDALAKNGAWGDV VHVMKASMDDDRHDYYALGRYDAAANAWTPLDAEKDVGTGLRYDWGKFYASKTFYDPAKRRRVLWGffVGETDSERAD VSKGWASLQGIPRTVLLDTKTGSNLLQWPVEEVETLRTNSTDLSGITIDYGSTFPLNLRRATQLDIEAEFELDRRAV MSLNEADVGYNCSTSGGAAARGALGPFGLLVLTDKHLHEQTAVYFYVAKGLDGSLTTHFCQDESRSSSANDIVKRVV GSAVPVLEDETTLSLRVLVDHSIVESFAQGGRSTATSRVYPTKAIYANAGVFLFNNATAARVTAKKLVVHEMDSSYN HDYMVTDI。
5.根据权利要求1所述的甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于所述的甘蔗GH32家族三个可溶性酸性转化酶(SoINV)基因全长cDNA序列,核心区序列(M-SoINV)是用以下核苷酸序列为引物扩增获得N fl 5' -tctggggcaacaagatcgcgt-3‘; N rl 5' -aaattgggtgcagcggtgggt-31
6.根据权利要求1所述的甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于所述的3'端的序列(3' -SoINV)是用以下核苷酸序列为引物,与 3' -Full RACEKit(TAKARA)提供的引物结合,鸟巢式扩增获得GZ f2 -cctcttcaacaacgccaccgccg-3‘ 0
7.根据权利要求1所述的甘蔗GH32家族可溶性酸性转化酶(SoINV)基因及其蛋白序列,其特征在于所述的5'端序列(5' -SoINV)分别是用以下核苷酸序列为引物,与 5' -Full RACE Kit(TAKARA)提供的引物结合,鸟巢式扩增获得用于扩增甘蔗GH32家族可溶性酸性转化酶SoINVl基因5'端的引物INVSYR1 (5 ‘ -gttggtggagccggtgtagagcat-3‘);INVRl (5 ‘ -gcccctgctgatgctcctggtcg-3‘);用于扩增甘蔗GH32家族可溶性酸性转化酶SoINV2基因5'端的引物INVSY Rl: (5, _gttggtggagccggtgtagagcat-3‘);INVR2 (5‘ -acgtcgcctgtggtgtcccc-3‘);用于扩增甘蔗GH32家族可溶性酸性转化酶SoINV3基因5'端的引物 INVSY Rl (5 ‘ -gttggtggagccggtgtagagcat-3‘); INVR3 (5 ‘ -ggggtcgttcatccagttcctct-3‘)。
全文摘要
本发明公开了一种GH32家族甘蔗可溶性酸性转化酶(SoINV)基因及其蛋白序列,在对所有植物可溶性酸性转化酶基因全长序列进行进化分析的基础上,以同一家族的可溶性酸性转化酶基因序列的保守区设计引物,从甘蔗幼嫩叶片提取总RNA,并经反转录,用常规PCR法扩增,结合RACE PCR技术克隆到甘蔗可溶性酸性转化酶基因家族三个成员的全长cDNA序列,不仅为研究甘蔗可溶性酸性转化酶的转录和表达机制,进一步探讨蔗糖的积累机理奠定基础,而且通过氨基酸序列可以获得具有生物活性的纯化蛋白,解决了糖基水解酶(glycosyl hydrolase familie)GH32基因家族中甘蔗INV基因目前只克隆出一些cDNA序列片段,没有获得全长核苷酸序列的技术难点,为研究可溶性酸性转化酶的生物学功能提供基础。
文档编号C12N15/56GK102505021SQ201110365748
公开日2012年6月20日 申请日期2011年11月17日 优先权日2011年11月17日
发明者刘铭, 李杨瑞, 杨丽涛, 牛俊奇, 王爱勤 申请人:广西大学