牙鲆follistatin基因序列及其编码蛋白质序列的制作方法

文档序号:3575425阅读:250来源:国知局
专利名称:牙鲆follistatin基因序列及其编码蛋白质序列的制作方法
技术领域
本发明涉及海洋生物牙鲆“全鱼”,具体地说是牙鲆FOLLISTATIN基因组序列、cDNA序列与蛋白质序列。
背景技术
Follistatin基因最早于1987年作为在小鼠卵巢的卵泡液中的卵泡刺激因子的抑制物质被发现的,其生物学功能是与TGF-β超基因家族成员结合,从而在胚胎发育和成体组织中发挥功能。在胚胎发育过程中,Follistatin(和另两个BMP拮抗物noggin和chordin)能够通过结合BMP-4阻止腹部化,从而使外胚层发育为神经。在肌肉生成方面,对Follistatin纯合子无功能突变的小鼠的表型研究表明Follistatin可能在生肌节形态形成方面有一定的作用。在该基因缺失的小鼠在出生后由于各种缺陷包括肌肉发育不足和骨骼畸形等而迅速死亡。在该基因缺失的小鼠在出生后由于各种缺陷包括肌肉发育不足和骨骼畸形等而迅速死亡。Follistatin能够促进肌肉的发育和生长,缺失Follistatin的小鼠比杂合小鼠个体小,在隔膜处的肌肉比较少而切肋间肌肉小,这种小鼠在出生后很快就因为呼吸障碍而死亡。另外,通过Follistatin与Myostatin的相互作用,诱导出了超级肌肉表现型的牛和鼠。现已发现Follistatin基因广泛分布于脊椎动物并且结构保守,在几种鱼类中已克隆到这个基因,但是牙鲆FOLLISTATIN基因的克隆和序列及所编码的蛋白序列未见报道。

发明内容
本发明克隆了牙鲆鱼Follistatin基因,得到基因组序列、启动子序列、cDNA序列与蛋白质序列。序列来源于牙鲆鱼,名称为牙鲆鱼Follistatin。
为了实现上述目的,本发明的技术方案如下一种牙鲆FOLLISTATIN基因序列,具有序列表中SEQ ID NO.1碱基序列。
一种牙鲆Follistatin启动子,具有SEQ ID NO.1碱基序列中第1到第1016个碱基序列(序列一致性在85%以上的碱基序列)。
一种牙鲆FOLLISTATIN cDNA基因序列1,具有序列表中SEQ IDNO.2碱基序列,其编码蛋白具有SEQ ID No.3所示的氨基酸序列(蛋白序列一致性在90%以上的);一种牙鲆FOLLISTATIN cDNA基因序列2,具有序列表中SEQ IDNO.4碱基序列,其编码蛋白具有SEQ ID No.5所示的氨基酸序列(蛋白序列一致性在90%以上的)。
本发明研究并搞清楚了牙鲆Follistatin基因序列、启动子序列、cDNA序列与蛋白质序列。Follistatin基因可以治疗由于TGF-β家族蛋白过量表达而引起的各种疾病;本发明的两个FOLLISTATIN cDNA基因序列可利用多种表达系统(比如细菌、酵母、细胞等)对其的全长进行表达,利用所表达的两个蛋白序列(两个蛋白序列一致性在90%以上的)以及这两个蛋白的衍生物(对这两个蛋白进行各种修饰后的蛋白产物)可进行多种开发与应用;本发明的Follistatin启动子可以驱动不同基因在不同组织(比如肌肉、脑等)以及细胞中特异性表达。同时可以利用该基因进行物种改良,有利于养殖业的生产。


图1为牙鲆基因组DNA消化结果图;图2为第一轮Genomewalking结果图;其中1-7第一次PCR,8-14第二次PCR,MDNAmarker。
具体实施例方式
实施例11、牙鲆基因组Genomewalking基因文库构建a.基因组DNA提取利用牙鲆新鲜的或者-20℃保存的肌肉、肝脏、鳍、精巢、精液等组织作为基因组DNA提取来源。
消化液成分为Tris-HCl,pH8.010mmol/LEDTA,pH8.0100mmol/LSDS,pH7.2 1%Proteinase K 100μg/mlRNase A20μg/ml将组织剪碎、碾磨后按照1∶1(重量g∶体积ml)加入消化液,在55℃消化3小时,然后等体积酚,酚∶氯仿(1∶1),氯仿∶异戊醇(24∶1)各抽提一次,最后上清液加入0.2体积10mol/L乙酸铵,2体积预冷无水乙醇沉淀DNA,离心弃上清,70%乙醇洗涤沉淀2次,室温干燥,按照1∶500(原始重量g∶体积ul)的量加入TE溶解,-20℃保存。
b.使用7种平末端内切酶彻底消化基因组DNA。7种内切酶为G;基因组DNA1,Dra I;2,Pvu II;3,EcoR V;4,Stu I;5,Sma I;6,HpaI;7,Sca I。消化结果见图1。
c.连接接头,构建Genomewalking基因文库。利用Clontech公司Universal Genomewalker试剂盒提供的Genomewalker Adaptors,连接已消化基因组DNA,构建文库。
2.RNA提取a.利用新鲜的或者-80℃保存的不同时期的胚胎或者肌肉等组织作为RNA提取来源;b.利用Trizol(Invitrogen公司)进行RNA的提取,方法根据Invitrogen公司提供的Trizol试剂手册;
3.反转录获得cDNA文库利用Promega公司的反转录试剂盒合成cDNA的第一链。方法按照Promega公司反转录试剂盒手册进行,即得到cDNA文库。
2、牙鲆基因组Follistatin基因的克隆在巢式PCR过程中利用Clontech公司的酶系统Advantage GenomicPCR Kit、Advantage Genomic Polymerase Mix、Advantage cDNA PCRKit或者Advantage cDNA Polymerase Mix;根据不同的PCR仪,按照Clontech Universal Genomewalker试剂盒提供的不同的程序进行巢式PCR。
a.根据斑马鱼等鱼类以及其它动物已知Follistatin基因序列保守区,合成一对正向巢式PCR引物,follyp-1-1(5′-GGBGCNCCMAAYTGCATMCCNTGYA-3′)和follyp-1-2(5′-TRGACTGTGGNCCYGGRAARARATG-3′)。其中B代表T,C,G;N代表A,T,G,C;M代表A,C;Y代表C,T;R代表G,A;b.配合Clontech Universal Genomewalker试剂盒(Clontech,USA)提供的正向巢式PCR引物AP1(5′-GTAATACGACTCACTATAGGGC-3′)和AP2(5′-ACTATAGGGCACGCGTGGT-3’),进行第一轮Genomewalking。结果见图2。
c.将第一轮Genomewalking中第二次PCR获得的单一强带纯化回收,克隆入上海生工pUMc-T载体,扩增质粒,使用通用测序引物T7和Sp6测序。
d.根据上述片段测序得到结果,经比较发现该片段是Follistatin基因。
e.根据获得的牙鲆Follistatin基因片断,再分别合成巢式PCR引物,向下游走的Follyp-2s-1(5’-CGGCTCAGATGGAAAGACCTACAAAG)和Follyp-2s-2(5’-GCACTGCTGAAGG CTAAATGCAAAGG-3’),向上游走的引物Follyp-2as-1(5’-CCTTTGCATTTAGCCTTCAGCAG TG-3’)和Follyp-2as-2(5’-CGTCTTTGTAGGTCTTTCCATCTGAG-3’)。配合ClontechUniversal GenomeWalker试剂盒提供的正向巢式PCR引物AP1和AP2,进行第二轮Genomewalking。
f.然后将获得片断测序,再根据获得到的序列设计向下游走的引物FollYP-3s-1(5’-GATCCTGCTTGAAAAGCGTTCGTG-3’)和FollYP-3s-2(5’-CTAAGTCGTGTGAGGACATCCAG-3’);向上游走的引物Follyp-3as-1(5’-GTGAAGAGCGAACCCACTCTGAAAAG-3’)和Follyp-3as-2(5’-TTTTCCTGCGGCTGACAGCAGTGGAAG-3’)引物,配合AP1和AP2进行巢式PCR;g.然后将获得片断测序,再根据获得到的序列设计向下游走的引物Fsyp-4s-1(5’-TCCCACGGAAACTGTGTGTCAATG-3’)和Fsyp-4s-2(5’-CTACAGAGAACTTGAGGTTCTGTC-3’),最后利用计算机推断出得到了完整的Follistatin基因后停止巢式PCR。
(1)序列特征基因组序列4327碱基对;类型核苷酸;链型双链;拓扑结构线性(2)分子类型DNA(3)假设否(4)反义否(5)最初来源牙鲆鱼(Paralichthys olivaceus)。
完整的Follistatin基因的基因组序列如下CTGTACCCAGGGTTCCGCTGAAGAGGTCGCGCAAGGACTTCTGAAGGAGTACAGCCTGTGCGTCAGTGAGTGGAGAGCTGTGTGTGTGTCTGTGTGTGTGTGCGTGTGCGTGTTTCATTCAATGAAACGCAACACCCCGCTCCTCTGTCTGAAGCCTCTTCGGAAACAGGTTTGATTATATTTAAAAGCTTGACCCGGGAGAGAAACCCAAACAACTGTCTGTTCTAATATTAGCTGTGCGGCTGCGGACTACTATGAATTCTATTATTGTCTGGACTGGGATATATTCCCAAAAGGCAACTCTCATCAAAAGATTTACACTTTTGTCTGCAGTTGGTGGAAAAAATATCAGTTACCATCAAGCAGTATACTCTTAATTTGAAGGTATTATGATGGCATGCATGGACAGGATGTTGGCTGTGGGCCTGTTTTATTATACAGTAATAAAGTTTTTAATGCATAAAAAGCTATTTATCATTACTATTATTAATAATTAATGTTATTATTAATATTTTCATTAATAGTATCATTATTATTGCTTTAAAAGCACATGTAGAGGTCGTGCGTAAAAACGAACCGTGCGCCTGTTGAGAAAACTAAAAGGATTAAAGTTAATAACAGCATTGTGCATTAAAAACTAAATTCACACCCACCGGCAGCTTTTATCTGCTCTGCACAGTTCCACTAGTTTCACAGCAGAGCCACAGACACTCGTGGCGTTTTTATTACAACACCAGAGCTGTTTCCATGTTCTTTAATTGTTGTGGCAGGAGAAAGTTGCCTGTAACCATGTCACCTGATTGGGATTGAAAGAACTTTGGTAAGAAAAAAAAAAAAATCTCCCCACCCCCACAGAAGAGGAGACCGCCCACCAGAGACAGTCAACCCGAGACCCCTTATAGATTTAAAAAGAGAGGCTGCATTCTCAGACTGACACTTGTTCAACACTGCCACCTTAAGCAGATTACTTTTGCGCTGCCCGTGTCAAATACGTGCTTCACTTTGCCTCTCCATCATGTTTAGGATGCTGAAACACCACCTCCACCCGGGCATTTTTCTCTTCTTCATATGGCTTTGTCACCTCATGGAACATCAAAAAGTTCAAGGTAAGGATCTTCCTCATCCGTTTTTTTTTTTTCATTTTTATTCTCTCCGTTTTTTGGAAAAAGAAAAAAAAACTTCCACTGCTGTCAGCCGCAGGAAAACGTGCACCAGTGCCGGATAATTGCTGCAGCCGACAAACTTTTCAGAGTG
GGTTCGCTCTTCACCCGCTGATAAAAGTCTCCACATCAGACAGTGCGTGCGTAAAGTGTCCGCAAAACAGGTGCCATATGGGACACTTCAAAGTGAGACTGTACGGCAAGTTTTTGCAGCCCTCTCCAGGGGCACGCACGGCGCTGGACGCAGCCTCAGCTTCTCTCAATATCTCAACATGTTCTCATACTTTTTCTCTCAGAATTCAATGTCTGTGCGTGTTTTCGCCAAACCTGTAACAAACTTGAAAAATGTGACTTCATGGGCGTGTTATTTGTATATTTCCCGCATCCAGCATTCATGCGTAAAATTGCTTTTATTGGCAATAGTTTTATTTTACTATGACAAAAATTATAAAACTGGAAACTTTTACACGCATCACTCTTTCTTGAGAGGTTCTAGTTGAGATAGATACAAATGGAAATGTGTCTTCAGTTTAGGAAACACACCATGAACACTTACTGTCATGTGTGTCTGTGTCGGTTTACACACCCTGCTGCTGTCTGGGTCGAATTATATTCCTGTGTCATTGATTTAAGATTCGTTTTAGTTTGAATGTGTTTAAAATGAACTTGGTACTGCTTTGTTAATAAACTAAAGAGGCGTGTTGGGTAAATGTCTGGGATATCGCCCTGAGGCGGCGGAGACAGTGGAGTCTTGCTACCTTAAGACGGTGGTTTCCTCTCTGAGGGAGACTTAACACTGTCTGTACGCTGCTCTCTGTCCAGTGAIAACAAGGAGTCTGGGAATAAAGCACAACCTCCCTCCTAAGATTCTTTTCACTTTTTGGGGTTTGTGGCTGCATGGCATGGGCTGTCAGTCACAGACTGGATAGTGTTAACAGTCTCGCAAAAAAAAAATCAGACAAAATAGTGCCGGGTGGGAGAAAGGGGAGGGAGCGTCTCGGATAATACGGGCTGCTTGTGTGTCTGGTGCGTCTGGTCCAGCGCTTTGAGGCGTTCAACATGCAACAATCGAGGTTTTCTTATGTTTTTTCTGTTTTCAGCCGGGAACTGCTGGTTGCAGCAGGGGAAGAACGGGAGGTGCCAGGTGCTGTACATGCCCGGTATGAGCAGGGAGGAGTGCTGCCGGAGCGGAAGACTGGGGACGTCCTGGACCGAGGAGGACGTCCCTAACAGCACGCTCTTTAGGTGGATGATCTTCAATGGCGGAGCCCCCAATTGCATACCTTGCAAAGGTGGAGGTGCACTCATTTTCCTTCGTTTTTTTCATAACATACATGCTCGATCATTTTCTCTCATGAACCATATGGTGATGCTCTGATTTGCGCATGCCAAAGTGCATAAAATGCCCATTTTGCAATGCGTAATTTCACGCACAAGGTCAAAATCCTCGCCAGCGTCACTTACCAGAGACCTCTTTATTTTTCAGAAACCTGCGATAATGTTGACTGTGGGCCGGGAAAGAGGTGCAAGATGAACAGAAGAAGCAAGCCGCGCTGCGTGTGCGCGCCAGACTGCTCCAACATCACCTGGAAAGGACCGGTCTGCGGCTCAGATGGAAAGACCTACAAAGACGAATGCGCACTGCTGAAGGCTAAATGCAAAGGCCACCCTGACCTGGACGTGCAGTACCAGGGAAAGTGCAAGAGTGAGTAAACATTACA
TTTAAACCTGCCAATTTATGAGATTACGCGTCGGCATTCGTGCATTTCGTGCCAGTTTTTAACAAAATCTTTAAAATTCCTTTGTCTGGACAGAAACGTGCCGTGACGTCTTGTGCCCCGGCAGCTCCACGTGCGTCGTGGACCAGACAAATAATGCATATTGTGTGACGCGTAATCGGATTTGCCCCGAGGTGACGTCGCCTGATCAGTACCTGTGTGGAAACGACGGGATCATCTATGCCAGCGCGTGTCACTGAGAGAGCTACCTGTCTCCTGGGCAGATCTATCGGAGTGGCGTATGAGGGCAAATGCATCAGTAAGTCTGCAGACATAAGAGACGAGATACTGAGCGAGACTTTGCTCCCTGAAAGCGCCTCCAGTCACTGACTTTGAATATTGTTTGAGTGCATGTTCTTCCTGGCCAGCTCTGCACTTTCTCACTGCTCATTTTTGCCAACATTCCACTGAGGAGGGGGTCTTAGAGAGAGAGAGAAAGAGGGAGGGAGTTAGTTTTGTATTGCTTGTGTTTGCTCAAGAAATGATAGACATCTTATTATTTCCTGATGTTGGCAGCACTATCCCATATGGGAGAGAGGAAAGAAGAGGAGGGGGGAGAAAGAGAGCAAAACAGGGGAGTGCTGGGGGCCATGAAGACATGCTTCAAGTTAATATTTGAGTCAGATGGACTCTTATCCAGAAAGCAGTCAATATAAGTCATGTAGACTTAAAAATGCTAATTAAAACATGACTTTTTGTTGCCTGCCACAAGGCATAAGCCTATAAAATAGGATTTGTTTTTTATTCCTTGCTTAGCTCTCCAATACTCAACAGTCATCTGATCCTGCTTGAAAAGCGTTCGTGATAATCCATCTAATCAAAGATTTCCCCTGTGCTGATCTCTTCCTCCCTCTCTGCAGAGGCTAAGTCGTGTGAGGACATCCAGTGCAGCGCAGGGAAAAAGTGTCTGTGGGATGCTCGAATGAGCCGAGGCCGCTGCTCACTGTGCGATGAGACCTGTCCGGAGAGCAGGACGGATGAGGCGGTGTGTGCCAGCGACAACACCACATATCCCAGTGAATGTGCCATGAAGCAAGCTGCTTGCTCTATGGGTGTGCTGCTTGAGGTCAAGCACTCTGGATCTTGCAACTGTAAGTAAATAACAAAAGCAAAATATGAAAAAGAATCAATCAAAACACCCCCCCCCTCCAAGCAAAAGACAATATTCCATGTTGCTTTCCCAACAAAAAACCTCCCCTGAAAGTGCCCCTGATGGCTGTGCGGTTCCCACGGAAACTGTGTGTCAATGATTATCACGACTAGATAAGCACTTTAAAAACAATTCTGATGTTCTACAGAGAACTTGAGGTTCTGTCATTTTAACAACTTGCTTGTGATTTTTGTTCATCAGAGACGTTTCCAGGGCAGCAGATGGTTCCCATGTCCAG对于该基因组序列(SEQ ID NO.1)根据其可生成的蛋白质序列有如下两种描述方式
Follistatin基因组序列的1-1016为启动子;1017-1107,2262-2459,2648-2866,2976-3198,3800-4038段为Exon;1108-2261,2460-2647,2867-2975,3199-3799段为Intron;4039-4327段为3’-UTRFollistatin基因组序列的1-1016为启动子;1017-1107,2262-2453,2648-2866,2976-3198,3800-4038段为Exon;1108-2261,2454-2647,2867-2975,3199-3799段为Intron;4039-4327段为3’-UTR3.Follistatin cDNA序列的获得a.根据基因组文库及计算机预测设计Follistatin cDNA扩增所需要的一对引物Fs-cDNA-F(5’-GCCCGTGTCAAATACGTGCTTCAC-3’)和Fs-cDNA-R(5’-AGGGGGGGGGTGTTT TGATTGATTC-3’)。
b.利用反转录得到的cDNA文库作为模板,用pfu DNA聚合酶进行PCR。
c.将PCR得到的片断克隆到pBluescript II SK(Stratagene,USA)Sma I位点上,扩增质粒,使用通用测序引物T7和T3测序。得到的两个cDNA序列如下。
Follistatin-cDNA序列1的编码起始位置在本序列的39位(基因组的1017位),终止于位于本序列1010位(基因组的4038位)本发明公开了牙鲆Follistatin基因组序列、启动子序列、cDNA序列与蛋白质序列。具有如下特征(1)序列特征cDNA序列1长度1059碱基对;类型核苷酸;链型双链;拓扑结构线性(2)分子类型cDNA(3)假设否(4)反义否(5)最初来源牙鲆鱼(Paralichthys olivaceus)。
GCCCGTGTCAAATACGTGCTTCACTTTGCCTCTCCATCATGTTTAGGATGCTGAAACACCACCTCCACCCGGGCATTTTTCTCTTCTTCATATGGCTTTGTCACCTCATGGAACATCAAAAAGTTCAAGCCGGGAACTGCTGGTTGCAGCAGGGGAAGAACGGGAGGTGCCAGGTGCTGTACATGCCCGGTATGAGCAGGGAGGAGTGCTGCCGGAGCGGAAGACTGGGGACGTCCTGGACCGAGGAGGACGTCCCTAACAGCACGCTCTTTAGGTGGATGATCTTCAATGGCGGAGCCCCCAATTGCATACCTTGCAAAGGTGGAGAAACCTGCGATAATGTTGACTGTGGGCCGGGAAAGAGGTGCAAGATGAACAGA
AGAAGCAAGCCGCGCTGCGTGTGCGCGCCAGACTGCTCCAACATCACCTGGAAAGGACCGGTCTGCGGCTCAGATGGAAAGACCTACAAAGACGAATGCGCACTGCTGAAGGCTAAATGCAAAGGCCACCCTGACCTGGACGTGCAGTACCAGGGAAAGTGCAAGAAAACGTGCCGTGACGTCTTGTGCCCCGGCAGCTCCACGTGCGTCGTGGACCAGACAAATAATGCATATTGTGTGACGTGTAATCGGATTTGCCCCGAGGTGACGTCGCCTGATCAGTACCTGTGTGGAAACGACGGGATCATCTATGCCAGCGCGTGTCACCTGAGAAGAGCTACCTGTCTCCTGGGCAGATCTATCGGAGTGGCGTATGAGGGCAAATGCATCAAGGCTAAGTCGTGTGAGGACATCCAGTGCAGCGCAGGGAAAAAGTGTCTGTGGGATGCTCGAATGAGCCGAGGCCGCTGCTCACTGTGCGATGAGACCTGTCCGGAGAGCAGGACGGATGAGGCGGTGTGTGCCAGCGACAACACCACATATCCCAGTGAATGTGCCATGAAGCAAGCTGCTTGCTCTATGGGTGTGCTGCTTGAGGTCAAGCACTCTGGATCTTGCAACTGTAAGTAAATAACAAAAGCAAAATATGAAAAAGAATCAATCAAAACACCCCCCCCCT其编码的蛋白质序列如下蛋白质序列1长度323氨基酸;类型蛋白质;链型单链;Follistatin蛋白质序列1MFRMLKHHLHPGIFLFFIWLCHLMEHQKVQAGNCWLQQGKNGRCQVLYMPGMSREECCRSGRLGTSWTEEDVPNSTLFRWMIFNGGAPNCIPCKGGETCDNVDCGPGKRCKMNRRSKPRCVCAPDCSNITWKGPVCGSDGKTYKDECALLKAKCKGHPDLDVQYQGKCKKTCRDVLCPGSSTCVVDQTNNAYCVTCNRICPEVTSPDQYLCGNDGIIYASACHLRRATCLLGRSIGVAYEGKCIKAKSCEDIQCSAGKKCLWDARMSRGRCSLCDETCPESRTDEAVCASDNTTYPSECAMKQAACSMGVLLEVKHSGSCNCKFollistatin-cDNA序列2的编码起始位置在本序列的39位(基因组的1017位),终止于本序列1004位(基因组的4038位)本发明公开了牙鲆Follistatin基因组序列、cDNA序列与蛋白质序列。具有如下特征(1)序列特征cDNA序列2长度1053碱基对;类型核苷酸;链型双链;拓扑结构线性(2)分子类型cDNA(3)假设否(4)反义否
(5)最初来源牙鲆鱼(Paralichthys olivaceus)。
GCCCGTGTCAAATACGTGCTTCACTTTGCCTCTCCATCATGTTTAGGATGCTGAAACACCACCTCCACCCGGGCATTTTTCTCTTCTTCATATGGCTTTGTCACCTCATGGAACATCAAAAAGTTCAAGCCGGGAACTGCTGGTTGCAGCAGGGGAAGAACGGGAGGTGCCAGGTGCTGTACATGCCCGGTATGAGCAGGGAGGAGTGCTGCCGGAGCGGAAGACTGGGGACGTCCTGGACCGAGGAGGACGTCCCTAACAGCACGCTCTTTAGGTGGATGATCTTCAATGGCGGAGCCCCCAATTGCATACCTTGCAAAGAAACCTGCGATAATGTTGACTGTGGGCCGGGAAAGAGGTGCAAGATGAACAGAAGAAGCAAGCCGCGCTGCGTGTGCGCGCCAGACTGCTCCAACATCACCTGGAAAGGACCGGTCTGCGGCTCAGATGGAAAGACCTACAAAGACGAATGCGCACTGCTGAAGGCTAAATGCAAAGGCCACCCTGACCTGGACGTGCAGTACCAGGGAAAGTGCAAGAAAACGTGCCGTGACGTCTTGTGCCCCGGCAGCTCCACGTGCGTCGTGGACCAGACAAATAATGCATATTGTGTGACGTGTAATCGGATTTGCCCCGAGGTGACGTCGCCTGATCAGTACCTGTGTGGAAACGACGGGATCATCTATGCCAGCGCGTGTCACCTGAGAAGAGCTACCTGTCTCCTGGGCAGATCTATCGGAGTGGCGTATGAGGGCAAATGCATCAAGGCTAAGTCGTGTGAGGACATCCAGTGCAGCGCAGGGAAAAAGTGTCTGTGGGATGCTCGAATGAGCCGAGGCCGCTGCTCACTGTGCGATGAGACCTGTCCGGAGAGCAGGACGGATGAGGCGGTGTGTGCCAGCGACAACACCACATATCCCAGTGAATGTGCCATGAAGCAAGCTGCTTGCTCTATGGGTGTGCTGCTTGAGGTCAAGCACTCTGGATCTTGCAACTGTAAGTAAATAACAAAAGCAAAATATGAAAAAGAATCAATCAAAACACCCCCCCCCT其编码的蛋白质序列如下蛋白质序列2长度321氨基酸;类型蛋白质;链型单链;Follistatin蛋白质序列2MFRMLKHHLHPGIFLFFIWLCHLMEHQKVQAGNCWLQQGKNGRCQVLYMPGMSREECCRSGRLGTSWTEEDVPNSTLFRWMIFNGGAPNCIPCKETCDNVDCGPGKRCKMNRRSKPRCVCAPDCSNITWKGPVCGSDGKTYKDECALLKAKCKGHPDLDVQYQGKCKKTCRDVLCPGSSTCVVDQTNNAYCVTCNRICPEVTSPDQYLCGNDGIIYASACHLRRATCLLGRSIGVAYEGKCIKAKSCEDIQCSAGKKCLWDARMSRGRCSLCDETCPESRTDEAVCASDNTTYPSECAMKQAACSMGVLLEVKHSGSCNCK
SEQUENCE LISTING<110>中国科学院海洋研究所<120>牙鲆FOLLISTATIN基因序列及其编码蛋白质序列<130>
<160>5<170>PatentIn version 3.1<210>1<211>4327<212>DNA<213>牙鲆鱼(Paralichthys olivaceus)<220>
<221>promoter<222>(1)..(1016)<223>
<220>
<221>3’UTR<222>(1017)..(4039)<223>
<220>
<221>exon<222>(1017)..(4038)<223>
<400>1ctgtacccag ggttccgctg aagaggtcgc gcaaggactt ctgaaggagt acagcctgtg 60cgtcagtgag tggagagctg tgtgtgtgtc tgtgtgtgtg tgcgtgtgcg tgtttcattc 120aatgaaacgc aacaccccgc tcctctgtct gaagcctctt cggaaacagg tttgattata 180tttaaaagct tgacccggga gagaaaccca aacaactgtc tgttctaata ttagctgtgc 240ggctgcggac tactatgaat tctattattg tctggactgg gatatattcc caaaaggcaa 300ctctcatcaa aagatttaca cttttgtctg cagttggtgg aaaaaatatc agttaccatc 360aagcagtata ctcttaattt gaaggtatta tgatggcatg catggacagg atgttggctg 420
tgggcctgtt ttattataca gtaataaagt ttttaatgca taaaaagcta ttatcattac 480tattattaat ataattaatg ttattattaa tattttcatt aatagtatca ttattattgc 540tttaaaagca catgtagagg tcgtgcgtaa aaacgaaccg tgcgcctgtt gagaaaacta 600aaaggattaa agttaataac agcattgtgc attaaaaact aaattcacac ccaccggcag 660cttttatctg ctctgcacag ttccactagt ttcacagcag agccacagac actcgtggcg 720tttttattac aacaccagag ctgtttccat gttctttaat tgttgtggca ggagaaagtt 780gcctgtaacc atgtcacctg attgggattg aaagaacttt ggtaagaaaa aaaaaaaaat 840ctccccaccc ccacagaaga ggagaccgcc caccagagac agtcaacccg agacccctta 900tagatttaaa aagagaggct gcattctcag actgacactt gttcaacact gccaccttaa 960gcagattact tttgcgctgc ccgtgtcaaa tacgtgcttc actttgcctc tccatc atg1019Met1ttt agg atg ctg aaa cac cac ctc cac ccg ggc att ttt ctc ttc ttc 1067Phe Arg Met Leu Lys His His Leu His Pro Gly Ile Phe Leu Phe Phe5 10 15ata tgg ctt tgt cac ctc atg gaa cat caa aaa gtt caa ggt aag gat 1115Ile Trp Leu Cys His Leu Met Glu His Gln Lys Val Gln Gly Lys Asp20 25 30ctt cct cat ccg ttt ttt ttt ttt cat ttt tat tct ctc cgt ttt ttg 1163Leu Pro His Pro Phe Phe Phe Phe His Phe Tyr Ser Leu Arg Phe Leu35 40 45gaa aaa gaa aaa aaa act tcc act gct gtc agc cgc agg aaa acg tgc 1211Glu Lys Glu Lys Lys Thr Ser Thr Ala Val Ser Arg Arg Lys Thr Cys50 55 60 65acc agt gcc gga taa ttg ctg cag ccg aca aac ttt tca gag tgg gtt 1259Thr Ser Ala Gly Leu Leu Gln Pro Thr Asn Phe Ser Glu Trp Val70 75 80cgc tct tca ccc gct gat aaa agt ctc cac atc aga cag tgc gtg cgt 1307Arg Ser Ser Pro Ala Asp Lys Ser Leu His Ile Arg Gln Cys Val Arg85 90 95aaa gtg tcc gca aaa cag gtg cca tat ggg aca ctt caa agt gag act 1355Lys Val Ser Ala Lys Gln Val Pro Tyr Gly Thr Leu Gln Ser Glu Thr100 105 110gta cgg caa gtt ttt gca gcc ctc tcc agg ggc acg cac ggc gct gga 1403Val Arg Gln Val Phe Ala Ala Leu Ser Arg Gly Thr His Gly Ala Gly115 120 125cgc agc ctc agc ttc tct caa tat ctc aac atg ttc tca tac ttt ttc 1451Arg Ser Leu Ser Phe Ser Gln Tyr Leu Asn Met Phe Ser Tyr Phe Phe130 135 140tct cag aat tca atg tct gtg cgt gtt ttc gcc aaa cct gta aca aac 1499Ser Gln Asn Ser Met Ser Val Arg Val Phe Ala Lys Pro Val Thr Asn145 150 155 160ttg aaa aat gtg act tca tgg gcg tgt tat ttg tat att tcc cgc atc 1547Leu Lys Asn Val Thr Ser Trp Ala Cys Tyr Leu Tyr Ile Ser Arg Ile165 170 175cag cat tca tgc gta aaa ttg ctt tta ttg gca ata gtt tta ttt tac 1595Gln His Ser Cys Val Lys Leu Leu Leu Leu Ala Ile Val Leu Phe Tyr180 185 190tat gac aaa aat tat aaa act gga aac ttt tac acg cat cac tct ttc 1643Tyr Asp Lys Asn Tyr Lys Thr Gly Asn Phe Tyr Thr His His Ser Phe195 200 205ttg aga ggt tct agt tga gat aga tac aaa tgg aaa tgt gtc ttc agt 1691Leu Arg Gly Ser Ser Asp Arg Tyr Lys Trp Lys Cys Val Phe Ser210 215 220
tta gga aac aca cca tga aca ctt act gtc atg tgt gtc tgt gtc ggt1739Leu Gly Asn Thr Pro Thr Leu Thr Val Met Cys Val Cys Val Gly225 230 235tta cac acc ctg ctg ctg tct ggg tcg aat tat att cct gtg tca ttg1787Leu His Thr Leu Leu Leu Ser Gly Ser Asn Tyr Ile Pro Val Ser Leu240 245 250att taa gat tcg ttt tag ttt gaa tgt gtt taa aat gaa ctt ggt act1835Ile Asp Ser Phe Phe Glu Cys Val Asn Glu Leu Gly Thr255 260 265gct ttg tta ata aac taa aga ggc gtg ttg ggt aaa tgt ctg gga tat1883A1a Leu Leu Ile Asn Arg Gly Val Leu Gly Lys Cys Leu Gly Tyr270 275 280cgc cct gag gcg gcg gag aca gtg gag tct tgc tac ctt aag acg gtg1931Arg Pro Glu Ala Ala Glu Thr Val Glu Ser Cys Tyr Leu Lys Thr Val285 290 295gtt tcc tct ctg agg gag act taa cac tgt ctg tac gct gct ctc tgt1979Val Ser Ser Leu Arg Glu Thr His Cys Leu Tyr Ala Ala Leu Cys300 305 310cca gtg ata aca agg agt ctg gga ata aag cac aac ctc cct cct aag2027Pro Val Ile Thr Arg Ser Leu Gly Ile Lys His Asn Leu Pro Pro Lys315 320 325att ctt ttc act ttt tgg ggt ttg tgg ctg cat ggc atg ggc tgt cag2075Ile Leu Phe Thr Phe Trp Gly Leu Trp Leu His Gly Met Gly Cys Gln330 335 340 345tca cag act gga tag tgt taa cag tct cgc aaa aaa aaa atc aga caa2123Ser Gln Thr Gly Cys Gln Ser Arg Lys Lys Lys Ile Arg Gln350 355aat agt gcc ggg tgg gag aaa ggg gag gga gcg tct cgg ata ata cgg2171Asn Ser Ala Gly Trp Glu Lys Gly Glu Gly Ala Ser Arg Ile Ile Arg360 365 370 375gct gct tgt gtg tct ggt gcg tct ggt cca gcg ctt tga ggc gtt caa2219Ala Ala Cys Val Ser Gly Ala Ser Gly Pro Ala Leu Gly Val Gln380 385 390cat gca aca atc gag gtt ttc tta tgt ttt ttc tgt ttt cag ccg gga2267His Ala Thr Ile Glu Val Phe Leu Cys Phe Phe Cys Phe Gln Pro Gly395 400 405act gct ggt tgc agc agg gga aga acg gga ggt gcc agg tgc tgt aca2315Thr Ala Gly Cys Ser Arg Gly Arg Thr Gly Gly Ala Arg Cys Cys Thr410 415 420tgc ccg gta tga gca ggg agg agt gct gcc gga gcg gaa gac tgg gga2363Cys Pro Val Ala Gly Arg Ser Ala Ala Gly Ala Glu Asp Trp Gly425 430 435cgt cct gga ccg agg agg acg tcc cta aca gca cgc tct tta ggt gga2411Arg Pro Gly Pro Arg Arg Thr Ser Leu Thr Ala Arg Ser Leu Gly Gly440 445 450tga tct tca atg gcg gag ccc cca att gca tac ctt gca aag gtg gag2459Ser Ser Met Ala Glu Pro Pro Ile Ala Tyr Leu Ala Lys Val Glu455 460 465gtg cac tca ttt tcc ttc gtt ttt ttc ata aca tac atg ctc gat cat2507Val His Ser Phe Ser Phe Val Phe Phe Ile Thr Tyr Met Leu Asp His470 475 480ttt ctc tca tga acc ata tgg tga tgc tct gat ttg cgc atg cca aag2555Phe Leu Ser Thr Ile Trp Cys Ser Asp Leu Arg Met Pro Lys485 490 495tgc ata aaa tgc cca ttt tgc aat gcg taa ttt cac gca caa ggt caa2603Cys Ile Lys Cys Pro Phe Cys Asn Ala Phe His Ala Gln Gly Gln500 505 510aat cct cgc cag cgt cac tta cca gag acc tct tta ttt ttc aga aac2651
Asn Pro Arg Gln Arg His Leu Pro Glu Thr Ser Leu Phe Phe Arg Asn515 520 525ctg cga taa tgt tga ctg tgg gcc ggg aaa gag gtg caa gat gaa cag2699Leu Arg Cys Leu Trp Ala Gly Lys Glu Val Gln Asp Glu Gln530 535 540aag aag caa gcc gcg ctg cgt gtg cgc gcc aga ctg ctc caa cat cac2747Lys Lys Gln Ala Ala Leu Arg Val Arg Ala Arg Leu Leu Gln His His545 550 555ctg gaa agg acc ggt ctg cgg ctc aga tgg aaa gac cta caa aga cga2795Leu Glu Arg Thr Gly Leu Arg Leu Arg Trp Lys Asp Leu Gln Arg Arg560 565 570 575atg cgc act gct gaa ggc taa atg caa agg cca ccc tga cct gga cgt2843Met Arg Thr Ala Glu Gly Met Gln Arg Pro Pro Pro Gly Arg580 585gca gta cca ggg aaa gtg caa gag tga gta aac att aca ttt aaa cct2891Ala Val Pro Gly Lys Val Gln Glu Val Asn Ile Thr Phe Lys Pro590 595 600gcc aat tta tga gat tac gcg tcg gca ttc gtg cat ttc gtg cca gtt2939Ala Asn Leu Asp Tyr Ala Ser Ala Phe Val His Phe Val Pro Val605 610 615ttt aac aaa atc ttt aaa att cct ttg tct gga cag aaa cgt gcc gtg2987Phe Asn Lys Ile Phe Lys Ile Pro Leu Ser Gly Gln Lys Arg Ala Val620 625 630 635acg tct tgt gcc ccg gca gct cca cgt gcg tcg tgg acc aga caa ata3035Thr Ser Cys Ala Pro Ala Ala Pro Arg Ala Ser Trp Thr Arg Gln Ile640 645 650atg cat att gtg tga cgc gta atc gga ttt gcc ccg agg tga cgt cgc3083Met His Ile Val Arg Val Ile Gly Phe Ala Pro Arg Arg Arg655 660 665ctg atc agt acc tgt gtg gaa acg acg gga tca tct atg cca gcg cgt3131Leu Ile Ser Thr Cys Val Glu Thr Thr Gly Ser Ser Met Pro Ala Arg670 675 680gtc act gag aga gct acc tgt ctc ctg ggc aga tct atc gga gtg gcg3179Val Thr Glu Arg Ala Thr Cys Leu Leu Gly Arg Ser Ile Gly Val Ala685 690 695tat gag ggc aaa tgc atc agt aag tct gca gac ata aga gac gag ata3227Tyr Glu Gly Lys Cys Ile Ser Lys Ser Ala Asp Ile Arg Asp Glu Ile700 705 710ctg agc gag act ttg ctc cct gaa agc gcc tcc agt cac tga ctt tga3275Leu Ser Glu Thr Leu Leu Pro Glu Ser Ala Ser Ser His Leu715 720 725ata ttg ttt gag tgc atg ttc ttc ctg gcc agc tct gca ctt tct cac3323Ile Leu Phe Glu Cys Met Phe Phe Leu Ala Ser Ser Ala Leu Ser His730 735 740tgc tca ttt ttg cca aca ttc cac tga gga ggg ggt ctt aga gag aga3371Cys Ser Phe Leu Pro Thr Phe His Gly Gly Gly Leu Arg Glu Arg745 750 755gag aaa gag gga ggg agt tag ttt tgt att gct tgt gtt tgc tca aga3419Glu Lys Glu Gly Gly Ser Phe Cys Ile Ala Cys Val Cys Ser Arg760 765 770aat gat aga cat ctt att att tcc tga tgt tgg cag cac tat ccc ata3467Asn Asp Arg His Leu Ile Ile Ser Cys Trp Gln His Tyr Pro Ile775 780 785tgg gag aga gga aag aag agg agg ggg gag aaa gag agc aaa aca ggg3515Trp Glu Arg Gly Lys Lys Arg Arg Gly Glu Lys Glu Ser Lys Thr Gly790 795 800gag tgc tgg ggg cca tga aga cat gct tca agt taa tat ttg agt cag3563Glu Cys Trp Gly Pro Arg His Ala Ser Ser Tyr Leu Ser Gln805 8l0 815
atg gac tct tat cca gaa agc agt caa tat aag tca tgt aga ctt aaa3611Met Asp Ser Tyr Pro Glu Ser Ser Gln Tyr Lys Ser Cys Arg Leu Lys820 825 830aat gct aat taa aac atg act ttt tgt tgc ctg cca caa ggc ata agc3659Asn Ala Asn Asn Met Thr Phe Cys Cys Leu Pro Gln Gly Ile Ser835 840 845cra taa aat agg att tgt ttt tta ttc ctt gct tag ctc tcc aat act3707Leu Asn Arg Ile Cys Phe Leu Phe Leu Ala Leu Ser Asn Thr850 855 860caa cag rca tct gat cct gct tga aaa gcg ttc gtg ata atc cat cra3755Gln Gln Ser Ser Asp Pro Ala Lys Ala Phe Val Ile Ile His Leu865 870 875atc aaa gat ttc ccc tgt gct gat ctc ttc ctc cct crc tgc aga ggc3803Ile Lys Asp Phe Pro Cys Ala Asp Leu Phe Leu Pro Leu Cys Arg Gly880 885 890taa gtc gtg tga gga cat cca gtg cag cgc agg gaa aaa gtg tct gtg3851Val Val Gly His Pro Val Gln Arg Arg Glu Lys Val Ser Val895 900 905gga tgc tcg aat gag ccg agg ccg ctg ctc act gtg cga tga gac ctg3899Gly Cys Ser Asn Glu Pro Arg Pro Leu Leu Thr Val Arg Asp Leu910 915 920tcc gga gag cag gac gga tga ggc ggt gtg tgc cag cga caa cac cac3947Ser Gly Glu Gln Asp Gly Gly Gly Val Cys Gln Arg Gln His His925 930 935ata tcc cag tga atg tgc cat gaa gca agc tgc ttg ctc tat ggg tgt3995Ile Ser Gln Met Cys His Glu Ala Ser Cys Leu Leu Tyr Gly Cys940 945 950gct gct tga ggt caa gca ctc tgg atc ttg caa ctg taa gta a 4038Ala Ala Gly Gln Ala Leu Trp Ile Leu Gln Leu Val955 960 965ataacaaaag caaaatatga aaaagaatca atcaaaacac cccccccctc caagcaaaag 4098acaatattcc atgttgcttt cccaacaaaa aacctcccct gaaagtgccc ctgatggctg 4158tgcggttccc acggaaactg tgtgtcaatg attatcacga ctagataagc actttaaaaa 4218caattctgat gttctacaga gaacttgagg ttctgtcatt ttaacaactt gcttgtgatt 4278tttgttcatc agagacgttt ccagggcagc agatggttcc catgtccag 4327<210>2<211>1059<212>DNA<213>牙鲆鱼(Paralichthys olivaceus)<220>
<221>CDS<222>(39)..(1010)<223>
<400>2gcccgtgtca aatacgtgct tcactttgcc tctccatc atg ttt agg atg ctg aaa 56Met Phe Arg Met Leu Lys1 5cac cac ctc cac ccg ggc att ttt crc ttc ttc ata tgg ctt tgt cac104
His His Leu His Pro Gly Ile Phe Leu Phe Phe Ile Trp Leu Cys His10 15 20ctc atg gaa cat caa aaa gtt caa gcc ggg aac tgc tgg ttg cag cag152Leu Met Glu His Gln Lys Val Gln Ala Gly Asn Cys Trp Leu Gln Gln25 30 35ggg aag aac ggg agg tgc cag gtg ctg tac atg ccc ggt atg agc agg200Gly Lys Asn Gly Arg Cys Gln Val Leu Tyr Met Pro Gly Met Ser Arg40 45 50gag gag tgc tgc cgg agc gga aga ctg ggg acg tcc tgg acc gag gag248Glu Glu Cys Cys Arg Ser Gly Arg Leu Gly Thr Ser Trp Thr Glu Glu55 60 65 70gac gtc cct aac agc acg ctc ttt agg tgg atg atc ttc aat ggc gga296Asp Val Pro Asn Ser Thr Leu Phe Arg Trp Met Ile Phe Asn Gly Gly75 80 85gcc ccc aat tgc ata cct tgc aaa ggt gga gaa acc tgc gat aat gtt344Ala Pro Asn Cys Ile Pro Cys Lys Gly Gly Glu Thr Cys Asp Asn Val90 95 100gac tgt ggg ccg gga aag agg tgc aag atg aac aga aga agc aag ccg392Asp Cys Gly Pro Gly Lys Arg Cys Lys Met Asn Arg Arg Ser Lys Pro105 110 115cgc tgc gtg tgc gcg cca gac tgc tcc aac atc acc tgg aaa gga ccg440Arg Cys Val Cys Ala Pro Asp Cys Ser Asn Ile Thr Trp Lys Gly Pro120 125 130gtc tgc ggc tca gat gga aag acc tac aaa gac gaa tgc gca ctg ctg488Val Cys Gly Ser Asp Gly Lys Thr Tyr Lys Asp Glu Cys Ala Leu Leu135 140 145 150aag gct aaa tgc aaa ggc cac cct gac ctg gac gtg cag tac cag gga536Lys Ala Lys Cys Lys Gly His Pro Asp Leu Asp Val Gln Tyr Gln Gly155 160 165aag tgc aag aaa acg tgc cgt gac gtc ttg tgc ccc ggc agc tcc acg584Lys Cys Lys Lys Thr Cys Arg Asp Val Leu Cys Pro Gly Ser Ser Thr170 175 180tgc gtc gtg gac cag aca aat aat gca tat tgt gtg acg tgt aat cgg632Cys Val Val Asp Gln Thr Asn Asn Ala Tyr Cys Val Thr Cys Asn Arg185 190 195att tgc ccc gag gtg acg tcg cct gat cag tac ctg tgt gga aac gac680Ile Cys Pro Glu Val Thr Ser Pro Asp Gln Tyr Leu Cys Gly Asn Asp200 205 210ggg atc atc tat gcc agc gcg tgt cac ctg aga aga gct acc tgt ctc728Gly Ile Ile Tyr Ala Ser Ala Cys His Leu Arg Arg Ala Thr Cys Leu215 220 225 230ctg ggc aga tct atc gga gtg gcg tat gag ggc aaa tgc atc aag gct776Leu Gly Arg Ser Ile Gly Val Ala Tyr Glu Gly Lys Cys Ile Lys Ala235 240 245aag tcg tgt gag gac atc cag tgc agc gca ggg aaa aag tgt ctg tgg824Lys Ser Cys Glu Asp Ile Gln Cys Ser Ala Gly Lys Lys Cys Leu Trp250 255 260gat gct cga atg agc cga ggc cgc tgc tca ctg tgc gat gag acc tgt872Asp Ala Arg Met Ser Arg Gly Arg Cys Ser Leu Cys Asp Glu Thr Cys265 270 275ccg gag agc agg acg gat gag gcg gtg tgt gcc agc gac aac acc aca920Pro Glu Ser Arg Thr Asp Glu Ala Val Cys Ala Ser Asp Asn Thr Thr280 285 290tat ccc agt gaa tgt gcc atg aag caa gct gct tgc tct atg ggt gtg968Tyr Pro Ser Glu Cys Ala Met Lys Gln Ala Ala Cys Ser Met Gly Val295 300 305 310ctg ctt gag gtc aag cac tct gga tct tgc gag tgt aag taa1010Leu Leu Glu Val Lys His Ser Gly Ser Cys Asn Cys Lys315 320
ataacaaaag caaaatatga aaaagaatca atcaaaacac cccccccct 1059<210>3<211>323<212>PRT<213>牙鲆鱼(Paralichthys olivaceus)<400>3Met Phe Arg Met Leu Lys His His Leu His Pro Gly Ile Phe Leu Phe1 5 10 15Phe Ile Trp Leu Cys His Leu Met Glu His Gln Lys Val Gln Ala Gly20 25 30Asn Cys Trp Leu Gln Gln Gly Lys Asn Gly Arg Cys Gln Val Leu Tyr35 40 45Met Pro Gly Met Ser Arg Glu Glu Cys Cys Arg Ser Gly Arg Leu Gly50 55 60Thr Ser Trp Thr Glu Glu Asp Val Pro Asn Ser Thr Leu Phe Arg Trp65 70 75 80Met Ile Phe Asn Gly Gly Ala Pro Asn Cys Ile Pro Cys Lys Gly Gly85 90 95Glu Thr Cys Asp Asn Val Asp Cys Gly Pro Gly Lys Arg Cys Lys Met100 105 110Asn Arg Arg Ser Lys Pro Arg Cys Val Cys Ala Pro Asp Cys Ser Asn115 120 125Ile Thr Trp Lys Gly Pro Val Cys Gly Ser Asp Gly Lys Thr Tyr Lys130 135 140Asp Glu Cys Ala Leu Leu Lys Ala Lys Cys Lys Gly His Pro Asp Leu145 150 155 160Asp Val Gln Tyr Gln Gly Lys Cys Lys Lys Thr Cys Arg Asp Val Leu165 170 175Cys Pro Gly Ser Ser Thr Cys Val Val Asp Gln Thr Asn Asn Ala Tyr180 185 190Cys Val Thr Cys Asn Arg Ile Cys Pro Glu Val Thr Ser Pro Asp Gln195 200 205Tyr Leu Cys Gly Asn Asp Gly Ile Ile Tyr Ala Ser Ala Cys His Leu210 215 220Arg Arg Ala Thr Cys Leu Leu Gly Arg Ser Ile Gly Val Ala Tyr Glu225 230 235 240Gly Lys Cys Ile Lys Ala Lys Ser Cys Glu Asp Ile Gln Cys Ser Ala245 250 255
Gly Lys Lys Cys Leu Trp Asp Ala Arg Met Ser Arg Gly Arg Cys Ser260 265 270Leu Cys Asp Glu Thr Cys Pro Glu Ser Arg Thr Asp Glu Ala Val Cys275 280 285Ala Ser Asp Asn Thr Thr Tyr Pro Ser Glu Cys Ala Met Lys Gln Ala290 295 300Ala Cys Ser Met Gly Val Leu Leu Glu Val Lys His Ser Gly Ser Cys305 310 315 320Asn Cys Lys<210>4<211>1053<212>DNA<213>牙鲆鱼(Paralichthys olivaceus)<220>
<221>CDS<222>(39)..(1004)<223>
<400>4gcccgtgtca aatacgtgct tcactttgcc tctccatc atg ttt agg atg ctg aaa 56Met Phe Arg Met Leu Lys1 5cac cac ctc cac ccg ggc att ttt ctc ttc ttc ata tgg ctt tgt cac104His His Leu His Pro Gly Ile Phe Leu Phe Phe Ile Trp Leu Cys His10 15 20ctc atg gaa cat caa aaa gtt caa gcc ggg aac tgc tgg ttg cag cag152Leu Met Glu His Gln Lys Val Gln Ala Gly Asn Cys Trp Leu Gln Gln25 30 35ggg aag aac ggg agg tgc cag gtg ctg tac atg ccc ggt atg agc agg200Gly Lys Asn Gly Arg Cys Gln Val Leu Tyr Met Pro Gly Met Ser Arg40 45 50gag gag tgc tgc cgg agc gga aga ctg ggg acg tcc tgg acc gag gag248Glu Glu Cys Cys Arg Ser Gly Arg Leu Gly Thr Ser Trp Thr Glu Glu55 60 65 70gac gtc cct aac agc acg ctc ttt agg tgg atg atc ttc aat ggc gga296Asp Val Pro Asn Ser Thr Leu Phe Arg Trp Met Ile Phe Asn Gly Gly75 80 85gcc ccc aat tgc ata cct tgc aaa gaa acc tgc gat aat gtt gac tgt344Ala Pro Asn Cys Ile Pro Cys Lys Glu Thr Cys Asp Asn Val Asp Cys90 95 100ggg ccg gga aag agg tgc aag atg aac aga aga agc aag ccg cgc tgc392Gly Pro Gly Lys Arg Cys Lys Met Asn Arg Arg Ser Lys Pro Arg Cys105 110 115gtg tgc gcg cca gac tgc tcc aac atc acc tgg aaa gga ccg gtc tgc440Val Cys Ala Pro Asp Cys Ser Asn Ile Thr Trp Lys Gly Pro Val Cys120 125 130
ggc tca gat gga aag acc tac aaa gac gaa tgc gca ctg ctg aag gct488Gly Ser Asp Gly Lys Thr Tyr Lys Asp Glu Cys Ala Leu Leu Lys Ala135 140 145 150aaa tgc aaa ggc cac cct gac ctg gac gtg cag tac cag gga aag tgc536Lys Cys Lys Gly His Pro Asp Leu Asp Val Gln Tyr Gln Gly Lys Cys155 160 165aag aaa acg tgc cgt gac gtc ttg tgc ccc ggc agc tcc acg tgc gtc584Lys Lys Thr Cys Arg Asp Val Leu Cys Pro Gly Ser Ser Thr Cys Val170 175 180gtg gac cag aca aat aat gca tat tgt gtg acg tgt aat cgg att tgc632Val Asp Gln Thr Asn Asn Ala Tyr Cys Val Thr Cys Asn Arg Ile Cys185 190 195ccc gag gtg acg tcg cct gat cag tac ctg tgt gga aac gac ggg atc680Pro Glu Val Thr Ser Pro Asp Gln Tyr Leu Cys Gly Asn Asp Gly Ile200 205 210atc tat gcc agc gcg tgt cac ctg aga aga gct acc tgt crc ctg ggc728Ile Tyr Ala Ser Ala Cys His Leu Arg Arg Ala Thr Cys Leu Leu Gly215 220 225 230aga tct atc gga gtg gcg tat gag ggc aaa tgc arc aag gct aag tcg776Arg Ser Ile Gly Val Ala Tyr Glu Gly Lys Cys Ile Lys Ala Lys Ser235 240 245tgt gag gac atc cag tgc agc gca ggg aaa aag tgt ctg tgg gat gct824Cys Glu Asp Ile Gln Cys Ser Ala Gly Lys Lys Cys Leu Trp Asp Ala250 255 260cga atg agc cga ggc cgc tgc rca ctg tgc gat gag acc tgt ccg gag872Arg Met Ser Arg Gly Arg Cys Ser Leu Cys Asp Glu Thr Cys Pro Glu265 270 275agc agg acg gat gag gcg gtg tgt gcc agc gac aac acc aca tat ccc920Ser Arg Thr Asp Glu Ala Val Cys Ala Ser Asp Ash Thr Thr Tyr Pro280 285 290agt gaa tgt gcc atg aag caa gct gct tgc tct atg ggt gtg ctg ctt968Ser Glu Cys Ala Met Lys Gln Ala Ala Cys Ser Met Gly Val Leu Leu295 300 305 310gag gtc aag cac tct gga tct tgc aac tgt aag taa ataacaaaag 1014Glu Val Lys His Ser Gly Ser Cys Asn Cys Lys315 320caaaatatga aaaagaatca atcaaaacac cccccccct 1053<210>5<211>321<212>PRT<213>牙鲆鱼(Paralichthys olivaceus)<400>5Met Phe Arg Met Leu Lys His His Leu His Pro Gly Ile Phe Leu Phe1 5 10 15Phe Ile Trp Leu Cys His Leu Met Glu His Gln Lys Val Gln Ala Gly20 25 30Asn Cys Trp Leu Gln Gln Gly Lys Asn Gly Arg Cys Gln Val Leu Tyr35 40 45Met Pro Gly Met Ser Arg Glu Glu Cys Cys Arg Ser Gly Arg Leu Gly50 55 60
Thr Ser Trp Thr Glu Glu Asp Val Pro Asn Ser Thr Leu Phe Arg Trp65 70 75 80Met Ile Phe Asn Gly Gly Ala Pro Asn Cys Ile Pro Cys Lys Glu Thr85 90 95Cys Asp Asn Val Asp Cys Gly Pro Gly Lys Arg Cys Lys Met Asn Arg100 105 110Arg Ser Lys Pro Arg Cys Val Cys Ala Pro Asp Cys Ser Asn Ile Thr115 120 125Trp Lys Gly Pro Val Cys Gly Ser Asp Gly Lys Thr Tyr Lys Asp Glu130 135 140Cys Ala Leu Leu Lys Ala Lys Cys Lys Gly His Pro Asp Leu Asp Val145 150 155 160Gln Tyr Gln Gly Lys Cys Lys Lys Thr Cys Arg Asp Val Leu Cys Pro165 170 175Gly Ser Ser Thr Cys Val Val Asp Gln Thr Asn Asn Ala Tyr Cys Val180 185 190Thr Cys Asn Arg Ile Cys Pro Glu Val Thr Ser Pro Asp Gln Tyr Leu195 200 205Cys Gly Asn Asp Gly Ile Ile Tyr Ala Ser Ala Cys His Leu Arg Arg210 215 220Ala Thr Cys Leu Leu Gly Arg Ser Ile Gly Val Ala Tyr Glu Gly Lys225 230 235 240Cys Ile Lys Ala Lys Ser Cys Glu Asp Ile Gln Cys Ser Ala Gly Lys245 250 255Lys Cys Leu Trp Asp Ala Arg Met Ser Arg Gly Arg Cys Ser Leu Cys260 265 270Asp Glu Thr Cys Pro Glu Ser Arg Thr Asp Glu Ala Val Cys Ala Ser275 280 285Asp Asn Thr Thr Tyr Pro Ser Glu Cys Ala Met Lys Gln Ala Ala Cys290 295 300Ser Met Gly Val Leu Leu Glu Val Lys His Ser Gly Ser Cys Asn Cys305 310 315 320Lys
权利要求
1.一种牙鲆FOLLISTATIN基因序列,其特征在于具有序列表中SEQ ID NO.1碱基序列。
2.一种牙鲆FOLLISTATIN cDNA基因序列,其特征在于具有序列表中SEQ ID NO.2碱基序列。
3.一种牙鲆FOLLISTATIN cDNA基因序列,其特征在于具有序列表中SEQ ID NO.4碱基序列。
4.一种权利要求1所述牙鲆Follistatin启动子,其特征在于具有SEQ ID NO.1碱基序列中第1到第1016个碱基序列或序列一致性在85%以上的碱基序列。
5.一种权利要求1所述牙鲆FOLLISTATIN cDNA基因序列的编码蛋白,其特征在于具有SEQ ID No.3所示的氨基酸序列或序列一致性在90%以上的蛋白序列。
6.一种权利要求1所述牙鲆FOLLISTATIN cDNA基因序列的编码蛋白,其特征在于具有SEQ ID No.5所示的氨基酸序列或序列一致性在90%以上的蛋白序列。
全文摘要
本发明涉及海洋生物牙鲆“全鱼”,具体地说是牙鲆FOLLISTATIN基因组序列、cDNA序列与蛋白质序列,其具有序列表中SEQ ID NO.1-6碱基或氨基酸序列。本发明研究并搞清楚了牙鲆Follistatin基因组序列、启动子序列、cDNA序列与蛋白质序列。Follistatin基因可以治疗由于TGF-β家族蛋白过量表达而引起的各种疾病;同时可以利用该基因进行物种改良,有利于养殖业的生产。
文档编号C07K14/435GK1847395SQ200510046259
公开日2006年10月18日 申请日期2005年4月15日 优先权日2005年4月15日
发明者谭训刚, 刘庆华, 张培军, 徐永立 申请人:中国科学院海洋研究所
网友询问留言 已有0条留言
  • 还没有人留言评论。精彩留言会获得点赞!
1