分子系統樹の描く(植物遺伝子編)
今回は、分子系統樹を描いてみましょう。
fasta形式のファイルを用意して、MAFFT version 7(https://mafft.cbrc.jp/alignment/software/)あるいは、Genome Net (genome.jp)のClustal W(https://www.genome.jp/tools-bin/clustalw)に配列を放り込めば、アライメントができて、何かしらの系統樹が描けてしまいます。しかし、複数のドメインがある場合や欠失・挿入が多い場合、何を比べて系統樹を描いているのか、その生物学的な意味がわからないということになりかねません。そこで、保存されている領域でトリムしてドメイン単位で比較することが一般的です。
今回は、前回模式図を描いたATHB25のパラログおよびオーソログを例として系統樹を描いてみましょう。
作業としては、① fasta形式のファイルを準備する②アライメントする③トリムする④再アライメントする⑤アライメントした配列をもとに系統樹を描く、という流れです
① fasta形式のファイルを準備する
まず、比較したい配列をfasta形式で準備します。最もシンプルな方法は、NCBI(https://www.ncbi.nlm.nih.gov)のBLAST検索でホモロジーの高い配列をダウンロードしてくるというものです。シロイヌナズナの場合には、The Arabidopsis Information Resource(https://www.arabidopsis.org)のサイトからダウンロードすることもできます。
植物種間比較を行いたい場合には、PLAZA 5.0(https://bioinformatics.psb.ugent.be/plaza/)のサイトからfasta形式のファイルをダウンロードしてくるのが便利です。
例えば、Dicots PLAZA 5.0でATHB25で検索すると、Homologous gene family (HOM05D000243)で96植物種1731遺伝子がダウンロードできます。実際には、zinc fingerを持っているけれどhomeoboxを持っていないものや、逆にzinc fingerを持っていないけれどhomeoboxを持っているものなど、いくつかのsubgroupがあるので、Orthologous gene familyからfastaファイルをダウンロードしてくるのをお勧めします。また、数百〜千遺伝子の比較で系統樹を描いてもよくわからない結果になりがちなので、いくつかの生物種に絞って比較するのが賢明でしょう。
ここでは、例として以下のようなfastaファイルを作成しました。
ATはシロイヌナズナ、Osはイネ、Ljはミヤコグサ、PpはヒメツリガネゴケのATHBのパラログ&オーソログです。
②アライメントする
MAFFT version 7のオンラインサイト(https://mafft.cbrc.jp/alignment/server/)でアライメントしてみましょう。
私は、アライメントの際にはG-INS-1 (Slow; progressive method with an accurate guide tree)を選択しています。
CLUSTAL format alignment by MAFFT (v7.511)
AT1G14440 --------------------------------------------------M-EIA-----
AT2G02540 --------------------------------------------------M-EIA-----
Lj4g002049 --------------------------------------------------M-EVS-----
Lj5g000796 ------------------------------------------------------------
Os11g02433 MVSILQLQTRTEASPASSASAAATRIFAVRRQQQEQEGEEEEEEFEFQERM-DLS-----
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 --------------------------------------------------M-FFD-----
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 --------------------------------------------------M-NFE-----
Lj1g000953 --------------------------------------------------M-EFDDHEDL
Lj1g002084 --------------------------------------------------M-EFD-----
Os08g04794 --------------------------------------------------M-DFD-----
Os09g04664 --------------------------------------------------M-DFD-----
Pp3c1_1529 --------------------------------------------------M-ESL-----
Pp3c2_2116 --------------------------------------------------M-ESL-----
Pp3c7_1500 --------------------------------------------------M-ESL-----
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 --------------------------------------------------M-DLG-----
Pp3c21_110 --------------------------------------------------M-DLG-----
Pp3c19_204 --------------------------------------------------M-DLG-----
Pp3c22_960 --------------------------------------------------M-DLG-----
Lj2g001509 --------------------------------------------------M-EFE-----
Lj4g002796 --------------------------------------------------M-DY------
AT5G65410 --------------------------------------------------M-EFE-----
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 --------------------------------------------------M-ELG-----
AT2G18350 --------------------------------------------------M-EV------
Lj2g001028 --------------------------------------------------M-EAVRGQGS
Lj4g002150 --------------------------------------------------M-EM------
AT1G75240 --------------------------------------------------M-DMR-----
Os01g06355 --------------------------------------------------M-ELS-----
Os05g05793 --------------------------------------------------M-EFR-----
AT1G69600 --------------------------------------------------M-DLS-----
AT5G60480 --------------------------------------------------M-SSL-----
AT3G28920 --------------------------------------------------MLEV------
AT5G39760 --------------------------------------------------MMDMT-----
AT5G15210 --------------------------------------------------M-DVI-----
Lj1g000619 --------------------------------------------------M-DIT-----
Lj2g000448 --------------------------------------------------M-EVL-----
Lj4g001229 --------------------------------------------------M---------
Lj2g002469 --------------------------------------------------M-DLK-----
Os03g07185 --------------------------------------------------M-E-------
Lj2g000198 --------------------------------------------------M-EGG-----
Os04g04345 --------------------------------------------------M---------
Os08g04384 --------------------------------------------------M-EAV-----
Os02g07066 --------------------------------------------------M-EYKRSSHV
Os06g03372 --------------------------------------------------L---------
AT1G14687 ------------------------------------------------------------
AT5G42780 --------------------------------------------------MDEIK-----
Os12g02089 --------------------------------------------------M-DL------
Pp3c11_223 MG------------------------------------------------M-DISRY---
Pp3c5_860 MN------------------------------------------------------Y---
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
AT1G14440 ------SQEDHD------------------------------------------------
AT2G02540 ------SQED--------------------------------------------------
Lj4g002049 ------SSQEAGE------------------------------------------IPIP-
Lj5g000796 ------------------------------------------------------------
Os11g02433 ------GAQGE-------------------------------------------------
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------SQKT------------HK------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 ------DQEE------------DM------------------------------------
Lj1g000953 EEEEEEEEEE------------EE------------------------------------
Lj1g002084 ------EQEE------------QD------------------------------------
Os08g04794 ------DHDEGDG---------DE------------------------------------
Os09g04664 ------DHD--DG---------DE------------------------------------
Pp3c1_1529 ------VSHKID------------------------------------------------
Pp3c2_2116 ------VSYKID------------------------------------------------
Pp3c7_1500 ------VSHKVD------------------------------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ------TGREGTD---------PQQQQKSS-------HQTQQQQ----QPQPLSPLPAPL
Pp3c21_110 ------SGRDGTD--------QQQQQQRGP-------HQTQQQQQQLPQPQPLASLPTSI
Pp3c19_204 ------SGHESSN---------DQPQP------------EQPQM----QTSPLPSLPAPI
Pp3c22_960 ------SGHESNN---------NQ----------------QQQV----QAHPLPISPLPA
Lj2g001509 ------DQEE------------QE------------------------------------
Lj4g002796 --------DE------------QE------------------------------------
AT5G65410 ------DNNNNND---------EEQE----------------------------------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 ------GKCNAIT-----------------------------------------------
AT2G18350 --------RE--------------------------------------------------
Lj2g001028 KDIEI-MSTATTTTLGYNLPIRNSSSSSSKLSSPTAGHRTSTDQ------------PAPV
Lj4g002150 --------REMPSTLIYNLPNRDSSSPSLP---------SSSDQ------------P---
AT1G75240 ------SHEMIER--------RREDN----------------------------------
Os01g06355 ------EHEEDAGDVG--------------------------------------------
Os05g05793 ------GHDEPVDEM---------------------------------------------
AT1G69600 ------------------------------------------------------------
AT5G60480 ------------------------------------------------------------
AT3G28920 ------------RSMD------MTP-----------------------------------
AT5G39760 ------PTIT--TTTT------PTP-----------------------------------
AT5G15210 ------AT-T--TTIV------SDL-----------------------------------
Lj1g000619 ------PTTTIITNIN------NTATPTTTIA----------------------------
Lj2g000448 ------TTAT--NNIT------STA-----------------------------------
Lj4g001229 ------------------------------------------------------------
Lj2g002469 ------T-----------------------------------------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------A-----------------------------------------------------
Os04g04345 ------------------------------------------------------------
Os08g04384 ------------------------------------------------------------
Os02g07066 EEEEEEEEEEDDEEED------EEEQGHHQYTT------AAAQQ----QLHP--------
Os06g03372 ------------------------------------------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 ------PKKEENSKRR--------------------------------------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 ------NHCEVHTNEVKAYKVQDGMLPTVS-------SDSGADQQGGDVAGRMEFWPPGV
Pp3c5_860 ------N----------------------------------CDRRGG-------------
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
AT1G14440 ---------------------------------MPIP------------------LNTT-
AT2G02540 ----------------------------------PIP------------------INTS-
Lj4g002049 -------------------------------IPIPIP------------------INSS-
Lj5g000796 ------------------------------------------------------------
Os11g02433 ---------------------------------LPIP------------------MHASA
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ----------------------------FHFIFAGFE------------------VEE--
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 -------------------------------EMSGVN------------------PPC--
Lj1g000953 -------------------------------EEGEVEMGFT-------------VAPP--
Lj1g002084 -------------------------------EEMGIPESPP-------------PVPA--
Os08g04794 -------------------------------EMPPMP------------------LSS--
Os09g04664 -------------------------------EMPPMP------------------VSS--
Pp3c1_1529 --------------------------------YTPMP------------------ITA--
Pp3c2_2116 --------------------------------YTPMP------------------ITA--
Pp3c7_1500 --------------------------------YSTMS------------------IAA--
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 PL------------------LMPQPLAVSNFHSAPPP------------------LQQ--
Pp3c21_110 PL------------------LLPPALAVSNYHPTPVS------------------VQP--
Pp3c19_204 PM------------------MLP-SLGASNYHATPTS------------------LHQQQ
Pp3c22_960 PM------------------MLP-SLAASNYHSTPTS------------------LYQQL
Lj2g001509 -------------------------------EELCMA------------------TAP--
Lj4g002796 -------------------------------EELVMAG----------------GGGA--
AT5G65410 -------------------------------EDMNLHEEEE-------------DDDA--
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 -------------------------------TTTMISTE----------------VKP--
AT2G18350 ----------------------------KKDEKMEMTRRKSSALD-------HHRLPPYT
Lj2g001028 P--------------------VP----HTNNNTLIFNDSAQPSHH-------HHHLSAPP
Lj4g002150 ----------------------------SQTHTIIFNHPPKLSH----------------
AT1G75240 ---------------------------GNNNGGVVIS----------------NIIST--
Os01g06355 -----------------------------------------------------GGCSSPP
Os05g05793 ------------------------------------------------------GVAYGR
AT1G69600 --------------------------------SKPQQQLL--------------------
AT5G60480 --------------------------------SKPNRQFLSPT-----------------
AT3G28920 --------------------------------KSPEPESETPTR-----------IQ---
AT5G39760 --------------------------------KSPEPESETPTR-----------IQ---
AT5G15210 --------------------------------DSRQPEIEAPIR-----------IQ---
Lj1g000619 ------------------------IAAAATSSKSPEHETETPPR-----------IAN--
Lj2g000448 --------------------------------KSPEPETETPTR-----------IQQ--
Lj4g001229 ------------------------------------------------------------
Lj2g002469 --------------------------------ETPPPPTQ--------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 ------------------------------------------------------------
Os08g04384 ------------------------------------------------------------
Os02g07066 -------------------------QVLGSSASSPSSLMDSAAFSRPLLPPNLSLVSPSA
Os06g03372 ------------------------------------------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 -----------------------------------------------------RNVKPIC
Os12g02089 ------------------------------------------------------------
Pp3c11_223 GVADHANKHCVEGGIDLCGSVIHGNDALDQMLQFPKAGDVRSWRDL----TGASRTNSES
Pp3c5_860 ----------------------YGDSAEEAANLFLAASTRNPWQ--------VGPMNPVI
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
AT1G14440 ----FG-GGGSH-----GHMIHHHDH--------------HAANSAP--PTHNNN-----
AT2G02540 ----YGNSGGGH-----GNMNHHHH-----------------ANSAP--SSL-NI-----
Lj4g002049 ----TN-YGGGHA-AGNGHDHHMNMH--------------HIHDPAP-HHNHNHN-----
Lj5g000796 ------------------------------------------------------------
Os11g02433 AASPFA-GMGAHGGAGGGHVVELHRH--------------EHVGNNG--QAM-AM-----
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 -----------------------------------------SRE------EFEGN-----
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 -----------------------------------------GYD------SLSG------
Lj1g000953 -----------------------------------------GFD------SLGN------
Lj1g002084 -----------------------------------------SYD------PLLN------
Os08g04794 -----------------------------------------GY----------D------
Os09g04664 -----------------------------------------SY----------E------
Pp3c1_1529 -----------------------------------------TFA------GLHE------
Pp3c2_2116 -----------------------------------------TFA------GLHE------
Pp3c7_1500 -----------------------------------------TFA------GLHD------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ---------------------HHQHH--------------QLHH------EIPS------
Pp3c21_110 ---------------------QHQQ-----------------HH------EMPG------
Pp3c19_204 -----------------QHHHHHHPH--------------ALHH------DLPS------
Pp3c22_960 ---------------------HHHQH--------------ALHH------DLAS------
Lj2g001509 -----------------------------------------SYD------SLTH------
Lj4g002796 -----------------------------------------SYDDD--DSSLAN------
AT5G65410 -----------------------------------------VYDSPPLSRVLPK------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 -----------------------------------------HTDPEPE------------
AT2G18350 ----YS----------QTANKEKPTTKR----------NGSDPDPDPD------------
Lj2g001028 ----LP----------QTQNHHHQSQ--------------RPTTTDPD------------
Lj4g002150 -------------------NHHHHIY--------------TPSSTSPP------------
AT1G75240 -----------------------------------------NIDDNC--NGNNN------
Os01g06355 ------------------------------------------------------------
Os05g05793 ------------------------------------------------------------
AT1G69600 -------------------------------------------NSLPI------------
AT5G60480 ------------------------------------------TNNQDT------------
AT3G28920 ----------------------------------------------PA------------
AT5G39760 ----------------------------------------------PA------------
AT5G15210 ----------------------------------------------PA------------
Lj1g000619 -----------------------------------------TTTPPPT------------
Lj2g000448 -----------------------------------------PGNVNAT------------
Lj4g001229 ------------------------------------------------------------
Lj2g002469 ------------------------------------------------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 ------------------------------------------------------------
Os08g04384 ----VG------------------------------------------------------
Os02g07066 AA--AAAPGGSY-----LHAAHHHGQGRRVEAPGGESQHHLQRHHEPARNGVLGG-----
Os06g03372 ------------------------------------------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 RE--TG----------------------------------DHVHYLPT------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 KAFRRG--------VIGGRLGHHMS--------------------CPCDSAMMND-----
Pp3c5_860 SA------------------GHHAGN------------------VTNCNAASAGDAGTAQ
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
AT1G14440 ---------------------------NTTQPPPMP-----------------LHGNGHG
AT2G02540 ---------------------------TTSNPLLVS-----------------SNSNGLG
Lj4g002049 ---------------------------NIISPTSAA-----------------VPSNGSS
Lj5g000796 ------------------------------------------------------------
Os11g02433 ---------------------------ASPPPTNVA-----------------VAAE---
AT1G18835 -----------------------------MKKRQVV------------------------
Lj1g001161 -----------------------------MKKRQVV------------------------
AT3G28917 -----------------------------MRKRQVV------------------------
AT1G74660 ----------------------------MMKKRQMV------------------------
Lj2g002499 -----------------------------MRKRQVV------------------------
Lj4g000050 -----------------------------MKKKQVV------------------------
Lj1g001230 ---------------------------CTMRKSHVV------------------------
Os11g01283 -----------------------------MGPQQ--------------------------
Os12g01245 -----------------------------MGPQQ--------------------------
AT4G24660 -----------------------------EGATSSG--------------------GG--
Lj1g000953 -----------------------------SAARSKT--------------------GGGI
Lj1g002084 -----------------------------SAPRSKI------------------------
Os08g04794 -----------------------------APMQPGL------------------GGGGGG
Os09g04664 -----------------------------TPPQHGL------------------AGGGMA
Pp3c1_1529 -----------------------------FSKLKLL------------------SSTGNG
Pp3c2_2116 -----------------------------FSKLKLF------------------SNTGNR
Pp3c7_1500 -----------------------------SSKLKFF-------------------NSGFG
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 -----------------------------STKLKLL---------------NTSSGNGGS
Pp3c21_110 -----------------------------AAKPKLL---------------NTPSGNGGS
Pp3c19_204 -----------------------------ATKLKLLNQ------------NNTPSGNGGS
Pp3c22_960 -----------------------------TTKLKLQNQ------------NNTPSGNGGS
Lj2g001509 -----------------------------PSRVKMP--------------------GGGA
Lj4g002796 -----------------------------PTRVKMP--------------------SPVD
AT5G65410 -----------------------------ASTESHETT------------GTTSTGGGGG
Os08g04381 ------------------------------------------------------------
Os09g04145 -----------------------------MMKRLVV------------------------
AT3G50890 -----------------------------AKPESDP------------------------
AT2G18350 -----------------------------LDTNPIS------------------------
Lj2g001028 -----------------------------LTPSSSP------------------------
Lj4g002150 -----------------------------LPPNSVQ------------------------
AT1G75240 -----------------------------NTRVSCN------------------------
Os01g06355 -----------------------------TPPHRVLTS----------------------
Os05g05793 -----------------------------TPPSSSSSP----------------------
AT1G69600 -----------------------------AGELTV-------------------------
AT5G60480 -----------------------------GREQTI-------------------------
AT3G28920 ------------------------------KPISFS--------------------NGI-
AT5G39760 ------------------------------KPISFS--------------------NGI-
AT5G15210 ------------------------------KPISFS--------------------NG--
Lj1g000619 ------------------------------KALSFS--------------------NGV-
Lj2g000448 -----------------------------AKPLSFS--------------------NGV-
Lj4g001229 ------------------------------------------------------------
Lj2g002469 ------------------------------------------------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 -----------------------------MDHLSLV------------------------
Os08g04384 -----------------------------VKYRPVVFP------------------NGGA
Os02g07066 ----VAG-------------------AHAASTLALV-------------------GGGGG
Os06g03372 ------------------------------------------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 ---------------------------CKTKPKPTR------------------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 ----FKGSSYLLGKSFRIGVGESEDVDGSVEGRQGEAGLGRWE-------EAATSQNEGD
Pp3c5_860 GAILFQG--FLGGRGY----------GGSLQ-PSSAALHARWDLNPVQPGENQTSGNQRD
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ----------------------------------MTNKLFRWDLNPIHLGENQTSGNLSD
AT1G14440 --------NNYDHH--------------HHQDPH--HVGYNAII---------KKPMIKY
AT2G02540 --------KNHDHS--------------HH---H--HVGYNIMVTNIKK---EKPVVIKY
Lj4g002049 --------MQLQAAA------------GLQQED-----DGAY----------NKKVAIRY
Lj5g000796 ----------------------------MECSDF--HVDKSL----------EKKIIISY
Os11g02433 ----------------------------QEGSPV--AGKKRGGMAVVGG---GGGVAVKY
AT1G18835 ----------------------IKQ-------RK-----SSYTM----T---SSSSNVRY
Lj1g001161 ----------------------VK---------K-----LSNTT----S---SVMRNIRY
AT3G28917 ----------------------LRRASPEEPSRS-----SSTAS----S---LTVRTVRY
AT1G74660 ----------------------IKQRSRNSNTSSSWTTTSSSSS----S---SEISNVRY
Lj2g002499 ----------------------VRREDPQR-----------------------NVRSVKY
Lj4g000050 ----------------------V-------------------------------------
Lj1g001230 ----------------------VRR-----------------VE----S---PTGRNVRY
Os11g01283 --------------------------DRSAAKPYANGSTAAAAA----AGRKENNKVVRY
Os12g01245 --------------------------DRSAAKPYANGSTAAAAA----AGRKENNKVVRY
AT4G24660 ------------------------------------GVGRSKGV----------GAKIRY
Lj1g000953 EPEG-----------------------GAAATAL--GVGRKNGS----------TGTVRY
Lj1g002084 ----------------------------AEVSAP--VIGRKGGSFT-PP---VAAGVVRY
Os08g04794 VPKP--------------------------GGGV--GGGGGGGG----G---GGGGGARY
Os09g04664 -PKP--------------------------PGEI--GSRVKGPS----C---GGG---RY
Pp3c1_1529 -VTT--------------------------MDEP--LLLEAPSV----K---AKAKVIRY
Pp3c2_2116 -VTN--------------------------MDEP--RPMEAAGA----K---AKSKAIRY
Pp3c7_1500 -VIK--------------------------MDEL--KRIEAENV----S---AKDKAISY
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 VPSK------GDHVGADQAREILRQAVQGAAAGN--ESASTKPS----N---VKKGTVRY
Pp3c21_110 VQSK------SDHVAADQAREIVRQAVQVAGAAS--ESASAKLS----N---VKKGAFRY
Pp3c19_204 VPTKYNDVKNSDQVATDQAREILRQAVVTAVTES--NAASTKAA----N--AAKKGAVRY
Pp3c22_960 LPTKNSDVKTSDQVATDQAREILRQAVVTAVTES--NAASAKAA----H---AKKGAVRY
Lj2g001509 EPIM--------------------------AHPL--RNNSSNNG----A---AKG---RY
Lj4g002796 DPAA--------------------------MMVV--VRNT------------GKG---KY
AT5G65410 FMVVH-------------------------------------------G---GGGSRFRF
Os08g04381 --------------------------------------------------------RVRY
Os09g04145 ----------------------LRR--------------REPAV----R---FSCCGVRY
AT3G50890 ---------------------------SMALFPI--KKENQKPK----T---RVDQGAKY
AT2G18350 ----------------------ISHA----------PRSYARPQ----T---TSPGKARY
Lj2g001028 ----------------------LATT----------RITAPPPP----P---PTP-LVRY
Lj4g002150 ----------------------LQQQ----------PTRDPDPS----S---SSSLLIRY
AT1G75240 ----------------------------SQTLDH--HQSKSPSSFSISA---AAKPTVRY
Os01g06355 ------------------------------------AAPET--------------IRCRY
Os05g05793 ------------------------------------AASASAGN----G---AGAAEVRY
AT1G69600 ----------------------------------------------------TGEMGVCY
AT5G60480 -------------------------------------------A----C---ARDMVVLY
AT3G28920 ----------------------IKRH--H----HH-HH-----N----N------NKVTY
AT5G39760 ----------------------IKRH--H----HH-HH----------P------LLFTY
AT5G15210 -----------------------KRC--H----HH-HL-----A----S---EAVAVATY
Lj1g000619 ----------------------LKRH--HPSSYHH-HHHHPLSA----N---HTTMAVAY
Lj2g000448 ----------------------LKRH--HPPAPH---------A----N---HSPVTVTY
Lj4g001229 -------------------------------------------------------VVVSF
Lj2g002469 ---------------------------------HR-HLITATPS----P---PSTVAVSY
Os03g07185 -------------------------------------------Q----Q---QERPREVY
Lj2g000198 ---------------------------------------MISSS----E---NSSSNCLY
Os04g04345 --------------------------------PY-----EGGSA----G---GGGGGGKY
Os08g04384 AAA---------------------------------AAGKSKAT----P---ASATAAVY
Os02g07066 ------------------------------------GPRGGEGA----A---GEAPTWRY
Os06g03372 ------------------------------------QLRRAQPA--------VGGGETVY
AT1G14687 -----------------------------------------------------MQSTCVY
AT5G42780 ------------------------------------THHAPPPI--LDS-IFKVTHKPHY
Os12g02089 ------------------------------------------------------------
Pp3c11_223 PHAQF---NVVQEE-------ELSNRVNDLCSQGDRSNEHRLQE-------FSRDMIDEC
Pp3c5_860 DQEQA---KWANQNTTSRFRGQLDEDDLLGFSMDQRSAQPNLQA-------KSGTCTVVY
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 NHNPA---KWMNTNTAPRLRGQLANDDFLGFSPDQRSVQQNLQP-------KSGHCVIVC
AT1G14440 K------------------------------------ECLKNHAAAMGGNATDGCGEF--
AT2G02540 K------------------------------------ECLKNHAATMGGNAIDGCGEF--
Lj4g002049 R------------------------------------ECLKNHAAGMGGNATDGCGEF--
Lj5g000796 K------------------------------------ECLKNHAAAIGGNATDGCCEF--
Os11g02433 R------------------------------------ECLKNHAAAIGGNATDGCGEF--
AT1G18835 V------------------------------------ECQKNHAANIGGYAVDGCREF--
Lj1g001161 G------------------------------------ECQKNHAASIGGYAVDGCREF--
AT3G28917 G------------------------------------ECQKNHAAAVGGYAVDGCREF--
AT1G74660 V------------------------------------ECQKNHAANIGGYAVDGCREF--
Lj2g002499 G------------------------------------ECQKNHAANVGGYAVDGCREF--
Lj4g000050 -----------------------------------------NHAAYSGGYAVDGCREF--
Lj1g001230 G------------------------------------ECQKNHAVNVGGYAVDGCREF--
Os11g01283 R------------------------------------ECQRNHAASIGGHAVDGCREF--
Os12g01245 R------------------------------------ECQRNHAASIGGHAVDGCREF--
AT4G24660 R------------------------------------ECLKNHAVNIGGHAVDGCCEF--
Lj1g000953 R------------------------------------ECQKNHAVGIGGHAVDGCCEF--
Lj1g002084 R------------------------------------ECQKNHAVSFGGHAVDGCCEF--
Os08g04794 R------------------------------------ECLKNHAVGIGGHAVDGCGEF--
Os09g04664 R------------------------------------ECLKNHAVGIGGHAVDGCGEF--
Pp3c1_1529 R------------------------------------ECNRNHAITTGGYVVDGCGEF--
Pp3c2_2116 R------------------------------------ECNRNHAISTGGYAVDGCGEF--
Pp3c7_1500 K------------------------------------ECNRNHAIFSGGYAVDGCGEF--
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 R------------------------------------ECQKNHAASIGGHALDGCGEF--
Pp3c21_110 R------------------------------------ECQKNHAASIGGHALDGCGEF--
Pp3c19_204 R------------------------------------ECQKNHAAGMGGHAMDGCGEF--
Pp3c22_960 R------------------------------------ECQKNHAAGIGAHAIDGCGEF--
Lj2g001509 R------------------------------------ECLKNHAVGIGGHALDGCGEF--
Lj4g002796 R------------------------------------ECLKNHAVGMGGYALDGCLEF--
AT5G65410 R------------------------------------ECLKNQAVNIGGHAVDGCGEF--
Os08g04381 G------------------------------------ECRRNHAARMGGHAVDGCREF--
Os09g04145 G------------------------------------ECRRNHAASTGGHAVDGCREF--
AT3G50890 R------------------------------------ECQKNHAASTGGHVVDGCCEF--
AT2G18350 R------------------------------------ECQKNHAASSGGHVVDGCGEF--
Lj2g001028 R------------------------------------ECLRNHAASMGSHVVDGCGEF--
Lj4g002150 R------------------------------------ECLRNHAARLGSHVTDGCGEF--
AT1G75240 R------------------------------------ECLKNHAASVGGSVHDGCGEF--
Os01g06355 H------------------------------------ECLRNHAAASGGHVVDGCGEF--
Os05g05793 H------------------------------------ECLRNHAAAMGGHVVDGCREFMP
AT1G69600 K------------------------------------ECLKNHAANLGGHALDGCGEF--
AT5G60480 N------------------------------------ECLKNHAVSLGGHALDGCGEF--
AT3G28920 K------------------------------------ECLKNHAAAIGGHALDGCGEF--
AT5G39760 K------------------------------------ECLKNHAAALGGHALDGCGEF--
AT5G15210 K------------------------------------ECLKNHAAGIGGHALDGCGEF--
Lj1g000619 K------------------------------------ECLKNHAANLGGHALDGCGEF--
Lj2g000448 K------------------------------------ECLKNHVASLGGHALDGCGEF--
Lj4g001229 K------------------------------------ECLKNHAASLGGHALDGCGEF--
Lj2g002469 K------------------------------------ECLRNHAASLGAHALDGCGEF--
Os03g07185 R------------------------------------ECMRNHAAKLGTYANDGCCEY--
Lj2g000198 R------------------------------------ECLRNHAATLGSYATDGCGEF--
Os04g04345 K------------------------------------ECMRNHAAAMGGQAFDGCGEY--
Os08g04384 R------------------------------------ECLKNHAASLGGHAVDGCGEF--
Os02g07066 R------------------------------------ECLKNHAARMGAHVLDGCGEF--
Os06g03372 Q------------------------------------ECPKNHAASLGGHGAGRLRGVHA
AT1G14687 R------------------------------------ECMRNHAAKLGSYAIDGCREY--
AT5G42780 Y------------------------------------ECRKNHAADIGTTAYDGCGEF--
Os12g02089 ------------------------------------------------------------
Pp3c11_223 NMVGIDLRRRNHPEDLDGLDLREAGLKGVSRDCDPYAQCQKNTCVARGPSSVDRFTKF--
Pp3c5_860 K------------------------------------ECQKNQALDTANHCVDGCGEF--
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 K------------------------------------ECQNNHALDGVNHCIDGCGEF--
AT1G14440 MP-SGED-GSIE-----------A-LTCSACNCHRNFHRKE------VEG------E---
AT2G02540 MP-SGEE-GSIE-----------A-LTCSVCNCHRNFHRRE------TEG------E---
Lj4g002049 MP-SGEE-GTIE-----------A-LNCSACNCHRNFHRKE------VEG------E---
Lj5g000796 MP-AGDE-GTLE-----------A-LKCSACNCHRNFHRKE------VD-----------
Os11g02433 MP-SGEE-GSLE-----------A-LKCSACGCHRNFHRKE------ADD------L---
AT1G18835 MA-SGGDD---------------A-LTCAACGCHRNFHRRE------VDT------E---
Lj1g001161 MA-SAGDE---------------A-LTCAACGCHRNFHRRE------VQT------E---
AT3G28917 MA-SRGEEGTVA-----------A-LTCAACGCHRSFHRRE------IET------E---
AT1G74660 MA-AGVE-GTVD-----------A-LRCAACGCHRNFHRKE------VDT------E---
Lj2g002499 MA-SGEE-GTSD-----------S-LACAACGCHRNFHKKE------VQT------EGS-
Lj4g000050 MA-SAGE-GTEG-----------A-LTCAACGCHRNFHKRELTFNSTLKT------K---
Lj1g001230 MA-SGAE-GTSV-----------A-LTCAAYGCHRSFYKKE------VWP------E---
Os11g01283 MA-SGAE-GTAA-----------A-LLCAACGCHRSFHRRE------VEA------AA--
Os12g01245 MA-SGAD-GTAA-----------A-LLCAACGCHQSFHRRE------VEA------AA--
AT4G24660 MP-SGED-GTLD-----------A-LKCAACGCHRNFHRKE------TESIGGRAHR---
Lj1g000953 LA-AGQE-GTLE-----------A-VICAACNCHRNFHRKE------T---GG---E---
Lj1g002084 MA-AGDE-GTLE-----------A------------------------------------
Os08g04794 MA-SGEE-GSID-----------A-LRCAACGCHRNFHRKE------SES------P---
Os09g04664 MA-AGEE-GTID-----------A-LRCAACNCHRNFHRKE------SES------L---
Pp3c1_1529 MP-GGEE-GTVA-----------A-LRCAACDCHRNFHRKE------TEG------E---
Pp3c2_2116 MP-GGEE-GTVA-----------A-LKCAACDCHRNFHRKE------VEG------E---
Pp3c7_1500 MP-SGEE-GTIE-----------S-LKCAACDCHRNYHRKE------VEV------E---
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 MP-GGEE-GTVD-----------A-LRCAACDCHRNFHRRE------VEG------E---
Pp3c21_110 MP-GGQE-GTVG-----------A-LRCAACDCHRNFHRRE------VEG------E---
Pp3c19_204 MP-GGGE-GSVD-----------A-LRCAACNCHRNFHRRE------VEG------E---
Pp3c22_960 MP-GGEE-GSVD-----------A-LRCAACNCHRNFHRRE------VEG------E---
Lj2g001509 MP-AGSE-GTLD-----------A-LKCAACNCHRNFHRRE------NDS------S---
Lj4g002796 MA-AGPE-GTID-----------A-LKCAACDCHRNFHRK--------DA------A---
AT5G65410 MP-AGIE-GTID-----------A-LKCAACGCHRNFHRKE------LPY------F---
Os08g04381 LA-EGEE-GTGG-----------A-LRCAACGCHRSFHRRV------VVV------Q---
Os09g04145 IA--AED-GGGGNSTSAVGVAAAA-LKCAACGCHRSFHRRV------QVY------E---
AT3G50890 MA-GGEE-GTLG-----------A-LKCAACNCHRSFHRKE------VYG----------
AT2G18350 MS-SGEE-GTVE-----------S-LLCAACDCHRSFHRKE------IDG------L---
Lj2g001028 MP-SGEE-GT-E-----------A-LKCAACECHRNFHRKE------VEG------E---
Lj4g002150 MP-NGEQ-GTPE-----------S-LICAACECHRNFHRKE------AQG------EP--
AT1G75240 MP-SGEE-GTIE-----------A-LRCAACDCHRNFHRKE------MDG------V---
Os01g06355 MP-ASTE----E-----------P-LACAACGCHRSFHRRD------PSP------G---
Os05g05793 MP-GDAA----D-----------A-LKCAACGCHRSFHRKD------DGQ------Q---
AT1G69600 MP-SPTA-TSTD---------PSS-LRCAACGCHRNFHRRD------PSE------N---
AT5G60480 TP-KSTT-ILTD---------PPS-LRCDACGCHRNFHRRS------PSD------G---
AT3G28920 MP-SPSS-TPSD---------PTS-LKCAACGCHRNFHRRE------TD-----------
AT5G39760 MP-SPSS-ISSD---------PTS-LKCAACGCHRNFHRRD------PDN------N---
AT5G15210 MP-SPSF-NSND---------PAS-LTCAACGCHRNFHRRE------EDP------S---
Lj1g000619 MP-APSA-TAAD---------PSS-LKCAACGCHRNFHRRE------PEE------P---
Lj2g000448 MP-SPTA-TADD---------PSS-IKCAACGCHRNFHRRE------PEE------P---
Lj4g001229 MP-SSST-NPTD---------PRS-LKCAACGCHRNFHRRD------P------------
Lj2g002469 MP-SA------E---------PRSQLTCAACGCHRNFHRRD------TKQ------Q---
Os03g07185 TP-DDG-----H---------PAG-LLCAACGCHRNFHRKD------FLD------G---
Lj2g000198 TL-DD----------------PAGSLQCAACGCHRNFHRK--------------------
Os04g04345 MP-ASPD----------------S-LKCAACGCHRSFHRRA------AAG------I---
Os08g04384 MP-SPAA-DAAD---------PAS-LKCAACGCHRNFHRRL------PEA------P---
Os02g07066 MS-SPGD-GAA------------A-LACAACGCHRSFHRRE------PAV------V---
Os06g03372 VV-GGEP-TDPT-----------S-LMCAACGCHCNFHCWL------LEG------S---
AT1G14687 SQPST---GDL----------------CVACGCHRSYHRRI------DVI------S---
AT5G42780 VS-ST---GEED-----------S-LNCAACGCHRNFHREE------LIP------E---
Os12g02089 ---SGAQ-GELPLPMH-------A-AASPYLGLHHDHHHQLLG----VGA----------
Pp3c11_223 LS-SGKD-EKL------------A-LTCPPCGCHRNFHQRV------VDA-CEEGEEEEL
Pp3c5_860 MR-RGRE-GQE------------A-LQCMACGCHRSYH-RS------VLV-GDNGKELD-
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 MR-RGRD-GPE------------A-LQCMACGCHRRYH-RC------VGV-GDNGNEPQ-
AT1G14440 --------------------------------L---------------------------
AT2G02540 ------------------------------------------------------------
Lj4g002049 --------------------------------PPDY------------------------
Lj5g000796 ------------------------------------------------------------
Os11g02433 --------------------------------DADS------------------------
AT1G18835 --------------------------------VVCE------------------------
Lj1g001161 --------------------------------VVCE------------------------
AT3G28917 --------------------------------VVCD------------------------
AT1G74660 --------------------------------VVCG------------------------
Lj2g002499 -------------------------------HLLVP------------------------
Lj4g000050 --------------------------------MI--------------------------
Lj1g001230 --------------------------------AECD------------------------
Os11g01283 --------------------------------AECD------------------------
Os12g01245 --------------------------------AECD------------------------
AT4G24660 --------------------------------VPTY------------------------
Lj1g000953 --------------------------------ITSY------------------------
Lj1g002084 ------------------------------------------------------------
Os08g04794 --------------------------------TGVGP-------------------AE--
Os09g04664 --------------------------------AGEG------------------------
Pp3c1_1529 --------------------------------TSCD------------------------
Pp3c2_2116 --------------------------------ATCD------------------------
Pp3c7_1500 --------------------------------ESCD------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 --------------------------------VLCE------------------------
Pp3c21_110 --------------------------------VLCE------------------------
Pp3c19_204 --------------------------------VLCD------------------------
Pp3c22_960 --------------------------------VLCD------------------------
Lj2g001509 --------------------------------NSPGD-------------------GGQF
Lj4g002796 --------------------------------AFPGD-------------------HPYH
AT5G65410 --------------------------------HHAP------------------------
Os08g04381 --------------------------------QCCA------------------------
Os09g04145 --------------------------------VAWD------------------------
AT3G50890 ------------------------------------------------------------
AT2G18350 --------------------------------FVVN------------------------
Lj2g001028 --------------------------------QQVP------------------------
Lj4g002150 --------------------------------QQVS------------------------
AT1G75240 --------------------------------GSSD------------------------
Os01g06355 --------------------------------RAGA---------------------ARL
Os05g05793 --------------------------------Q-------------------------QQ
AT1G69600 --------------------------------LNFL------------------------
AT5G60480 --------------------------------F---------------------------
AT3G28920 ---------------------------------DSS------------------------
AT5G39760 --------------------------------NDSS------------------------
AT5G15210 --------------------------------SLSA------------------------
Lj1g000619 --------------------------------PIST------------------------
Lj2g000448 --------------------------------PITA------------------------
Lj4g001229 ----------------------------------SA------------------------
Lj2g002469 --------------------------------YSNP------------------------
Os03g07185 --------------------------------RATA------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 --------------------------------GGGP------------------------
Os08g04384 --------------------------------PSPPL-------------------LALP
Os02g07066 --------------------------------APAS------------------------
Os06g03372 --------------------------------PPPP------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 ------------------------------------------------------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 TVKAKREKLNSGNYFSSFVDHCNIDRVAHELMAVANEALALA-------QDSICHGEGRG
Pp3c5_860 ----------------------TIGEVDAVQPRLINNDLHLSLSRIETVALNLMEATGRA
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ----------------------NIDEADGAAPRISNDDLQLSLSRTETVSPNLMEATGRA
AT1G14440 ------------------A--AT-AM--------------SPY---HQHP-------P-H
AT2G02540 ------------------E--KT-FF--------------SPYLNHHQPP-------PQQ
Lj4g002049 ----QHFNRLG----LGGR--KF-IL--------------GGHHHHHHKN-----ILTTP
Lj5g000796 ------------------------------------------------------------
Os11g02433 ----CAAALRA----AAGR--HHHLL--------------GPALPHHHHKNGGGLLVAGG
AT1G18835 -----YSPPN--------------------------------------------------
Lj1g001161 -----YSPPN--------------------------------------------------
AT3G28917 ----CNSPPS--------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 -----YS-----------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ----CSSSPSA-------------------------------------------------
Os11g01283 ----CSSDTS--------------------------------------------------
Os12g01245 ----CSSDTS--------------------------------------------------
AT4G24660 ----YNRPPQP-H--------------------QPP-----GYL-H---------LTSP-
Lj1g000953 ----QPRPPQQ-QPAY------HHQFSPYYPRAEPP--PSAGYL-HH-------LVTPP-
Lj1g002084 ----------------------HHA-------------ASGGYLHHH-------LTTSP-
Os08g04794 ---PSAVSPAA-ISAYGAS--PHHQFSPYY-R-TP-----AGYL-HHQQHQMAAAAAAAA
Os09g04664 ----SPFSPAA-VVPYGAT--PHHQFSPYY-R-TP-----AGYL-HHHQHHM-AAAAAAA
Pp3c1_1529 ----------------------------------------CKYI-NRNDPRKRGMM-VPG
Pp3c2_2116 ----------------------------------------CQNI-KRNDPRKRGLM-APG
Pp3c7_1500 ----------------------------------------WQ-I-FRCDDRKRGQMTAPG
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ----CKRKQKP-GVQLGA-----------------------------------AVITSQH
Pp3c21_110 ----CKRKPKP-GMQLGA-----------------------------------GIVTPHQ
Pp3c19_204 ----CKRKPKM-GAPLGT-----------------------------------GIVNTGQ
Pp3c22_960 ----CKRKPKP-GVQLGA-----------------------------------GIVTPGM
Lj2g001509 LLTHLPHVPPP-PPQFQA----------YYGR-GP-----AGYL-HMSGQHR--------
Lj4g002796 PFHHRRHQPPPGQPQYAA----------CYRA-TP-----AGYL-HVAGPNR--------
AT5G65410 ----PQHQPPPPPPGF-------------Y-R-LP-----APVS-YRPPPS---------
Os08g04381 ----CDTAAAA-AAAGGW------------------------------------------
Os09g04145 ----DDCASG--------------------------------------------------
AT3G50890 ---------------------------------------------HRNSKQDHQLMITPA
AT2G18350 ----FNSF---------------------------------GHS-QRP------------
Lj2g001028 ----NPSFHSY------------------YKH-------SNGHL-QLPAPQP----LPPP
Lj4g002150 ----N----YH------------------HNK-------SNGQN-RI-------------
AT1G75240 ---------------------------------------LISHHRHHH------------
Os01g06355 PQLHLPASINS-------------------------------------------------
Os05g05793 LRLLIPSPPTP-------------------------------------------------
AT1G69600 ----TAPPIS------------------------------SPS-----------------
AT5G60480 ------------------------------------------------------------
AT3G28920 ----AVPPPSL---LPSST--TTAAIE------------YQPHHRHHPPP-------PLA
AT5G39760 ----QIPPPP------------STAVE------------YQPHHRHHPPP-------PPP
AT5G15210 ----IVP-----------------AIE------------FRPHNRHQLPP-------PPP
Lj1g000619 ----A-------------------VIE------------YQPHHRHHPPP-------PPS
Lj2g000448 ----AHH-----------------VFE------------YQPHHRHHPPP-------PVP
Lj4g001229 ----QTPPQP------------------------------LPHHGMSRST-------SPS
Lj2g002469 ----T-------------------FIS------------FYP-----------------S
Os03g07185 ----AAG-----------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 ----VFFRPPP----------------------PP-----QPHS-HHAALQG--------
Os08g04384 PPPPPPPPPPP-PPQPQQHLPRTAAVAV-----APQLLLHGSHQRREQSPET-DRVRGPG
Os02g07066 -LSLCPASASA-SAAAG-------LVS---------------------------------
Os06g03372 PPLALPAPPMP-----------ANVL-------------HGQLHREEETPE----VRLPG
AT1G14687 ----------------------------------------SPQINHT-------------
AT5G42780 ----------------------------------------NGGVTETVLEV---------
Os12g02089 ---------------------------------HPR---GHGHHHHHLLV----------
Pp3c11_223 L-------PENGV--YSVE--EHTIIA----------KISLENLDHITKV-----ISSTT
Pp3c5_860 LPLLAADHPPRGSDDLATK--ELDTVM----------KISIENLDHISKV-----TLSTV
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 LSLLATDHPSRGADDLAIQ--DLDTVM----------KISIDNLDHIST-----------
AT1G14440 RKLMLNHQK---------------------IRS------AMP------------------
AT2G02540 RKLMFHHKM---------------------IKS------PLP------------------
Lj4g002049 EALGYHHHPTTAGNNN-----------ILPSRT------ILPP-----------------
Lj5g000796 ------------------------------SDS------NIPS-----------------
Os11g02433 DPYGAAYAAARA----------------LPPPP------PPPPH------------GHH-
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 ---------------------------------------AAPY-----------------
Lj1g000953 ---------------------------------------LSQH-----------------
Lj1g002084 ---------------------------------------TAPH-----------------
Os08g04794 ----------------------------AAAAG------GYPQ-----------------
Os09g04664 ----------------------------AAAAG------GYPQ-----------------
Pp3c1_1529 ----------------------------APSQP------GGPQ-----------------
Pp3c2_2116 ----------------------------GPSQP------GSSQ-----------------
Pp3c7_1500 ----------------------------FPSQT------ATPH-----------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ----------------------------PPGGT----IPSTPM-----------------
Pp3c21_110 ----------------------------LPGGT----NTSTPM-----------------
Pp3c19_204 ----------------------------PPTLT-----STTPV-----------------
Pp3c22_960 ----------------------------PTTLT-----SATPV-----------------
Lj2g001509 ----------------------------------------VGA-----------------
Lj4g002796 ---------------------------------------GSGA-----------------
AT5G65410 ---------------------------------------QAPP-----------------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 ----------FYS-------------------------SNSSY-----------------
AT2G18350 -------------------------------LG------SRHV-----------------
Lj2g001028 QNSSLRFSHAAAGT--------------TPTAG------PVPV-----------------
Lj4g002150 HPSSLHHKHGFSSS----------------------------------------------
AT1G75240 ----YHHNQYGGG---------------GGRRP------PPPN---------------M-
Os01g06355 ---------------------------------------RAPP-----------------
Os05g05793 ---------------------------------------RVP------------------
AT1G69600 ------------GTE-------------SPPSR----HVSSPV----------P-CSYY-
AT5G60480 -------------------------------SQ----HRSPPS----------P-LQLQ-
AT3G28920 P-PL-------------------------PRSP----NSSSPP----------P-ISSS-
AT5G39760 P-PP-------------------------PRSP----NSASPP----------P-ISSS-
AT5G15210 P-HLA-------GIR-------------SPDDD----DSASPP----------P-ISSS-
Lj1g000619 FQPS-------------------------SRSP----NSASPP----------P-ISSSY
Lj2g000448 F----------------------------NRSP----NSASPP----------P-ISS--
Lj4g001229 LSSS----------Q-------------SPSPI----SSPSPP----------P-LSHM-
Lj2g002469 ISTS-------------------------PSSS----PSRSPP----------P-LSHHF
Os03g07185 ------------G----------------AGGA----GVGVAPML-PAPGGGGP-PGYM-
Lj2g000198 --------------------------------------VTCPP----------P-SSNM-
Os04g04345 ---------------------------FLPSSV----PAPAPP-----------------
Os08g04384 H----HHDDDAAADDDDSEDSEMSDYDDDRSAS----PLQAPP----------PVLSPGY
Os02g07066 ---------------------------LSPSAT----PTGANS-----------------
Os06g03372 VD----------GDESD-NNSDGSEY-YDERSV----SPPSPPHL-PA-----PVVHQPY
AT1G14687 ---------------------------------------RFPF-----------------
AT5G42780 --------------------------------------LKISS-----------------
Os12g02089 --------------------------------------AALPP----------P------
Pp3c11_223 ECLTLLRKR-----KS-----------VNPRDKDVNKSKSCPHEGGPFSTDASPVITVE-
Pp3c5_860 ECIGILQTAA-STQQS-----------ASSRD-----RPSVPDHG-RLSSEVSPVMNKE-
Pp3c5_889 -------------------------------------------------------MNKE-
Pp3c6_2830 --------------QR-----------ASSRE-----SQTMPDY------PLCPVRTTD-
AT1G14440 ----------------HQMIMPI------------------------G-VSNYR------
AT2G02540 ----------------QQMIMPI------------------------G-VTT--------
Lj4g002049 ----------------HHMIMPY-------------------------------------
Lj5g000796 ----------------QHYVLPLL---------------------PIL-AHSSL------
Os11g02433 --------------HHHQIIMPL-------------------------------------
AT1G18835 -------------------------------------------------AN---------
Lj1g001161 ------------------------------------------------------------
AT3G28917 -------------------------------------------------TG---------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 -------------------------------------------------SG---------
Os11g01283 -------------------------------------------------SG---------
Os12g01245 -------------------------------------------------SG---------
AT4G24660 ----------------R----PP------------------------A-AS---------
Lj1g000953 ----------------RPLALPP------------------------A-GS---------
Lj1g002084 ----------------RPLALPP------------------------A-TS---------
Os08g04794 ----------------RPLALPS------------------------T-SHSGRD--EGD
Os09g04664 ----------------RPLALPS------------------------T-SHSGRD--DGD
Pp3c1_1529 -----------------QLALLS------------------------P-AP---------
Pp3c2_2116 -----------------QLALPT------------------------P-AQ---------
Pp3c7_1500 -----------------PLALPS------------------------P-SQ---------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ----------------ATLALPP------------------------S-AG---------
Pp3c21_110 ----------------GALALPP------------------------S-AG---------
Pp3c19_204 ----------------TTLALTA------------------------SVAG---------
Pp3c22_960 ----------------STLALTV------------------------GGAG---------
Lj2g001509 ----------------GTLALPS------------------------I-SG---------
Lj4g002796 ----------------ATLALPI------------------------G-GG---------
AT5G65410 ----------------LQLALPP------------------------P-QR---------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 ----------------KPRVMHP--------------------------TG---------
AT2G18350 ----------------SPIMMSF------------------------G-GG---------
Lj2g001028 ----------------PPLMMAF------------------------G-GA---------
Lj4g002150 ----------------PGLMMGF------------------------G-GG---------
AT1G75240 --------------MLNPLMLPPPPNYQPIH-------HHKYGMSPPGGGG---------
Os01g06355 -----------------ALLLPP------------------------A-AAASKQ-----
Os05g05793 ------------------LLMPP------------------------P-QPQPHPHPQHP
AT1G69600 -TSA----------PPHHVILSL------------------------S-SGFP-------
AT5G60480 -PLA----------PVPNLLLSL------------------------S-SGFF-------
AT3G28920 -----------------YMLLAL------------------------SG----NN-----
AT5G39760 -----------------YMLLSL------------------------SGTNNNNN-----
AT5G15210 -----------------YMLLAL------------------------SGGRGGAN-----
Lj1g000619 YPSA------------PHMLLAL------------------------S-TGLA-------
Lj2g000448 YPSA------------PHMLLAL------------------------SGAGLP-------
Lj4g001229 PPSA---------SAVPQMLLAL-------------------------GTAFS-------
Lj2g002469 PPSHHHLHSQISKSAPPHVQLAL-------------------------GT----------
Os03g07185 --------------HMAAMGGAV------------------------GGGGGV-------
Lj2g000198 --------------RDLSELTEY------------------------SGGG---------
Os04g04345 ----------------PQLALPY-------HAVPAAAWHHAAA----AAAG------R--
Os08g04384 LPSA------------THMLLSL------------------------GSAS---------
Os02g07066 -------------SRLMPLLLAPP------H-------MQK---RP--------------
Os06g03372 YPSA------------QHMLLSL------------------------GSSG------Q--
AT1G14687 -------------TSLRRV--------------------------------------K--
AT5G42780 -------------CQFRRIFCSP------------------------YGGG------K--
Os12g02089 ----------------TRMVMPL-------------------------------------
Pp3c11_223 -------------NGAISLLVPP------------------------S-PGRQFR--ED-
Pp3c5_860 -------------H---------------------------------------RR--RA-
Pp3c5_889 -------------H---------------------------------------RR--RA-
Pp3c6_2830 -------------N---------------------------------------RR--RS-
AT1G14440 -----------YMHNNSESEDFM-EEDGVTT-------------ASRSLPNLP-------
AT2G02540 ------------AGSNSESEDLM-EEEGGGS-------------LTFRQPPPPPSPYSYG
Lj4g002049 -----------NIGSLLPSESDE-QEDVAAG------------GMVGRPAGQN-------
Lj5g000796 ----------NKSGSISPSDQSD-EKDCDYG--------------IKRVENPK-------
Os11g02433 -----------NMIHTSESDEMD-VSGGGGG--------------VGRGGGSS-------
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 ---------------G----DEE-DTSNPSS----------S------------------
Lj1g000953 ---------------GGFRREEE-DMSNPSS----------S----------G-------
Lj1g002084 --------------GGGFSREEE-DMSNPSS----------SGGG------GG-------
Os08g04794 DMSGMVGP----M-VIGPMVGMSLGSAGPSG--------------------SG-------
Os09g04664 DLSGMVGP----MSAVGPLSGMSLG-AGPSG--------------------SG-------
Pp3c1_1529 -IMGRVTP-APYMLAHGLVDSDD-GDGGLSG--------------------SP-------
Pp3c2_2116 -IISRVTA-APFLLGPGPTDSDD-GDGGLSG--------------------SP-------
Pp3c7_1500 -MISPVNQFQHYLLGPRPANSGD-GDGGFGR--------------------SP-------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 -VMTPLTMAA--LSTGGPTDSDE-QDDGLGN----------SGGGMMMSMRSP-------
Pp3c21_110 -AMTPLTTAA--LSAGGLTDSDE-QDDGLGN----------SAGGMMISMRSP-------
Pp3c19_204 -QMTPLAMAA--LSAGGPTDSDE-QDDGPGNV--------TSGGGMMMSMRSP-------
Pp3c22_960 -QMSPLAMAA--LSAGGPTDSDE-QDDGPGNV--------TSGGGMMMSMRSP-------
Lj2g001509 --------------GGGPRE--D-QED-ISNL--------SAGGGS--------------
Lj4g002796 --------------PQGARELMD-HDD-VSDR--------SGGGGEG----SN-------
AT5G65410 ------------------ERSED-PMETSSA-------------------EAG-------
Os08g04381 ----------------------EWRDCSPES-----------------------------
Os09g04145 -------------------------DTSSSS-----------------------------
AT3G50890 -----------EIGRRTSSSSED-MKKILSHRNQN----VDGKSLMMMM-----------
AT2G18350 ---------------GGCAAESS-TEDLNKFH--------QSFSGYGVDQFHH-------
Lj2g001028 ------------------AAESS-SEDLNMF---------QRNAGAQ---EAA-------
Lj4g002150 --------------SGGPAAESS-SEDLNMFH--------QSNDGGQLSVQPP-------
AT1G75240 -MVTPMSVA--YGGGGGGAESSS-EDLNLYGQSSG----EGAGAAAGQMAFSM-------
Os01g06355 --GLPFPGYGTPSGGTGTTTASS-SDERLRP--------------------SP-------
Os05g05793 YLHPPFPYHHTPSGSGGTTTESS-SEERGPP--------------------SS-------
AT1G69600 ----------------GPSDQDP-----------------T---VVRSENSSR-------
AT5G60480 ----------------GPSDQEV-----------------KNKFTVERDVRKT-------
AT3G28920 -KTAPFSDL-----NFAAAAN---------------------------HLSAT-------
AT5G39760 -NLASFSDL-----NFSAGNNHH-----------------H------HHQHTL-------
AT5G15210 --------------------------------------------------TAV-------
Lj1g000619 ----------------APPENAA-----------------G------PT-----------
Lj2g000448 ----------------VPPENTA-----------------A------PTQTPG-------
Lj4g001229 ----------------TPPEN---------------------------------------
Lj2g002469 -------------------EN---------------------------------------
Os03g07185 ------------------------------------------------DGGGG-------
Lj2g000198 -------------------------------------------------GSGG-------
Os04g04345 --------------AGSETPPRM-DDFGPGS------------AGGSGSGGGG-------
Os08g04384 ----------------APAVAASRPHAAAAAM-----------GPPPPPGAAT-------
Os02g07066 -PVLPVSPA---SAPAALAESSS-EELRPPPLPSSHPHAHAAAVVAASASAPP-------
Os06g03372 ------------------------------------------------------------
AT1G14687 -----------QLARLKWKTAEE-------------RNEEEEDDTEETSTEEK-------
AT5G42780 -----------SEGKKKKKEKES-------------YGGDPIIKDRFGGAEEE-------
Os12g02089 -----------SAMHTSESDDAA------------------------ARPGGG-------
Pp3c11_223 -----------A-IGGSPTLIIS-ESCGIRNCN------LGMPRKLPLERVSH-------
Pp3c5_860 -----------Q-LQLSPSHLHI-QSNLLQVDR------ISAPNG-QAQNGSH-------
Pp3c5_889 -----------Q-LQLSPSHLHI-QSNLLQVDR------ISAPNG-QAQNGSH-------
Pp3c6_2830 -----------QSLQFN-------HNNLQQADR------TSAPNW-NAQQGSH-------
AT1G14440 YNQ-----KKRFRTKFTPEQKEKMLSFAEKVGW----KIQRQE------DCVVQRFCEEI
AT2G02540 HNQ-----KKRFRTKFTQEQKEKMISFAERVGW----KIQRQE------ESVVQQLCQEI
Lj4g002049 QIV-----KKRFRTKFTQEQKEKMLNFADKVGW----KIQKQE------ESAVQQFCQEI
Lj5g000796 ENV-----KKRSRTKFTQEQKEKLLGFAEKAGW----RIQKLE------ESVVHKFCQEV
Os11g02433 SSS-----KKRFRTKFTAEQKARMLEFAERVGW----RLQKLD------DAMVHHFCQEI
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 GGT-----TKRFRTKFTAEQKEKMLAFAERLGW----RIQKHD------DVAVEQFCAET
Lj1g000953 GGT-----KKRHRTRFTPEQKDKMLEFAERVGW----RIQKHD------EAAVEQFCEEA
Lj1g002084 GGM-----KKRYRTKFTPEQKEKMLAFAEELGW----RIQKHQ------EAAVEQFCAET
Os08g04794 SG------KKRFRTKFTQEQKDKMLAFAERLGW----RIQKHD------EAAVQQFCEEV
Os09g04664 SG------KKRFRTKFTQEQKDKMLAFAERVGW----RIQKHD------EAAVQQFCDEV
Pp3c1_1529 STM-----KKRFRTKFTNEQKEKMGVFAEKLGW----KIQKHD------EAAVQEFCAEV
Pp3c2_2116 STI-----KKRFRTRFNNEQKEKMGVFAEKLGW----KIQKHD------EAAVQEFCAEV
Pp3c7_1500 STM-----KKRFRTKFTSNQREKMGAFSEKLGW----RIQKHD------EPAVQEFCSDV
Pp3c3_2260 ---------------HTQVQPKLFLSFSEKLGW----RIQKHD------EPAVQEFCSVV
Pp3c18_128 SAI-----KKRFRTKFTNEQKDQMCAFAEKVGW----RIQKHD------EASVQEFCATA
Pp3c21_110 SAI-----KKRFRTKFSTEQKDQMCAFAEELGW----RIQKHD------EAAVQEFCTTV
Pp3c19_204 SAI-----KKRFRTKFTTEQKDKMCAFAEKLGW----RIQKHD------EAAVQEFCTTV
Pp3c22_960 SAI-----KKRFRTKFTTGQKDKMFAFAEN-AWVAHPEARRGC------RAGVLLTCGSQ
Lj2g001509 --S-----KKRFRTKFTQEQRDKMLDLAERLGW----RMQKHD------EGVVQDFCNET
Lj4g002796 SRA-----GKRFRTKFTHEQKEKMLEFAESAGW----RIQRHD------DNVVEEFCNEI
AT5G65410 GGI-----RKRHRTKFTAEQKERMLALAERIGW----RIQRQD------DEVIQRFCQET
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 MRK-----KKRVRTKINEEQKEKMKEFAERLGW----RMQKKD------EEEIDKFCRMV
AT2G18350 YQP-----KKRFRTKFNEEQKEKMMEFAEKIGW----RMTKLE------DDEVNRFCREI
Lj2g001028 ALS-----RKRHRTKFSSQQKDRMMEFAEKIGW----RIQKQD------EEEVQQFCSQV
Lj4g002150 LSS-----KKRFRTKFTQQQKDRMMEFAEKLGW----KIQKQD------EEEVKQFCSHV
AT1G75240 SSS-----KKRFRTKFTTDQKERMMDFAEKLGW----RMNKQD------EEELKRFCGEI
Os01g06355 VQP-----RRRSRTTFTREQKEQMLAFAERVGW----RIQRQE------EATVEHFCAQV
Os05g05793 SAAAAQGRRKRFRTKFTPEQKEQMLAFAERVGW----RMQKQD------EALVEQFCAQV
AT1G69600 GAM-----RKRTRTKFTPEQKIKMRAFAEKAGW----KINGCD------EKSVREFCNEV
AT5G60480 AMI-----KKHKRTKFTAEQKVKMRGFAERAGW----KINGWD------EKWVREFCSEV
AT3G28920 PGS-----RKRFRTKFSSNQKEKMHEFADRIGW----KIQKRD------EDEVRDFCREI
AT5G39760 HGS-----RKRFRTKFSQFQKEKMHEFAERVGW----KMQKRD------EDDVRDFCRQI
AT5G15210 PMS-----RKRFRTKFSQYQKEKMFEFSERVGW----RMPKAD------DVVVKEFCREI
Lj1g000619 SSS-----RKRFRTKFSQEQKEKMLKFAERVGW----KMQKKD------EDFVQDFCNEI
Lj2g000448 SDS-----RKRFRTKFTPGQKEKMLEFAERVGW----KMQKRD------EDLVMEFCNEV
Lj4g001229 PTA-----KKRQRTKFTMEQKEKMQSFSEKLGW----RMQK-D------DGLVQKFCNDI
Lj2g002469 PSK-----KKRCRTKFSEEQKGKMLEFSEKLGW----RMQREE------EGSIQKFCDGI
Os03g07185 SGG-----RRRTRTKFTEEQKARMLRFAERLGW----RMPKREPGRAPGDDEVARFCREI
Lj2g000198 EGR-----RKRYRTKFTVEQKEKMLGFAEKLGW----KLQRKEL-----EGEIEAFCRSV
Os04g04345 IFG-----RKRFRTKFTPEQKERMREFAEKQGW----RINRND------DGALDRFCVEI
Os08g04384 SAS-----RKRFRTKFSPEQKQRMQALSERLGW----RLQKRD------EAVVDECCREI
Os02g07066 GPS-----KKRFRTKFTAEQKERMREFAHRVGW----RIHKPD------AAAVDAFCAQV
Os06g03372 ---------------------------AQRLPL---------------------------
AT1G14687 MTV-----QRRRKSKFTAEQREAMKDYAAKLGW----TLKDKRAL----REEIRVFCEGI
AT5G42780 EGI-----VKRLKTKFTAEQTEKMRDYAEKLRW----KVRPER------QEEVEEFCVEI
Os12g02089 AAA-----RKRFRTKFTAEQKARMLGFAEEVGW----RLQKLE------DAVVQRFCQEV
Pp3c11_223 --K-----LKRTRTRISLEQREKLNAFAEKAGW----TVVGQR------KETIDATCQYI
Pp3c5_860 PGK-----PKRKRTQLTDEQREKMKSYAEHAGW----TIVGQR------KENIAAACKDI
Pp3c5_889 PGK-----PKRKRTQLTDEQREKMKSYAEHAGW----TIVGQR------KENIAAACKDI
Pp3c6_2830 PGG-----PEIKRKQFFDEQRRKMKAYAEHVGW----TNFGQR------KENIAAACKDI
AT1G14440 GVKRRVLKVWM--HNNKIHF---S----------KKNNINLEDNDNEK------------
AT2G02540 GIRRRVLKVWM--HNNKQNL---S----------KKSN----------------------
Lj4g002049 GVKRRVLKVWM--HNNKHNL---A----------KKNNLPTTPSQP--------------
Lj5g000796 GIKRRVLKVWM--HNNKNTF---S----------KR-KLSTT------------------
Os11g02433 GVKRRVLKVWM--HNNKHNL---A----------KK-PLPSSPPPP--------------
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 GVRRQVLKIWM--HNNKNSL---D----------GYQVFK--------------------
Lj1g000953 CIKRHVLKVWM--HNNKHTL---G----------KKP-----------------------
Lj1g002084 CVRRNVLKVWM--HNNKNTL---G----------KKP-----------------------
Os08g04794 CVKRHVLKVWM--HNNKHTL---G----------KKAP----------------------
Os09g04664 GVKRHVLKVWM--HNNKHTL---G----------KKLP----------------------
Pp3c1_1529 GVKRHVLKVWM--HNNKNTI---G----------KKPA----------------------
Pp3c2_2116 GVKRHVLKVWM--HNNKHTI---G----------KKPP----------------------
Pp3c7_1500 GVKRHVLKVWM--HNNKNTL---G----------KKVD----------------------
Pp3c3_2260 GVKRHVLKVWM--HNNKNTL---G----------KKITC---------------------
Pp3c18_128 GIKRHVLKVWM--HNNKHTM---G----------KKPT----------------------
Pp3c21_110 GVKRHVLKVWM--HNNKHTV---G----------KKP-----------------------
Pp3c19_204 GVKRHVLKVWM--HNNKHTV---G----------KKP-----------------------
Pp3c22_960 AARSEGLDAQQQAHYGQEAI---NFLA------ETRR-----------------------
Lj2g001509 GVKRHVLKVWM--HNNKHTL---G----------KKP-----------------------
Lj4g002796 GVKRHVLKVWM--HNNKHTL---G----------KKP-----------------------
AT5G65410 GVPRQVLKVWL--HNNKHTL---G----------KSPSPLHHHQAP--------------
Os08g04381 -------------SSSASST---T------------------------------------
Os09g04145 -------------PSSSSSL---S------------------------------------
AT3G50890 NLRRQVFKVWM--HNNKQAM---K----------RNNSNI--------------------
AT2G18350 KVKRQVFKVWM--HNNKQAA---K----------KKD-L---------------------
Lj2g001028 GVRRKVFKVWM--HNNKQAM---K----------KLHQM---------------------
Lj4g002150 GVKRQAFKVWM--HNSKQAM---K----------KKQIM---------------------
AT1G75240 GVKRQVFKVWM--HNNKNNA---K----------KPPTPTT-------------------
Os01g06355 GVRRQALKVWM--HNNKHSF-------------KQKQQQENRQEQ---------------
Os05g05793 GVRRQVFKVWM--HNNKSSI---GSSSGGGSRRQPQEQQSQQQQQ---------------
AT1G69600 GIERGVLKVWM--HNNKYSLLNGK----------IREI----------------------
AT5G60480 GIERKVLKVWI--HNNKY-FNNGR----------SRDT----------------------
AT3G28920 GVDKGVLKVWM--HNNKNSF---K----------FSGGGATTVQRNDNGIGGE-------
AT5G39760 GVDKSVLKVWM--HNNKNTF---N----------RRDIAGNEIRQIDNGGGNHTPILAGE
AT5G15210 GVDKSVFKVWM--HNNKIS--------------------GRSGARRANGGVV--------
Lj1g000619 GIERGVLKVWM--HNNKNTF---G----------KRDG-----SNEINNNNINNN-----
Lj2g000448 GVDRGVLKVWM--HNNKNTF---G----------KRDH-----AAANGGAGGGDD-----
Lj4g001229 GVSRGVFKVWM--HNNKNSF---R----------RRSQ--------DQGDAPPPP--PQT
Lj2g002469 GVSREVFKVWM--HNNKSR-----------------------------------------
Os03g07185 GVNRQVFKVWM--HNHKAGG---G----------GGGGG-------SGGPG---------
Lj2g000198 GVSRQVFKVWM--HNHKNSC---S----------SNASA-------SNG-----------
Os04g04345 GVKRHVLKVWM--HNHKNQL---A----------SSPTSAAAAAAGVMNPGAG-------
Os08g04384 GVGKGVFKVWM--HNNKHNFL--G----GHSA--RRSAAAAAAAP---------------
Os02g07066 GVSRRVLKVWM--HNNKHLA---K----------TPPSPTSQPPPPPLHHDPSPP---PP
Os06g03372 ------------------------------------------------------------
AT1G14687 GVTRYHFKTWV--NNNKK------------------------------------------
AT5G42780 GVNRKNFRIWM--NNHKDKI---I------------------------------------
Os12g02089 GVKRRVLKVWM--HNNKHTL---A----------RRHLHPSPAAAAGDDDDDGAP-----
Pp3c11_223 GIEPKTLKYWI--HNSKQKW---K----------RQPSLSEDPSK---------------
Pp3c5_860 GVTPKTLKYWI--HNAKQKL---K----------R----SHDQAL---------------
Pp3c5_889 GVTPKTLKYWI--HNAKQKL---K----------R----SHDQAL---------------
Pp3c6_2830 GVTPKTLKYWI--HNAKQKL---K----------R----SHEQPLQHSHTAHTQT-----
AT1G14440 ------------------------------------------------------------
AT2G02540 ------------------------------------------------------------
Lj4g002049 ------------------------------------------------------------
Lj5g000796 ------------------------------------------------------------
Os11g02433 ------------------------------------------------------------
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 ------------------------------------------------------------
Lj1g000953 ------------------------------------------------------------
Lj1g002084 ------------------------------------------------------------
Os08g04794 ------------------------------------------------------------
Os09g04664 ------------------------------------------------------------
Pp3c1_1529 ------------------------------------------------------------
Pp3c2_2116 ------------------------------------------------------------
Pp3c7_1500 ------------------------------------------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ------------------------------------------------------------
Pp3c21_110 ------------------------------------------------------------
Pp3c19_204 ------------------------------------------------------------
Pp3c22_960 ------------------------------------------------------------
Lj2g001509 ------------------------------------------------------------
Lj4g002796 ------------------------------------------------------------
AT5G65410 ------------------------------------------------------------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 ------------------------------------------------------------
AT2G18350 ------------------------------------------------------------
Lj2g001028 ------------------------------------------------------------
Lj4g002150 ------------------------------------------------------------
AT1G75240 ------------------------------------------------------------
Os01g06355 ------------------------------------------------------------
Os05g05793 ------------------------------------------------------------
AT1G69600 ------------------------------------------------------------
AT5G60480 ------------------------------------------------------------
AT3G28920 -NSNDDGVRG---------------------LA-NDGD----------------------
AT5G39760 INNHNNGHHG---------------------VG-GGGELHQSVSS---------------
AT5G15210 -------------------------------VG-GVGDSRQSVVP---------------
Lj1g000619 -INNAGSAKSFFAKENHDPVITN--ISSASEIN-GNGNRNHGAED---------------
Lj2g000448 -DGAGGGERA---------------------IN-GNGNGSAASQD---------------
Lj4g001229 EKNH--------------------------------------------------------
Lj2g002469 SSSEIGNEKK---------------------IN-GGGY----------------------
Os03g07185 -------------------------------AG-GGAQTSSSTTRG--------------
Lj2g000198 --------------------------------------NASSLTQ---------------
Os04g04345 -------------------------------IGLGTG-----------------------
Os08g04384 --------------LAPPPVL---------------------------------------
Os02g07066 PHHHHHHHHHH-------------------------------------------------
Os06g03372 ------------------------------------------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 ------------------------------------------------------------
Os12g02089 -------------PPHPDPRR--------RELAAAAAPPPAPVTQ---------------
Pp3c11_223 ------------------------------------------------------------
Pp3c5_860 ------------------------------------------------------------
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 ----------------------DWKFRVLSKIVEGVGTKYHTCSCCNLQRMFHSKNIHLG
AT1G14440 ------------------------------------------------INNLNNVDLS-G
AT2G02540 -------------------------------------------------NVSNNVDLSAG
Lj4g002049 ------------------------------------------------------------
Lj5g000796 ------------------------------------------------------------
Os11g02433 -------------------------------------------------PQIPPMSMP-P
AT1G18835 ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917 ------------------------------------------------------------
AT1G74660 ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660 ------------------------------------------------------------
Lj1g000953 ------------------------------------------------------------
Lj1g002084 ------------------------------------------------------------
Os08g04794 ------------------------------------------------------------
Os09g04664 ------------------------------------------------------------
Pp3c1_1529 ------------------------------------------------------------
Pp3c2_2116 ------------------------------------------------------------
Pp3c7_1500 ------------------------------------------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ------------------------------------------------------------
Pp3c21_110 ------------------------------------------------------------
Pp3c19_204 ------------------------------------------------------------
Pp3c22_960 ------------------------------------------------------------
Lj2g001509 ------------------------------------------------------------
Lj4g002796 ------------------------------------------------------------
AT5G65410 ---------------------------------------------------------PPP
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890 ------------------------------------------------------------
AT2G18350 ------------------------------------------------------------
Lj2g001028 ------------------------------------------------------------
Lj4g002150 ------------------------------------------------------------
AT1G75240 ------------------------------------------------------------
Os01g06355 ------------------------------------------------------------
Os05g05793 ------------------------------------------------------------
AT1G69600 ------------------------------------------EHGLCLNTHSNDGD----
AT5G60480 ------------------------------------------TSSMSLNLKL--------
AT3G28920 ------------------------------------------GGGGRFESDSGGAD----
AT5G39760 -----------------------------------------GGGGGGFDSDSGGAN----
AT5G15210 ------------------------------------------------------------
Lj1g000619 ------------------------------------------PIHHHFQNDGGA------
Lj2g000448 ------------------------------------------P--NQYENDSGT------
Lj4g001229 --------------------------------------------GGCFDSDINNNDIHMN
Lj2g002469 --------------------------------------------GFQLLSDINNPH----
Os03g07185 ----------------------------------------GGDVGVGLSPAMGGDG----
Lj2g000198 ------------------------------------------------------------
Os04g04345 -----------------------------------------LGTGISGDGDGDDDDTDDS
Os08g04384 ------------------------------------------------------------
Os02g07066 -----------------------------------------------------------H
Os06g03372 ------------------------------------------------------------
AT1G14687 ------------------------------------------------------------
AT5G42780 ------------------------------------------------------------
Os12g02089 -------------------------------------------------------HIKKS
Pp3c11_223 ------------------------------------------------------------
Pp3c5_860 ------------------------------------------------------------
Pp3c5_889 ------------------------------------------------------------
Pp3c6_2830 AVAKNCQAAVRHVFSHGQNACMLRTSAQELHDVPVPDGCKHPFLGSLITITLFKSNVATD
AT1G14440 NNDMTKIV------------------------P
AT2G02540 NNDITENLAST--------------------NP
Lj4g002049 --------------------------------S
Lj5g000796 ---------------------------------
Os11g02433 SPPPPQIPPMSMPPSPPPMPMPMPPSPPQLKLE
AT1G18835 --------------------------------N
Lj1g001161 ---------------------------------
AT3G28917 --------------------------------N
AT1G74660 --------------------------------M
Lj2g002499 --------------------------------Y
Lj4g000050 ---------------------------------
Lj1g001230 --------------------------------R
Os11g01283 ---------TG--------------------RR
Os12g01245 ---------TG--------------------RR
AT4G24660 -----RYEATS--------------------AH
Lj1g000953 ---------------------------------
Lj1g002084 ---------------------------------
Os08g04794 ---------------------------------
Os09g04664 ---------------------------------
Pp3c1_1529 ---------------------------------
Pp3c2_2116 ---------------------------------
Pp3c7_1500 ----------Q--------------------VE
Pp3c3_2260 ------HGNAG--------------------LS
Pp3c18_128 ---------------------------------
Pp3c21_110 ---------------------------------
Pp3c19_204 ---------------------------------
Pp3c22_960 ---------------------------------
Lj2g001509 ---------------------------------
Lj4g002796 ---------------------------------
AT5G65410 PPQSSFHHEQD--------------------QP
Os08g04381 -------------------------------AS
Os09g04145 -------------------------------SE
AT3G50890 -------------------------------SE
AT2G18350 ---------------------------------
Lj2g001028 ---------------------------------
Lj4g002150 ---------------------------------
AT1G75240 -------------------------------TL
Os01g06355 -------------------------------QQ
Os05g05793 -------------------------------QQ
AT1G69600 --------GSS--------------------SS
AT5G60480 ---------------------------------
AT3G28920 -GGGNVNASSS--------------------SS
AT5G39760 --GGNVNGSSS--------------------S-
AT5G15210 -----TNGSFS--------------------ST
Lj1g000619 --TVRANGSSS--------------------S-
Lj2g000448 --NGATNGSSS--------------------SS
Lj4g001229 QDHSTVNVHFS--------------------S-
Lj2g002469 ------SRNSS--------------------TD
Os03g07185 EDDEEVRGSEM--------------------CM
Lj2g000198 ---------------------------------
Os04g04345 PPRAAVSSPSPSPI-----------------SV
Os08g04384 -TDFSINGSPQ--------------------ST
Os02g07066 PPQHHQQQQQQ-------------------HDA
Os06g03372 ---------------------------------
AT1G14687 ------------------------------FYH
AT5G42780 ------------------------------IDE
Os12g02089 VDNKSLISSLAALHCIALLLF-------HQIDA
Pp3c11_223 ---------------------------------
Pp3c5_860 ---------------------------------
Pp3c5_889 ---------------------------------
Pp3c6_2830 SDALSKIHETS------------------RNQC
このアライメントを眺めると、zinc finger (CHCC3H2, CX2NHAX3GX4DGCXEFX8~15CX2CXCHRXFH)とhomeobox 以外の領域でgapが非常に多いことがわかります。これでは何を比較しているかわかりません。そこで、比較したい領域のみにトリムしましょう。
MAFFT7の結果をReformatー> Output Sequence formatでclustalを選択してダウンロードします。Genome NetのCLUATAL W場合は、.alnファイルをダウンロードします。
③トリムする
文献をもとに目で見てトリムしても構わないのですが、配列が多いと大変なのでtrimAI(https://vicfero.github.io/trimal/ )を使うことをお勧めします。
trimAIのサイトをもとに、github経由でファイルをダウンロードしてインストールしてください。
trimAIを使って、ダウンロードしてきたファイル(readseq.txt)をパラメータgt 0.95でトリムします。
(base) hanano@172 ~ % trimal -in readseq.txt -out demo-trimmed095.out -gt 0.95
結果、以下のファイル(demo-trimmed0.95.out)が出力されました。zinc finger領域がトリムされて出力されてきているようです。
CLUSTAL W (1.8) multiple sequence alignment
AT1G14440 NHAAAMGGNATDGCGEFMPSGACSACNCHRNFHK
AT2G02540 NHAATMGGNAIDGCGEFMPSGACSVCNCHRNFHR
Lj4g0020493 NHAAGMGGNATDGCGEFMPSGACSACNCHRNFHK
Lj5g0007965 NHAAAIGGNATDGCCEFMPAGACSACNCHRNFHK
Os11g0243300 NHAAAIGGNATDGCGEFMPSGACSACGCHRNFHK
AT1G18835 NHAANIGGYAVDGCREFMASGACAACGCHRNFHR
Lj1g0011612 NHAASIGGYAVDGCREFMASAACAACGCHRNFHR
AT3G28917 NHAAAVGGYAVDGCREFMASRACAACGCHRSFHR
AT1G74660 NHAANIGGYAVDGCREFMAAGACAACGCHRNFHK
Lj2g0024999 NHAANVGGYAVDGCREFMASGSCAACGCHRNFHK
Lj4g0000502 NHAAYSGGYAVDGCREFMASAACAACGCHRNFHR
Lj1g0012309 NHAVNVGGYAVDGCREFMASGACAAYGCHRSFYK
Os11g0128300 NHAASIGGHAVDGCREFMASGACAACGCHRSFHR
Os12g0124500 NHAASIGGHAVDGCREFMASGACAACGCHQSFHR
AT4G24660 NHAVNIGGHAVDGCCEFMPSGACAACGCHRNFHK
Lj1g0009539 NHAVGIGGHAVDGCCEFLAAGACAACNCHRNFHK
Lj1g0020844 NHAVSFGGHAVDGCCEFMAAGA------------
Os08g0479400 NHAVGIGGHAVDGCGEFMASGACAACGCHRNFHK
Os09g0466400 NHAVGIGGHAVDGCGEFMAAGACAACNCHRNFHK
Pp3c1_15290 NHAITTGGYVVDGCGEFMPGGACAACDCHRNFHK
Pp3c2_21160 NHAISTGGYAVDGCGEFMPGGACAACDCHRNFHK
Pp3c7_15000 NHAIFSGGYAVDGCGEFMPSGSCAACDCHRNYHK
Pp3c18_12880 NHAASIGGHALDGCGEFMPGGACAACDCHRNFHR
Pp3c21_11010 NHAASIGGHALDGCGEFMPGGACAACDCHRNFHR
Pp3c19_20410 NHAAGMGGHAMDGCGEFMPGGACAACNCHRNFHR
Pp3c22_9600 NHAAGIGAHAIDGCGEFMPGGACAACNCHRNFHR
Lj2g0015097 NHAVGIGGHALDGCGEFMPAGACAACNCHRNFHR
Lj4g0027969 NHAVGMGGYALDGCLEFMAAGACAACDCHRNFHK
AT5G65410 NQAVNIGGHAVDGCGEFMPAGACAACGCHRNFHK
Os08g0438100 NHAARMGGHAVDGCREFLAEGACAACGCHRSFHR
Os09g0414500 NHAASTGGHAVDGCREFIA-AACAACGCHRSFHR
AT3G50890 NHAASTGGHVVDGCCEFMAGGACAACNCHRSFHK
AT2G18350 NHAASSGGHVVDGCGEFMSSGSCAACDCHRSFHK
Lj2g0010289 NHAASMGSHVVDGCGEFMPSGACAACECHRNFHK
Lj4g0021504 NHAARLGSHVTDGCGEFMPNGSCAACECHRNFHK
AT1G75240 NHAASVGGSVHDGCGEFMPSGACAACDCHRNFHK
Os01g0635550 NHAAASGGHVVDGCGEFMPASPCAACGCHRSFHR
Os05g0579300 NHAAAMGGHVVDGCREFMPGDACAACGCHRSFHK
AT1G69600 NHAANLGGHALDGCGEFMPSPSCAACGCHRNFHR
AT5G60480 NHAVSLGGHALDGCGEFTPKSSCDACGCHRNFHR
AT3G28920 NHAAAIGGHALDGCGEFMPSPSCAACGCHRNFHR
AT5G39760 NHAAALGGHALDGCGEFMPSPSCAACGCHRNFHR
AT5G15210 NHAAGIGGHALDGCGEFMPSPSCAACGCHRNFHR
Lj1g0006197 NHAANLGGHALDGCGEFMPAPSCAACGCHRNFHR
Lj2g0004482 NHVASLGGHALDGCGEFMPSPSCAACGCHRNFHR
Lj4g0012297 NHAASLGGHALDGCGEFMPSSSCAACGCHRNFHR
Lj2g0024695 NHAASLGAHALDGCGEFMPSASCAACGCHRNFHR
Os03g0718500 NHAAKLGTYANDGCCEYTPDDGCAACGCHRNFHK
Lj2g0001982 NHAATLGSYATDGCGEFTLDDGCAACGCHRNFHK
Os04g0434500 NHAAAMGGQAFDGCGEYMPASSCAACGCHRSFHR
Os08g0438400 NHAASLGGHAVDGCGEFMPSPSCAACGCHRNFHR
Os02g0706600 NHAARMGAHVLDGCGEFMSSPACAACGCHRSFHR
Os06g0337200 NHAASLGGHGAGRLRGVVVGGSCAACGCHCNFHW
AT1G14687 NHAAKLGSYAIDGCREYSQST-CVACGCHRSYHR
AT5G42780 NHAADIGTTAYDGCGEFVSSTSCAACGCHRNFHE
Os12g0208900 -------------------SGASPYLGLHHDHHQ
Pp3c11_22370 NTCVARGPSSVDRFTKFLSSGACPPCGCHRNFHR
Pp3c5_860 NQALDTANHCVDGCGEFMRRGACMACGCHRSYHR
Pp3c6_28300 NHALDGVNHCIDGCGEFMRRGACMACGCHRRYHR
CHCC3H2の最初のCが欠けているので、もう少し条件を緩くしてもいいかもしれません。gt 0.92でやってみましょう。
(base) hanano@172 ~ % trimal -in readseq.txt -out demo-trimmed092.out -gt 0.92
これで系統樹を描いてみましょう。
CLUSTAL W (1.8) multiple sequence alignment
AT1G14440 KYKECLKNHAAAMGGNATDGCGEFMPSGALTCSACNCHRNFHK
AT2G02540 KYKECLKNHAATMGGNAIDGCGEFMPSGALTCSVCNCHRNFHR
Lj4g0020493 RYRECLKNHAAGMGGNATDGCGEFMPSGALNCSACNCHRNFHK
Lj5g0007965 SYKECLKNHAAAIGGNATDGCCEFMPAGALKCSACNCHRNFHK
Os11g0243300 KYRECLKNHAAAIGGNATDGCGEFMPSGALKCSACGCHRNFHK
AT1G18835 RYVECQKNHAANIGGYAVDGCREFMASGALTCAACGCHRNFHR
Lj1g0011612 RYGECQKNHAASIGGYAVDGCREFMASAALTCAACGCHRNFHR
AT3G28917 RYGECQKNHAAAVGGYAVDGCREFMASRALTCAACGCHRSFHR
AT1G74660 RYVECQKNHAANIGGYAVDGCREFMAAGALRCAACGCHRNFHK
Lj2g0024999 KYGECQKNHAANVGGYAVDGCREFMASGSLACAACGCHRNFHK
Lj4g0000502 -------NHAAYSGGYAVDGCREFMASAALTCAACGCHRNFHR
Lj1g0012309 RYGECQKNHAVNVGGYAVDGCREFMASGALTCAAYGCHRSFYK
Os11g0128300 RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHRSFHR
Os12g0124500 RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHQSFHR
AT4G24660 RYRECLKNHAVNIGGHAVDGCCEFMPSGALKCAACGCHRNFHK
Lj1g0009539 RYRECQKNHAVGIGGHAVDGCCEFLAAGAVICAACNCHRNFHK
Lj1g0020844 RYRECQKNHAVSFGGHAVDGCCEFMAAGA--------------
Os08g0479400 RYRECLKNHAVGIGGHAVDGCGEFMASGALRCAACGCHRNFHK
Os09g0466400 RYRECLKNHAVGIGGHAVDGCGEFMAAGALRCAACNCHRNFHK
Pp3c1_15290 RYRECNRNHAITTGGYVVDGCGEFMPGGALRCAACDCHRNFHK
Pp3c2_21160 RYRECNRNHAISTGGYAVDGCGEFMPGGALKCAACDCHRNFHK
Pp3c7_15000 SYKECNRNHAIFSGGYAVDGCGEFMPSGSLKCAACDCHRNYHK
Pp3c18_12880 RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
Pp3c21_11010 RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
Pp3c19_20410 RYRECQKNHAAGMGGHAMDGCGEFMPGGALRCAACNCHRNFHR
Pp3c22_9600 RYRECQKNHAAGIGAHAIDGCGEFMPGGALRCAACNCHRNFHR
Lj2g0015097 RYRECLKNHAVGIGGHALDGCGEFMPAGALKCAACNCHRNFHR
Lj4g0027969 KYRECLKNHAVGMGGYALDGCLEFMAAGALKCAACDCHRNFHK
AT5G65410 RFRECLKNQAVNIGGHAVDGCGEFMPAGALKCAACGCHRNFHK
Os08g0438100 RYGECRRNHAARMGGHAVDGCREFLAEGALRCAACGCHRSFHR
Os09g0414500 RYGECRRNHAASTGGHAVDGCREFIA-AALKCAACGCHRSFHR
AT3G50890 KYRECQKNHAASTGGHVVDGCCEFMAGGALKCAACNCHRSFHK
AT2G18350 RYRECQKNHAASSGGHVVDGCGEFMSSGSLLCAACDCHRSFHK
Lj2g0010289 RYRECLRNHAASMGSHVVDGCGEFMPSGALKCAACECHRNFHK
Lj4g0021504 RYRECLRNHAARLGSHVTDGCGEFMPNGSLICAACECHRNFHK
AT1G75240 RYRECLKNHAASVGGSVHDGCGEFMPSGALRCAACDCHRNFHK
Os01g0635550 RYHECLRNHAAASGGHVVDGCGEFMPASPLACAACGCHRSFHR
Os05g0579300 RYHECLRNHAAAMGGHVVDGCREFMPGDALKCAACGCHRSFHK
AT1G69600 CYKECLKNHAANLGGHALDGCGEFMPSPSLRCAACGCHRNFHR
AT5G60480 LYNECLKNHAVSLGGHALDGCGEFTPKSSLRCDACGCHRNFHR
AT3G28920 TYKECLKNHAAAIGGHALDGCGEFMPSPSLKCAACGCHRNFHR
AT5G39760 TYKECLKNHAAALGGHALDGCGEFMPSPSLKCAACGCHRNFHR
AT5G15210 TYKECLKNHAAGIGGHALDGCGEFMPSPSLTCAACGCHRNFHR
Lj1g0006197 AYKECLKNHAANLGGHALDGCGEFMPAPSLKCAACGCHRNFHR
Lj2g0004482 TYKECLKNHVASLGGHALDGCGEFMPSPSIKCAACGCHRNFHR
Lj4g0012297 SFKECLKNHAASLGGHALDGCGEFMPSSSLKCAACGCHRNFHR
Lj2g0024695 SYKECLRNHAASLGAHALDGCGEFMPSASLTCAACGCHRNFHR
Os03g0718500 VYRECMRNHAAKLGTYANDGCCEYTPDDGLLCAACGCHRNFHK
Lj2g0001982 LYRECLRNHAATLGSYATDGCGEFTLDDGLQCAACGCHRNFHK
Os04g0434500 KYKECMRNHAAAMGGQAFDGCGEYMPASSLKCAACGCHRSFHR
Os08g0438400 VYRECLKNHAASLGGHAVDGCGEFMPSPSLKCAACGCHRNFHR
Os02g0706600 RYRECLKNHAARMGAHVLDGCGEFMSSPALACAACGCHRSFHR
Os06g0337200 VYQECPKNHAASLGGHGAGRLRGVVVGGSLMCAACGCHCNFHW
AT1G14687 VYRECMRNHAAKLGSYAIDGCREYSQST---CVACGCHRSYHR
AT5G42780 HYYECRKNHAADIGTTAYDGCGEFVSSTSLNCAACGCHRNFHE
Os12g0208900 --------------------------SGAAASPYLGLHHDHHQ
Pp3c11_22370 ECNQCQKNTCVARGPSSVDRFTKFLSSGALTCPPCGCHRNFHR
Pp3c5_860 VYKECQKNQALDTANHCVDGCGEFMRRGALQCMACGCHRSYHR
Pp3c6_28300 VCKECQNNHALDGVNHCIDGCGEFMRRGALQCMACGCHRRYHR
このファイルをfasta形式のファイルに書き換えます。
私は、ファイル名をdemo-trimmed0.92.fastaとして複製保存し、テキストエディタ(Atom-2)を使って以下のような作業を行なっています。
1. 1行目のCLUSTAL W (1.8) multiple sequence alignmentを削除します。
2. 全ての「改行コード」を「改行コード>」に変換します。
3. スペース*nを改行に変換します。
以下のようなアライメント結果のファイルができました。
>AT1G14440
KYKECLKNHAAAMGGNATDGCGEFMPSGALTCSACNCHRNFHK
>AT2G02540
KYKECLKNHAATMGGNAIDGCGEFMPSGALTCSVCNCHRNFHR
>Lj4g0020493
RYRECLKNHAAGMGGNATDGCGEFMPSGALNCSACNCHRNFHK
>Lj5g0007965
SYKECLKNHAAAIGGNATDGCCEFMPAGALKCSACNCHRNFHK
>Os11g0243300
KYRECLKNHAAAIGGNATDGCGEFMPSGALKCSACGCHRNFHK
>AT1G18835
RYVECQKNHAANIGGYAVDGCREFMASGALTCAACGCHRNFHR
>Lj1g0011612
RYGECQKNHAASIGGYAVDGCREFMASAALTCAACGCHRNFHR
>AT3G28917
RYGECQKNHAAAVGGYAVDGCREFMASRALTCAACGCHRSFHR
>AT1G74660
RYVECQKNHAANIGGYAVDGCREFMAAGALRCAACGCHRNFHK
>Lj2g0024999
KYGECQKNHAANVGGYAVDGCREFMASGSLACAACGCHRNFHK
>Lj4g0000502
-------NHAAYSGGYAVDGCREFMASAALTCAACGCHRNFHR
>Lj1g0012309
RYGECQKNHAVNVGGYAVDGCREFMASGALTCAAYGCHRSFYK
>Os11g0128300
RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHRSFHR
>Os12g0124500
RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHQSFHR
>AT4G24660
RYRECLKNHAVNIGGHAVDGCCEFMPSGALKCAACGCHRNFHK
>Lj1g0009539
RYRECQKNHAVGIGGHAVDGCCEFLAAGAVICAACNCHRNFHK
>Lj1g0020844
RYRECQKNHAVSFGGHAVDGCCEFMAAGA--------------
>Os08g0479400
RYRECLKNHAVGIGGHAVDGCGEFMASGALRCAACGCHRNFHK
>Os09g0466400
RYRECLKNHAVGIGGHAVDGCGEFMAAGALRCAACNCHRNFHK
>Pp3c1_15290
RYRECNRNHAITTGGYVVDGCGEFMPGGALRCAACDCHRNFHK
>Pp3c2_21160
RYRECNRNHAISTGGYAVDGCGEFMPGGALKCAACDCHRNFHK
>Pp3c7_15000
SYKECNRNHAIFSGGYAVDGCGEFMPSGSLKCAACDCHRNYHK
>Pp3c18_12880
RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
>Pp3c21_11010
RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
>Pp3c19_20410
RYRECQKNHAAGMGGHAMDGCGEFMPGGALRCAACNCHRNFHR
>Pp3c22_9600
RYRECQKNHAAGIGAHAIDGCGEFMPGGALRCAACNCHRNFHR
>Lj2g0015097
RYRECLKNHAVGIGGHALDGCGEFMPAGALKCAACNCHRNFHR
>Lj4g0027969
KYRECLKNHAVGMGGYALDGCLEFMAAGALKCAACDCHRNFHK
>AT5G65410
RFRECLKNQAVNIGGHAVDGCGEFMPAGALKCAACGCHRNFHK
>Os08g0438100
RYGECRRNHAARMGGHAVDGCREFLAEGALRCAACGCHRSFHR
>Os09g0414500
RYGECRRNHAASTGGHAVDGCREFIA-AALKCAACGCHRSFHR
>AT3G50890
KYRECQKNHAASTGGHVVDGCCEFMAGGALKCAACNCHRSFHK
>AT2G18350
RYRECQKNHAASSGGHVVDGCGEFMSSGSLLCAACDCHRSFHK
>Lj2g0010289
RYRECLRNHAASMGSHVVDGCGEFMPSGALKCAACECHRNFHK
>Lj4g0021504
RYRECLRNHAARLGSHVTDGCGEFMPNGSLICAACECHRNFHK
>AT1G75240
RYRECLKNHAASVGGSVHDGCGEFMPSGALRCAACDCHRNFHK
>Os01g0635550
RYHECLRNHAAASGGHVVDGCGEFMPASPLACAACGCHRSFHR
>Os05g0579300
RYHECLRNHAAAMGGHVVDGCREFMPGDALKCAACGCHRSFHK
>AT1G69600
CYKECLKNHAANLGGHALDGCGEFMPSPSLRCAACGCHRNFHR
>AT5G60480
LYNECLKNHAVSLGGHALDGCGEFTPKSSLRCDACGCHRNFHR
>AT3G28920
TYKECLKNHAAAIGGHALDGCGEFMPSPSLKCAACGCHRNFHR
>AT5G39760
TYKECLKNHAAALGGHALDGCGEFMPSPSLKCAACGCHRNFHR
>AT5G15210
TYKECLKNHAAGIGGHALDGCGEFMPSPSLTCAACGCHRNFHR
>Lj1g0006197
AYKECLKNHAANLGGHALDGCGEFMPAPSLKCAACGCHRNFHR
>Lj2g0004482
TYKECLKNHVASLGGHALDGCGEFMPSPSIKCAACGCHRNFHR
>Lj4g0012297
SFKECLKNHAASLGGHALDGCGEFMPSSSLKCAACGCHRNFHR
>Lj2g0024695
SYKECLRNHAASLGAHALDGCGEFMPSASLTCAACGCHRNFHR
>Os03g0718500
VYRECMRNHAAKLGTYANDGCCEYTPDDGLLCAACGCHRNFHK
>Lj2g0001982
LYRECLRNHAATLGSYATDGCGEFTLDDGLQCAACGCHRNFHK
>Os04g0434500
KYKECMRNHAAAMGGQAFDGCGEYMPASSLKCAACGCHRSFHR
>Os08g0438400
VYRECLKNHAASLGGHAVDGCGEFMPSPSLKCAACGCHRNFHR
>Os02g0706600
RYRECLKNHAARMGAHVLDGCGEFMSSPALACAACGCHRSFHR
>Os06g0337200
VYQECPKNHAASLGGHGAGRLRGVVVGGSLMCAACGCHCNFHW
>AT1G14687
VYRECMRNHAAKLGSYAIDGCREYSQST---CVACGCHRSYHR
>AT5G42780
HYYECRKNHAADIGTTAYDGCGEFVSSTSLNCAACGCHRNFHE
>Os12g0208900
--------------------------SGAAASPYLGLHHDHHQ
>Pp3c11_22370
ECNQCQKNTCVARGPSSVDRFTKFLSSGALTCPPCGCHRNFHR
>Pp3c5_860
VYKECQKNQALDTANHCVDGCGEFMRRGALQCMACGCHRSYHR
>Pp3c6_28300
VCKECQNNHALDGVNHCIDGCGEFMRRGALQCMACGCHRRYHR
このファイルを使って再アライメントします。
④再アライメントする
MAFFT version 7のオンラインサイト(https://mafft.cbrc.jp/alignment/server/)にdemo-trimmed0.92.fastaをアップロードし、G-INS-1 (Slow; progressive method with an accurate guide tree)を選択してアライメントします。
Phylogenetic Treeをクリックして系統樹を描かせてみます。
保存されたzinc finger領域をもとにした系統樹が描けました。
⑤アライメントした配列をもとに系統樹を描く
先ほどの図を完成としてもいいのですが、iTOL Visualize(https://itol.embl.de)を使うと綺麗な図が完成します。
先ほどのウインドウの左側に出てきたテキストをコピペしてテキストファイルdemo.dndとします。Genome NetのCLUSTAL Wでアライメントした場合には、clustulw.dndファイルをダウンロードしてください。
((((((((((((((((
1_AT1G14440
:0.04988,
3_Lj4g0020493
:0.04988):0.01677,
5_Os11g0243300
:0.06665):0.02832,
4_Lj5g0007965
:0.09497):0.02779,
2_AT2G02540
:0.12276):0.05054,((((
15_AT4G24660
:0.06653,
29_AT5G65410
:0.06653):0.03174,((
18_Os08g0479400
:0.03087,
19_Os09g0466400
:0.03087):0.03448,
27_Lj2g0015097
:0.06535):0.03292):0.03277,((
23_Pp3c18_12880
:0.00000,
24_Pp3c21_11010
:0.00000):0.05124,(
25_Pp3c19_20410
:0.03264,
26_Pp3c22_9600
:0.03264):0.01860):0.07979):0.02577,((
34_Lj2g0010289
:0.08986,
35_Lj4g0021504
:0.08986):0.05512,
36_AT1G75240
:0.14498):0.01182):0.01650):0.01688,
28_Lj4g0027969
:0.19018):0.01166,((
20_Pp3c1_15290
:0.03264,
21_Pp3c2_21160
:0.03264):0.08093,
22_Pp3c7_15000
:0.11357):0.08827):0.03285,((
16_Lj1g0009539
:0.12082,
17_Lj1g0020844
:0.12082):0.06694,(
32_AT3G50890
:0.12443,
33_AT2G18350
:0.12443):0.06332):0.04693):0.04360,((((((
6_AT1G18835
:0.04287,
9_AT1G74660
:0.04287):0.03828,
10_Lj2g0024999
:0.08115):0.01458,(
7_Lj1g0011612
:0.05027,
8_AT3G28917
:0.05027):0.04546):0.02993,
12_Lj1g0012309
:0.12565):0.04811,((
13_Os11g0128300
:0.01673,
14_Os12g0124500
:0.01673):0.12467,(
30_Os08g0438100
:0.10497,
31_Os09g0414500
:0.10497):0.03643):0.03237):0.02601,
11_Lj4g0000502
:0.19978):0.07851):0.00752,((((((((
39_AT1G69600
:0.03520,
44_Lj1g0006197
:0.03520):0.02486,
51_Os08g0438400
:0.06006):0.00256,(((
41_AT3G28920
:0.00941,
42_AT5G39760
:0.00941):0.02821,
43_AT5G15210
:0.03762):0.01414,
45_Lj2g0004482
:0.05176):0.01086):0.00285,
46_Lj4g0012297
:0.06548):0.01585,
47_Lj2g0024695
:0.08132):0.07181,
50_Os04g0434500
:0.15313):0.03766,(
52_Os02g0706600
:0.18408,(
37_Os01g0635550
:0.12848,
38_Os05g0579300
:0.12848):0.05560):0.00671):0.03615,
40_AT5G60480
:0.22695):0.05886):0.04077,((
48_Os03g0718500
:0.15402,
49_Lj2g0001982
:0.15402):0.10005,
54_AT1G14687
:0.25407):0.07251):0.01626,
55_AT5G42780
:0.34285):0.07269,(
58_Pp3c5_860
:0.12092,
59_Pp3c6_28300
:0.12092):0.29462):0.11255,
57_Pp3c11_22370
:0.52809):0.04017,
53_Os06g0337200
:0.56826):0.18396,
56_Os12g0208900
:0.75222);
このファイルをiTOL VisualizeのUpload Treeでアップロードすると、以下のような図が描けます。
このサイトでは、webブラウザ上で線の太さや色を変えたり、色をつけたり、系統樹のタイプを変えたり、高画質なファイルとしてエクスポートできたりします。
例えば、unrootedの系統樹で、ATHB25 (AT5G65410)にだけ色をつけてファイルをエクスポートしてみました。
zinc finger領域をもとにした系統樹では、ATHB25 (AT5G65410)はATHB22 (AT4G24660)が一番近いこと、図の下のクレードと左側のクレードにはヒメツリガネゴケのオーソログがありますが、右側や上のクレードにはヒメツリガネゴケのオーソログがないことから、それらグループのzinc finger領域はコケ類とシダ類が分岐した後に独自の進化を遂げてきた(あるいは苔類では失われた)ことなどが推察されます。
以下、作業の流れとリンクをまとめます。
① fasta形式のファイルを準備します。
植物種間比較では、PLAZA 5.0がオススメ
②アライメントします。
MAFFT version 7
あるいは、Genome Net (genome.jp)のClustal W
③トリムします。
trimAI
④再アライメントします。
②のアライメントと同様
⑤アライメントした配列をもとに系統樹を描きます。
iTOL Visualize