見出し画像

分子系統樹の描く(植物遺伝子編)

今回は、分子系統樹を描いてみましょう。

fasta形式のファイルを用意して、MAFFT version 7https://mafft.cbrc.jp/alignment/software/)あるいは、Genome Net (genome.jp)のClustal Whttps://www.genome.jp/tools-bin/clustalw)に配列を放り込めば、アライメントができて、何かしらの系統樹が描けてしまいます。しかし、複数のドメインがある場合や欠失・挿入が多い場合、何を比べて系統樹を描いているのか、その生物学的な意味がわからないということになりかねません。そこで、保存されている領域でトリムしてドメイン単位で比較することが一般的です。

今回は、前回模式図を描いたATHB25のパラログおよびオーソログを例として系統樹を描いてみましょう。

作業としては、① fasta形式のファイルを準備する②アライメントする③トリムする④再アライメントする⑤アライメントした配列をもとに系統樹を描く、という流れです

fasta形式のファイルを準備する
まず、比較したい配列をfasta形式で準備します。最もシンプルな方法は、NCBIhttps://www.ncbi.nlm.nih.gov)のBLAST検索でホモロジーの高い配列をダウンロードしてくるというものです。シロイヌナズナの場合には、The Arabidopsis Information Resourcehttps://www.arabidopsis.org)のサイトからダウンロードすることもできます。

植物種間比較を行いたい場合には、PLAZA .0https://bioinformatics.psb.ugent.be/plaza/)のサイトからfasta形式のファイルをダウンロードしてくるのが便利です。
例えば、Dicots PLAZA 5.0でATHB25で検索すると、Homologous gene family (HOM05D000243)で96植物種1731遺伝子がダウンロードできます。実際には、zinc fingerを持っているけれどhomeoboxを持っていないものや、逆にzinc fingerを持っていないけれどhomeoboxを持っているものなど、いくつかのsubgroupがあるので、Orthologous gene familyからfastaファイルをダウンロードしてくるのをお勧めします。また、数百〜千遺伝子の比較で系統樹を描いてもよくわからない結果になりがちなので、いくつかの生物種に絞って比較するのが賢明でしょう。

ここでは、例として以下のようなfastaファイルを作成しました。
ATはシロイヌナズナ、Osはイネ、Ljはミヤコグサ、PpはヒメツリガネゴケのATHBのパラログ&オーソログです。

②アライメントする
MAFFT version 7
のオンラインサイト(https://mafft.cbrc.jp/alignment/server/)でアライメントしてみましょう。
私は、アライメントの際にはG-INS-1 (Slow; progressive method with an accurate guide tree)を選択しています。

CLUSTAL format alignment by MAFFT (v7.511)


AT1G14440  --------------------------------------------------M-EIA-----
AT2G02540  --------------------------------------------------M-EIA-----
Lj4g002049 --------------------------------------------------M-EVS-----
Lj5g000796 ------------------------------------------------------------
Os11g02433 MVSILQLQTRTEASPASSASAAATRIFAVRRQQQEQEGEEEEEEFEFQERM-DLS-----
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 --------------------------------------------------M-FFD-----
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  --------------------------------------------------M-NFE-----
Lj1g000953 --------------------------------------------------M-EFDDHEDL
Lj1g002084 --------------------------------------------------M-EFD-----
Os08g04794 --------------------------------------------------M-DFD-----
Os09g04664 --------------------------------------------------M-DFD-----
Pp3c1_1529 --------------------------------------------------M-ESL-----
Pp3c2_2116 --------------------------------------------------M-ESL-----
Pp3c7_1500 --------------------------------------------------M-ESL-----
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 --------------------------------------------------M-DLG-----
Pp3c21_110 --------------------------------------------------M-DLG-----
Pp3c19_204 --------------------------------------------------M-DLG-----
Pp3c22_960 --------------------------------------------------M-DLG-----
Lj2g001509 --------------------------------------------------M-EFE-----
Lj4g002796 --------------------------------------------------M-DY------
AT5G65410  --------------------------------------------------M-EFE-----
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  --------------------------------------------------M-ELG-----
AT2G18350  --------------------------------------------------M-EV------
Lj2g001028 --------------------------------------------------M-EAVRGQGS
Lj4g002150 --------------------------------------------------M-EM------
AT1G75240  --------------------------------------------------M-DMR-----
Os01g06355 --------------------------------------------------M-ELS-----
Os05g05793 --------------------------------------------------M-EFR-----
AT1G69600  --------------------------------------------------M-DLS-----
AT5G60480  --------------------------------------------------M-SSL-----
AT3G28920  --------------------------------------------------MLEV------
AT5G39760  --------------------------------------------------MMDMT-----
AT5G15210  --------------------------------------------------M-DVI-----
Lj1g000619 --------------------------------------------------M-DIT-----
Lj2g000448 --------------------------------------------------M-EVL-----
Lj4g001229 --------------------------------------------------M---------
Lj2g002469 --------------------------------------------------M-DLK-----
Os03g07185 --------------------------------------------------M-E-------
Lj2g000198 --------------------------------------------------M-EGG-----
Os04g04345 --------------------------------------------------M---------
Os08g04384 --------------------------------------------------M-EAV-----
Os02g07066 --------------------------------------------------M-EYKRSSHV
Os06g03372 --------------------------------------------------L---------
AT1G14687  ------------------------------------------------------------
AT5G42780  --------------------------------------------------MDEIK-----
Os12g02089 --------------------------------------------------M-DL------
Pp3c11_223 MG------------------------------------------------M-DISRY---
Pp3c5_860  MN------------------------------------------------------Y---
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
                                                                       

AT1G14440  ------SQEDHD------------------------------------------------
AT2G02540  ------SQED--------------------------------------------------
Lj4g002049 ------SSQEAGE------------------------------------------IPIP-
Lj5g000796 ------------------------------------------------------------
Os11g02433 ------GAQGE-------------------------------------------------
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------SQKT------------HK------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  ------DQEE------------DM------------------------------------
Lj1g000953 EEEEEEEEEE------------EE------------------------------------
Lj1g002084 ------EQEE------------QD------------------------------------
Os08g04794 ------DHDEGDG---------DE------------------------------------
Os09g04664 ------DHD--DG---------DE------------------------------------
Pp3c1_1529 ------VSHKID------------------------------------------------
Pp3c2_2116 ------VSYKID------------------------------------------------
Pp3c7_1500 ------VSHKVD------------------------------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ------TGREGTD---------PQQQQKSS-------HQTQQQQ----QPQPLSPLPAPL
Pp3c21_110 ------SGRDGTD--------QQQQQQRGP-------HQTQQQQQQLPQPQPLASLPTSI
Pp3c19_204 ------SGHESSN---------DQPQP------------EQPQM----QTSPLPSLPAPI
Pp3c22_960 ------SGHESNN---------NQ----------------QQQV----QAHPLPISPLPA
Lj2g001509 ------DQEE------------QE------------------------------------
Lj4g002796 --------DE------------QE------------------------------------
AT5G65410  ------DNNNNND---------EEQE----------------------------------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  ------GKCNAIT-----------------------------------------------
AT2G18350  --------RE--------------------------------------------------
Lj2g001028 KDIEI-MSTATTTTLGYNLPIRNSSSSSSKLSSPTAGHRTSTDQ------------PAPV
Lj4g002150 --------REMPSTLIYNLPNRDSSSPSLP---------SSSDQ------------P---
AT1G75240  ------SHEMIER--------RREDN----------------------------------
Os01g06355 ------EHEEDAGDVG--------------------------------------------
Os05g05793 ------GHDEPVDEM---------------------------------------------
AT1G69600  ------------------------------------------------------------
AT5G60480  ------------------------------------------------------------
AT3G28920  ------------RSMD------MTP-----------------------------------
AT5G39760  ------PTIT--TTTT------PTP-----------------------------------
AT5G15210  ------AT-T--TTIV------SDL-----------------------------------
Lj1g000619 ------PTTTIITNIN------NTATPTTTIA----------------------------
Lj2g000448 ------TTAT--NNIT------STA-----------------------------------
Lj4g001229 ------------------------------------------------------------
Lj2g002469 ------T-----------------------------------------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------A-----------------------------------------------------
Os04g04345 ------------------------------------------------------------
Os08g04384 ------------------------------------------------------------
Os02g07066 EEEEEEEEEEDDEEED------EEEQGHHQYTT------AAAQQ----QLHP--------
Os06g03372 ------------------------------------------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  ------PKKEENSKRR--------------------------------------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 ------NHCEVHTNEVKAYKVQDGMLPTVS-------SDSGADQQGGDVAGRMEFWPPGV
Pp3c5_860  ------N----------------------------------CDRRGG-------------
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
                                                                       

AT1G14440  ---------------------------------MPIP------------------LNTT-
AT2G02540  ----------------------------------PIP------------------INTS-
Lj4g002049 -------------------------------IPIPIP------------------INSS-
Lj5g000796 ------------------------------------------------------------
Os11g02433 ---------------------------------LPIP------------------MHASA
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ----------------------------FHFIFAGFE------------------VEE--
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  -------------------------------EMSGVN------------------PPC--
Lj1g000953 -------------------------------EEGEVEMGFT-------------VAPP--
Lj1g002084 -------------------------------EEMGIPESPP-------------PVPA--
Os08g04794 -------------------------------EMPPMP------------------LSS--
Os09g04664 -------------------------------EMPPMP------------------VSS--
Pp3c1_1529 --------------------------------YTPMP------------------ITA--
Pp3c2_2116 --------------------------------YTPMP------------------ITA--
Pp3c7_1500 --------------------------------YSTMS------------------IAA--
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 PL------------------LMPQPLAVSNFHSAPPP------------------LQQ--
Pp3c21_110 PL------------------LLPPALAVSNYHPTPVS------------------VQP--
Pp3c19_204 PM------------------MLP-SLGASNYHATPTS------------------LHQQQ
Pp3c22_960 PM------------------MLP-SLAASNYHSTPTS------------------LYQQL
Lj2g001509 -------------------------------EELCMA------------------TAP--
Lj4g002796 -------------------------------EELVMAG----------------GGGA--
AT5G65410  -------------------------------EDMNLHEEEE-------------DDDA--
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  -------------------------------TTTMISTE----------------VKP--
AT2G18350  ----------------------------KKDEKMEMTRRKSSALD-------HHRLPPYT
Lj2g001028 P--------------------VP----HTNNNTLIFNDSAQPSHH-------HHHLSAPP
Lj4g002150 ----------------------------SQTHTIIFNHPPKLSH----------------
AT1G75240  ---------------------------GNNNGGVVIS----------------NIIST--
Os01g06355 -----------------------------------------------------GGCSSPP
Os05g05793 ------------------------------------------------------GVAYGR
AT1G69600  --------------------------------SKPQQQLL--------------------
AT5G60480  --------------------------------SKPNRQFLSPT-----------------
AT3G28920  --------------------------------KSPEPESETPTR-----------IQ---
AT5G39760  --------------------------------KSPEPESETPTR-----------IQ---
AT5G15210  --------------------------------DSRQPEIEAPIR-----------IQ---
Lj1g000619 ------------------------IAAAATSSKSPEHETETPPR-----------IAN--
Lj2g000448 --------------------------------KSPEPETETPTR-----------IQQ--
Lj4g001229 ------------------------------------------------------------
Lj2g002469 --------------------------------ETPPPPTQ--------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 ------------------------------------------------------------
Os08g04384 ------------------------------------------------------------
Os02g07066 -------------------------QVLGSSASSPSSLMDSAAFSRPLLPPNLSLVSPSA
Os06g03372 ------------------------------------------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  -----------------------------------------------------RNVKPIC
Os12g02089 ------------------------------------------------------------
Pp3c11_223 GVADHANKHCVEGGIDLCGSVIHGNDALDQMLQFPKAGDVRSWRDL----TGASRTNSES
Pp3c5_860  ----------------------YGDSAEEAANLFLAASTRNPWQ--------VGPMNPVI
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
                                                                       

AT1G14440  ----FG-GGGSH-----GHMIHHHDH--------------HAANSAP--PTHNNN-----
AT2G02540  ----YGNSGGGH-----GNMNHHHH-----------------ANSAP--SSL-NI-----
Lj4g002049 ----TN-YGGGHA-AGNGHDHHMNMH--------------HIHDPAP-HHNHNHN-----
Lj5g000796 ------------------------------------------------------------
Os11g02433 AASPFA-GMGAHGGAGGGHVVELHRH--------------EHVGNNG--QAM-AM-----
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 -----------------------------------------SRE------EFEGN-----
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  -----------------------------------------GYD------SLSG------
Lj1g000953 -----------------------------------------GFD------SLGN------
Lj1g002084 -----------------------------------------SYD------PLLN------
Os08g04794 -----------------------------------------GY----------D------
Os09g04664 -----------------------------------------SY----------E------
Pp3c1_1529 -----------------------------------------TFA------GLHE------
Pp3c2_2116 -----------------------------------------TFA------GLHE------
Pp3c7_1500 -----------------------------------------TFA------GLHD------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ---------------------HHQHH--------------QLHH------EIPS------
Pp3c21_110 ---------------------QHQQ-----------------HH------EMPG------
Pp3c19_204 -----------------QHHHHHHPH--------------ALHH------DLPS------
Pp3c22_960 ---------------------HHHQH--------------ALHH------DLAS------
Lj2g001509 -----------------------------------------SYD------SLTH------
Lj4g002796 -----------------------------------------SYDDD--DSSLAN------
AT5G65410  -----------------------------------------VYDSPPLSRVLPK------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  -----------------------------------------HTDPEPE------------
AT2G18350  ----YS----------QTANKEKPTTKR----------NGSDPDPDPD------------
Lj2g001028 ----LP----------QTQNHHHQSQ--------------RPTTTDPD------------
Lj4g002150 -------------------NHHHHIY--------------TPSSTSPP------------
AT1G75240  -----------------------------------------NIDDNC--NGNNN------
Os01g06355 ------------------------------------------------------------
Os05g05793 ------------------------------------------------------------
AT1G69600  -------------------------------------------NSLPI------------
AT5G60480  ------------------------------------------TNNQDT------------
AT3G28920  ----------------------------------------------PA------------
AT5G39760  ----------------------------------------------PA------------
AT5G15210  ----------------------------------------------PA------------
Lj1g000619 -----------------------------------------TTTPPPT------------
Lj2g000448 -----------------------------------------PGNVNAT------------
Lj4g001229 ------------------------------------------------------------
Lj2g002469 ------------------------------------------------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 ------------------------------------------------------------
Os08g04384 ----VG------------------------------------------------------
Os02g07066 AA--AAAPGGSY-----LHAAHHHGQGRRVEAPGGESQHHLQRHHEPARNGVLGG-----
Os06g03372 ------------------------------------------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  RE--TG----------------------------------DHVHYLPT------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 KAFRRG--------VIGGRLGHHMS--------------------CPCDSAMMND-----
Pp3c5_860  SA------------------GHHAGN------------------VTNCNAASAGDAGTAQ
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ------------------------------------------------------------
                                                                       

AT1G14440  ---------------------------NTTQPPPMP-----------------LHGNGHG
AT2G02540  ---------------------------TTSNPLLVS-----------------SNSNGLG
Lj4g002049 ---------------------------NIISPTSAA-----------------VPSNGSS
Lj5g000796 ------------------------------------------------------------
Os11g02433 ---------------------------ASPPPTNVA-----------------VAAE---
AT1G18835  -----------------------------MKKRQVV------------------------
Lj1g001161 -----------------------------MKKRQVV------------------------
AT3G28917  -----------------------------MRKRQVV------------------------
AT1G74660  ----------------------------MMKKRQMV------------------------
Lj2g002499 -----------------------------MRKRQVV------------------------
Lj4g000050 -----------------------------MKKKQVV------------------------
Lj1g001230 ---------------------------CTMRKSHVV------------------------
Os11g01283 -----------------------------MGPQQ--------------------------
Os12g01245 -----------------------------MGPQQ--------------------------
AT4G24660  -----------------------------EGATSSG--------------------GG--
Lj1g000953 -----------------------------SAARSKT--------------------GGGI
Lj1g002084 -----------------------------SAPRSKI------------------------
Os08g04794 -----------------------------APMQPGL------------------GGGGGG
Os09g04664 -----------------------------TPPQHGL------------------AGGGMA
Pp3c1_1529 -----------------------------FSKLKLL------------------SSTGNG
Pp3c2_2116 -----------------------------FSKLKLF------------------SNTGNR
Pp3c7_1500 -----------------------------SSKLKFF-------------------NSGFG
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 -----------------------------STKLKLL---------------NTSSGNGGS
Pp3c21_110 -----------------------------AAKPKLL---------------NTPSGNGGS
Pp3c19_204 -----------------------------ATKLKLLNQ------------NNTPSGNGGS
Pp3c22_960 -----------------------------TTKLKLQNQ------------NNTPSGNGGS
Lj2g001509 -----------------------------PSRVKMP--------------------GGGA
Lj4g002796 -----------------------------PTRVKMP--------------------SPVD
AT5G65410  -----------------------------ASTESHETT------------GTTSTGGGGG
Os08g04381 ------------------------------------------------------------
Os09g04145 -----------------------------MMKRLVV------------------------
AT3G50890  -----------------------------AKPESDP------------------------
AT2G18350  -----------------------------LDTNPIS------------------------
Lj2g001028 -----------------------------LTPSSSP------------------------
Lj4g002150 -----------------------------LPPNSVQ------------------------
AT1G75240  -----------------------------NTRVSCN------------------------
Os01g06355 -----------------------------TPPHRVLTS----------------------
Os05g05793 -----------------------------TPPSSSSSP----------------------
AT1G69600  -----------------------------AGELTV-------------------------
AT5G60480  -----------------------------GREQTI-------------------------
AT3G28920  ------------------------------KPISFS--------------------NGI-
AT5G39760  ------------------------------KPISFS--------------------NGI-
AT5G15210  ------------------------------KPISFS--------------------NG--
Lj1g000619 ------------------------------KALSFS--------------------NGV-
Lj2g000448 -----------------------------AKPLSFS--------------------NGV-
Lj4g001229 ------------------------------------------------------------
Lj2g002469 ------------------------------------------------------------
Os03g07185 ------------------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 -----------------------------MDHLSLV------------------------
Os08g04384 -----------------------------VKYRPVVFP------------------NGGA
Os02g07066 ----VAG-------------------AHAASTLALV-------------------GGGGG
Os06g03372 ------------------------------------------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  ---------------------------CKTKPKPTR------------------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 ----FKGSSYLLGKSFRIGVGESEDVDGSVEGRQGEAGLGRWE-------EAATSQNEGD
Pp3c5_860  GAILFQG--FLGGRGY----------GGSLQ-PSSAALHARWDLNPVQPGENQTSGNQRD
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ----------------------------------MTNKLFRWDLNPIHLGENQTSGNLSD
                                                                       

AT1G14440  --------NNYDHH--------------HHQDPH--HVGYNAII---------KKPMIKY
AT2G02540  --------KNHDHS--------------HH---H--HVGYNIMVTNIKK---EKPVVIKY
Lj4g002049 --------MQLQAAA------------GLQQED-----DGAY----------NKKVAIRY
Lj5g000796 ----------------------------MECSDF--HVDKSL----------EKKIIISY
Os11g02433 ----------------------------QEGSPV--AGKKRGGMAVVGG---GGGVAVKY
AT1G18835  ----------------------IKQ-------RK-----SSYTM----T---SSSSNVRY
Lj1g001161 ----------------------VK---------K-----LSNTT----S---SVMRNIRY
AT3G28917  ----------------------LRRASPEEPSRS-----SSTAS----S---LTVRTVRY
AT1G74660  ----------------------IKQRSRNSNTSSSWTTTSSSSS----S---SEISNVRY
Lj2g002499 ----------------------VRREDPQR-----------------------NVRSVKY
Lj4g000050 ----------------------V-------------------------------------
Lj1g001230 ----------------------VRR-----------------VE----S---PTGRNVRY
Os11g01283 --------------------------DRSAAKPYANGSTAAAAA----AGRKENNKVVRY
Os12g01245 --------------------------DRSAAKPYANGSTAAAAA----AGRKENNKVVRY
AT4G24660  ------------------------------------GVGRSKGV----------GAKIRY
Lj1g000953 EPEG-----------------------GAAATAL--GVGRKNGS----------TGTVRY
Lj1g002084 ----------------------------AEVSAP--VIGRKGGSFT-PP---VAAGVVRY
Os08g04794 VPKP--------------------------GGGV--GGGGGGGG----G---GGGGGARY
Os09g04664 -PKP--------------------------PGEI--GSRVKGPS----C---GGG---RY
Pp3c1_1529 -VTT--------------------------MDEP--LLLEAPSV----K---AKAKVIRY
Pp3c2_2116 -VTN--------------------------MDEP--RPMEAAGA----K---AKSKAIRY
Pp3c7_1500 -VIK--------------------------MDEL--KRIEAENV----S---AKDKAISY
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 VPSK------GDHVGADQAREILRQAVQGAAAGN--ESASTKPS----N---VKKGTVRY
Pp3c21_110 VQSK------SDHVAADQAREIVRQAVQVAGAAS--ESASAKLS----N---VKKGAFRY
Pp3c19_204 VPTKYNDVKNSDQVATDQAREILRQAVVTAVTES--NAASTKAA----N--AAKKGAVRY
Pp3c22_960 LPTKNSDVKTSDQVATDQAREILRQAVVTAVTES--NAASAKAA----H---AKKGAVRY
Lj2g001509 EPIM--------------------------AHPL--RNNSSNNG----A---AKG---RY
Lj4g002796 DPAA--------------------------MMVV--VRNT------------GKG---KY
AT5G65410  FMVVH-------------------------------------------G---GGGSRFRF
Os08g04381 --------------------------------------------------------RVRY
Os09g04145 ----------------------LRR--------------REPAV----R---FSCCGVRY
AT3G50890  ---------------------------SMALFPI--KKENQKPK----T---RVDQGAKY
AT2G18350  ----------------------ISHA----------PRSYARPQ----T---TSPGKARY
Lj2g001028 ----------------------LATT----------RITAPPPP----P---PTP-LVRY
Lj4g002150 ----------------------LQQQ----------PTRDPDPS----S---SSSLLIRY
AT1G75240  ----------------------------SQTLDH--HQSKSPSSFSISA---AAKPTVRY
Os01g06355 ------------------------------------AAPET--------------IRCRY
Os05g05793 ------------------------------------AASASAGN----G---AGAAEVRY
AT1G69600  ----------------------------------------------------TGEMGVCY
AT5G60480  -------------------------------------------A----C---ARDMVVLY
AT3G28920  ----------------------IKRH--H----HH-HH-----N----N------NKVTY
AT5G39760  ----------------------IKRH--H----HH-HH----------P------LLFTY
AT5G15210  -----------------------KRC--H----HH-HL-----A----S---EAVAVATY
Lj1g000619 ----------------------LKRH--HPSSYHH-HHHHPLSA----N---HTTMAVAY
Lj2g000448 ----------------------LKRH--HPPAPH---------A----N---HSPVTVTY
Lj4g001229 -------------------------------------------------------VVVSF
Lj2g002469 ---------------------------------HR-HLITATPS----P---PSTVAVSY
Os03g07185 -------------------------------------------Q----Q---QERPREVY
Lj2g000198 ---------------------------------------MISSS----E---NSSSNCLY
Os04g04345 --------------------------------PY-----EGGSA----G---GGGGGGKY
Os08g04384 AAA---------------------------------AAGKSKAT----P---ASATAAVY
Os02g07066 ------------------------------------GPRGGEGA----A---GEAPTWRY
Os06g03372 ------------------------------------QLRRAQPA--------VGGGETVY
AT1G14687  -----------------------------------------------------MQSTCVY
AT5G42780  ------------------------------------THHAPPPI--LDS-IFKVTHKPHY
Os12g02089 ------------------------------------------------------------
Pp3c11_223 PHAQF---NVVQEE-------ELSNRVNDLCSQGDRSNEHRLQE-------FSRDMIDEC
Pp3c5_860  DQEQA---KWANQNTTSRFRGQLDEDDLLGFSMDQRSAQPNLQA-------KSGTCTVVY
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 NHNPA---KWMNTNTAPRLRGQLANDDFLGFSPDQRSVQQNLQP-------KSGHCVIVC
                                                                       

AT1G14440  K------------------------------------ECLKNHAAAMGGNATDGCGEF--
AT2G02540  K------------------------------------ECLKNHAATMGGNAIDGCGEF--
Lj4g002049 R------------------------------------ECLKNHAAGMGGNATDGCGEF--
Lj5g000796 K------------------------------------ECLKNHAAAIGGNATDGCCEF--
Os11g02433 R------------------------------------ECLKNHAAAIGGNATDGCGEF--
AT1G18835  V------------------------------------ECQKNHAANIGGYAVDGCREF--
Lj1g001161 G------------------------------------ECQKNHAASIGGYAVDGCREF--
AT3G28917  G------------------------------------ECQKNHAAAVGGYAVDGCREF--
AT1G74660  V------------------------------------ECQKNHAANIGGYAVDGCREF--
Lj2g002499 G------------------------------------ECQKNHAANVGGYAVDGCREF--
Lj4g000050 -----------------------------------------NHAAYSGGYAVDGCREF--
Lj1g001230 G------------------------------------ECQKNHAVNVGGYAVDGCREF--
Os11g01283 R------------------------------------ECQRNHAASIGGHAVDGCREF--
Os12g01245 R------------------------------------ECQRNHAASIGGHAVDGCREF--
AT4G24660  R------------------------------------ECLKNHAVNIGGHAVDGCCEF--
Lj1g000953 R------------------------------------ECQKNHAVGIGGHAVDGCCEF--
Lj1g002084 R------------------------------------ECQKNHAVSFGGHAVDGCCEF--
Os08g04794 R------------------------------------ECLKNHAVGIGGHAVDGCGEF--
Os09g04664 R------------------------------------ECLKNHAVGIGGHAVDGCGEF--
Pp3c1_1529 R------------------------------------ECNRNHAITTGGYVVDGCGEF--
Pp3c2_2116 R------------------------------------ECNRNHAISTGGYAVDGCGEF--
Pp3c7_1500 K------------------------------------ECNRNHAIFSGGYAVDGCGEF--
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 R------------------------------------ECQKNHAASIGGHALDGCGEF--
Pp3c21_110 R------------------------------------ECQKNHAASIGGHALDGCGEF--
Pp3c19_204 R------------------------------------ECQKNHAAGMGGHAMDGCGEF--
Pp3c22_960 R------------------------------------ECQKNHAAGIGAHAIDGCGEF--
Lj2g001509 R------------------------------------ECLKNHAVGIGGHALDGCGEF--
Lj4g002796 R------------------------------------ECLKNHAVGMGGYALDGCLEF--
AT5G65410  R------------------------------------ECLKNQAVNIGGHAVDGCGEF--
Os08g04381 G------------------------------------ECRRNHAARMGGHAVDGCREF--
Os09g04145 G------------------------------------ECRRNHAASTGGHAVDGCREF--
AT3G50890  R------------------------------------ECQKNHAASTGGHVVDGCCEF--
AT2G18350  R------------------------------------ECQKNHAASSGGHVVDGCGEF--
Lj2g001028 R------------------------------------ECLRNHAASMGSHVVDGCGEF--
Lj4g002150 R------------------------------------ECLRNHAARLGSHVTDGCGEF--
AT1G75240  R------------------------------------ECLKNHAASVGGSVHDGCGEF--
Os01g06355 H------------------------------------ECLRNHAAASGGHVVDGCGEF--
Os05g05793 H------------------------------------ECLRNHAAAMGGHVVDGCREFMP
AT1G69600  K------------------------------------ECLKNHAANLGGHALDGCGEF--
AT5G60480  N------------------------------------ECLKNHAVSLGGHALDGCGEF--
AT3G28920  K------------------------------------ECLKNHAAAIGGHALDGCGEF--
AT5G39760  K------------------------------------ECLKNHAAALGGHALDGCGEF--
AT5G15210  K------------------------------------ECLKNHAAGIGGHALDGCGEF--
Lj1g000619 K------------------------------------ECLKNHAANLGGHALDGCGEF--
Lj2g000448 K------------------------------------ECLKNHVASLGGHALDGCGEF--
Lj4g001229 K------------------------------------ECLKNHAASLGGHALDGCGEF--
Lj2g002469 K------------------------------------ECLRNHAASLGAHALDGCGEF--
Os03g07185 R------------------------------------ECMRNHAAKLGTYANDGCCEY--
Lj2g000198 R------------------------------------ECLRNHAATLGSYATDGCGEF--
Os04g04345 K------------------------------------ECMRNHAAAMGGQAFDGCGEY--
Os08g04384 R------------------------------------ECLKNHAASLGGHAVDGCGEF--
Os02g07066 R------------------------------------ECLKNHAARMGAHVLDGCGEF--
Os06g03372 Q------------------------------------ECPKNHAASLGGHGAGRLRGVHA
AT1G14687  R------------------------------------ECMRNHAAKLGSYAIDGCREY--
AT5G42780  Y------------------------------------ECRKNHAADIGTTAYDGCGEF--
Os12g02089 ------------------------------------------------------------
Pp3c11_223 NMVGIDLRRRNHPEDLDGLDLREAGLKGVSRDCDPYAQCQKNTCVARGPSSVDRFTKF--
Pp3c5_860  K------------------------------------ECQKNQALDTANHCVDGCGEF--
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 K------------------------------------ECQNNHALDGVNHCIDGCGEF--
                                                                       

AT1G14440  MP-SGED-GSIE-----------A-LTCSACNCHRNFHRKE------VEG------E---
AT2G02540  MP-SGEE-GSIE-----------A-LTCSVCNCHRNFHRRE------TEG------E---
Lj4g002049 MP-SGEE-GTIE-----------A-LNCSACNCHRNFHRKE------VEG------E---
Lj5g000796 MP-AGDE-GTLE-----------A-LKCSACNCHRNFHRKE------VD-----------
Os11g02433 MP-SGEE-GSLE-----------A-LKCSACGCHRNFHRKE------ADD------L---
AT1G18835  MA-SGGDD---------------A-LTCAACGCHRNFHRRE------VDT------E---
Lj1g001161 MA-SAGDE---------------A-LTCAACGCHRNFHRRE------VQT------E---
AT3G28917  MA-SRGEEGTVA-----------A-LTCAACGCHRSFHRRE------IET------E---
AT1G74660  MA-AGVE-GTVD-----------A-LRCAACGCHRNFHRKE------VDT------E---
Lj2g002499 MA-SGEE-GTSD-----------S-LACAACGCHRNFHKKE------VQT------EGS-
Lj4g000050 MA-SAGE-GTEG-----------A-LTCAACGCHRNFHKRELTFNSTLKT------K---
Lj1g001230 MA-SGAE-GTSV-----------A-LTCAAYGCHRSFYKKE------VWP------E---
Os11g01283 MA-SGAE-GTAA-----------A-LLCAACGCHRSFHRRE------VEA------AA--
Os12g01245 MA-SGAD-GTAA-----------A-LLCAACGCHQSFHRRE------VEA------AA--
AT4G24660  MP-SGED-GTLD-----------A-LKCAACGCHRNFHRKE------TESIGGRAHR---
Lj1g000953 LA-AGQE-GTLE-----------A-VICAACNCHRNFHRKE------T---GG---E---
Lj1g002084 MA-AGDE-GTLE-----------A------------------------------------
Os08g04794 MA-SGEE-GSID-----------A-LRCAACGCHRNFHRKE------SES------P---
Os09g04664 MA-AGEE-GTID-----------A-LRCAACNCHRNFHRKE------SES------L---
Pp3c1_1529 MP-GGEE-GTVA-----------A-LRCAACDCHRNFHRKE------TEG------E---
Pp3c2_2116 MP-GGEE-GTVA-----------A-LKCAACDCHRNFHRKE------VEG------E---
Pp3c7_1500 MP-SGEE-GTIE-----------S-LKCAACDCHRNYHRKE------VEV------E---
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 MP-GGEE-GTVD-----------A-LRCAACDCHRNFHRRE------VEG------E---
Pp3c21_110 MP-GGQE-GTVG-----------A-LRCAACDCHRNFHRRE------VEG------E---
Pp3c19_204 MP-GGGE-GSVD-----------A-LRCAACNCHRNFHRRE------VEG------E---
Pp3c22_960 MP-GGEE-GSVD-----------A-LRCAACNCHRNFHRRE------VEG------E---
Lj2g001509 MP-AGSE-GTLD-----------A-LKCAACNCHRNFHRRE------NDS------S---
Lj4g002796 MA-AGPE-GTID-----------A-LKCAACDCHRNFHRK--------DA------A---
AT5G65410  MP-AGIE-GTID-----------A-LKCAACGCHRNFHRKE------LPY------F---
Os08g04381 LA-EGEE-GTGG-----------A-LRCAACGCHRSFHRRV------VVV------Q---
Os09g04145 IA--AED-GGGGNSTSAVGVAAAA-LKCAACGCHRSFHRRV------QVY------E---
AT3G50890  MA-GGEE-GTLG-----------A-LKCAACNCHRSFHRKE------VYG----------
AT2G18350  MS-SGEE-GTVE-----------S-LLCAACDCHRSFHRKE------IDG------L---
Lj2g001028 MP-SGEE-GT-E-----------A-LKCAACECHRNFHRKE------VEG------E---
Lj4g002150 MP-NGEQ-GTPE-----------S-LICAACECHRNFHRKE------AQG------EP--
AT1G75240  MP-SGEE-GTIE-----------A-LRCAACDCHRNFHRKE------MDG------V---
Os01g06355 MP-ASTE----E-----------P-LACAACGCHRSFHRRD------PSP------G---
Os05g05793 MP-GDAA----D-----------A-LKCAACGCHRSFHRKD------DGQ------Q---
AT1G69600  MP-SPTA-TSTD---------PSS-LRCAACGCHRNFHRRD------PSE------N---
AT5G60480  TP-KSTT-ILTD---------PPS-LRCDACGCHRNFHRRS------PSD------G---
AT3G28920  MP-SPSS-TPSD---------PTS-LKCAACGCHRNFHRRE------TD-----------
AT5G39760  MP-SPSS-ISSD---------PTS-LKCAACGCHRNFHRRD------PDN------N---
AT5G15210  MP-SPSF-NSND---------PAS-LTCAACGCHRNFHRRE------EDP------S---
Lj1g000619 MP-APSA-TAAD---------PSS-LKCAACGCHRNFHRRE------PEE------P---
Lj2g000448 MP-SPTA-TADD---------PSS-IKCAACGCHRNFHRRE------PEE------P---
Lj4g001229 MP-SSST-NPTD---------PRS-LKCAACGCHRNFHRRD------P------------
Lj2g002469 MP-SA------E---------PRSQLTCAACGCHRNFHRRD------TKQ------Q---
Os03g07185 TP-DDG-----H---------PAG-LLCAACGCHRNFHRKD------FLD------G---
Lj2g000198 TL-DD----------------PAGSLQCAACGCHRNFHRK--------------------
Os04g04345 MP-ASPD----------------S-LKCAACGCHRSFHRRA------AAG------I---
Os08g04384 MP-SPAA-DAAD---------PAS-LKCAACGCHRNFHRRL------PEA------P---
Os02g07066 MS-SPGD-GAA------------A-LACAACGCHRSFHRRE------PAV------V---
Os06g03372 VV-GGEP-TDPT-----------S-LMCAACGCHCNFHCWL------LEG------S---
AT1G14687  SQPST---GDL----------------CVACGCHRSYHRRI------DVI------S---
AT5G42780  VS-ST---GEED-----------S-LNCAACGCHRNFHREE------LIP------E---
Os12g02089 ---SGAQ-GELPLPMH-------A-AASPYLGLHHDHHHQLLG----VGA----------
Pp3c11_223 LS-SGKD-EKL------------A-LTCPPCGCHRNFHQRV------VDA-CEEGEEEEL
Pp3c5_860  MR-RGRE-GQE------------A-LQCMACGCHRSYH-RS------VLV-GDNGKELD-
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 MR-RGRD-GPE------------A-LQCMACGCHRRYH-RC------VGV-GDNGNEPQ-
                                                                       

AT1G14440  --------------------------------L---------------------------
AT2G02540  ------------------------------------------------------------
Lj4g002049 --------------------------------PPDY------------------------
Lj5g000796 ------------------------------------------------------------
Os11g02433 --------------------------------DADS------------------------
AT1G18835  --------------------------------VVCE------------------------
Lj1g001161 --------------------------------VVCE------------------------
AT3G28917  --------------------------------VVCD------------------------
AT1G74660  --------------------------------VVCG------------------------
Lj2g002499 -------------------------------HLLVP------------------------
Lj4g000050 --------------------------------MI--------------------------
Lj1g001230 --------------------------------AECD------------------------
Os11g01283 --------------------------------AECD------------------------
Os12g01245 --------------------------------AECD------------------------
AT4G24660  --------------------------------VPTY------------------------
Lj1g000953 --------------------------------ITSY------------------------
Lj1g002084 ------------------------------------------------------------
Os08g04794 --------------------------------TGVGP-------------------AE--
Os09g04664 --------------------------------AGEG------------------------
Pp3c1_1529 --------------------------------TSCD------------------------
Pp3c2_2116 --------------------------------ATCD------------------------
Pp3c7_1500 --------------------------------ESCD------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 --------------------------------VLCE------------------------
Pp3c21_110 --------------------------------VLCE------------------------
Pp3c19_204 --------------------------------VLCD------------------------
Pp3c22_960 --------------------------------VLCD------------------------
Lj2g001509 --------------------------------NSPGD-------------------GGQF
Lj4g002796 --------------------------------AFPGD-------------------HPYH
AT5G65410  --------------------------------HHAP------------------------
Os08g04381 --------------------------------QCCA------------------------
Os09g04145 --------------------------------VAWD------------------------
AT3G50890  ------------------------------------------------------------
AT2G18350  --------------------------------FVVN------------------------
Lj2g001028 --------------------------------QQVP------------------------
Lj4g002150 --------------------------------QQVS------------------------
AT1G75240  --------------------------------GSSD------------------------
Os01g06355 --------------------------------RAGA---------------------ARL
Os05g05793 --------------------------------Q-------------------------QQ
AT1G69600  --------------------------------LNFL------------------------
AT5G60480  --------------------------------F---------------------------
AT3G28920  ---------------------------------DSS------------------------
AT5G39760  --------------------------------NDSS------------------------
AT5G15210  --------------------------------SLSA------------------------
Lj1g000619 --------------------------------PIST------------------------
Lj2g000448 --------------------------------PITA------------------------
Lj4g001229 ----------------------------------SA------------------------
Lj2g002469 --------------------------------YSNP------------------------
Os03g07185 --------------------------------RATA------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 --------------------------------GGGP------------------------
Os08g04384 --------------------------------PSPPL-------------------LALP
Os02g07066 --------------------------------APAS------------------------
Os06g03372 --------------------------------PPPP------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  ------------------------------------------------------------
Os12g02089 ------------------------------------------------------------
Pp3c11_223 TVKAKREKLNSGNYFSSFVDHCNIDRVAHELMAVANEALALA-------QDSICHGEGRG
Pp3c5_860  ----------------------TIGEVDAVQPRLINNDLHLSLSRIETVALNLMEATGRA
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ----------------------NIDEADGAAPRISNDDLQLSLSRTETVSPNLMEATGRA
                                                                       

AT1G14440  ------------------A--AT-AM--------------SPY---HQHP-------P-H
AT2G02540  ------------------E--KT-FF--------------SPYLNHHQPP-------PQQ
Lj4g002049 ----QHFNRLG----LGGR--KF-IL--------------GGHHHHHHKN-----ILTTP
Lj5g000796 ------------------------------------------------------------
Os11g02433 ----CAAALRA----AAGR--HHHLL--------------GPALPHHHHKNGGGLLVAGG
AT1G18835  -----YSPPN--------------------------------------------------
Lj1g001161 -----YSPPN--------------------------------------------------
AT3G28917  ----CNSPPS--------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 -----YS-----------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ----CSSSPSA-------------------------------------------------
Os11g01283 ----CSSDTS--------------------------------------------------
Os12g01245 ----CSSDTS--------------------------------------------------
AT4G24660  ----YNRPPQP-H--------------------QPP-----GYL-H---------LTSP-
Lj1g000953 ----QPRPPQQ-QPAY------HHQFSPYYPRAEPP--PSAGYL-HH-------LVTPP-
Lj1g002084 ----------------------HHA-------------ASGGYLHHH-------LTTSP-
Os08g04794 ---PSAVSPAA-ISAYGAS--PHHQFSPYY-R-TP-----AGYL-HHQQHQMAAAAAAAA
Os09g04664 ----SPFSPAA-VVPYGAT--PHHQFSPYY-R-TP-----AGYL-HHHQHHM-AAAAAAA
Pp3c1_1529 ----------------------------------------CKYI-NRNDPRKRGMM-VPG
Pp3c2_2116 ----------------------------------------CQNI-KRNDPRKRGLM-APG
Pp3c7_1500 ----------------------------------------WQ-I-FRCDDRKRGQMTAPG
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ----CKRKQKP-GVQLGA-----------------------------------AVITSQH
Pp3c21_110 ----CKRKPKP-GMQLGA-----------------------------------GIVTPHQ
Pp3c19_204 ----CKRKPKM-GAPLGT-----------------------------------GIVNTGQ
Pp3c22_960 ----CKRKPKP-GVQLGA-----------------------------------GIVTPGM
Lj2g001509 LLTHLPHVPPP-PPQFQA----------YYGR-GP-----AGYL-HMSGQHR--------
Lj4g002796 PFHHRRHQPPPGQPQYAA----------CYRA-TP-----AGYL-HVAGPNR--------
AT5G65410  ----PQHQPPPPPPGF-------------Y-R-LP-----APVS-YRPPPS---------
Os08g04381 ----CDTAAAA-AAAGGW------------------------------------------
Os09g04145 ----DDCASG--------------------------------------------------
AT3G50890  ---------------------------------------------HRNSKQDHQLMITPA
AT2G18350  ----FNSF---------------------------------GHS-QRP------------
Lj2g001028 ----NPSFHSY------------------YKH-------SNGHL-QLPAPQP----LPPP
Lj4g002150 ----N----YH------------------HNK-------SNGQN-RI-------------
AT1G75240  ---------------------------------------LISHHRHHH------------
Os01g06355 PQLHLPASINS-------------------------------------------------
Os05g05793 LRLLIPSPPTP-------------------------------------------------
AT1G69600  ----TAPPIS------------------------------SPS-----------------
AT5G60480  ------------------------------------------------------------
AT3G28920  ----AVPPPSL---LPSST--TTAAIE------------YQPHHRHHPPP-------PLA
AT5G39760  ----QIPPPP------------STAVE------------YQPHHRHHPPP-------PPP
AT5G15210  ----IVP-----------------AIE------------FRPHNRHQLPP-------PPP
Lj1g000619 ----A-------------------VIE------------YQPHHRHHPPP-------PPS
Lj2g000448 ----AHH-----------------VFE------------YQPHHRHHPPP-------PVP
Lj4g001229 ----QTPPQP------------------------------LPHHGMSRST-------SPS
Lj2g002469 ----T-------------------FIS------------FYP-----------------S
Os03g07185 ----AAG-----------------------------------------------------
Lj2g000198 ------------------------------------------------------------
Os04g04345 ----VFFRPPP----------------------PP-----QPHS-HHAALQG--------
Os08g04384 PPPPPPPPPPP-PPQPQQHLPRTAAVAV-----APQLLLHGSHQRREQSPET-DRVRGPG
Os02g07066 -LSLCPASASA-SAAAG-------LVS---------------------------------
Os06g03372 PPLALPAPPMP-----------ANVL-------------HGQLHREEETPE----VRLPG
AT1G14687  ----------------------------------------SPQINHT-------------
AT5G42780  ----------------------------------------NGGVTETVLEV---------
Os12g02089 ---------------------------------HPR---GHGHHHHHLLV----------
Pp3c11_223 L-------PENGV--YSVE--EHTIIA----------KISLENLDHITKV-----ISSTT
Pp3c5_860  LPLLAADHPPRGSDDLATK--ELDTVM----------KISIENLDHISKV-----TLSTV
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 LSLLATDHPSRGADDLAIQ--DLDTVM----------KISIDNLDHIST-----------
                                                                       

AT1G14440  RKLMLNHQK---------------------IRS------AMP------------------
AT2G02540  RKLMFHHKM---------------------IKS------PLP------------------
Lj4g002049 EALGYHHHPTTAGNNN-----------ILPSRT------ILPP-----------------
Lj5g000796 ------------------------------SDS------NIPS-----------------
Os11g02433 DPYGAAYAAARA----------------LPPPP------PPPPH------------GHH-
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  ---------------------------------------AAPY-----------------
Lj1g000953 ---------------------------------------LSQH-----------------
Lj1g002084 ---------------------------------------TAPH-----------------
Os08g04794 ----------------------------AAAAG------GYPQ-----------------
Os09g04664 ----------------------------AAAAG------GYPQ-----------------
Pp3c1_1529 ----------------------------APSQP------GGPQ-----------------
Pp3c2_2116 ----------------------------GPSQP------GSSQ-----------------
Pp3c7_1500 ----------------------------FPSQT------ATPH-----------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ----------------------------PPGGT----IPSTPM-----------------
Pp3c21_110 ----------------------------LPGGT----NTSTPM-----------------
Pp3c19_204 ----------------------------PPTLT-----STTPV-----------------
Pp3c22_960 ----------------------------PTTLT-----SATPV-----------------
Lj2g001509 ----------------------------------------VGA-----------------
Lj4g002796 ---------------------------------------GSGA-----------------
AT5G65410  ---------------------------------------QAPP-----------------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  ----------FYS-------------------------SNSSY-----------------
AT2G18350  -------------------------------LG------SRHV-----------------
Lj2g001028 QNSSLRFSHAAAGT--------------TPTAG------PVPV-----------------
Lj4g002150 HPSSLHHKHGFSSS----------------------------------------------
AT1G75240  ----YHHNQYGGG---------------GGRRP------PPPN---------------M-
Os01g06355 ---------------------------------------RAPP-----------------
Os05g05793 ---------------------------------------RVP------------------
AT1G69600  ------------GTE-------------SPPSR----HVSSPV----------P-CSYY-
AT5G60480  -------------------------------SQ----HRSPPS----------P-LQLQ-
AT3G28920  P-PL-------------------------PRSP----NSSSPP----------P-ISSS-
AT5G39760  P-PP-------------------------PRSP----NSASPP----------P-ISSS-
AT5G15210  P-HLA-------GIR-------------SPDDD----DSASPP----------P-ISSS-
Lj1g000619 FQPS-------------------------SRSP----NSASPP----------P-ISSSY
Lj2g000448 F----------------------------NRSP----NSASPP----------P-ISS--
Lj4g001229 LSSS----------Q-------------SPSPI----SSPSPP----------P-LSHM-
Lj2g002469 ISTS-------------------------PSSS----PSRSPP----------P-LSHHF
Os03g07185 ------------G----------------AGGA----GVGVAPML-PAPGGGGP-PGYM-
Lj2g000198 --------------------------------------VTCPP----------P-SSNM-
Os04g04345 ---------------------------FLPSSV----PAPAPP-----------------
Os08g04384 H----HHDDDAAADDDDSEDSEMSDYDDDRSAS----PLQAPP----------PVLSPGY
Os02g07066 ---------------------------LSPSAT----PTGANS-----------------
Os06g03372 VD----------GDESD-NNSDGSEY-YDERSV----SPPSPPHL-PA-----PVVHQPY
AT1G14687  ---------------------------------------RFPF-----------------
AT5G42780  --------------------------------------LKISS-----------------
Os12g02089 --------------------------------------AALPP----------P------
Pp3c11_223 ECLTLLRKR-----KS-----------VNPRDKDVNKSKSCPHEGGPFSTDASPVITVE-
Pp3c5_860  ECIGILQTAA-STQQS-----------ASSRD-----RPSVPDHG-RLSSEVSPVMNKE-
Pp3c5_889  -------------------------------------------------------MNKE-
Pp3c6_2830 --------------QR-----------ASSRE-----SQTMPDY------PLCPVRTTD-
                                                                       

AT1G14440  ----------------HQMIMPI------------------------G-VSNYR------
AT2G02540  ----------------QQMIMPI------------------------G-VTT--------
Lj4g002049 ----------------HHMIMPY-------------------------------------
Lj5g000796 ----------------QHYVLPLL---------------------PIL-AHSSL------
Os11g02433 --------------HHHQIIMPL-------------------------------------
AT1G18835  -------------------------------------------------AN---------
Lj1g001161 ------------------------------------------------------------
AT3G28917  -------------------------------------------------TG---------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 -------------------------------------------------SG---------
Os11g01283 -------------------------------------------------SG---------
Os12g01245 -------------------------------------------------SG---------
AT4G24660  ----------------R----PP------------------------A-AS---------
Lj1g000953 ----------------RPLALPP------------------------A-GS---------
Lj1g002084 ----------------RPLALPP------------------------A-TS---------
Os08g04794 ----------------RPLALPS------------------------T-SHSGRD--EGD
Os09g04664 ----------------RPLALPS------------------------T-SHSGRD--DGD
Pp3c1_1529 -----------------QLALLS------------------------P-AP---------
Pp3c2_2116 -----------------QLALPT------------------------P-AQ---------
Pp3c7_1500 -----------------PLALPS------------------------P-SQ---------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ----------------ATLALPP------------------------S-AG---------
Pp3c21_110 ----------------GALALPP------------------------S-AG---------
Pp3c19_204 ----------------TTLALTA------------------------SVAG---------
Pp3c22_960 ----------------STLALTV------------------------GGAG---------
Lj2g001509 ----------------GTLALPS------------------------I-SG---------
Lj4g002796 ----------------ATLALPI------------------------G-GG---------
AT5G65410  ----------------LQLALPP------------------------P-QR---------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  ----------------KPRVMHP--------------------------TG---------
AT2G18350  ----------------SPIMMSF------------------------G-GG---------
Lj2g001028 ----------------PPLMMAF------------------------G-GA---------
Lj4g002150 ----------------PGLMMGF------------------------G-GG---------
AT1G75240  --------------MLNPLMLPPPPNYQPIH-------HHKYGMSPPGGGG---------
Os01g06355 -----------------ALLLPP------------------------A-AAASKQ-----
Os05g05793 ------------------LLMPP------------------------P-QPQPHPHPQHP
AT1G69600  -TSA----------PPHHVILSL------------------------S-SGFP-------
AT5G60480  -PLA----------PVPNLLLSL------------------------S-SGFF-------
AT3G28920  -----------------YMLLAL------------------------SG----NN-----
AT5G39760  -----------------YMLLSL------------------------SGTNNNNN-----
AT5G15210  -----------------YMLLAL------------------------SGGRGGAN-----
Lj1g000619 YPSA------------PHMLLAL------------------------S-TGLA-------
Lj2g000448 YPSA------------PHMLLAL------------------------SGAGLP-------
Lj4g001229 PPSA---------SAVPQMLLAL-------------------------GTAFS-------
Lj2g002469 PPSHHHLHSQISKSAPPHVQLAL-------------------------GT----------
Os03g07185 --------------HMAAMGGAV------------------------GGGGGV-------
Lj2g000198 --------------RDLSELTEY------------------------SGGG---------
Os04g04345 ----------------PQLALPY-------HAVPAAAWHHAAA----AAAG------R--
Os08g04384 LPSA------------THMLLSL------------------------GSAS---------
Os02g07066 -------------SRLMPLLLAPP------H-------MQK---RP--------------
Os06g03372 YPSA------------QHMLLSL------------------------GSSG------Q--
AT1G14687  -------------TSLRRV--------------------------------------K--
AT5G42780  -------------CQFRRIFCSP------------------------YGGG------K--
Os12g02089 ----------------TRMVMPL-------------------------------------
Pp3c11_223 -------------NGAISLLVPP------------------------S-PGRQFR--ED-
Pp3c5_860  -------------H---------------------------------------RR--RA-
Pp3c5_889  -------------H---------------------------------------RR--RA-
Pp3c6_2830 -------------N---------------------------------------RR--RS-
                                                                       

AT1G14440  -----------YMHNNSESEDFM-EEDGVTT-------------ASRSLPNLP-------
AT2G02540  ------------AGSNSESEDLM-EEEGGGS-------------LTFRQPPPPPSPYSYG
Lj4g002049 -----------NIGSLLPSESDE-QEDVAAG------------GMVGRPAGQN-------
Lj5g000796 ----------NKSGSISPSDQSD-EKDCDYG--------------IKRVENPK-------
Os11g02433 -----------NMIHTSESDEMD-VSGGGGG--------------VGRGGGSS-------
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  ---------------G----DEE-DTSNPSS----------S------------------
Lj1g000953 ---------------GGFRREEE-DMSNPSS----------S----------G-------
Lj1g002084 --------------GGGFSREEE-DMSNPSS----------SGGG------GG-------
Os08g04794 DMSGMVGP----M-VIGPMVGMSLGSAGPSG--------------------SG-------
Os09g04664 DLSGMVGP----MSAVGPLSGMSLG-AGPSG--------------------SG-------
Pp3c1_1529 -IMGRVTP-APYMLAHGLVDSDD-GDGGLSG--------------------SP-------
Pp3c2_2116 -IISRVTA-APFLLGPGPTDSDD-GDGGLSG--------------------SP-------
Pp3c7_1500 -MISPVNQFQHYLLGPRPANSGD-GDGGFGR--------------------SP-------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 -VMTPLTMAA--LSTGGPTDSDE-QDDGLGN----------SGGGMMMSMRSP-------
Pp3c21_110 -AMTPLTTAA--LSAGGLTDSDE-QDDGLGN----------SAGGMMISMRSP-------
Pp3c19_204 -QMTPLAMAA--LSAGGPTDSDE-QDDGPGNV--------TSGGGMMMSMRSP-------
Pp3c22_960 -QMSPLAMAA--LSAGGPTDSDE-QDDGPGNV--------TSGGGMMMSMRSP-------
Lj2g001509 --------------GGGPRE--D-QED-ISNL--------SAGGGS--------------
Lj4g002796 --------------PQGARELMD-HDD-VSDR--------SGGGGEG----SN-------
AT5G65410  ------------------ERSED-PMETSSA-------------------EAG-------
Os08g04381 ----------------------EWRDCSPES-----------------------------
Os09g04145 -------------------------DTSSSS-----------------------------
AT3G50890  -----------EIGRRTSSSSED-MKKILSHRNQN----VDGKSLMMMM-----------
AT2G18350  ---------------GGCAAESS-TEDLNKFH--------QSFSGYGVDQFHH-------
Lj2g001028 ------------------AAESS-SEDLNMF---------QRNAGAQ---EAA-------
Lj4g002150 --------------SGGPAAESS-SEDLNMFH--------QSNDGGQLSVQPP-------
AT1G75240  -MVTPMSVA--YGGGGGGAESSS-EDLNLYGQSSG----EGAGAAAGQMAFSM-------
Os01g06355 --GLPFPGYGTPSGGTGTTTASS-SDERLRP--------------------SP-------
Os05g05793 YLHPPFPYHHTPSGSGGTTTESS-SEERGPP--------------------SS-------
AT1G69600  ----------------GPSDQDP-----------------T---VVRSENSSR-------
AT5G60480  ----------------GPSDQEV-----------------KNKFTVERDVRKT-------
AT3G28920  -KTAPFSDL-----NFAAAAN---------------------------HLSAT-------
AT5G39760  -NLASFSDL-----NFSAGNNHH-----------------H------HHQHTL-------
AT5G15210  --------------------------------------------------TAV-------
Lj1g000619 ----------------APPENAA-----------------G------PT-----------
Lj2g000448 ----------------VPPENTA-----------------A------PTQTPG-------
Lj4g001229 ----------------TPPEN---------------------------------------
Lj2g002469 -------------------EN---------------------------------------
Os03g07185 ------------------------------------------------DGGGG-------
Lj2g000198 -------------------------------------------------GSGG-------
Os04g04345 --------------AGSETPPRM-DDFGPGS------------AGGSGSGGGG-------
Os08g04384 ----------------APAVAASRPHAAAAAM-----------GPPPPPGAAT-------
Os02g07066 -PVLPVSPA---SAPAALAESSS-EELRPPPLPSSHPHAHAAAVVAASASAPP-------
Os06g03372 ------------------------------------------------------------
AT1G14687  -----------QLARLKWKTAEE-------------RNEEEEDDTEETSTEEK-------
AT5G42780  -----------SEGKKKKKEKES-------------YGGDPIIKDRFGGAEEE-------
Os12g02089 -----------SAMHTSESDDAA------------------------ARPGGG-------
Pp3c11_223 -----------A-IGGSPTLIIS-ESCGIRNCN------LGMPRKLPLERVSH-------
Pp3c5_860  -----------Q-LQLSPSHLHI-QSNLLQVDR------ISAPNG-QAQNGSH-------
Pp3c5_889  -----------Q-LQLSPSHLHI-QSNLLQVDR------ISAPNG-QAQNGSH-------
Pp3c6_2830 -----------QSLQFN-------HNNLQQADR------TSAPNW-NAQQGSH-------
                                                                       

AT1G14440  YNQ-----KKRFRTKFTPEQKEKMLSFAEKVGW----KIQRQE------DCVVQRFCEEI
AT2G02540  HNQ-----KKRFRTKFTQEQKEKMISFAERVGW----KIQRQE------ESVVQQLCQEI
Lj4g002049 QIV-----KKRFRTKFTQEQKEKMLNFADKVGW----KIQKQE------ESAVQQFCQEI
Lj5g000796 ENV-----KKRSRTKFTQEQKEKLLGFAEKAGW----RIQKLE------ESVVHKFCQEV
Os11g02433 SSS-----KKRFRTKFTAEQKARMLEFAERVGW----RLQKLD------DAMVHHFCQEI
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  GGT-----TKRFRTKFTAEQKEKMLAFAERLGW----RIQKHD------DVAVEQFCAET
Lj1g000953 GGT-----KKRHRTRFTPEQKDKMLEFAERVGW----RIQKHD------EAAVEQFCEEA
Lj1g002084 GGM-----KKRYRTKFTPEQKEKMLAFAEELGW----RIQKHQ------EAAVEQFCAET
Os08g04794 SG------KKRFRTKFTQEQKDKMLAFAERLGW----RIQKHD------EAAVQQFCEEV
Os09g04664 SG------KKRFRTKFTQEQKDKMLAFAERVGW----RIQKHD------EAAVQQFCDEV
Pp3c1_1529 STM-----KKRFRTKFTNEQKEKMGVFAEKLGW----KIQKHD------EAAVQEFCAEV
Pp3c2_2116 STI-----KKRFRTRFNNEQKEKMGVFAEKLGW----KIQKHD------EAAVQEFCAEV
Pp3c7_1500 STM-----KKRFRTKFTSNQREKMGAFSEKLGW----RIQKHD------EPAVQEFCSDV
Pp3c3_2260 ---------------HTQVQPKLFLSFSEKLGW----RIQKHD------EPAVQEFCSVV
Pp3c18_128 SAI-----KKRFRTKFTNEQKDQMCAFAEKVGW----RIQKHD------EASVQEFCATA
Pp3c21_110 SAI-----KKRFRTKFSTEQKDQMCAFAEELGW----RIQKHD------EAAVQEFCTTV
Pp3c19_204 SAI-----KKRFRTKFTTEQKDKMCAFAEKLGW----RIQKHD------EAAVQEFCTTV
Pp3c22_960 SAI-----KKRFRTKFTTGQKDKMFAFAEN-AWVAHPEARRGC------RAGVLLTCGSQ
Lj2g001509 --S-----KKRFRTKFTQEQRDKMLDLAERLGW----RMQKHD------EGVVQDFCNET
Lj4g002796 SRA-----GKRFRTKFTHEQKEKMLEFAESAGW----RIQRHD------DNVVEEFCNEI
AT5G65410  GGI-----RKRHRTKFTAEQKERMLALAERIGW----RIQRQD------DEVIQRFCQET
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  MRK-----KKRVRTKINEEQKEKMKEFAERLGW----RMQKKD------EEEIDKFCRMV
AT2G18350  YQP-----KKRFRTKFNEEQKEKMMEFAEKIGW----RMTKLE------DDEVNRFCREI
Lj2g001028 ALS-----RKRHRTKFSSQQKDRMMEFAEKIGW----RIQKQD------EEEVQQFCSQV
Lj4g002150 LSS-----KKRFRTKFTQQQKDRMMEFAEKLGW----KIQKQD------EEEVKQFCSHV
AT1G75240  SSS-----KKRFRTKFTTDQKERMMDFAEKLGW----RMNKQD------EEELKRFCGEI
Os01g06355 VQP-----RRRSRTTFTREQKEQMLAFAERVGW----RIQRQE------EATVEHFCAQV
Os05g05793 SAAAAQGRRKRFRTKFTPEQKEQMLAFAERVGW----RMQKQD------EALVEQFCAQV
AT1G69600  GAM-----RKRTRTKFTPEQKIKMRAFAEKAGW----KINGCD------EKSVREFCNEV
AT5G60480  AMI-----KKHKRTKFTAEQKVKMRGFAERAGW----KINGWD------EKWVREFCSEV
AT3G28920  PGS-----RKRFRTKFSSNQKEKMHEFADRIGW----KIQKRD------EDEVRDFCREI
AT5G39760  HGS-----RKRFRTKFSQFQKEKMHEFAERVGW----KMQKRD------EDDVRDFCRQI
AT5G15210  PMS-----RKRFRTKFSQYQKEKMFEFSERVGW----RMPKAD------DVVVKEFCREI
Lj1g000619 SSS-----RKRFRTKFSQEQKEKMLKFAERVGW----KMQKKD------EDFVQDFCNEI
Lj2g000448 SDS-----RKRFRTKFTPGQKEKMLEFAERVGW----KMQKRD------EDLVMEFCNEV
Lj4g001229 PTA-----KKRQRTKFTMEQKEKMQSFSEKLGW----RMQK-D------DGLVQKFCNDI
Lj2g002469 PSK-----KKRCRTKFSEEQKGKMLEFSEKLGW----RMQREE------EGSIQKFCDGI
Os03g07185 SGG-----RRRTRTKFTEEQKARMLRFAERLGW----RMPKREPGRAPGDDEVARFCREI
Lj2g000198 EGR-----RKRYRTKFTVEQKEKMLGFAEKLGW----KLQRKEL-----EGEIEAFCRSV
Os04g04345 IFG-----RKRFRTKFTPEQKERMREFAEKQGW----RINRND------DGALDRFCVEI
Os08g04384 SAS-----RKRFRTKFSPEQKQRMQALSERLGW----RLQKRD------EAVVDECCREI
Os02g07066 GPS-----KKRFRTKFTAEQKERMREFAHRVGW----RIHKPD------AAAVDAFCAQV
Os06g03372 ---------------------------AQRLPL---------------------------
AT1G14687  MTV-----QRRRKSKFTAEQREAMKDYAAKLGW----TLKDKRAL----REEIRVFCEGI
AT5G42780  EGI-----VKRLKTKFTAEQTEKMRDYAEKLRW----KVRPER------QEEVEEFCVEI
Os12g02089 AAA-----RKRFRTKFTAEQKARMLGFAEEVGW----RLQKLE------DAVVQRFCQEV
Pp3c11_223 --K-----LKRTRTRISLEQREKLNAFAEKAGW----TVVGQR------KETIDATCQYI
Pp3c5_860  PGK-----PKRKRTQLTDEQREKMKSYAEHAGW----TIVGQR------KENIAAACKDI
Pp3c5_889  PGK-----PKRKRTQLTDEQREKMKSYAEHAGW----TIVGQR------KENIAAACKDI
Pp3c6_2830 PGG-----PEIKRKQFFDEQRRKMKAYAEHVGW----TNFGQR------KENIAAACKDI
                                                                       

AT1G14440  GVKRRVLKVWM--HNNKIHF---S----------KKNNINLEDNDNEK------------
AT2G02540  GIRRRVLKVWM--HNNKQNL---S----------KKSN----------------------
Lj4g002049 GVKRRVLKVWM--HNNKHNL---A----------KKNNLPTTPSQP--------------
Lj5g000796 GIKRRVLKVWM--HNNKNTF---S----------KR-KLSTT------------------
Os11g02433 GVKRRVLKVWM--HNNKHNL---A----------KK-PLPSSPPPP--------------
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  GVRRQVLKIWM--HNNKNSL---D----------GYQVFK--------------------
Lj1g000953 CIKRHVLKVWM--HNNKHTL---G----------KKP-----------------------
Lj1g002084 CVRRNVLKVWM--HNNKNTL---G----------KKP-----------------------
Os08g04794 CVKRHVLKVWM--HNNKHTL---G----------KKAP----------------------
Os09g04664 GVKRHVLKVWM--HNNKHTL---G----------KKLP----------------------
Pp3c1_1529 GVKRHVLKVWM--HNNKNTI---G----------KKPA----------------------
Pp3c2_2116 GVKRHVLKVWM--HNNKHTI---G----------KKPP----------------------
Pp3c7_1500 GVKRHVLKVWM--HNNKNTL---G----------KKVD----------------------
Pp3c3_2260 GVKRHVLKVWM--HNNKNTL---G----------KKITC---------------------
Pp3c18_128 GIKRHVLKVWM--HNNKHTM---G----------KKPT----------------------
Pp3c21_110 GVKRHVLKVWM--HNNKHTV---G----------KKP-----------------------
Pp3c19_204 GVKRHVLKVWM--HNNKHTV---G----------KKP-----------------------
Pp3c22_960 AARSEGLDAQQQAHYGQEAI---NFLA------ETRR-----------------------
Lj2g001509 GVKRHVLKVWM--HNNKHTL---G----------KKP-----------------------
Lj4g002796 GVKRHVLKVWM--HNNKHTL---G----------KKP-----------------------
AT5G65410  GVPRQVLKVWL--HNNKHTL---G----------KSPSPLHHHQAP--------------
Os08g04381 -------------SSSASST---T------------------------------------
Os09g04145 -------------PSSSSSL---S------------------------------------
AT3G50890  NLRRQVFKVWM--HNNKQAM---K----------RNNSNI--------------------
AT2G18350  KVKRQVFKVWM--HNNKQAA---K----------KKD-L---------------------
Lj2g001028 GVRRKVFKVWM--HNNKQAM---K----------KLHQM---------------------
Lj4g002150 GVKRQAFKVWM--HNSKQAM---K----------KKQIM---------------------
AT1G75240  GVKRQVFKVWM--HNNKNNA---K----------KPPTPTT-------------------
Os01g06355 GVRRQALKVWM--HNNKHSF-------------KQKQQQENRQEQ---------------
Os05g05793 GVRRQVFKVWM--HNNKSSI---GSSSGGGSRRQPQEQQSQQQQQ---------------
AT1G69600  GIERGVLKVWM--HNNKYSLLNGK----------IREI----------------------
AT5G60480  GIERKVLKVWI--HNNKY-FNNGR----------SRDT----------------------
AT3G28920  GVDKGVLKVWM--HNNKNSF---K----------FSGGGATTVQRNDNGIGGE-------
AT5G39760  GVDKSVLKVWM--HNNKNTF---N----------RRDIAGNEIRQIDNGGGNHTPILAGE
AT5G15210  GVDKSVFKVWM--HNNKIS--------------------GRSGARRANGGVV--------
Lj1g000619 GIERGVLKVWM--HNNKNTF---G----------KRDG-----SNEINNNNINNN-----
Lj2g000448 GVDRGVLKVWM--HNNKNTF---G----------KRDH-----AAANGGAGGGDD-----
Lj4g001229 GVSRGVFKVWM--HNNKNSF---R----------RRSQ--------DQGDAPPPP--PQT
Lj2g002469 GVSREVFKVWM--HNNKSR-----------------------------------------
Os03g07185 GVNRQVFKVWM--HNHKAGG---G----------GGGGG-------SGGPG---------
Lj2g000198 GVSRQVFKVWM--HNHKNSC---S----------SNASA-------SNG-----------
Os04g04345 GVKRHVLKVWM--HNHKNQL---A----------SSPTSAAAAAAGVMNPGAG-------
Os08g04384 GVGKGVFKVWM--HNNKHNFL--G----GHSA--RRSAAAAAAAP---------------
Os02g07066 GVSRRVLKVWM--HNNKHLA---K----------TPPSPTSQPPPPPLHHDPSPP---PP
Os06g03372 ------------------------------------------------------------
AT1G14687  GVTRYHFKTWV--NNNKK------------------------------------------
AT5G42780  GVNRKNFRIWM--NNHKDKI---I------------------------------------
Os12g02089 GVKRRVLKVWM--HNNKHTL---A----------RRHLHPSPAAAAGDDDDDGAP-----
Pp3c11_223 GIEPKTLKYWI--HNSKQKW---K----------RQPSLSEDPSK---------------
Pp3c5_860  GVTPKTLKYWI--HNAKQKL---K----------R----SHDQAL---------------
Pp3c5_889  GVTPKTLKYWI--HNAKQKL---K----------R----SHDQAL---------------
Pp3c6_2830 GVTPKTLKYWI--HNAKQKL---K----------R----SHEQPLQHSHTAHTQT-----
                                                                       

AT1G14440  ------------------------------------------------------------
AT2G02540  ------------------------------------------------------------
Lj4g002049 ------------------------------------------------------------
Lj5g000796 ------------------------------------------------------------
Os11g02433 ------------------------------------------------------------
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  ------------------------------------------------------------
Lj1g000953 ------------------------------------------------------------
Lj1g002084 ------------------------------------------------------------
Os08g04794 ------------------------------------------------------------
Os09g04664 ------------------------------------------------------------
Pp3c1_1529 ------------------------------------------------------------
Pp3c2_2116 ------------------------------------------------------------
Pp3c7_1500 ------------------------------------------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ------------------------------------------------------------
Pp3c21_110 ------------------------------------------------------------
Pp3c19_204 ------------------------------------------------------------
Pp3c22_960 ------------------------------------------------------------
Lj2g001509 ------------------------------------------------------------
Lj4g002796 ------------------------------------------------------------
AT5G65410  ------------------------------------------------------------
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  ------------------------------------------------------------
AT2G18350  ------------------------------------------------------------
Lj2g001028 ------------------------------------------------------------
Lj4g002150 ------------------------------------------------------------
AT1G75240  ------------------------------------------------------------
Os01g06355 ------------------------------------------------------------
Os05g05793 ------------------------------------------------------------
AT1G69600  ------------------------------------------------------------
AT5G60480  ------------------------------------------------------------
AT3G28920  -NSNDDGVRG---------------------LA-NDGD----------------------
AT5G39760  INNHNNGHHG---------------------VG-GGGELHQSVSS---------------
AT5G15210  -------------------------------VG-GVGDSRQSVVP---------------
Lj1g000619 -INNAGSAKSFFAKENHDPVITN--ISSASEIN-GNGNRNHGAED---------------
Lj2g000448 -DGAGGGERA---------------------IN-GNGNGSAASQD---------------
Lj4g001229 EKNH--------------------------------------------------------
Lj2g002469 SSSEIGNEKK---------------------IN-GGGY----------------------
Os03g07185 -------------------------------AG-GGAQTSSSTTRG--------------
Lj2g000198 --------------------------------------NASSLTQ---------------
Os04g04345 -------------------------------IGLGTG-----------------------
Os08g04384 --------------LAPPPVL---------------------------------------
Os02g07066 PHHHHHHHHHH-------------------------------------------------
Os06g03372 ------------------------------------------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  ------------------------------------------------------------
Os12g02089 -------------PPHPDPRR--------RELAAAAAPPPAPVTQ---------------
Pp3c11_223 ------------------------------------------------------------
Pp3c5_860  ------------------------------------------------------------
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 ----------------------DWKFRVLSKIVEGVGTKYHTCSCCNLQRMFHSKNIHLG
                                                                       

AT1G14440  ------------------------------------------------INNLNNVDLS-G
AT2G02540  -------------------------------------------------NVSNNVDLSAG
Lj4g002049 ------------------------------------------------------------
Lj5g000796 ------------------------------------------------------------
Os11g02433 -------------------------------------------------PQIPPMSMP-P
AT1G18835  ------------------------------------------------------------
Lj1g001161 ------------------------------------------------------------
AT3G28917  ------------------------------------------------------------
AT1G74660  ------------------------------------------------------------
Lj2g002499 ------------------------------------------------------------
Lj4g000050 ------------------------------------------------------------
Lj1g001230 ------------------------------------------------------------
Os11g01283 ------------------------------------------------------------
Os12g01245 ------------------------------------------------------------
AT4G24660  ------------------------------------------------------------
Lj1g000953 ------------------------------------------------------------
Lj1g002084 ------------------------------------------------------------
Os08g04794 ------------------------------------------------------------
Os09g04664 ------------------------------------------------------------
Pp3c1_1529 ------------------------------------------------------------
Pp3c2_2116 ------------------------------------------------------------
Pp3c7_1500 ------------------------------------------------------------
Pp3c3_2260 ------------------------------------------------------------
Pp3c18_128 ------------------------------------------------------------
Pp3c21_110 ------------------------------------------------------------
Pp3c19_204 ------------------------------------------------------------
Pp3c22_960 ------------------------------------------------------------
Lj2g001509 ------------------------------------------------------------
Lj4g002796 ------------------------------------------------------------
AT5G65410  ---------------------------------------------------------PPP
Os08g04381 ------------------------------------------------------------
Os09g04145 ------------------------------------------------------------
AT3G50890  ------------------------------------------------------------
AT2G18350  ------------------------------------------------------------
Lj2g001028 ------------------------------------------------------------
Lj4g002150 ------------------------------------------------------------
AT1G75240  ------------------------------------------------------------
Os01g06355 ------------------------------------------------------------
Os05g05793 ------------------------------------------------------------
AT1G69600  ------------------------------------------EHGLCLNTHSNDGD----
AT5G60480  ------------------------------------------TSSMSLNLKL--------
AT3G28920  ------------------------------------------GGGGRFESDSGGAD----
AT5G39760  -----------------------------------------GGGGGGFDSDSGGAN----
AT5G15210  ------------------------------------------------------------
Lj1g000619 ------------------------------------------PIHHHFQNDGGA------
Lj2g000448 ------------------------------------------P--NQYENDSGT------
Lj4g001229 --------------------------------------------GGCFDSDINNNDIHMN
Lj2g002469 --------------------------------------------GFQLLSDINNPH----
Os03g07185 ----------------------------------------GGDVGVGLSPAMGGDG----
Lj2g000198 ------------------------------------------------------------
Os04g04345 -----------------------------------------LGTGISGDGDGDDDDTDDS
Os08g04384 ------------------------------------------------------------
Os02g07066 -----------------------------------------------------------H
Os06g03372 ------------------------------------------------------------
AT1G14687  ------------------------------------------------------------
AT5G42780  ------------------------------------------------------------
Os12g02089 -------------------------------------------------------HIKKS
Pp3c11_223 ------------------------------------------------------------
Pp3c5_860  ------------------------------------------------------------
Pp3c5_889  ------------------------------------------------------------
Pp3c6_2830 AVAKNCQAAVRHVFSHGQNACMLRTSAQELHDVPVPDGCKHPFLGSLITITLFKSNVATD
                                                                       

AT1G14440  NNDMTKIV------------------------P
AT2G02540  NNDITENLAST--------------------NP
Lj4g002049 --------------------------------S
Lj5g000796 ---------------------------------
Os11g02433 SPPPPQIPPMSMPPSPPPMPMPMPPSPPQLKLE
AT1G18835  --------------------------------N
Lj1g001161 ---------------------------------
AT3G28917  --------------------------------N
AT1G74660  --------------------------------M
Lj2g002499 --------------------------------Y
Lj4g000050 ---------------------------------
Lj1g001230 --------------------------------R
Os11g01283 ---------TG--------------------RR
Os12g01245 ---------TG--------------------RR
AT4G24660  -----RYEATS--------------------AH
Lj1g000953 ---------------------------------
Lj1g002084 ---------------------------------
Os08g04794 ---------------------------------
Os09g04664 ---------------------------------
Pp3c1_1529 ---------------------------------
Pp3c2_2116 ---------------------------------
Pp3c7_1500 ----------Q--------------------VE
Pp3c3_2260 ------HGNAG--------------------LS
Pp3c18_128 ---------------------------------
Pp3c21_110 ---------------------------------
Pp3c19_204 ---------------------------------
Pp3c22_960 ---------------------------------
Lj2g001509 ---------------------------------
Lj4g002796 ---------------------------------
AT5G65410  PPQSSFHHEQD--------------------QP
Os08g04381 -------------------------------AS
Os09g04145 -------------------------------SE
AT3G50890  -------------------------------SE
AT2G18350  ---------------------------------
Lj2g001028 ---------------------------------
Lj4g002150 ---------------------------------
AT1G75240  -------------------------------TL
Os01g06355 -------------------------------QQ
Os05g05793 -------------------------------QQ
AT1G69600  --------GSS--------------------SS
AT5G60480  ---------------------------------
AT3G28920  -GGGNVNASSS--------------------SS
AT5G39760  --GGNVNGSSS--------------------S-
AT5G15210  -----TNGSFS--------------------ST
Lj1g000619 --TVRANGSSS--------------------S-
Lj2g000448 --NGATNGSSS--------------------SS
Lj4g001229 QDHSTVNVHFS--------------------S-
Lj2g002469 ------SRNSS--------------------TD
Os03g07185 EDDEEVRGSEM--------------------CM
Lj2g000198 ---------------------------------
Os04g04345 PPRAAVSSPSPSPI-----------------SV
Os08g04384 -TDFSINGSPQ--------------------ST
Os02g07066 PPQHHQQQQQQ-------------------HDA
Os06g03372 ---------------------------------
AT1G14687  ------------------------------FYH
AT5G42780  ------------------------------IDE
Os12g02089 VDNKSLISSLAALHCIALLLF-------HQIDA
Pp3c11_223 ---------------------------------
Pp3c5_860  ---------------------------------
Pp3c5_889  ---------------------------------
Pp3c6_2830 SDALSKIHETS------------------RNQC

このアライメントを眺めると、zinc finger (CHCC3H2, CX2NHAX3GX4DGCXEFX8~15CX2CXCHRXFH)とhomeobox 以外の領域でgapが非常に多いことがわかります。これでは何を比較しているかわかりません。そこで、比較したい領域のみにトリムしましょう。

MAFFT7の結果をReformatー> Output Sequence formatでclustalを選択してダウンロードします。Genome NetのCLUATAL W場合は、.alnファイルをダウンロードします。

③トリムする
文献をもとに目で見てトリムしても構わないのですが、配列が多いと大変なのでtrimAIhttps://vicfero.github.io/trimal/ )を使うことをお勧めします。
trimAIのサイトをもとに、github経由でファイルをダウンロードしてインストールしてください。

trimAIを使って、ダウンロードしてきたファイル(readseq.txt)をパラメータgt 0.95でトリムします。

(base) hanano@172 ~ % trimal -in readseq.txt -out demo-trimmed095.out -gt 0.95

結果、以下のファイル(demo-trimmed0.95.out)が出力されました。zinc finger領域がトリムされて出力されてきているようです。

CLUSTAL W (1.8) multiple sequence alignment

AT1G14440        NHAAAMGGNATDGCGEFMPSGACSACNCHRNFHK
AT2G02540        NHAATMGGNAIDGCGEFMPSGACSVCNCHRNFHR
Lj4g0020493      NHAAGMGGNATDGCGEFMPSGACSACNCHRNFHK
Lj5g0007965      NHAAAIGGNATDGCCEFMPAGACSACNCHRNFHK
Os11g0243300     NHAAAIGGNATDGCGEFMPSGACSACGCHRNFHK
AT1G18835        NHAANIGGYAVDGCREFMASGACAACGCHRNFHR
Lj1g0011612      NHAASIGGYAVDGCREFMASAACAACGCHRNFHR
AT3G28917        NHAAAVGGYAVDGCREFMASRACAACGCHRSFHR
AT1G74660        NHAANIGGYAVDGCREFMAAGACAACGCHRNFHK
Lj2g0024999      NHAANVGGYAVDGCREFMASGSCAACGCHRNFHK
Lj4g0000502      NHAAYSGGYAVDGCREFMASAACAACGCHRNFHR
Lj1g0012309      NHAVNVGGYAVDGCREFMASGACAAYGCHRSFYK
Os11g0128300     NHAASIGGHAVDGCREFMASGACAACGCHRSFHR
Os12g0124500     NHAASIGGHAVDGCREFMASGACAACGCHQSFHR
AT4G24660        NHAVNIGGHAVDGCCEFMPSGACAACGCHRNFHK
Lj1g0009539      NHAVGIGGHAVDGCCEFLAAGACAACNCHRNFHK
Lj1g0020844      NHAVSFGGHAVDGCCEFMAAGA------------
Os08g0479400     NHAVGIGGHAVDGCGEFMASGACAACGCHRNFHK
Os09g0466400     NHAVGIGGHAVDGCGEFMAAGACAACNCHRNFHK
Pp3c1_15290      NHAITTGGYVVDGCGEFMPGGACAACDCHRNFHK
Pp3c2_21160      NHAISTGGYAVDGCGEFMPGGACAACDCHRNFHK
Pp3c7_15000      NHAIFSGGYAVDGCGEFMPSGSCAACDCHRNYHK
Pp3c18_12880     NHAASIGGHALDGCGEFMPGGACAACDCHRNFHR
Pp3c21_11010     NHAASIGGHALDGCGEFMPGGACAACDCHRNFHR
Pp3c19_20410     NHAAGMGGHAMDGCGEFMPGGACAACNCHRNFHR
Pp3c22_9600      NHAAGIGAHAIDGCGEFMPGGACAACNCHRNFHR
Lj2g0015097      NHAVGIGGHALDGCGEFMPAGACAACNCHRNFHR
Lj4g0027969      NHAVGMGGYALDGCLEFMAAGACAACDCHRNFHK
AT5G65410        NQAVNIGGHAVDGCGEFMPAGACAACGCHRNFHK
Os08g0438100     NHAARMGGHAVDGCREFLAEGACAACGCHRSFHR
Os09g0414500     NHAASTGGHAVDGCREFIA-AACAACGCHRSFHR
AT3G50890        NHAASTGGHVVDGCCEFMAGGACAACNCHRSFHK
AT2G18350        NHAASSGGHVVDGCGEFMSSGSCAACDCHRSFHK
Lj2g0010289      NHAASMGSHVVDGCGEFMPSGACAACECHRNFHK
Lj4g0021504      NHAARLGSHVTDGCGEFMPNGSCAACECHRNFHK
AT1G75240        NHAASVGGSVHDGCGEFMPSGACAACDCHRNFHK
Os01g0635550     NHAAASGGHVVDGCGEFMPASPCAACGCHRSFHR
Os05g0579300     NHAAAMGGHVVDGCREFMPGDACAACGCHRSFHK
AT1G69600        NHAANLGGHALDGCGEFMPSPSCAACGCHRNFHR
AT5G60480        NHAVSLGGHALDGCGEFTPKSSCDACGCHRNFHR
AT3G28920        NHAAAIGGHALDGCGEFMPSPSCAACGCHRNFHR
AT5G39760        NHAAALGGHALDGCGEFMPSPSCAACGCHRNFHR
AT5G15210        NHAAGIGGHALDGCGEFMPSPSCAACGCHRNFHR
Lj1g0006197      NHAANLGGHALDGCGEFMPAPSCAACGCHRNFHR
Lj2g0004482      NHVASLGGHALDGCGEFMPSPSCAACGCHRNFHR
Lj4g0012297      NHAASLGGHALDGCGEFMPSSSCAACGCHRNFHR
Lj2g0024695      NHAASLGAHALDGCGEFMPSASCAACGCHRNFHR
Os03g0718500     NHAAKLGTYANDGCCEYTPDDGCAACGCHRNFHK
Lj2g0001982      NHAATLGSYATDGCGEFTLDDGCAACGCHRNFHK
Os04g0434500     NHAAAMGGQAFDGCGEYMPASSCAACGCHRSFHR
Os08g0438400     NHAASLGGHAVDGCGEFMPSPSCAACGCHRNFHR
Os02g0706600     NHAARMGAHVLDGCGEFMSSPACAACGCHRSFHR
Os06g0337200     NHAASLGGHGAGRLRGVVVGGSCAACGCHCNFHW
AT1G14687        NHAAKLGSYAIDGCREYSQST-CVACGCHRSYHR
AT5G42780        NHAADIGTTAYDGCGEFVSSTSCAACGCHRNFHE
Os12g0208900     -------------------SGASPYLGLHHDHHQ
Pp3c11_22370     NTCVARGPSSVDRFTKFLSSGACPPCGCHRNFHR
Pp3c5_860        NQALDTANHCVDGCGEFMRRGACMACGCHRSYHR
Pp3c6_28300      NHALDGVNHCIDGCGEFMRRGACMACGCHRRYHR

CHCC3H2の最初のCが欠けているので、もう少し条件を緩くしてもいいかもしれません。gt 0.92でやってみましょう。

(base) hanano@172 ~ % trimal -in readseq.txt -out demo-trimmed092.out -gt 0.92

これで系統樹を描いてみましょう。

CLUSTAL W (1.8) multiple sequence alignment

AT1G14440        KYKECLKNHAAAMGGNATDGCGEFMPSGALTCSACNCHRNFHK
AT2G02540        KYKECLKNHAATMGGNAIDGCGEFMPSGALTCSVCNCHRNFHR
Lj4g0020493      RYRECLKNHAAGMGGNATDGCGEFMPSGALNCSACNCHRNFHK
Lj5g0007965      SYKECLKNHAAAIGGNATDGCCEFMPAGALKCSACNCHRNFHK
Os11g0243300     KYRECLKNHAAAIGGNATDGCGEFMPSGALKCSACGCHRNFHK
AT1G18835        RYVECQKNHAANIGGYAVDGCREFMASGALTCAACGCHRNFHR
Lj1g0011612      RYGECQKNHAASIGGYAVDGCREFMASAALTCAACGCHRNFHR
AT3G28917        RYGECQKNHAAAVGGYAVDGCREFMASRALTCAACGCHRSFHR
AT1G74660        RYVECQKNHAANIGGYAVDGCREFMAAGALRCAACGCHRNFHK
Lj2g0024999      KYGECQKNHAANVGGYAVDGCREFMASGSLACAACGCHRNFHK
Lj4g0000502      -------NHAAYSGGYAVDGCREFMASAALTCAACGCHRNFHR
Lj1g0012309      RYGECQKNHAVNVGGYAVDGCREFMASGALTCAAYGCHRSFYK
Os11g0128300     RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHRSFHR
Os12g0124500     RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHQSFHR
AT4G24660        RYRECLKNHAVNIGGHAVDGCCEFMPSGALKCAACGCHRNFHK
Lj1g0009539      RYRECQKNHAVGIGGHAVDGCCEFLAAGAVICAACNCHRNFHK
Lj1g0020844      RYRECQKNHAVSFGGHAVDGCCEFMAAGA--------------
Os08g0479400     RYRECLKNHAVGIGGHAVDGCGEFMASGALRCAACGCHRNFHK
Os09g0466400     RYRECLKNHAVGIGGHAVDGCGEFMAAGALRCAACNCHRNFHK
Pp3c1_15290      RYRECNRNHAITTGGYVVDGCGEFMPGGALRCAACDCHRNFHK
Pp3c2_21160      RYRECNRNHAISTGGYAVDGCGEFMPGGALKCAACDCHRNFHK
Pp3c7_15000      SYKECNRNHAIFSGGYAVDGCGEFMPSGSLKCAACDCHRNYHK
Pp3c18_12880     RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
Pp3c21_11010     RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
Pp3c19_20410     RYRECQKNHAAGMGGHAMDGCGEFMPGGALRCAACNCHRNFHR
Pp3c22_9600      RYRECQKNHAAGIGAHAIDGCGEFMPGGALRCAACNCHRNFHR
Lj2g0015097      RYRECLKNHAVGIGGHALDGCGEFMPAGALKCAACNCHRNFHR
Lj4g0027969      KYRECLKNHAVGMGGYALDGCLEFMAAGALKCAACDCHRNFHK
AT5G65410        RFRECLKNQAVNIGGHAVDGCGEFMPAGALKCAACGCHRNFHK
Os08g0438100     RYGECRRNHAARMGGHAVDGCREFLAEGALRCAACGCHRSFHR
Os09g0414500     RYGECRRNHAASTGGHAVDGCREFIA-AALKCAACGCHRSFHR
AT3G50890        KYRECQKNHAASTGGHVVDGCCEFMAGGALKCAACNCHRSFHK
AT2G18350        RYRECQKNHAASSGGHVVDGCGEFMSSGSLLCAACDCHRSFHK
Lj2g0010289      RYRECLRNHAASMGSHVVDGCGEFMPSGALKCAACECHRNFHK
Lj4g0021504      RYRECLRNHAARLGSHVTDGCGEFMPNGSLICAACECHRNFHK
AT1G75240        RYRECLKNHAASVGGSVHDGCGEFMPSGALRCAACDCHRNFHK
Os01g0635550     RYHECLRNHAAASGGHVVDGCGEFMPASPLACAACGCHRSFHR
Os05g0579300     RYHECLRNHAAAMGGHVVDGCREFMPGDALKCAACGCHRSFHK
AT1G69600        CYKECLKNHAANLGGHALDGCGEFMPSPSLRCAACGCHRNFHR
AT5G60480        LYNECLKNHAVSLGGHALDGCGEFTPKSSLRCDACGCHRNFHR
AT3G28920        TYKECLKNHAAAIGGHALDGCGEFMPSPSLKCAACGCHRNFHR
AT5G39760        TYKECLKNHAAALGGHALDGCGEFMPSPSLKCAACGCHRNFHR
AT5G15210        TYKECLKNHAAGIGGHALDGCGEFMPSPSLTCAACGCHRNFHR
Lj1g0006197      AYKECLKNHAANLGGHALDGCGEFMPAPSLKCAACGCHRNFHR
Lj2g0004482      TYKECLKNHVASLGGHALDGCGEFMPSPSIKCAACGCHRNFHR
Lj4g0012297      SFKECLKNHAASLGGHALDGCGEFMPSSSLKCAACGCHRNFHR
Lj2g0024695      SYKECLRNHAASLGAHALDGCGEFMPSASLTCAACGCHRNFHR
Os03g0718500     VYRECMRNHAAKLGTYANDGCCEYTPDDGLLCAACGCHRNFHK
Lj2g0001982      LYRECLRNHAATLGSYATDGCGEFTLDDGLQCAACGCHRNFHK
Os04g0434500     KYKECMRNHAAAMGGQAFDGCGEYMPASSLKCAACGCHRSFHR
Os08g0438400     VYRECLKNHAASLGGHAVDGCGEFMPSPSLKCAACGCHRNFHR
Os02g0706600     RYRECLKNHAARMGAHVLDGCGEFMSSPALACAACGCHRSFHR
Os06g0337200     VYQECPKNHAASLGGHGAGRLRGVVVGGSLMCAACGCHCNFHW
AT1G14687        VYRECMRNHAAKLGSYAIDGCREYSQST---CVACGCHRSYHR
AT5G42780        HYYECRKNHAADIGTTAYDGCGEFVSSTSLNCAACGCHRNFHE
Os12g0208900     --------------------------SGAAASPYLGLHHDHHQ
Pp3c11_22370     ECNQCQKNTCVARGPSSVDRFTKFLSSGALTCPPCGCHRNFHR
Pp3c5_860        VYKECQKNQALDTANHCVDGCGEFMRRGALQCMACGCHRSYHR
Pp3c6_28300      VCKECQNNHALDGVNHCIDGCGEFMRRGALQCMACGCHRRYHR

このファイルをfasta形式のファイルに書き換えます。
私は、ファイル名をdemo-trimmed0.92.fastaとして複製保存し、テキストエディタ(Atom-2)を使って以下のような作業を行なっています。
    1. 1行目のCLUSTAL W (1.8) multiple sequence alignmentを削除します。
    2. 全ての「改行コード」を「改行コード>」に変換します。
    3. スペース*nを改行に変換します。

以下のようなアライメント結果のファイルができました。

>AT1G14440
KYKECLKNHAAAMGGNATDGCGEFMPSGALTCSACNCHRNFHK
>AT2G02540
KYKECLKNHAATMGGNAIDGCGEFMPSGALTCSVCNCHRNFHR
>Lj4g0020493
RYRECLKNHAAGMGGNATDGCGEFMPSGALNCSACNCHRNFHK
>Lj5g0007965
SYKECLKNHAAAIGGNATDGCCEFMPAGALKCSACNCHRNFHK
>Os11g0243300
KYRECLKNHAAAIGGNATDGCGEFMPSGALKCSACGCHRNFHK
>AT1G18835
RYVECQKNHAANIGGYAVDGCREFMASGALTCAACGCHRNFHR
>Lj1g0011612
RYGECQKNHAASIGGYAVDGCREFMASAALTCAACGCHRNFHR
>AT3G28917
RYGECQKNHAAAVGGYAVDGCREFMASRALTCAACGCHRSFHR
>AT1G74660
RYVECQKNHAANIGGYAVDGCREFMAAGALRCAACGCHRNFHK
>Lj2g0024999
KYGECQKNHAANVGGYAVDGCREFMASGSLACAACGCHRNFHK
>Lj4g0000502
-------NHAAYSGGYAVDGCREFMASAALTCAACGCHRNFHR
>Lj1g0012309
RYGECQKNHAVNVGGYAVDGCREFMASGALTCAAYGCHRSFYK
>Os11g0128300
RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHRSFHR
>Os12g0124500
RYRECQRNHAASIGGHAVDGCREFMASGALLCAACGCHQSFHR
>AT4G24660
RYRECLKNHAVNIGGHAVDGCCEFMPSGALKCAACGCHRNFHK
>Lj1g0009539
RYRECQKNHAVGIGGHAVDGCCEFLAAGAVICAACNCHRNFHK
>Lj1g0020844
RYRECQKNHAVSFGGHAVDGCCEFMAAGA--------------
>Os08g0479400
RYRECLKNHAVGIGGHAVDGCGEFMASGALRCAACGCHRNFHK
>Os09g0466400
RYRECLKNHAVGIGGHAVDGCGEFMAAGALRCAACNCHRNFHK
>Pp3c1_15290
RYRECNRNHAITTGGYVVDGCGEFMPGGALRCAACDCHRNFHK
>Pp3c2_21160
RYRECNRNHAISTGGYAVDGCGEFMPGGALKCAACDCHRNFHK
>Pp3c7_15000
SYKECNRNHAIFSGGYAVDGCGEFMPSGSLKCAACDCHRNYHK
>Pp3c18_12880
RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
>Pp3c21_11010
RYRECQKNHAASIGGHALDGCGEFMPGGALRCAACDCHRNFHR
>Pp3c19_20410
RYRECQKNHAAGMGGHAMDGCGEFMPGGALRCAACNCHRNFHR
>Pp3c22_9600
RYRECQKNHAAGIGAHAIDGCGEFMPGGALRCAACNCHRNFHR
>Lj2g0015097
RYRECLKNHAVGIGGHALDGCGEFMPAGALKCAACNCHRNFHR
>Lj4g0027969
KYRECLKNHAVGMGGYALDGCLEFMAAGALKCAACDCHRNFHK
>AT5G65410
RFRECLKNQAVNIGGHAVDGCGEFMPAGALKCAACGCHRNFHK
>Os08g0438100
RYGECRRNHAARMGGHAVDGCREFLAEGALRCAACGCHRSFHR
>Os09g0414500
RYGECRRNHAASTGGHAVDGCREFIA-AALKCAACGCHRSFHR
>AT3G50890
KYRECQKNHAASTGGHVVDGCCEFMAGGALKCAACNCHRSFHK
>AT2G18350
RYRECQKNHAASSGGHVVDGCGEFMSSGSLLCAACDCHRSFHK
>Lj2g0010289
RYRECLRNHAASMGSHVVDGCGEFMPSGALKCAACECHRNFHK
>Lj4g0021504
RYRECLRNHAARLGSHVTDGCGEFMPNGSLICAACECHRNFHK
>AT1G75240
RYRECLKNHAASVGGSVHDGCGEFMPSGALRCAACDCHRNFHK
>Os01g0635550
RYHECLRNHAAASGGHVVDGCGEFMPASPLACAACGCHRSFHR
>Os05g0579300
RYHECLRNHAAAMGGHVVDGCREFMPGDALKCAACGCHRSFHK
>AT1G69600
CYKECLKNHAANLGGHALDGCGEFMPSPSLRCAACGCHRNFHR
>AT5G60480
LYNECLKNHAVSLGGHALDGCGEFTPKSSLRCDACGCHRNFHR
>AT3G28920
TYKECLKNHAAAIGGHALDGCGEFMPSPSLKCAACGCHRNFHR
>AT5G39760
TYKECLKNHAAALGGHALDGCGEFMPSPSLKCAACGCHRNFHR
>AT5G15210
TYKECLKNHAAGIGGHALDGCGEFMPSPSLTCAACGCHRNFHR
>Lj1g0006197
AYKECLKNHAANLGGHALDGCGEFMPAPSLKCAACGCHRNFHR
>Lj2g0004482
TYKECLKNHVASLGGHALDGCGEFMPSPSIKCAACGCHRNFHR
>Lj4g0012297
SFKECLKNHAASLGGHALDGCGEFMPSSSLKCAACGCHRNFHR
>Lj2g0024695
SYKECLRNHAASLGAHALDGCGEFMPSASLTCAACGCHRNFHR
>Os03g0718500
VYRECMRNHAAKLGTYANDGCCEYTPDDGLLCAACGCHRNFHK
>Lj2g0001982
LYRECLRNHAATLGSYATDGCGEFTLDDGLQCAACGCHRNFHK
>Os04g0434500
KYKECMRNHAAAMGGQAFDGCGEYMPASSLKCAACGCHRSFHR
>Os08g0438400
VYRECLKNHAASLGGHAVDGCGEFMPSPSLKCAACGCHRNFHR
>Os02g0706600
RYRECLKNHAARMGAHVLDGCGEFMSSPALACAACGCHRSFHR
>Os06g0337200
VYQECPKNHAASLGGHGAGRLRGVVVGGSLMCAACGCHCNFHW
>AT1G14687
VYRECMRNHAAKLGSYAIDGCREYSQST---CVACGCHRSYHR
>AT5G42780
HYYECRKNHAADIGTTAYDGCGEFVSSTSLNCAACGCHRNFHE
>Os12g0208900
--------------------------SGAAASPYLGLHHDHHQ
>Pp3c11_22370
ECNQCQKNTCVARGPSSVDRFTKFLSSGALTCPPCGCHRNFHR
>Pp3c5_860
VYKECQKNQALDTANHCVDGCGEFMRRGALQCMACGCHRSYHR
>Pp3c6_28300
VCKECQNNHALDGVNHCIDGCGEFMRRGALQCMACGCHRRYHR

このファイルを使って再アライメントします。

④再アライメントする

MAFFT version 7のオンラインサイト(https://mafft.cbrc.jp/alignment/server/)にdemo-trimmed0.92.fastaをアップロードし、G-INS-1 (Slow; progressive method with an accurate guide tree)を選択してアライメントします。

Phylogenetic Treeをクリックして系統樹を描かせてみます。

保存されたzinc finger領域をもとにした系統樹が描けました。

⑤アライメントした配列をもとに系統樹を描く
先ほどの図を完成としてもいいのですが、iTOL Visualizehttps://itol.embl.de)を使うと綺麗な図が完成します。

先ほどのウインドウの左側に出てきたテキストをコピペしてテキストファイルdemo.dndとします。Genome NetのCLUSTAL Wでアライメントした場合には、clustulw.dndファイルをダウンロードしてください。

((((((((((((((((
1_AT1G14440
:0.04988,
3_Lj4g0020493
:0.04988):0.01677,
5_Os11g0243300
:0.06665):0.02832,
4_Lj5g0007965
:0.09497):0.02779,
2_AT2G02540
:0.12276):0.05054,((((
15_AT4G24660
:0.06653,
29_AT5G65410
:0.06653):0.03174,((
18_Os08g0479400
:0.03087,
19_Os09g0466400
:0.03087):0.03448,
27_Lj2g0015097
:0.06535):0.03292):0.03277,((
23_Pp3c18_12880
:0.00000,
24_Pp3c21_11010
:0.00000):0.05124,(
25_Pp3c19_20410
:0.03264,
26_Pp3c22_9600
:0.03264):0.01860):0.07979):0.02577,((
34_Lj2g0010289
:0.08986,
35_Lj4g0021504
:0.08986):0.05512,
36_AT1G75240
:0.14498):0.01182):0.01650):0.01688,
28_Lj4g0027969
:0.19018):0.01166,((
20_Pp3c1_15290
:0.03264,
21_Pp3c2_21160
:0.03264):0.08093,
22_Pp3c7_15000
:0.11357):0.08827):0.03285,((
16_Lj1g0009539
:0.12082,
17_Lj1g0020844
:0.12082):0.06694,(
32_AT3G50890
:0.12443,
33_AT2G18350
:0.12443):0.06332):0.04693):0.04360,((((((
6_AT1G18835
:0.04287,
9_AT1G74660
:0.04287):0.03828,
10_Lj2g0024999
:0.08115):0.01458,(
7_Lj1g0011612
:0.05027,
8_AT3G28917
:0.05027):0.04546):0.02993,
12_Lj1g0012309
:0.12565):0.04811,((
13_Os11g0128300
:0.01673,
14_Os12g0124500
:0.01673):0.12467,(
30_Os08g0438100
:0.10497,
31_Os09g0414500
:0.10497):0.03643):0.03237):0.02601,
11_Lj4g0000502
:0.19978):0.07851):0.00752,((((((((
39_AT1G69600
:0.03520,
44_Lj1g0006197
:0.03520):0.02486,
51_Os08g0438400
:0.06006):0.00256,(((
41_AT3G28920
:0.00941,
42_AT5G39760
:0.00941):0.02821,
43_AT5G15210
:0.03762):0.01414,
45_Lj2g0004482
:0.05176):0.01086):0.00285,
46_Lj4g0012297
:0.06548):0.01585,
47_Lj2g0024695
:0.08132):0.07181,
50_Os04g0434500
:0.15313):0.03766,(
52_Os02g0706600
:0.18408,(
37_Os01g0635550
:0.12848,
38_Os05g0579300
:0.12848):0.05560):0.00671):0.03615,
40_AT5G60480
:0.22695):0.05886):0.04077,((
48_Os03g0718500
:0.15402,
49_Lj2g0001982
:0.15402):0.10005,
54_AT1G14687
:0.25407):0.07251):0.01626,
55_AT5G42780
:0.34285):0.07269,(
58_Pp3c5_860
:0.12092,
59_Pp3c6_28300
:0.12092):0.29462):0.11255,
57_Pp3c11_22370
:0.52809):0.04017,
53_Os06g0337200
:0.56826):0.18396,
56_Os12g0208900
:0.75222);

このファイルをiTOL VisualizeのUpload Treeでアップロードすると、以下のような図が描けます。

このサイトでは、webブラウザ上で線の太さや色を変えたり、色をつけたり、系統樹のタイプを変えたり、高画質なファイルとしてエクスポートできたりします。

例えば、unrootedの系統樹で、ATHB25 (AT5G65410)にだけ色をつけてファイルをエクスポートしてみました。

zinc finger領域をもとにした系統樹では、ATHB25 (AT5G65410)はATHB22 (AT4G24660)が一番近いこと、図の下のクレードと左側のクレードにはヒメツリガネゴケのオーソログがありますが、右側や上のクレードにはヒメツリガネゴケのオーソログがないことから、それらグループのzinc finger領域はコケ類とシダ類が分岐した後に独自の進化を遂げてきた(あるいは苔類では失われた)ことなどが推察されます。

以下、作業の流れとリンクをまとめます。

① fasta形式のファイルを準備します。
植物種間比較では、PLAZA .0がオススメ

②アライメントします。
MAFFT version 7

あるいは、Genome Net (genome.jp)のClustal W

③トリムします。
trimAI

④再アライメントします。
 ②のアライメントと同様

⑤アライメントした配列をもとに系統樹を描きます。
iTOL Visualize


いいなと思ったら応援しよう!