Análise computacional de genes associados ao metabolismo de
Transcription
Análise computacional de genes associados ao metabolismo de
UNIVERSIDADE FEDERAL DE PERNAMBUCO CENTRO DE CIÊNCIAS BIOLÓGICAS DEPARTAMENTO DE GENÉTICA PROGRAMA DE PÓS-GRADUAÇÃO EM GENÉTICA E BIOLOGIA MOLECULAR DISSERTAÇÃO DE MESTRADO ANÁLISE COMPUTACIONAL DE GENES ASSOCIADOS AO METABOLISMO DE FIXAÇÃO DE NITROGÊNIO NO FEIJÃO-CAUPI (Vigna unguiculata) E CANA-DE-AÇÚCAR (Saccharum spp.) GABRIELA SOUTO VIEIRA DE MELLO RECIFE 2009 1 UNIVERSIDADE FEDERAL DE PERNAMBUCO CENTRO DE CIÊNCIAS BIOLÓGICAS DEPARTAMENTO DE GENÉTICA PROGRAMA DE PÓS-GRADUAÇÃO EM GENÉTICA E BIOLOGIA MOLECULAR DISSERTAÇÃO DE MESTRADO ANÁLISE COMPUTACIONAL DE GENES ASSOCIADOS AO METABOLISMO DE FIXAÇÃO DE NITROGÊNIO NO FEIJÃOCAUPI (Vigna unguiculata) E CANA-DE-AÇÚCAR (Saccharum spp.) GABRIELA SOUTO VIEIRA DE MELLO Dissertação apresentada ao Programa de Pós-graduação em Genética e Biologia Molecular da Universidade Federal de Pernambuco como requisito para obtenção do grau de Mestre em Genética pela UFPE Orientadora: Profª. Drª. Ana Maria Benko-Iseppon Co-orientador: Prof. Dr. Tercílio Calsa Júnior RECIFE 2009 2 Mello, Gabriela Souto Vieira de Análise computacional de genes associados ao metabolismo de fixação de nitrogênio no feijão-caupi (Vigna unguiculata) e cana-de-açúcar (Saccharum spp.) / Gabriela Souto Vieira de Mello. – Recife: O Autor, 2009. 160 folhas : il., fig., tab. Dissertação (mestrado) – Universidade Pernambuco.CCB. Genética, 2009. Federal de Inclui bibliografia e anexo. 1. Genética Molecular 2. Bioinformática 3. Feijão-Caupi 4. Cana-de-açúcar I Título. 577.21 CDU (2.ed.) UFPE 572.8 CDD (22.ed.) CCB – 2009-142 3 4 “Mude suas opiniões, sustente seus princípios; troque suas folhas, mas mantenha intacta suas raízes” Victor Hugo 5 Agradecimentos A Deus por tudo e por ter colocado tantas pessoas maravilhosas na minha vida. À Minha mãe, Uitamira pelo suporte moral e financeiro, por estar presente em todos os momentos da minha vida, pelo eterno incentivo e por manter nossa família unida que, junto com meu irmão João, me deram forças para correr atrás de tudo que eu sempre quis. Ao Meu pai, Ricardo José (in memorian), pelo exemplo de vida e força espiritual. À minha avó Irene pelo maravilhoso acolhimento em sua casa. Aos meus padrinhos Roberto Vieira de Mello e Maristela Ferraz por sempre acreditarem em mim. A todos os meus familiares, todos vocês, cada um de uma forma peculiar, iluminaram meu caminho. À minha irmã de coração Petra, que me ensinou tudo com uma enorme paciência, por sempre estar ao meu lado nas piores horas, pelas risadas, pelos maravilhosos momentos de descontração. Por TUDO, sem ela esse trabalho nunca teria sido feito. Aos meus amigos Marcela Randau, Mirella Soares e Moacyr Barreto por terem enchido minha pós-graduação de momentos felizes e descontraídos, os levarei para sempre na minha memória. Ao Túlio pelo apoio incondicional, por sempre me ajudar com uma palavra de conforto e uma idéia, por tanto me fazer rir e principalmente pela eterna paciência. À minha orientadora Profa. Ana Maria Benko-Iseppon pela oportunidade e por me ensinar a acreditar mais em mim. Ao meu co-orientador Tercílio Calsa Júnior pelos ensinamentos. A todos os meus amigos e membros do Laboratório de Genética e Biotecnologia Vegetal que tornaram o andamento desse mais agradável. A Carol e Luís pelos ensinamentos e principalmente a Nina pelas incontáveis ajudas e por sempre me apoiar. Aos professores do Programa de Pós-Graduação em Genética. À CAPES pela bolsa concedida durante o desenvolvimento do projeto. 6 SUMÁRIO Item Página LISTA DE ABREVIATURAS ......... VIII LISTA DE FIGURAS ......... XI LISTA DE TABELAS ......... XII RESUMO ......... XVI ABSTRACT ......... XIII INTRODUÇÃO ......... 15 CAPÍTULO 1. Revisão da literatura ......... 17 1.1. Interações Planta-microoganismo ......... 18 1.2. Fixação Biológica de Nitrogênio (FBN) ......... 19 1.2.1. Importância Econômica e Ambiental ......... 19 1.2.2. Mecanismos da FBN em Angiospermas ......... 21 1.2.3. FBN em Vigna unguiculata ......... 23 1.2.4. FBN em Saccharum sp. ......... 23 1.2.5. Aspectos Genéticos da FBN ......... 24 1.2.6. Principais Nodulinas Primárias ......... 25 1.2.7. Principais Nodulinas Secundárias ......... 32 ......... 38 1.3.1. Importância Econômica ......... 38 1.3.2. Origem e Distribuição Geográfica ......... 39 1.3.3. Melhoramento do Feijão-Caupi ......... 40 1.3.4. Aspectos Botânicos e Genéticos ......... 41 1.3.5. Projetos HarvEST, NordEST e CGKB ......... 42 ......... 44 1.4.1. Importância Econômica ......... 44 1.4.2. Origem e Distribuição Geográfica ......... 45 1.4.3. Melhoramento da Cana-de-Açúcar ......... 45 1.4.4. Aspectos Botânicos e Genéticos ......... 46 1.4.5. Projeto SUCEST ......... 47 1.5. Análise Bioinformática ......... 49 1.3. O Feijão-Caupi 1.4. A Cana-de-Açúcar 6 1.5.1. Retrospectiva e Aplicações Atuais ......... 49 1.5.2. Bancos de Dados, Ferramentas e Programas ......... 50 ......... 52 CAPÍTULO 2 - Computational Analysis of Genes Associated with ......... Symbiotic Nitrogen Fixation in the Cowpea (Vigna unguiculata) Transcriptome - Artigo a ser enviado para a revista Genetics and Molecular Research. CAPÍTULO 3 - Expression of Nodulins in Sugarcane Transcriptome ......... Revealed by Computational Analysis - Artigo a ser enviado para a revista Genetics and Molecular Research. 70 2. Referências Bibliográficas CONCLUSÕES GERAIS ANEXO ......... 115 156 157 7 LISTA DE ABREVIATURAS APC Anaphase Promoter Complex (Complexo Promotor da Anáfase) ATP Adenosina Trifosfato BLAST Basic Local Alignment Search Tool (Ferramenta Básica de Alinhamento Local de Seqüências) CCaMK Calcium/Calmodulin Dependent Protein Kinase-like (Quinase CálcioCalmodulina-Dependente CCS52 Cell Cycle Switch Protein (Proteína Interruptora do Ciclo Celular) CD Conserved Domain (Domínio Conservado) CDK Cyclin-Dependent Kinase (Quinase Ciclina-dependente) cDNA Complementary Complementar) DDBJ DNA Database of Japan (Banco de dados de DNA do Japão) DEFH125 Deficient Homolog 125 (Homólogo Deficiente 125) DMI Does Not Make Infection (Não Realiza a Infecção) DMT Divalent Metal Transporter (Transportador de Metais Divalentes) EMBL European Molecular Biology Laboratory (Laboratório Europeu de Biologiao Molecular) EMBRAPA Empresa Brasileira de Pesquisa Agropecuária ENOD Early Nodulin (Nodulina Primária) EST Expressed Sequence Tag (Etiqueta de Seqüência Expressa) FAPESP Fundação de Amparo à Pesquisa do Estado de São Paulo FBN Fixação Biológica do Nitrogênio GenBank Banco de Genes do NCBI GOGAT Glutamato Sintase Desoxyribonucleic Acid (Ácido Desoxirribonucléico VIII 8 GS Glutamina Sintase IITA International Institute of Tropical Agriculture (Instituto Internacional de Agricultura Tropical) KEGG Kyoto Encyclopedia of Genes and Genomes (Enciclopédia de Genes e Genomas de Kyoto) LRR Leucine Rich Repeats (Repetições Ricas em Leucina) LysM Lysin Motif (Motivo de Lisina) MADS-box Box of the Proteins MCM1 from Saccharomyces cerevisiae, AGAMOUS from Arabidopsis thaliana, DEFICIENS from Antirrhinum majus and SRF from Homo sapiens (Conjunto das Proteínas MCM1 de Saccharomyces cerevisiae, AGAMOUS de Arabidopsis thaliana, DEFICIENS de Antirrhinum majus e SRF de Homo sapiens). MEGA Molecular Evolutionary Genetics Analysis (Análises Genéticas e Evolução Molecular) MFS Major Facilitator Superfamily (Superfamília de Facilitadores Principais) MIP Major Intrinsic Protein (Proteína Intrínseca Principal) MS Membrana do Simbiossomo MtAnn Annexin from Medicago truncatula (Anexina de Medicago truncatula) N Nitrogênio NASA National Aeronautics and Space Administration (Agência Espacial Norte Americana) NCBI National Center for Biotechnology Information (Centro Norte Americano dee Biotecnologia e Informação) NFP Nod Factor Perception (Percepção do Fator Nod) NFR Nod Factor Receptor (Receptor de Fator Nod) NH4+ Íon Amônia NIN Nodule Inception Protein (Proteína do Início da Nodulação) NJ Neighbor Joining (Agrupamento por Vizinhança) 9 IX NO3- Íon Nitrato NOD Nodulina NORDEST Rede Nordeste de Biotecnologia NORK Nodulation Receptor Kinase (Receptor Quinase da Nodulação) Nramp Natural resistance-associated macrophage protein (Proteína do Macrófago Associada à Resistência Natural) NSP Nodulation Signaling Pathway (Via de sinalização da Nodulação) ONSA Organization for Nucleotide Sequencing and Analysis (Organização para Sequenciamento e Análise de Nucleotídeos) ORF Open Reading Frame (Quadro Aberto de Leitura) PDB Protein Data Base (Banco de Dados de Proteínas) PIR Protein Information Resources (Recursos de Informações Protéicas) RNA Ribonucleic acid (Ácido Ribonucléico) SAGE Serial Analysis of Gene Expression (Análise Serial da Expressão Gênica) SUCEST Sugarcane EST Project (Projeto EST da Cana-de-açúcar) SYMRK Symbiosis Receptor-Like Kinase (Receptor Quinase da Simbiose) UPGMA Unweighted Pair Group Method with Arithmetic Mean (Método não Polarizado de Agrupamentos aos Pares com Médias Aritméticas) X 10 LISTA DE FIGURAS CAPÍTULO 1 Figura 1. Visão geral do ciclo do nitrogênio 20 Figura 2. Representação esquemática da via de transdução de sinal ativado pelos fatores Nod, bem como os tipos e as principais proteínas encontradas nos 29 CAPÍTULO 2 Figura 1: Dendrograms generated after Maximum Parsimony analysis showing relationships among conserved domains in early nodulins (A) ENOD8 and (B) Annexin sequences including Vigna unguiculata orthologs. 83 Figura 2. Dendrograms generated after Maximum Parsimony analysis showing relationships considering conserved domains of late nodulins (A) Sucrose synthase and (B) Glutamine synthase sequences with Vigna unguiculata orthologs. 84 Figura 3. General distribution of transcripts found in the NordEST libraries. (A) Prevalence of early nodulin genes. (B) Prevalence of late nodulin genes. 86 Figura 4. Comparative prevalence of early and late nodulins genes in the cowpea NordEST libraries. 86 Figura 5. Expression pattern of cowpea transcripts to the here studied nodulins genes. (A) Graphic representation of the early nodulins CCS52a, Annexin, NSP1, DMI3, ENOD8 and NORK clusters. (B) Graphic representation of the late nodulins NOD70, SS, NOD26, NOD35, GS and Lgb. 88 CAPÍTULO 3. Figura 1. (A) Comparative prevalence of early and late nodulins genes in the SUCEST libraries. (B) Prevalence of reads per nodulin category. 127 Figura 2. Prevalence of sugarcane nodulins in the SUCEST libraries. (A) Occurrence of the early nodulins reads (B) Occurrence of the late nodulins reads. 128 Figura 3. Differential display of standard sugarcane transcripts representing selected nodulin genes. Graphic A represents the expression of early nodulins and graphic B represents the late nodulins. 130 11 XI LISTA DE TABELAS CAPÍTULO 1 Tabela 1. Descrição sucinta das bibliotecas geradas no projeto NORDEST 43 Tabela 2. Descrição sucinta das bibliotecas geradas no projeto SUCEST 48 CAPÍTULO 2 Tabela 1. Type and features of nodulin genes used as query against the cowpea databases. 112 Tabela 2. Main cowpea clusters significantly similar to known nodulins. tBLASTn results including the best match of each nodulin type. 113 Tabela 3. Conserved domains description of the best hits in cowpea database for each nodulin type. 114 CAPÍTULO 3. Tabela 1. Description of the SUCEST libraries. 121 Tabela 2. Type and features of nodulins genes used as query against the Sugarcane database. 154 Tabela 3. Main sugarcane clusters similar to nodulins genes. tBLASTn results and sequence evaluation of sugarcane nodulins genes including the best match of each gene. 155 12 XII RESUMO A fixação biológica de nitrogênio tem sido um dos principais focos de interesse no que se refere à nutrição mineral vegetal, sendo explorada na agricultura como uma fonte ecologicamente benigna de nitrogênio, além de reduzir o uso de fertilizantes químicos, que aumentam o custo da produção e causam danos ao meio ambiente. Nesse contexto, destacase a relação simbiótica entre bactérias da família Rhizobiaceae e raízes de leguminosas que permitem à planta, através dos nódulos radiculares, a absorção do nitrogênio fornecida pela bactéria, enquanto esta faz uso dos fotossintatos e de um ambiente microaeróbico fornecido pela planta. Esse trabalho teve como objetivo identificar, através de ferramentas computacionais, seqüências dos genes que participam da fixação de nitrogênio (NORK, DMI3, NIN, NSP1, Anexina, CCS52a, ENOD40, ENOD8, NOD26, DMT1, NOD70, Glutamina sintase, Leghemoglobina, NOD35 e Sucrose sintase) nos transcriptomas de Vigna unguiculata e de Saccharum officinarum. Foi possível a identificação de 263 ortólogos às nodulinas estudadas no transcriptoma do feijão-caupi, com destaque para as Leghemoglobinas que corresponderam a 95% dos clusters identificados. Com relação aos genes estudados na cana-de-açúcar, foram observados 195 clusters ortólogos, apresentando em sua maioria uma alta similaridade com nodulinas de outras monocotiledôneas. De uma forma geral, o estudo pôde constatar a presença das nodulinas em todos os tecidos analisados, com diferentes níveis de expressão. A maioria dos transcritos se encontrava nas bibliotecas de folhas infectadas e de raiz sob estresse salino no caso do caupi e em flores e raízes no caso da cana. Quando analisadas através de alinhamentos múltiplos, as nodulinas oriundas de diferentes organismos e aquelas encontradas no caupi apresentaram maior semelhança entre espécies pertencentes à mesma classe. Com relação às Angiospermas em geral, a família Fabaceae foi separada do restante, confirmando a função divergente e especifica destes genes no grupo. Os resultados do presente estudo sugerem o envolvimento das nodulinas em vias amplamente conservadas do desenvolvimento vegetal, confirmando o papel multifuncional desses genes além da interação benéfica com microorganismos. De uma forma geral, esse trabalho tem potencial para colaborar com o desenvolvimento de marcadores moleculares para o melhoramento das espécies estudadas, assim como para o entendimento da abundância, diversidade e evolução destes genes. O estudo pode ainda fornecer meios de elucidar os mecanismos envolvendo esses genes em outras vias, que não a de fixação, promovendo não só o controle adicional desses processos, como também uma possível expansão dessa vantajosa relação para plantas nãoleguminosas de importância econômica. Palavras-chave: Fixação bioinformática; EST. biológica de nitrogênio; feijão-caupi; cana-de-açúcar; XIII 13 ABSTRACT The biological nitrogen fixation has been one of the main targets in what refers to plants’ mineral nutrition, being explored in agriculture as an environmentally benign nitrogen source besides the fact that it can reduces the use of chemical fertilizers, which increases the cost of the production and causes damages to the environment. In this context, the symbiotic relationship between bacteria of the Rhizobiaceae family and leguminous roots is distinguished, enabling the absorption of nitrogen by the plant supplied by the bacteria in root nodules, while this microorganism makes use of the fotoshyntates and the microaerobic environment provided by the plant. This work aimed to identify, through computational tools, gene sequences involved in the nitrogen fixation (including NORK, DMI3, NIN, NSP1, Annexin, CCS52a, ENOD40, ENOD8, NOD26, DMT1, NOD70, Glutamine synthase, Leghemoglobin, NOD35 and Sucrose synthase) in the transcriptomes of Vigna unguiculata and Saccharum officinarum. It was possible to identify 263 orthologs to the nodulins studied in the cowpea transcriptome, highlighting the Leghemoglobins that corresponded to 95% of the clusters found. Regarding the genes studied in sugarcane, 195 ortholog clusters could be observed, often presenting high similarity with monocot nodulins. In a general view, it was possible to find the presence of the nodulins in all analyzed tissues, with different expression levels. Most of the transcripts were in the libraries of infected leaves and roots under salt stress in cowpea and of flowers and roots in the case of sugarcane. When analyzed through multiple alignments, the nodulins from different organisms and those found in cowpea showed greater similarity among species that belonged to the same class. Regarding the Angiosperms in general, the Fabaceae family was separated from the others, confirming the divergence and specific function of these genes within this group. The results of the present study suggest the involvement of these nodulins in highly conserved pathways of the plant development, confirming the multifunctional role of these genes besides the beneficial interaction with microorganisms. In a general view, this work has the potential to collaborate with the development of molecular markers for the improvement of the species studied, as well as for the understanding of the abundance, diversity and evolution of these genes. The study may also provide ways to elucidate the mechanisms involving these genes in other pathways, besides nitrogen fixation, promoting not only the additional control of this process, but also a possible expansion of this beneficial relationship to economically important nonleguminous plants. Palavras-chave: Nitrogen biological fixation; cowpea; sugarcane; bioinformatic; EST. 14 XIV INTRODUÇÃO A fixação biológica de nitrogênio (FBN) é um dos principais processos no que se refere à nutrição mineral das plantas, sendo por isso extensivamente explorada na agricultura. Entretanto, essa importante fonte primária de nitrogênio tem perdido espaço nas recentes décadas com o aumento do uso de fertilizantes químicos, tornando a agricultura moderna extremamente dependente e dispendiosa a custos acima de US$ 300 milhões por ano, considerando todo o planeta. Além disso, a produção e aplicação desses fertilizantes têm se tornado um grande problema devido aos métodos ineficientes empregados na sua utilização, culminando muitas vezes em níveis inaceitáveis de poluição em reservatórios de água e na eutrofização de lagos e rios, bem como no seu alto custo, impossibilitando sua utilização por agricultores de baixa renda. O atual e expansivo interesse no desenvolvimento sustentável e nas fontes de energia renováveis tem atentado para o fato de que a FBN é ecologicamente correta, podendo sua maior exploração reduzir o uso de combustíveis fósseis, auxiliando assim no reflorestamento e na reutilização de terras degradadas, através do enriquecimento de nitrogênio disponível no solo. Portanto, torna-se imprescindível conhecer os mecanismos que envolvem esse processo em cultivares que apresentem não apenas uma alta capacidade de FBN, como também uma melhor adaptação às condições ambientais adversas. Dentre as plantas com FBN eficiente, o feijão-caupi (Vigna unguiculata (L.) Walp.) destaca-se pelo seu alto valor protéico, tratando-se de uma cultura muito utilizada tanto na alimentação humana, como na alimentação de animais de criação, devido à sua rusticidade e destacável capacidade de adaptação em ambientes com estresse hídrico, térmico e salino. Portanto em diversas condições ambientais, o feijão-caupi pode ser utilizado como adubo verde, uma vez que apresenta uma eficiente fixação biológica de nitrogênio em associação com bactérias dos gêneros Rhizobium e Bradyrhizobium. Por outro lado, a cana-de-açúcar (Saccharum officinarum), por ser uma monocotiledônea, não apresenta processos de FBN tão eficientes como o feijão-caupi, e associa-se principalmente com bactérias endofíticas como Glucanobacter diazotrophicus e Herbaspirillum seropedicae que fornecem para a cana hormônios vegetais e nitrogênio. Trata-se certamente de uma das culturas economicamente mais importantes para o homem, sendo cultivada em regiões tropicais e subtropicais em mais de 80 países. O Brasil é 15 responsável por aproximadamente 25% de toda a produção mundial, tendo o estado de Pernambuco como um dos maiores produtores do país, onde ocupa 40% da economia local. A importância da cana pode ser atribuída a sua múltipla utilização, podendo ser empregada in natura, sob a forma de forragem, para alimentação animal ou como matéria-prima para fabricação de vários produtos, destacando-se o açúcar e álcool. Nesse contexto, torna-se evidente a necessidade de maiores informações e novos conhecimentos sobre os mecanismos genéticos utilizados por essas plantas na fixação de nitrogênio, uma vez que não há uma estimativa, em termos econômicos, da contribuição destas na FBN. O presente trabalho teve como objetivos principais identificar e caracterizar as sequências dos principais genes responsáveis pela fixação de nitrogênio no feijão-caupi e na cana-de-açúcar, analisando sua estrutura, seu perfil de expressão diferencial e comparando-as com as demais sequências de bancos de acesso restrito, bem como àquelas descritas na literatura e depositadas em bancos de dados públicos. 16 Capítulo 1 Revisão de Literatura ______________________________________________________________________ 17 1. REVISÃO DA LITERATURA 1.1. Interações Planta-Microorganismo Uma grande variedade de interações ocorre entre plantas e microorganismos, como vírus, bactérias, fungos e nematóides, sendo algumas prejudiciais para as plantas, resultando em bilhões de dólares perdidos por ano com os danos e os tratamentos com fungicidas e pesticidas. Outras relações são simbiônticas e benéficas, por aumentarem a absorção e utilização de nutrientes e/ou o crescimento vegetal. Essas relações ocorrem devido às trocas dinâmicas de múltiplos sinais entre as plantas e os microorganismos, que podem resultar na resistência ou susceptibilidade à infecção ou simbiose (Birch e Kamoun, 2000). Uma das mais importantes interações vantajosas entre planta e microorganismo é a realizada entre as raízes das leguminosas e bactérias rizobiais, com a formação do nódulo radicular responsável pela fixação biológica de nitrogênio. Entretanto, a associação com fungos micorrízicos também merece destaque, uma vez que estes são simbiontes vegetais ancestrais (Simon et al., 1993) e colonizam cerca de 90% das plantas terrestres (Zhu et al., 2005). Ademais, outras associações simbióticas são realizadas com os microrganismos presentes na rizosfera, sendo estas de suma importância em processos como a decomposição, mineralização, desnitrificação, armazenamento e mobilização de nutrientes e solubilização de fosfato (Khan et al., 2007). Os organismos que habitam o interior das plantas, em tecidos como folhas, ramos e raízes, os quais não produzem estruturas externas visíveis, são denominados endofíticos (Azevedo e Araújo, 2007); compreendendo principalmente fungos e bactérias, que comparados aos microrganismos patogênicos, não causam prejuízos à planta hospedeira (Neto et al., 2003). A presença destes microorganismos endofíticos já foi constatada em inúmeras espécies vegetais de interesse econômico, como algodão (Misaghi e Donndelinger, 1990), milho (Araújo et al., 2000), cana-de-açúcar (Rosenblueth et al., 2004), soja (Kuklinsky-Sobral et al., 2004), arroz (Sandhiya et al., 2005) e cacau (Rubini et al., 2005). Vários efeitos positivos foram atribuídos à presença dos organismos endofíticos em plantas hospedeiras, como a promoção do crescimento vegetal (Tsavkelova 18 et al., 2007), a fixação de nitrogênio, a supressão do desenvolvimento de nematóides (Sturz e Kimpinski, 2004), a indução de resistência sistêmica (Madhaiyan et al., 2004) e a proteção das plantas contra herbívoros (Schardl et al. 2004). 1.2. Fixação Biológica de Nitrogênio (FBN) 1.2.1. Importância Econômica e Ambiental O processo pelo qual o nitrogênio circula através das plantas e do solo pela ação de organismos vivos é conhecido como ciclo do nitrogênio (Figura 1), que é considerado um dos ciclos mais importantes nos ecossistemas terrestres, uma vê que o nitrogênio participa da composição de muitas moléculas, como ácidos nucléicos e proteínas sendo considerado, com exceção da água, o nutriente mais limitante para o crescimento vegetal. Apesar de ser requerido em quantidades significativas pelos seres vivos, este elemento é encontrado na natureza sob uma forma quimicamente estável devido à presença de uma tripla ligação N-N, o que limita sua utilização imediata, requerendo sua transformação para uma forma combinada que facilite sua assimilação (Sprent e Sprent, 1990). As plantas utilizam o nitrogênio sob a forma de íon nitrato (NO3-) ou íon amônio (NH4+) para a formação dos aminoácidos; entretanto a absorção e disponibilidade natural dessas formas se dão principalmente pela decomposição de plantas e animais, o que impossibilita utilização das mesmas na agricultura intensiva. Com essa carência de fertilização natural é necessária a adição de fertilizantes químicos nas plantações, gerando dependência de fontes externas, aumento do custo de produção e podendo inclusive causar danos ao meio ambiente. Nesse contexto, destaca-se a importância da fixação biológica de nitrogênio, caracterizada pela relação simbiótica entre bactérias e micorrizas com as raízes vegetais (Raven et al., 2001). A FBN não é apenas reconhecida como uma estratégia vantajosa para os legumes, ela também é uma importante alternativa para o enriquecimento do solo, considerando-se a inclusão desses vegetais nas culturas de rotação como uma eficiente metodologia para aumentar os estoques de nitrogênio total do solo, com consequente 19 melhoria do mesmo e da produtividade das culturas (Vezzani, 2001). Sua utilização se dá, geralmente, no pré-cultivo, onde a plantação de leguminosas precede a cultura principal, que se beneficia posteriormente com a mineralização do nitrogênio. Atualmente essa prática tem sido muito utilizada por pequenos agricultores, que não podem custear a fertilização artificial, ou ainda por sistemas de produção orgânicos, onde não é permitida a adição de adubos químicos sintéticos (Calegari, 2000). A FBN é tolerada pela necessidade do organismo em contraste aos fertilizantes químicos que são geralmente aplicados em grandes doses, sofrendo 50% de lixiviação, o que não apenas eleva os gastos, mas também culmina em sérios problemas de poluição, particularmente nos reservatórios de água (Zahran, 1999). Assim, as consequências econômicas aliadas às ambientais, tornam evidente a necessidade de investimentos em pesquisas que objetivem compreender os mecanismos fisiológicos, bioquímicos e moleculares da FBN, de modo que os conhecimentos obtidos possam beneficiar tanto o setor agrícola quanto as estratégias de reflorestamento. Fig 1. Visão geral do ciclo do nitrogênio 20 1.2.2. Mecanismos da FBN em Angiospermas Na natureza, a FBN pode ser realizada por diferentes grupos de microorganismos procarióticos, dentre os quais se destacam as bactérias do solo da família Rhizobiaceae, pertencente aos gêneros Bradyrhizobium, Azorhizobium e Rhizobium, denominadas genericamente de rizóbios. Os rizóbios caracterizam-se pela capacidade de interação simbiótica com o sistema radicular de leguminosas, por meio da formação de estruturas denominadas nódulos radiculares (Jordan, 1984). Nos nódulos radiculares, esses microorganismos fornecem aos vegetais nitrogênio através da ação da nitrogenase, sob a forma de NH4+. Esses íons são então convertidos nos aminoácidos glutamina e glutamato pela ação das enzimas vegetais glutamina sintetase e a glutamato sintase, respectivamente. A planta, por sua vez, fornece à bactéria produtos da fotossíntese e um ambiente microaeróbico ideal para a nitrogenase, que é sensível ao oxigênio (Spaink, 2000). O processo de fixação que ocorre nos nódulos das raízes compõe a última etapa de um processo de desenvolvimento, iniciando-se com o reconhecimento molecular nos pêlos radiculares das plantas, permitindo apenas a entrada dos mesmos e impedindo a colonização por microorganismos oportunistas (Parniske e Downie, 2003). O primeiro passo na interação molecular entre a planta e a bactéria é a detecção pelo rizóbio dos compostos fenólicos quimiotáticos, denominados flavonóides, secretados pelas raízes das plantas (Oldroyd e Downie, 2004). Os sinais dos flavanóides são reconhecidos pelos reguladores transcricionais Nod rizobiais (fatores Nod), proteínas que se ligam às moléculas sinalizadoras da planta, ativando a expressão dos genes responsáveis pela fixação de nitrogênio (Long, 1996). Os fatores Nod induzem muitas respostas nas células radiculares durante o processo de reconhecimento do rizóbio, ou seja, no estágio que antecede a simbiose, como mudanças nos filamentos de actina próximos à extremidade do pêlo radicular, a despolarização da membrana celular, o aumento do pH citoplasmático, a indução dos picos de cálcio, a ativação da divisão das células corticais radiculares e a indução da expressão de genes específicos nos tecidos epidermais e corticais (Long, 2001). Entretanto, como a variedade química externa dos fatores Nod é característica de cada rizóbio, esse reconhecimento 21 ocorre de modo específico, uma vez que os diferentes fatores determinam a especificidade entre a planta hospedeira e a bactéria simbiótica (Perret et al., 2000). A partir do reconhecimento, o rizóbio penetra nas raízes através da formação de uma estrutura tubular chamada canal ou via de infecção, a qual atravessa a epiderme e o córtex da raiz formando o nódulo primário. A bactéria é então liberada através do canal de infecção no citoplasma das células e, em paralelo, inicia-se uma divisão celular na região cortical da raiz, levando à formação do nódulo maduro (Franssen et al., 1992). Neste, a bactéria aumenta e diferencia-se na forma nitrogênio-fixante, conhecida como bacteróide. Esses bacterióides são cercados por uma membrana vegetal especializada, que permite a troca metabólica, formando o simbiossomo (Oldroyd e Downie, 2004). Uma vez que o nódulo é induzido, a planta utiliza um sistema de controle homeostático para regular o número de nódulos em formação. Em legumes, este controle é alcançado por mecanismos regulatórios conhecidos como auto-regulação da nodulação (Gresshoff, 1993), onde um sinal inicial dos nódulos é desenvolvido durante o tempo necessário para estabelecer o ciclo de realimentação após o qual cada nódulo primordial é iniciado, mas falha no desenvolvimento (Gresshoff, 2003). A FBN ocorre em apenas 10 das cerca de 380 famílias de angiospermas, sendo encontrada em mais de 90% das leguminosas pertencentes às subfamílias Mimosoideae e Papilionoideae, bem como em 30% das Caesalpinioideae (Sprent e Sprent, 1990). Entretanto, a simbiose radicular nitrogênio-fixante também pode ser observada em alguns membros das famílias Betulaceae, Casuarinaceae, Coriariaceae, Datiscaceae, Elaeagnaceae, Myricaceae, Rhamnaceae, Rosaceae e Ulmaceae (Mullin et al., 1990). Além da simbiose rizobial, cerca de 90% das plantas terrestres realizam uma associação endossimbiótica com fungos micorrízicos arbusculares, pertencentes à ordem Glomales (Brundrett, 2002). Essa interação com micorrizas compartilha muitas características com a simbiose rizobial durante a sinalização (Oldroyd et al., 2005). Essa interação envolve a invasão das hifas fúngicas nas células corticais radiculares, criando os arbúsculos, onde parecem acontecer trocas de nutrientes. Essa relação planta-fungo permite uma melhor absorção do fosfato, nitrogênio e outros macro e micronutrientes presentes no solo (Hodge et al., 2001; Rausch et al., 2001). Além disso, também cianobactérias 22 pertencentes ao gênero Nostoc realizam FBN com plantas do gênero Gunnera, entretanto essa relação ocorre de forma extracelular (Parniske, 2000). 1.2.3. FBN em Vigna unguiculata Na fixação de nitrogênio em V. unguiculata, como nas leguminosas, há o reconhecimento dos rizóbios pelas células radiculares, com posterior formação dos nódulos primários e maduros. O feijão-caupi realiza a FBN com bactérias dos gêneros Rhizobium e Bradyrhizobium (Fernandes et al., 2003), apresentando associação com pelo menos seis espécies: B. japonicum (Jordan, 1984), B. elkanii (Kuykendall et al., 1992), Sinorhizobium fredii (Lajudie et al., 1994), S. xinjiangensis (Chen et al., 1988) e R. hainanense (Chen et al., 1997) e R. tropici IIA (Zilli et al., 2006). Com relação à eficiência de fixação de nitrogênio no feijão-caupi, estudos têm mostrado principalmente as espécies B. japonicum e B. elkanii como as espécies que apresentam maior competência em prover um suprimento adequado de nitrogênio para a nutrição da maioria das leguminosas herbáceas, em regiões de clima tropical (Moreira e Siqueira, 2002). 1.2.4. FBN em Saccharum sp. A cana-de-açúcar, em comparação com as leguminosas, interage com bactérias nitrogênio-fixantes de uma maneira muito singular. Algumas bactérias endofíticas já foram isoladas em cana, incluindo Gluconacetobacter diazotrophicus, Herbaspirillum seropedicae e H. rubrisubalbicans; foram observadas colônias nos espaços intercelulares e nos tecidos vasculares da maioria dos órgãos da cana infectada, sem causar mudanças anatômicas visíveis ou sintomas de doenças (Reinhold-Hurek e Hurek, 1998). Apesar de escassas as informações sobre quais mecanismos estão envolvidos no estabelecimento desse tipo particular de interação e quais moléculas mediam a sinalização entre planta e bactéria, sabe-se que a planta pode controlar a colonização bacteriana pelo envio de sinais moleculares apropriados e/ou fornecendo um microambiente favorável para o estabelecimento das bactérias. Em contrapartida, tais bactérias propiciam um melhor 23 desenvolvimento dos vegetais possivelmente pelo suplemento de nitrogênio (Sevilla et al., 2001). Segundo Vargas et al. (2003), muitos dos genes, envolvidos na sinalização plantabactéria durante a associação e no metabolismo do nitrogênio, são provavelmente ativadas pela bactéria endofítica na etapa inicial da colonização vegetal, o que torna possível à cana assimilar e metabolizar o nitrogênio fixado pelas bactérias. Esses genes também parecem atuar como ativadores dos nódulos, uma vez que apresentam homologia com algumas nodulinas de leguminosas (Nogueira et al., 2001). 1.2.5. Aspectos Genéticos da FBN Os primeiros estudos genéticos dos mecanismos que envolvem a FBN foram iniciados na década de 1980, onde os primeiros genes de Rhizobium para fixação de nitrogênio e para nodulação foram clonados (Spaink et al., 1998). Com a identificação da grande maioria dos genes bacterianos necessários para a fixação simbiôntica de nitrogênio, importantes progressos foram conseguidos no que tange à elucidação da contribuição da planta para essa interação (Long, 2001). A utilização de Medicago truncatula e Lotus japonicus como plantas-modelo para a pesquisa com legumes tem acelerado muito o mapeamento genético e o isolamento dos genes requeridos para a fixação de nitrogênio, especialmente os que estão envolvidos na sinalização planta-rizóbio e no desenvolvimento do nódulo (Oldroyd e Downie, 2004; Oldroyd et al., 2005). A análise diferencial de bibliotecas de cDNA oriundas de raízes nodulares encontrou genes nódulo-acentuadores, chamados nodulinas, que foram divididos em duas classes principais: as nodulinas primárias denominadas ENOD (Early Nodulins), como ENOD2, ENOD12, ENOD40 e nodulinas com funções tardias (late nodulins) associadas com a fixação de nitrogênio, como leghemoglobulina, glutamina sintetase e NOD35 (Mylona et al., 1995). Os genes das nodulinas primárias são geralmente expressos durantes os primeiros estágios da nodulação e parecem estar envolvidos no processo de infecção e/ou na organogênese nodular, enquanto as nodulinas tardias são principalmente expressas nas estruturas dos nódulos maduros ao redor do local da fixação de nitrogênio, participando 24 nesse processo (Niebel et al., 1998). Atualmente, muitas nodulinas ainda necessitam de maiores informações sobre sua estrutura gênica, expressão espacial e até mesmo da estrutura de seu promotor. Além disso, nenhum estudo com mutantes conseguiu identificar danos à simbiose, quando comparados com o tipo selvagem/controle (Gresshoff, 2003). O processo evolutivo que permitiu que algumas plantas, leguminosas ou não, realizassem a simbiose nodular ainda permanece desconhecido. Entretanto, o fato de existirem homólogos aos genes ENOD em vegetais não-leguminosos sugere que o estabelecimento da simbiose nodular envolveu tanto o aparato genético existente num organismo ancestral, não relacionado à via simbiôntica, quanto à especialização de alguns genes e/ou o surgimento de novos genes (Szczyglowski e Amyot, 2003). Corroborando essa hipótese, foram identificados genes de nodulação expressos em tecidos vegetais não simbiônticos, como por exemplo, o gene ENOD12 que é um componente da parede celular, também encontrado em caules e flores (Scheres et al., 1990). Genes MADS-box, envolvidos no mecanismo de crescimento da extremidade do tubo polínico, estão agora emergindo como importantes reguladores de outros processos vegetais, incluindo a simbiose nodular (Zucchero et al., 2001). Em adição, o gene nóduloespecífico da alfafa nmhC5, assim como o gene de expressão tardia do pólen DEFH125 e ZmMADS2 de Antirrhinum majus e Zea mays, respectivamente, pertencem à mesma categoria dos MADS-box, sendo portanto considerados funcionalmente similares (Theissen et al., 2000). Dessa forma, o aparato molecular utilizado na formação nodular também parece estar envolvido em outras funções, considerando-se que outros fatores ainda não totalmente esclarecidos devem atuar no processo da simbiose (Hirsch et al., 2001). 1.2.6. Principais Nodulinas Primárias Os genes vegetais que participam do processo de sinalização do rizóbio e de formação do nódulo radicular durante a simbiose são coletivamente chamados nodulinas primárias. Apesar do processo inicial da nodulação ainda não ser bem elucidado, muitos genes já foram descritos como nodulinas primárias, sendo claramente divididos de acordo com o processo em que participam, como no reconhecimento e transdução dos sinais dos 25 fatores Nod rizobiais, culminando na expressão gênica de outros genes requeridos na nodulação e na formação do nódulo radicular primordial. Inicialmente, ocorre a percepção dos sinais rizóbio-derivados pelos receptores de membrana, principalmente pelas proteínas do tipo quinase LysM (Lysin Motif; Motivos de Lisina), como NFR1 (Nod Factor Receptor; Receptor do Fator Nod), NFR5 e NFP (Nod Factor Perception; Percepção do Fator Nod) (Limpens et al., 2003; Madsen et al., 2003; Radutoiu et al., 2003), havendo o posterior reconhecimento por outros receptores quinase, como NORK (Nodulation Receptor Kinase; Receptor quinase da Nodulação), SYMRK (Symbiosis Receptor-like Kinase; Receptor quinase da Simbiose), DMI2 (Does not Make Infection; Não Realiza a Infecção) e/ou SYM19 (Endre et al., 2002; Stracke et al., 2002; Mitra et al., 2004; Capoen et al., 2005). Em seguida, a mensagem é processada via canais iônicos, constituídos por proteínas como DMI1 de M. truncatula, CASTOR e POLLUX de L. japonicus, ancoradas nas membranas (Ané et al., 2004; Imaizumi-Anraku et al., 2005; Kanamori et al., 2006), que ativam proteínas quinases cálcio-calmodulina-dependentes (DMI3 e SYM9) (Lévy et al., 2004), as quais por sua vez ativam fatores de transcrição dos tipos GRAS (NSP1 e NSP2; Nodulation-signaling pathway; Via de sinalização da nodulação) e NIN (Nodule Inception; Início da nodulação), permitindo, desta forma, a expressão de outras nodulinas (Schauser et al., 1999; Borisov et al., 2003; Kalo et al., 2005; Smit et al., 2005) (Figura 1). O NORK e seus parálogos que possuem funções e sequências altamente similares, são compostos por um domínio extracelular com uma sequência única de 400 aminoácidos e três domínios com repetições ricas de leucina (LRR) que mediam as interações protéicas, seguidos por um domínio transmembrana e um típico domínio de proteína kinase serina/treonina intracelular (Shiu e Bleecker, 2001). Essa estrutura geral permite que o NORK participe da percepção do sinal originado pelas proteínas LysM extracelulares e na transdução deste através do domínio intracelular (Kistner e Parniske, 2002). Algumas proteínas que possuem similaridade com o ectodomínio do NORK já foram encontradas em Arabidopsis, monocotiledôneas e gimnospermas, sugerindo que essa região deve ter um papel biológico além da nodulação. Entretanto, a função desse segmento do domínio extracelular ainda permanece desconhecida (Endre et al., 2002). Além disso, receptores quinases com domínios ricos em repetições de leucina têm sido identificados em 26 numerosas vias de sinalização em plantas, incluindo a percepção de sinais dos patógenos (Dangl e Jones, 2001). De uma forma geral, o sinal externo reconhecido pelo NORK ativa a via de ação dos canais iônicos das membranas do núcleo e de organelas que vão permitir a entrada do cálcio (Oldroyd e Downie, 2006). O aumento de íons cálcio vai ser reconhecido no núcleo pelo gene DMI3 que codifica uma proteína quinase cálcio-calmodulina-dependente (CCaMK), responsável pela transdução do sinal originado pelo aumento de Ca2+, culminando em mudanças na expressão dos genes implicados na simbiose (Lévy et al., 2004; Mitra et al., 2004). A CCaMK já foi identificada em várias espécies vegetais, sendo considerada multifuncional. Esta proteína é formada por cinco domínios: um serina treonina quinase, CaM-ligante e três cálcio-ligante EF-hand (Patil et al., 1995) mais um domínio que promove ligação ao cálcio, que pode ocorrer de duas formas: através da ligação direta com os domínios EF-hand ou pela ligação indireta com a calmodulina formando um complexo (Takezawa et al., 1996). Essas duas vias de ligação com o cálcio devem permitir que esta proteína decodifique a informação gerada pela variação de cálcio, como as proteínas quinases calmodulina-dependentes dos sistemas animais, que apresentam uma indução da atividade da quinase por etapas em resposta à oscilação de cálcio (De Koninck e Schulman, 1998; Dal Santo et al., 1999). Essas quinases cálcio-ativadas regulam a expressão dos genes requeridos para o desenvolvimento do nódulo através da ativação dos fatores de transcrição NSP pertencentes à família GRAS (Kalo et al., 2005). Entretanto, o modo como essa ativação ocorre não foi esclarecido, uma das hipóteses sugeridas seria a través da fosforilação da CCaMK, localizada no núcleo (Smit et al., 2005). O gene NSP é essencial para as mudanças na expressão gênica induzidas pelos fatores Nod, como a formação do canal de infecção e a divisão das células corticais (Catoira et al., 2000; Oldroyd e Long, 2003; Mitra et al., 2004). Além disso, o gene NSP1 participa de outros processos além sinalização primária, sendo requerido possivelmente na manutenção do desenvolvimento nodular e/ou da infecção (Smit et al., 2005). 27 As proteínas codificadas pelo NSP possuem um domínio GRAS, o qual contém uma região N-terminal variável e uma C-terminal conservada, que possui cinco domínios (Kalo et al., 2005; Smit et al., 2005). Esses domínios ocorrem somente em plantas, apresentando homólogos em muitos vegetais superiores como arroz (Oryza sativa), arabidopsis (Arabidopsis thaliana), tomate (Lycopersicon esculentum), petúnia (Petunia hybrida) e lírio (Lilium longiflorum), participando de processos como transdução de sinal, manutenção do meristema e desenvolvimento vegetal (Bolle, 2004). Outro fator de transcrição ativado pelo DMI3 é codificado pelo gene NIN (Schauser et al., 1999), que é responsável pela entrada da bactéria e pelas respostas aos fatores Nod das células corticais (morfogênese do nódulo) e epidérmicas (infecção e expressão gênica). Também tem sido proposto que esse gene participa dos sinais nutricionais, hormonais ou outros endógenos e exógenos durante o processo de nodulação (Marsh et al., 2007). Entretanto, a recente identificação de proteínas semelhantes à NIN em arabidopsis e arroz sugere que essas proteínas atuem tanto durante a nodulação, quanto na sinalização de outros processos (Schauser et al., 2005). 28 PsSYM19 MtDMI2 MsSYMRK LjNORK Fatores Nod LjPOLLUX LjCASTOR MtDMI1 Oscilações do cálcio Cálcio Canais iônicos Oscilações do cálcio Receptores Quinase CCaMK Receptores LysM NFR1 NFR5 NFP NIN GRAS Fatores de transcrição DMI3 PsSYM9 RNA Pol Nodulinas Figura 2. Representação esquemática da via de transdução de sinal ativado pelos fatores Nod, bem como os tipos e as principais proteínas encontradas nos legumes em cada etapa. Os círculos azul e amarelo representam a região de citoplasma e o núcleo da célula vegetal respectivamente. Abraviações: Lj, Lotus japonicus; Ms, Medicago sativa; Mt, Medicago truncatula; Ps, Pisum sativum. (Desenvolvida pela autora, com base em Oldroyd et al., 2005 e Oldroyd e Downie, 2006). Além das nodulinas primárias descritas acima, outras proteínas também participam dos estágios inicias da simbiose rizobial, dentre elas destacam-se a anexina, CCS52A (Cell cycle switch protein; Proteína interruptora do ciclo celular) e as nodulinas primárias ENOD40 e ENOD8. A anexina participa da organização da membrana do simbiossomo cálcio-dependente durante a colonização dos tecidos vegetais. Estudos da localização in situ da atividade promotora do gene que codifica esta proteína mostraram uma indução no nódulo primordial, confirmando sua função durante a iniciação ou no estabelecimento das estruturas endossimbióticas da membrana do siombiossomo (Manthey et al., 2004). Niebel et al. (1998) mostraram que a expressão do gene Anexina de M. truncatula (MtAnn1) requer a ativação dos fatores Nod, sendo mais expresso na zona de pré-infecção do que na zona contendo os canais de infecção, sugerindo que o mesmo está mais 29 implicado na preparação da infecção ou na organogênese do nódulo do que no processo de infecção em si. Ademais, este estudo mostrou que o MtAnn1 está relacionado com as mudanças que ocorrem no citoesqueleto celular durante a simbiose, permitindo a ativação das células corticais e a deformação do pêlo radicular. A família das anexinas, inclui proteínas identificadas em muitos organismos eucarióticos, sendo constituída por proteínas cálcio-dependentes fosfolipídios-ligantes (Raynal e Pollard, 1994; Kaetzel e Dedman, 1995; Moss, 1997). No entanto, em plantas essas proteínas possuem diferentes características (Clark e Roux, 1995). No aipo, por exemplo, a anexina foi identificada como uma proteína cálcio-dependente vacúoloassociada (Seals et al., 1994); no algodão ela está associada com a modulação da atividade calose sintase localizada na membrana plasmática, enquanto no tomate e no milho elas possuem atividade de ATP-ase (Adenosine triphosphate; Adenosina trifosfato); entretanto apenas a anexina de tomate é capaz de interagir com a actina do citoesqueleto (McClung et al., 1994; Calvert et al., 1996). Essas proteínas geralmente possuem uma região variável N-terminal curta e uma região central conservada, composta de quatro repetições com cerca de 70 aminoácidos; a única exceção é a classe VI de animais que contêm oito repetições (Morgan e Fernandez, 1997). A diferenciação do nódulo primordial começa pela interrupção da divisão celular e subsequente início de vários endociclos, onde ocorre duplicação do material cromatínico sem haver divisão celular, culminando no aumento gradual do volume celular, que é essencial para a multiplicação da bactéria e estabelecimento dos bacterióides (Favery et al., 2002). Além disso, a amplificação do tamanho do genoma pelos endociclos assegura uma maior quantidade de genes envolvidos nos processos simbióticos (Foucher e Kondorosi, 2000). A endoreduplicação é uma estratégia comum no desenvolvimento de órgãos e tecidos vegetais (Kondorosi et al. 2000; Larkins et al., 2001) e caracteriza-se pela repetição da fase S do ciclo celular. Uma das maneiras de induzir o fenômeno de poliploidia é inativado os complexos ciclina/CDK (Cyclin-dependent kinase; Quinase ciclina dependente) antes do ponto de transição para a fase M (Fang et al., 1998). Com relação à mitose, sua inibição pode ser conseguida pela ativação precoce do complexo promotor da anáfase (APC; 30 Anaphase promoter complex), responsável pela proteólise ubiquitina-dependente das ciclinas mitóticas (Tarayre et al., 2004). Duas isoformas do gene que codificam ativadores APC, CCS52a e CCS52b, foram identificadas em M. truncatula e A. thaliana (Cebolla et al., 1999; Tarayre et al., 2004). Enquanto a CCS52a parece ser ortóloga às proteínas Cdhl de animais e de fungos, a CCS52b é encontrada apenas em tecidos vegetais (Tarayre et al., 2004). Em M. truncatula, a CCS52a é responsável pela degradação do ciclo mitótico e pela regulação da endoreduplicação durante a diferenciação celular simbiótica nos estágios finais da maturação do nódulo (Cebolla et al., 1999; Vinardell et al., 2003). Assim como o gene CCS52a, o ENOD40 também participa da formação do nódulo primordial, sendo induzido pelos fatores Nod. Este gene codifica um peptídeo de nove a treze aminoácidos, que é caracterizada pela ausência de um longo quadro aberto de leitura (ORF; Open read frame) (Kouchi et al., 1999, Vleghels et al., 2003); seus transcritos foram detectados não apenas nos nódulos radiculares mas também em tecidos meristemáticos nãosimbióticos, como nas raízes laterais (Papadopoulou et al., 1996; Fang e Hirsch, 1998), folhas jovens (Asad et al., 1994) e tecidos embrionários (Flemetakis et al., 2000). A nodulina codificada por este gene associa-se, além da nodulação, a diferentes processos, uma vez que são expressos em outros tecidos e possuem homólogos em plantas não leguminosas (Kouchi et al., 1999; Cebolla et al., 1999; Foucher e Kondorosi 2000). Entretanto o ENOD40, em contraste ao CCS52a, atua no transporte de componentes, como carboidratos, para as células corticais permitindo a organização apropriada do nódulo primordial (Charon et al., 1999; Kouchi et al., 1999). Pesquisas realizadas com RNA de interferência demonstraram que o ENOD40, apesar de ser requerido na ativação da divisão das células corticais que conduzem à formação do nódulo primordial (Sousa et al., 2001), não participa do processo de infecção do rizóbio (Kumagai et al., 2006). Além disso, o fato de sua expressão persistir no nódulo maduro sugere um possível papel adicional na função nodular (Kouchi e Hata, 1993; Yang et al., 1993). Em adição, homólogos desse gene são bem conservados em plantas nãoleguminosas, tendo sido descritos em plantas como tabaco (Nicotiana tabacum; Matvienko 31 et al., 1996) e arroz (Kouchi et al., 1999), indicando uma atuação mais geral no reino vegetal. Outra nodulina primária que participa da organogênese do nódulo é a codificada pelo ENOD8, que pertence a uma família gênica duplicada em tandem, na qual três genes já foram identificados em M. truncatula (Dickstein et al. 2002). Esse gene codifica uma proteína com atividade de acetiltranferase associada à membrana do simbiossomo nos nódulos radiculares (Pringle e Dickstein, 2004; Catalano et al., 2004), que integra a família GSDL de proteínas lipolíticas encontradas nas membranas de plantas e bactérias. Muitos estudos têm sido realizados com o objetivo de compreender melhor os processos envolvidos na sinalização dos fatores Nod e na formação dos nódulos radiculares das plantas. Esses dados poderão contribuir para o aperfeiçoamento e aumento da eficiência do processo de fixação biológica de nitrogênio das culturas de interesse agronômico. Além disso, a identificação da expressão das nodulinas em tecidos e órgãos vegetais que não realizam simbiose nitrogênio-fixante e a descoberta de diversas nodulinas primárias em plantas não leguminosas, podem tornar viável a transferência de genes que participam do processo de nodulação em outras plantas, aumentando, assim, a eficiência destas últimas na absorção de nitrogênio. 1.2.7. Principais Nodulinas Tardias A expressão das nodulinas tardias, que participam da troca metabólica entre a planta e o microssimbionte coincide com o início da fixação de nitrogênio, que é acionada pela nitrogenase expressa no rizóbio (Schröder et al., 1997). Essa classe de nodulina inclui proteínas transportadoras de membrana (Kaiser et al., 2003; Jeong et al., 2004) e proteínas associadas especificamente com a membrana do simbiossomo (MS) (Wienkoop e Saalbach, 2003; Catalano et al., 2004), permitindo o estabelecimento e a manutenção do processo simbiótico através do fluxo de várias moléculas e íons requeridos pelo bacterióide ou vegetal (Roberts e Tyerman, 2002). A principal proteína da MS, integrante da família MIP (Major intrinsic protein; Proteína intrínseca principal) de proteínas de canais, é codificada pelo gene NOD26 (Dean et al., 1999). Os membros desta família são caracterizados pela presença de seis domínios 32 α-hélice transmembrana, que formam uma estrutura hélice-loop-hélice (Jung et al., 1994; Agre et al., 1995). A família MIP é particularmente diversa em plantas superiores e mais de 30 genes podem ser encontrados em arabidopsis. Esses genes são divididos em quatro subfamílias: proteínas intrínsecas tonoplásticas, proteínas intrínsecas da membrana plasmática, proteínas intrínsecas semelhantes à nodulinas e pequenas proteínas básicas intrínsecas (Johanson et al., 2001). A aquaporina NOD26 é encontrada exclusivamente na membrana do simbiossomo, representando aproximadamente 10% do total de proteínas presentes na mesma (Weaver e Roberts, 1992). Essa aquaporina, além de ser altamente permeável à água, necessária à manutenção do equilíbrio osmótico nos nódulos, permitindo também o fluxo de glicerol, formamida, malato e outros eletrólitos que ajudam na osmoregulação (Rivers et al., 1997). A NOD26 é fosforilada pela proteína quinase cálcio-dependente da família CDPK apenas na serina localizada na posição 262 do domínio carboxi-terminal. Essa fosforilação regula a taxa de transporte do malato através da membrana do simbiossomo (Weaver e Roberts, 1992). Essa proteína é homóloga a várias proteínas intrínsecas do tipo-canal encontradas em Escherichia coli (Sweet et al., 1990), fungos (Van Aelst et al.,1991), Drosophila (Rao et al., 1990) e mamíferos (Kent e Shiels, 1990), sugerindo-se que a alta conservação entre os aminoácidos nos diferentes organismos para esta característica surgiu a partir de um ancestral comum (Baker e Saler, 1990). Outro transportador localizado na MS é codificado pela nodulina tardia DMT1 (Divalent metal transporter; Transportador de metais divalentes), que funciona no transporte de íons, como zinco, cobre, manganês e principalmente ferro, para o bacterióide. Nos bacterióides, o ferro participa da formação de inúmeras proteínas envolvidas na fixação de nitrogênio, incluindo a nitrogenase e os citocromos utilizados na cadeia transportadora de elétron (Kaiser et al., 2003). A proteína DMT1 pertencente à família de proteínas de membrana Nramp (Natural resistance-associated macrophage protein; Proteína do macrófago associada à resistência natural), sendo induzida nos nódulos no início da fixação de nitrogênio e tem sua expressão aumentada no nódulo maduro, sugerindo que o ferro seja requerido por enzimas que atuam 33 durante o desenvolvimento e o funcionamento nodular (Kaiser et al., 2003). Transcritos do DMT1 têm sido encontrados em diversos tipos celulares, sendo sua estrutura altamente conservada, apresentando homólogos em outras plantas, insetos, microorganismos e vertebrados (Mims e Prchal, 2005). Na planta, as leghemoglobinas, uma abundante nodulina que funciona como transportador de oxigênio para o simbiossomo, são compostas pelo grupo heme (que é rico em ferro) e pela globina, que é sintetizada pela planta em resposta à infecção bacteriana (Verma e Long, 1983; Appleby, 1984). A molécula de leghemoglobina é formada antes do começo da fixação de nitrogênio e atua na eficiência deste processo, uma vez que fornece um fluxo adequado de oxigênio para a respiração do rizóbio e correto funcionamento do complexo da nitrogenase bacteriana (Appleby, 1984). As leghemoglobinas são codificadas nas plantas por uma pequena família gênica (Laursen et al.,1994), que já foram clonados em Parasponia andersonii (Ulmaceae) e em plantas que não realizam a FBN, incluindo monocotiledôneas. Por este motivo, Hardison (1996) sugere que o grupo prostético heme, carreador de oxigênio, exclusivo dos nódulos, seja um produto especializado oriundo da divergência de uma hemoglobina ancestral presente antes da separação dos principais reinos. Embora sejam requeridas para o correto funcionamento dos nódulos radiculares, as leghemoglobinas não são necessárias para o crescimento e desenvolvimento vegetal na presença de uma fonte externa de nitrogênio, sugerindo-se que sua expressão ocorra exclusivamente durante a FBN (Ott et al., 2005). Além disso, a leghemoglobina também participa nos nódulos radiculares da regulação de outra nodulina tardia, a sucrose sintase. A forma ativa da sucrose sintase é um tetrâmero composto por monômeros idênticos, sendo encontrada em abundância nos nódulos onde catalisa a reação reversível da clivagem da sucrose. Além disso, a sucrose sintase é considerada como o principal transportador de carboidrato das folhas para os nódulos (Reibach e Streeter, 1983). A atividade da sucrose sintase no nódulo é modulada por grupos heme livres que se ligam aos seus monômeros, considerando-se que a concentração desses grupos heme dependa da ação das leghemoglobinas. Durante a senescência do nódulo, o grupo heme deve ser liberado das leghemoglobinas, permitindo a inativação da sucrose sintase, enquanto nos nódulos maduros, a alta concentração desses grupos heme não inibe esta 34 enzima, uma vez que estão ligados à leghemoglobina que possui maior afinidade com estes (Colebatch et al., 2004). A sucrose é um importante metabólito para o crescimento e desenvolvimento vegetal, desempenhando importante função em vários processos fisiológicos, como o transporte do carbono, a regulação do crescimento e desenvolvimento e transdução de sinal (Smeekens, 2000). Na FBN é a fonte primária carbono para os tecidos radiculares e para o bacteróide, fornecendo um esqueleto para o desenvolvimento das estruturas celulósicas como os canais de infecção e provendo intermediários de carbono para a assimilação do nitrogênio fixado (Colebatch et al., 2004). A sucrose sintase é codificada por uma pequena família multigênica em várias espécies, incluindo ervilha (Barratt et al., 2001), arabidopsis (Baud et al., 2004), batata (Zrenner et al., 1995) e milho (Duncan et al., 2006). Estudos com plantas mutantes e transgênicas que apresentaram atividade reduzida da sucrose sintase, demonstraram que isoformas específicas são essenciais para o metabolismo normal dos diferentes órgãos (Subbaiah e Sachs, 2001; Ruan et al., 2003). Em L. japonicus (Horst et al., 2007), arabidopsis (Baud et al., 2004) e arroz (Harada et al., 2005) a sucrose sintase é codificada por uma família de seis genes, havendo muitas isoformas em leguminosas como M. truncatula (Hohnjec et al., 1999) e ervilha (Barratt et al., 2001). A similaridade entre as isoformas da sucrose sintase pertencentes aos diferentes grupos entre as espécies sugere que os genes que codificam essas isoformas divergiram há um período relativamente longo, ao menos antes da separação entre mono e dicotiledôneas (Horst et al., 2007). O gene NOD70 possui 12 possíveis regiões transmembrana organizadas em dois grupos que são separados por um grande loop hidrofílico. A proteína codificada por este gene integra a MS, sendo responsável pelo transporte de nitrato, nitrito e cloreto (Pao et al., 1998), apresentando-se similar às da superfamília MFS (Major Facilitator Superfamily; Superfamília de facilitadores principais) de transportadores de membrana, que atuam no fluxo de vários substratos como açúcar, drogas, íons orgânicos e inorgânicos, intermediários do ciclo de Krebs, aminoácidos e peptídeos, podendo ser encontrados em quase todos os organismos (Pao et al., 1998; Szczyglowski et al., 1998). 35 Em adição, a subfamília de proteínas com alto grau de similaridade ao GmNOD70 e LjNOD70 está presente numa grande variedade de espécies vegetais, sugerindo-se que os vegetais possuam uma subfamília de transportadores ânion/nitrato relacionados com a NOD70, os quais devem ter um papel mais amplo além da simbiose. Essa idéia é suportada pela análise de bibliotecas de EST de soja, que revelou a presença de muitas sequências similares ao GmN70 em outros órgãos além do nódulo (Vincill et al., 2005). O NOD35 é um homotetrâmero que codifica uma oxidoradutase, a uricase nóduloespecífica (uricase II, EC.1.7.3.3) localizada nos peroxissomos das células não infectadas. Essa enzima tem um papel essencial na biossíntese do ureído, principal produto que atua no transporte do nitrogênio fixado dos nódulos para os galhos nas leguminosas tropicais (Tajima e Kouchi, 1996). Apesar de tratar-se de uma nodulina tardia, o NOD35 é expresso nas células não infectadas e por essa razão apresenta mecanismos de regulação completamente diferentes dos encontrados em outras nodulinas (Mauro e Verma, 1988). Tajima et al. (1991) identificaram uma quantidade significativa de transcritos destes genes antes do início da fixação de nitrogênio, que durante o processo aumentou gradativamente, sugerindo que a indução deste gene seja controlada em duas etapas. De uma forma geral, o N2 atmosférico é reduzido à amônia pela ação do complexo enzimático da nitrogenase no rizóbio, que requer um ambiente com baixa concentração de oxigênio, obtido pela presença de leghemoglobina e pela disponibilidade de um forte redutor bioquímico, a ferredoxina, fornecido pelo hospedeiro; tais condições são encontradas nos nódulos funcionais do sistema radicular de leguminosas. Posteriormente, a amônia, produto final desse processo, é liberada no citoplasma e assimilada pelo ciclo glutamina sintase/glutamato sintase (GS/GOGAT). No entanto, quando existe disponibilidade de nitrato no meio ambiente a leguminosa não estabelece a relação simbiótica. O nitrato absorvido será reduzido à amônia, pelas enzimas nitrato redutase e nitrito redutase, que posteriormente será assimilada pelo sistema GS/GOGAT. A partir destes primeiros compostos nitrogenados, glutamina e glutamato, todos os demais compostos orgânicos nitrogenados são produzidos pela ação de transaminases. Desta maneira, o sistema radicular é a principal fonte de nitrogênio para os drenos, sítios de alta demanda de N2 (Camargos, 2002). 36 A amônia é assimilada pelas enzimas vegetais nodulares GS (E.C.6.3.1.2) e GOGAT (EC 1.4.1.14). A eficiência da assimilação do nitrogênio fixado por essas enzimas têm um importante papel na produtividade da planta, uma vez que essa via mantém a amônia em baixas concentrações no citoplasma vegetal. A GS é localizada nos cloroplastos e citoplasma de folhas e no citoplasma de células de raízes (Oaks e Hirel, 1985), sendo a GOGAT localizada nos cloroplastos de folhas (Miflin e Lea, 1980) e em plastídeos nas raízes (Emes e Fowler, 1979). De acordo com Gonnet e Diaz (2000), a GS catalisa a aminação ATP-dependente do glutamato, formando glutamina e a GOGAT, por sua vez, catalisa a transferência redutiva do nitrogênio amida da glutamina para a posição a-ceto do 2-oxoglutarato, resultando na formação de duas moléculas de glutamato que servem como substrato para a biossíntese de vários metabólitos nitrogenados como os que são precursores de proteínas e ácidos nucléicos (Schuller et al., 1986). Uma simbiose eficaz requer a expressão coordenada dos genes vegetais e bacterianos. A expressão dos genes GS e GOGAT nos nódulos é influenciada pelo estágio de desenvolvimento do nódulo e pela presença da amônia produzida pela ação da nitrogenase (Vance et al., 1988). Entretanto o modo como a amônia produzida pela nitrogenase regula a GS ainda permanece desconhecido (Suganuma et al., 1999). Tendo em vista que a síntese da glutamina ocorre via reação da GS e sendo ela o substrato para a GOGAT, supõe-se que a GS desempenhe um papel central no metabolismo do nitrogênio. As evidências sugerem que a GS poderia estar sujeita a diversos tipos de controle, dentre os quais se incluem a repressão e a ativação em resposta a diferentes aminoácidos e fitormônios (Chanda et al., 1998). A variabilidade genotípica mensurada pela atividade da GOGAT em alfafa (Jessen et al., 1988) e da GS em feijão-comum (Hungría et al., 1991), pode servir como possíveis marcadores que forneçam subsídios aos programas de melhoramento. Além disso, a ação de outras nodulinas tardias que aumentam a eficácia da FBN também tem sido estudada com o intuito de melhorar a produtividade das leguminosas. 37 1.3. O Feijão-Caupi 1.3.1. Importância Econômica As leguminosas compreendem a segunda maior família em impacto econômico dentre as plantas cultivadas, constituindo aproximadamente 27% da produção mundial agrícola (Graham e Vance, 2003). Em escala mundial, os legumes contribuem com cerca de 30% das proteínas consumidas por humanos e animais, servindo como fonte primária de proteínas e vitaminas, sendo capazes de acumular metabólitos secundários, como isoflavonóides, que são benéficos para a saúde humana. Além de sua importância como fonte nutricional, estas plantas têm a capacidade única de realizar a fixação biológica de nitrogênio em associação com rizóbios e micorrizas, excelentes fertilizantes naturais (Dixon e Sumner, 2003). Dentre as leguminosas, o feijão-caupi destaca-se por suas características de rusticidade, versatilidade e adaptabilidade a condições de seca, solos ácidos e alcalinos. Além disso, este grão é muito tolerante a baixa fertilização devido à sua alta taxa de fixação de nitrogênio, pela simbiose com micorrizas e rizóbios. Assim, não só pela sua capacidade de fertilizar naturalmente o solo e suportar condições ambientais desfavoráveis, mas também por impedir a infecção e a reprodução de organismos oportunistas, o feijão-caupi pode ser considerado uma cultura de excelência para a agricultura (Fery, 1990; Ehlers e Hall, 1997). Além dos benefícios que proporciona ao solo, o feijão-caupi é rico em proteínas (23-25%), apresentando todos os aminoácidos essenciais, carboidratos (62%), minerais e vitaminas, além de ter uma grande quantidade de fibras dietéticas e baixos teores de gordura (com uma média de 2% de teor de óleo) (EMBRAPA, 2008). Além disso, este feijão é a principal fonte de nutrientes da dieta da população de baixa renda, considerandose que seu cultivo constitua importante meio de sustento da maioria dos pequenos produtores rurais do norte e nordeste brasileiros. Praticamente todas as partes da planta são aproveitadas: as sementes, vagens e folhas são consumidas frescas como vegetais verdes, os grãos podem ser consumidos após cozimento e o restante da planta pode ser usado como alimento para animais domésticos. Todas as partes da planta usadas na alimentação são nutritivas e com alto teor protéico, 38 tornando-a extremamente importante para populações de baixa renda, onde muitas pessoas não têm acesso a outras fontes de proteínas (Magloire, 2005). Atualmente seu cultivo concentra-se em regiões de clima quente, sendo três quartos da produção encontrados na África (Shoshima et al., 2005). No Brasil trata-se do único feijão capaz de sobreviver com sucesso na região norte (alta umidade, muita chuva e solo argiloso) e no Nordeste (seca, solo arenoso, por vezes salino e muito sol) (Barreto, 1999), onde contribui com cerca de 41% do feijão consumido pela população. Em 2004 os maiores produtores nacionais dessa cultura foram o Ceará e o Piauí, que produziram aproximadamente 212.000 toneladas por ano (FNP, 2004). O feijão-caupi, por sua importância econômica e social, é um organismo de grande interesse científico. Devido ao seu atributo nutricional superior, versatilidade, adaptabilidade e produtividade foi escolhido pela agência espacial norte-americana (NASA; North-American Space Agency) como um dos poucos vegetais a serem pesquisados nas estações espaciais (Ehlers e Hall, 1997). 1.3.2. Origem e Distribuição Geográfica As espécies do gênero Vigna estão distribuídas nas regiões tropicais e subtropicais de todo o mundo, entretanto há controversas sobre seu centro de origem e diversidade. Pant et al. (1982) sugerem que o provável local de introdução desse legume se deu na Índia, durante o período neolítico, tendo a Nigéria como centro primário de diversidade das espécies selvagens (Steele e Mehra, 1980; Ng e Marechal, 1985). No entanto, Freire-Filho (1988), levando em consideração a elevada taxa de endemismo e a maior concentração de espécies do gênero Vigna na África, sugere que a evolução e dispersão deste gênero tenham ocorrido a partir deste continente; entre as espécies nativas da África, V. unguiculata predomina em algumas regiões e suas formas selvagens não foram encontradas fora do continente africano. Ainda, Padulosi e Ng (1997) apontam a região de Transvaal, na República da África do Sul, como a provável região de especiação de V. unguiculata (L.) Walp. Na América latina, o feijão-caupi foi provavelmente trazido da Europa e do Oeste da África pelos colonizadores europeus e pelos escravos africanos durante os séculos 16 e 39 17 (Simon et al., 2007). O processo ocorreu primeiramente nas colônias espanholas e logo após no Brasil, possivelmente no estado da Bahia e posteriormente para outras regiões do nordeste brasileiro (Freire Filho et al., 1999). Mesmo tendo sofrido diversos eventos de introgressão e possuindo uma ampla variedade de fenótipos entre suas cultivares, o pool gênico do feijão-caupi parece ser bem limitado, principalmente nas espécies cultivadas; esta leguminosa parece ter passado por um efeito de gargalo durante sua domesticação. Por essa razão as conclusões sobre sua origem e distribuição ainda não foram totalmente esclarecidas (Ehlers e Hall, 1997). 1.3.3. Melhoramento do Feijão-Caupi O feijão-caupi possui características extremamente vantajosas para o seu melhoramento como: autofecundação, genoma estável (evitando o “escape” de genes) e ciclo de vida curto (cerca de dois meses) (Saccardo et al., 1992). Entretanto, durante muito tempo seu melhoramento foi baseado apenas em métodos tradicionais de cruzamento, com seleção de genótipos adaptados às condições específicas de cada região (Xavier et al., 2005). Durante o período de 1970 a 1988 as pesquisas que visavam seu melhoramento concentraram-se no desenvolvimento de cultivares apenas para o campo. Em 1989, diversificou-se, incluindo o melhoramento sistemático das cultivares locais e o desenvolvimento de uma gama de cultivares com o intuito de obter maiores grãos e maior eficiência na forragem para os sistemas de rotação de culturas (Fatokun et al., 2002). Atualmente um dos principais objetivos dos programas de melhoramento é o desenvolvimento de características agronômicas desejáveis, como tolerância aos estresses abióticos (drogas, salinidade e calor), maior produtividade e resistência à patógenos (Timko et al., 2007). O Instituto Internacional da Agricultura Tropical (IITA), localizado na África, é considerado o mais importante centro de pesquisa com feijão-caupi, entretanto avanços significantes têm sido alcançados em diferentes regiões do mundo, como na Índia, Mali, Nigéria, Senegal e, em menor escala, outros países. Recentemente, a Universidade da 40 Califórnia (USA) e a EMBRAPA (BR) também reforçaram e expandiram suas pesquisas nessa área (Singh et al., 2002). No Nordeste brasileiro, vários trabalhos têm visado à produtividade, resistência a vírus, adaptabilidade e estabilidade de genótipos do feijão-caupi baseados em metodologias que utilizam regressão linear (Finlay e Wilkinson, 1963; Eberhart e Russell, 1966). Esses estudos têm subsidiado o lançamento de cultivares de feijão-caupi em vários estados (Freire-Filho et al., 2001; 2002). Embora de suma importância, esses estudos não priorizaram o alto potencial desse legume em fixar nitrogênio, uma característica capaz de melhorar a produtividade e manter a fertilidade do solo sem a necessidade de fertilizantes químicos. Apesar da grande diversidade de fenótipos com um alto potencial genético e da importância econômica e social nos países em desenvolvimento, de uma forma geral, o feijão-caupi ainda permanece como uma cultura pouco explorada, sendo necessários mais estudos no sentido de tornar seu cultivo mais rentável. Assim, quando comparado a outras leguminosas como alfafa e soja, poucos esforços e investimentos são dispensados aos estudos com o feijão-caupi (Singh, 2005). Atualmente, as ferramentas biotecnológicas modernas podem propiciar ao feijão-caupi não só condições de competitividade e características que atendam às necessidades comerciais internacionais (Timko, 2002), como também maiores informações sobre a estrutura e a composição do seu genoma e proteoma, o que auxiliaria na interpretação da evolução do clado Phaseoloid/Millettoid e Papilionoideae em geral, contribuindo substancialmente para o melhoramento dessa cultura (Simon et al., 2007; Timko et al., 2008). 1.3.4. Aspectos Botânicos e Genéticos O feijão-caupi é uma angiosperma de cultura autógama (Teófilo et al., 2001), pertencente à classe Dicotyledoneae, ordem Fabales, família Fabaceae (Leguminosae), subfamília Faboideae (Papilionoidea), tribo Phaseoleae, subtribo Phaseolinae, gênero Vigna e espécie V. unguiculata (L.) Walp. (NCBI, 2008). Com relação à sua família, o feijão- 41 caupi apresenta um dos menores genomas, contendo aproximadamente 620 mega pares de base (Paterson, 2006). Essa leguminosa apresenta 2n=22 cromossomos, entretanto esse número pode sofrer variações; em algumas cultivares cerca de 20% das células contêm 23 cromossomos mitóticos (Benko-Iseppon, 2001; Adetula, 2006). Além disso, sua cariotipagem é controversa, enquanto Barone e Saccardo (1990) observaram um grande cromossomo, um muito pequeno e os outros nove distribuídos em três grupos de tamanhos intermediários, Pignone et al. (1990) descreveu o cariótipo como composto por cinco cromossomos grandes, cinco médios e um pequeno. Pouca atenção tem sido dada à caracterização gênica no feijão-caupi (Timko et al., 2007), observando-se que os maiores progressos na genômica de leguminosas têm sido realizados com as espécies modelo M. truncatula, L. japonicus e Glycine max (Cronk, et al., 2006; Sato et al., 2007). Esses organismos contêm não só suas sequências genômicas e bibliotecas de EST (Expressed Sequence Tag; Etiqueta de sequência expressa) disponibilizadas em bancos públicos, como também seus mapas físicos e genéticos (Sato et al., 2007). 1.3.5 Projetos HarvEST, NordEST e CGKB Atualmente o sequenciamento do genoma e do transcriptoma do feijão-caupi está sendo desenvolvido por alguns grupos e disponibilizados em bancos de dados públicos, como Cowpea Genespace/Genomics Knowledge Base (CGKB), responsável pela geração de sequências através da filtração do DNA genômico metilado (Chen et al., 2007) e o HarvEST, que gerou mais de 180.000 ESTs (HarvEST, 2008). Numa iniciativa inovadora, grupos de pesquisa do nordeste brasileiro deram inicio em 2004 ao projeto NordEST, integrante da rede Renorbio, que visa sequenciar o primeiro genoma expresso de uma leguminosa (V. unguiculata) no Brasil. Além de ser uma iniciativa inovadora, o diferencial deste projeto está na construção de bibliotecas contrastantes para características de estresses bióticos e abióticos. O sequenciamento do genoma do feijão-caupi é parte do projeto “Genômica funcional, estrutural e comparativa de feijão-caupi (V. unguiculata)”, que visa obter cerca 42 de 100.000 sequências geradas a partir de diferentes bibliotecas contrastantes (Tabela 1), com o intuito de identificar genes candidatos e novas sequências potencialmente úteis para fins de melhoramento da cultura Tabela 1. Descrição sucinta das bibliotecas geradas no projeto NordEST, incluindo o código da biblioteca, o número total de ESTs sequenciadas e a descrição da situação na extração do cDNA Código da Nº Total biblioteca de ESTs CT00 288 Controle Raiz BM90 1624 Genótipo BR14-Mulato Folha IM90 464 Genótipo IT85F coletado 90 minutos após infecção com vírus Folha Descrição Tecido do mosaico SS00 1433 Genótipo sensível à salinidade sem estresse salino Raiz SS02 2204 Genótipo sensível à salinidade após 2 horas de estresse* Raiz SS08 3647 Genótipo sensível à salinidade após 8 horas de estresse* Raiz ST00 2500 Genótipo tolerante à salinidade sem estresse salino Raiz ST02 3646 Genótipo tolerante à salinidade após 2 horas de estresse* Raiz ST08 3142 Genótipo tolerante à salinidade após 2 horas de estresse* Raiz * Genótipos submetidos a estresse salino foram cultivados com 200mM NaCl 43 1.4. A Cana-de-Açúcar 1.4.1. Importância Econômica A cana-de-açúcar é certamente uma das culturas economicamente mais importantes, sendo cultivada em regiões tropicais e subtropicais em mais de 80 países (Vettore et al., 2003). A importância da cana pode ser atribuída à sua múltipla utilização, podendo ser empregada in natura, sob a forma de forragem, para alimentação animal ou como matéria-prima para fabricação de rapadura, melaço, aguardente, açúcar e álcool (Novaretti, 1981). O bagaço é utilizado na fabricação de diversos tipos de papel, de fármacos e na síntese de compostos orgânicos, com grande número de aplicações na indústria química e farmacêutica (Pinazza e Alimandro, 2003). Ademais, da sua queima é possível gerar energia térmica e elétrica (Portal Única, 2008). Do melaço, além do álcool, extraem-se leveduras, mel, ácido cítrico, ácido lático e glutamato monossódico. Além disso, a partir do etanol são fabricados polietileno, estireno, cetona, acetaldeído, poliestireno, ácido acético, éter, acetona e uma gama de produtos químicos extraídos normalmente do petróleo (Pinazza e Alimandro, 2003). No Brasil, a cana-de-açúcar foi a primeira cultura introduzida no país, sendo cultivada inicialmente no litoral nordestino apenas para a produção de açúcar (Szmrecsányi e Moreira, 1991). A partir da década de 1970, a cana adquiriu maior status com o incentivo do governo federal para que o setor sucroalcooleiro contribuísse para a solução da crise energética emergente, frente à sua potencialidade como fonte geradora de energia renovável (Barela, 2005). Atualmente a cana é cultivada em quase todos os estados brasileiros. O agronegócio sucroalcooleiro é responsável por 2,4% do PIB nacional, gerando 3,6 milhões de empregos diretos e indiretos (Albino et al., 2006). Além disso, dados da EMBRAPA (2008) estimaram que na safra 2007/2008 foram produzidos 290 milhões tonelada/ano, o que reafirma o Brasil como o maior produtor mundial de açúcar e álcool (EMBRAPA, 2008). A área mundial ocupada pelo cultivo da cana-de-açúcar corresponde a seis milhões de hectares e especula-se que aumente para 9,1 milhões de hectares nos próximos 44 oito anos (Albino et al., 2006). O interesse pela cana tem aumentado bastante no âmbito internacional, principalmente devido às recentes negociações do protocolo de Kyoto que visam à redução do efeito estufa, o que dá à atividade canavieira do Brasil um destaque ambiental altamente positivo, uma vez que o uso energético da cana-de-açúcar evita um acréscimo anual de mais de 20% do total de emissões de CO2 pela queima de combustíveis fósseis no país (Macedo, 2001). 1.4.2. Origem, Distribuição Geográfica O provável centro de origem da cana-de-açúcar é o norte da Índia e estima-se que sua domesticação pelo homem tenha ocorrido por volta de 2500 A.C., iniciando-se em Papua Nova Guiné (Brandes, 1956). Seu cultivo se concentra em áreas tropicais e subtropicais em mais de 80 países ao redor do mundo, estendendo-se em uma ampla faixa de latitudes desde 35º N até 30º S, bem como em altitudes que variam desde o nível do mar até mil metros (Magalhães, 1987; SUCEST, 2008) A cultura da cana foi introduzida nas Américas em 1494 em São Domingos, enquanto no Brasil, seu plantio teve inicio na Província de São Vicente em 1522 com híbridos oriundos do cruzamento de S. officinarum e S. basberi, trazidos da Ilha da Madeira. Dessa mesma ilha, em 1533, Duarte Coelho Pereira introduziu a cana-de-açúcar em Pernambuco (Artschwager e Brandes, 1958; Bastos, 1987). Posteriormente, as canas-nobres, termo criado por melhoristas holandeses para se referir aos genótipos de S. officinarum com alto teor de açúcar, dominaram a economia do país, sendo prioritariamente utilizadas pelas indústrias de açúcar não só no Brasil como também no mundo (Dantas, 1960). 1.4.3. Melhoramento da Cana-de-Açúcar O sucesso do cultivo da cana está atrelado principalmente aos programas de melhoramento genético, os quais objetivam desenvolver variedades melhor adaptadas às condições de solo e clima, minimizar os danos causados pelos ataques de pragas, aumentar 45 a resistência às doenças e melhorar as características industriais das variedades (Rosse et al., 2002). Em 1887, Soltweld realizou o primeiro cruzamento em cana-de-açúcar obtendo sementes férteis, demonstrando a viabilidade do melhoramento da cana através de cruzamentos controlados. Dois anos mais tarde, Harrison e Bowell obtiveram plântulas de sementes originárias de cruzamentos. Surgiam assim os primeiros programas de melhoramento genético (Cesnik, 2008). As variedades atuais, classificadas como Saccharum spp., são híbridos oriundos de cruzamentos e retrocruzamentos interespecíficos que apresentam um elevado nível de ploidia e um complexo comportamento meiótico; estima-se que seu genoma apresente de cinco a dez por cento do genoma das espécies parentais ancestrais (Lu et al., 1994). No Brasil, os programas de melhoramento foram iniciados a partir do surgimento de uma epidemia de gomose, doença causada pelo patógeno Xanthomonas axonopodis pv. vasculorum, na principal variedade do país, resultando em enormes prejuízos (Matsuoka et al., 1999). Em 1910 foram instaladas as duas primeiras estações experimentais de cana-deaçúcar do Brasil, uma no Rio de Janeiro e outra em Pernambuco. Esta última iniciou em 1913 pesquisas para a obtenção de variedades resistentes à broca Diatraea e ao piolho Trionymus (Cesnik, 2008). Segundo Barbosa et al. (2000) nas últimas três décadas houve marcante contribuição do melhoramento genético no desenvolvimento do setor canavieiro do Brasil, com ganhos acentuados de produtividade e qualidade. Nesse período, houve mais de 30% de aumento na média de produtividade da cana-de-açúcar e da recuperação de quilogramas de açúcar por tonelada de cana moída. 1.4.4. Aspectos Botânicos e Genéticos A cana-de-açúcar é uma monocotiledônea pertencente à família Poaceae (gramíneas), tribo Andropogoneae e ao gênero Saccharum que engloba cerca de 30 espécies (EMBRAPA, 2006; NCBI, 2008). É uma cultura semi perene e alógama, com ciclo de cinco a sete anos, que requer um sistema radicular profundo para aumentar sua produtividade em solos pouco férteis e com baixa retenção de umidade (Demattê, 2005). 46 Devido à sua origem multiespecífica, a cana-de-açúcar apresenta um dos genomas mais complexos entre as plantas cultivadas (Ingelbrecht et al., 1999). Ainda, na sua maioria, as variedades atuais são férteis e possuem número cromossômico variando entre 2n=70 e 2n=130. Essa variação não ocorre somente entre órgãos de uma mesma planta, mas também entre células de um mesmo tecido (Roach e Daniels, 1987; Portieles et al., 2002). Atualmente vários projetos de genômica da cana-de-açúcar estão sendo desenvolvidos por diferentes grupos de pesquisa em todo o mundo. A Austrália e os EUA além de desenvolverem projetos para o mapeamento e a aplicação de marcadores de DNA têm sequenciado, juntamente com outros países como a África do Sul, a França e o Brasil, mais de 300.000 ESTs (Carson e Botha, 2000; Casu et al., 2001; Grivet e Arruda, 2001; Perrin e Wigge, 2002). As informações geradas por esses programas têm sido úteis no mapeamento comparativo da família Poaceae, fazendo uso de marcadores comuns que hibridizam em cana, arroz, milho, trigo e aveia, entre outras (SUCEST, 2008). 1.4.5. Projeto SUCEST No Brasil, em 1999, a rede ONSA (Organization for Nucleotide Sequencing and Analysis; Organização para Seqüenciamento e Análise de Nucleotídeos) deu inicio ao projeto SUCEST (Sugarcane Expressed Sequence Tag Project; Projeto EST da Cana-deaçúcar) que tinha como principal objetivo identificar 50.000 genes através do sequenciamento do genoma expresso da cana a partir de clones randômicos oriundos de 26 bibliotecas de cDNA extraídas de diversos órgãos e tecidos em diferentes estágios de desenvolvimento (Tabela 2). Atualmente este banco de dados disponibiliza um total de 291,689 ESTs, agrupadas em 43,141 clusters, que podem ser utilizadas para a identificação da composição gênica da cana e determinação da expressão diferencial em cada biblioteca (SUCEST, 2008). 47 Tabela 2. Descrição sucinta das bibliotecas geradas no projeto SUCEST, incluindo o código das bibliotecas, o número total de ESTs, descrição dos tecidos e condições de extração dos cDNAs (Fonte: Banco de Dados do SUCEST, http://sucest.lad.ic.unicamp.br/en/). Código da biblioteca AD1 Nº Total de ESTs 18137 AM1, AM2 28128 Infecção de tecidos de plantas cultivadas in vitro por Glauconacetobacter diazotroficans Meristema apical CL6 11872 Calos tratados por 12h à 4ºC e 37ºC no escuro e no claro FL1, FL2, FL3, 83899 Flor em diferentes estágios de desenvolvimento HR1 12000 LB1, LB2 18047 Infecção de tecidos de plantas cultivadas in vitro por Herbaspirilum diazotroficans Ramo lateral de plantas adultas LR1, LR2 18141 Primórdio foliar LV1 6432 Crescimento foliar in vitro NR1, NR2 768 Todas as bibliotecas RT1, RT2, RT3 31487 Ápice radicular e 0,3 cm a partir do ápice radicular em plantas maduras RZ1, RZ2, RZ3 24096 Tansição raiz-caule de plantas jovens SB1 16318 Colmo SD1, SD2 21406 Desenvolvimento de sementes ST1, ST3 20762 Primeiro ou quarto internó do caule de plantas jovens Descrição FL4, FL5, FL8 48 1.5. Análise Bioinformática 1.5.1. Retrospectiva e Aplicações Atuais A bioinformática, surgida no início dos anos 80, combina os conhecimentos da matemática, estatística, ciência da computação, biologia e química, com o objetivo de administrar e analisar grande quantidade de dados biológicos (Borém e Santos, 2001; Carraro e Kitajima, 2002). Este ramo da ciência adquiriu maior visibilidade a partir da produção massiva de sequências gênicas e protéicas oriundas do Projeto Genoma Humano. A grande quantidade de dados gerados exigia recursos computacionais cada vez mais eficientes para o armazenamento e análise destes dados. Assim, a bioinformática passou a desempenhar papel essencial em outros projetos genoma (Prosdocini et al., 2002). Atualmente através da bioinformática é possível manipular uma grande diversidade de dados biológicos; os programas e algoritmos desenvolvidos são capazes de armazenar, processar, analisar, decifrar estruturas, traçar relações entre moléculas e vias e interpretar grande quantidade de informações (Borém e Santos, 2001). No Brasil, o Laboratório de Bioinformática da Unicamp foi pioneiro no desenvolvimento e aplicação de várias ferramentas computacionais à pesquisa genômica. Em 2000, foi responsável pela montagem in silico do genoma da bactéria Xyllela fastidiosa, o primeiro sequenciado no país (Simpson et al., 2000). Posteriormente, vários outros centros de bioinformática surgiram no Brasil e diversas redes nacionais e regionais de sequenciamento de genomas foram criadas, como o Laboratório Nacional de Computação Científica em Petrópolis, onde funciona o Centro de Bioinformática do Projeto Genoma Brasileiro. Projetos envolvendo genomas expressos também se encontram em andamento no país, como o Projeto Genoma Humano do Câncer da FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo) e o Projeto do Schistosoma mansoni, realizado pela Rede Genoma de Minas Gerais (Santos e Ortega, 2003). 49 1.5.2. Bancos de Dados, Ferramentas e Programas Com a geração massiva de sequências nucleotídicas e protéicas tornou-se necessária a criação de bancos de dados capazes de armazenar essa grande quantidade de dados e que permitissem o acesso dessas informações pelos diferentes grupos de pesquisas. Atualmente diversos bancos de dados públicos e privados foram criados e, além do acesso aos dados depositados, vários deles disponibilizam informações importantes sobre as sequências armazenadas e ferramentas úteis para sua manipulação (Morais, 2003). O GenBank (Banco de Genes), hospedado no NCBI (National Center for Biotechnology Information; Centro Nacional para Informação Biotecnológica), é um banco de dados americano que permite acesso irrestrito à sequências nucleotídicas e protéicas de grande variedade de organismos. Atualmente, este banco encerra 96.400.790 sequências, contabilizando 97.381.682.336 bases (NCBI, 2008). Além disso, o GenBank faz parte da rede de Colaboração de Base de Dados de Sequências Nucleotídicas Internacional a qual compreende os bancos de dados japonês (DDBJ, DNA Database of Japan; Banco de Dados de DNA do Japão), europeu (EMBL, European Molecular Biology Laboratory; Laboratório Europeu de Biologia Molecular) e americano (GenBank). A criação desta rede permite a estes bancos a troca contínua de dados, de forma que os mesmos sejam atualizados periodicamente (NCBI, 2008). Além desses, destacam-se o PDB (Protein Data Bank; Banco de Dados de Proteínas), o PIR (Protein Information Resource; Recursos de Informações Protéicas) e o KEGG (Kyoto Encyclopedia os Genes and Genomes; Enciclopédia de Genes e Genomas de Kyoto) que também mantêm um constante intercâmbio de dados (Tateno et al., 2002; Prosdocini, et al., 2002). Ademais, o NCBI também disponibiliza outras bases de dados, como o UniGene, que agrupa todas as sequências oriundas de transcriptomas e o RefSeq, que reúne somente as sequências mais representativas de um transcrito. Além dos bancos de dados, o NCBI disponibiliza informações sobre taxonomia, genomas completos, mapas gênicos, estruturas protéicas e o PubMed, uma ferramenta de busca bibliográfica (Benson et al., 2000). Concomitantemente à criação dos bancos de dados, várias ferramentas e programas foram desenvolvidos com o intuito de analisar as sequências continuamente geradas pelos 50 projetos de sequenciamento. Entre as principais ferramentas destacam-se o BLAST (Basic Local Alignment Search Tool; Ferramenta de Busca por Alinhamento Local), utilizado na busca de sequências através da similaridade de bases ou aminoácidos (Altschul et al., 1990), bem como o ORF-finder, que pode traduzir sequências nucleotídicas em todos os seis quadros abertos de leitura (NCBI, 2008). Enquanto o BLAST está envolvido em análises locais de similaridade, o programa CLUSTAL executa alinhamentos múltiplos, tanto a partir de sequências de nucleotídeos quanto de aminoácidos, que levam em consideração análises globais de similaridade. Além disso, o programa permite a construção de cladogramas e fenogramas, para inferência filogenética e fenética, que podem ser visualizados, por exemplo, no programa TreeView (Page, 1996; Thompson et al., 1997). O MEGA (Molecular Evolutionary Genetics Analysis - Programa para Análise Genética Moleculares Evolutivas) permite a análise de caracteres evolutivamente informativos. Além disso, ele permite calcular matrizes de distância genética e analisar a composição de sequências nucleotídicas e protéicas. O programa também disponibiliza algoritmos como UPGMA (Unweighted Pair Group Method with Arithmetic Means; Método não polarizado de Agrupamentos aos Pares com Médias Aritméticas) (Sneath e Sokal, 1973), NJ (Neighbor-Joinning; Agrupamento por Vizinhança) (Saitou e Nei, 1987) e máxima parcimônia (Eck e Dayhoff, 1966; Fitch, 1971), permitindo a realização de inferências filogenéticas e fenéticas através da construção de dendrogramas (Sudhir et al., 2004). Outro programa bastante utilizado pelos bioinformatas é o CLUSTER, que permite a análise de sequências genômicas e de dados gerados por experimentos de microarray, SAGE (Serial analysis of Gene Expression; Análise Serial da Expressão Gênica), EST, entre outros, incluindo ferramentas de auto-organização de mapas, agrupamento de médias K (K-Means Clustering) e clusterização hierárquica, que permite o estudo do perfil de expressão in silico dos genes (Eisen et al., 1998). 51 2. Referências Bibliográficas Adetula OA (2006). Comparative study of the karyotypes of two Vigna sub species. Afric. J. Biotechnol. 5:563-565. Agre P, Brown D e Nielsen S (1995). Aquaporin water channels: unanswered questions and unresolved controversies. Curr. Opin. Cell Biol. 7:472–483. Albino JC, Creste S e Figueira A (2006). Mapeamento genético da cana-de-açúcar. Biotec Ciên. Desenvol. 36:82-91. Altschul SF, Gish W, Miller W, Myers EW, et al. (1990). Basic local alignment search tool. J. Mol. Biol. 251:403-410. Ané JM, Kiss GB, Riely BK, Penmetsa RV, et al. (2004). Medicago truncatula DMI1 required for bacterial and fungal symbioses in legumes. Sci. 303:1364-1367. Appleby CA (1984). Leghemoglobin and Rhizobium respiration. Ann. Rev. Plant Physiol. 35:521–554. Araújo JMD, Silva ACD e Azevedo JL (2000). Isolation of endophytic actinomycetes from roots and leaves of maize (Zea Mays L.). Braz. Arch. Biol. Tech. 43:447-451. Artschwager E e Brandes EW (1958). Sugarcane (Saccharum officinarum L.) Origin, clasification and characteristics of representative clones. In: Agriculture Handbook nº122. Department of Agriculture, Washington, pp 307-308. Asad S, Fang Y, Wycoff KL e Hirsch AM (1994). Isolation and characterization of cDNA and genomic clones of MsENOD40: Transcripts are detected in meristematic cells of alfalfa. Protoplasma 183:10–23. Azevedo JL e Araújo WL (2007). Diversity and applications of endophytic fungi isolated from tropical plants. In: Ganguli BN, Deshmukh SK (Ed.) Fungi: multifaceted microbes. Boca Ration. CRC Press 6:189-207. Baker ME e Saler MH (1990). A common ancestor for bovine lens fiber major intrinsic protein, soybean nodulin-26 protein, and E. coli glycerol facilitator. Cell 60:185186. Barbosa GVS, Souza AJR, Rocha AMC, Ribeiro CAG, et al. (2000). Novas variedades RB de cana-de-açúcar para Alagoas. Maceió: UFAL; Programa de Melhoramento Genético de Cana-de-Açúcar, 16p. (Boletim Técnico Programa de Melhoramento Genético de Cana-de-Açúcar, 1). 52 Barela JF (2005). Seletividade de herbicidas para a cultura da cana-de-açúcar (Saccharum spp.) afetada pela interação com nematicidas aplicados no plantio. 82p. Dissertação (Mestrado na área de Fitotecnia) – Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Piracicaba. Barone A e Saccardo F (1990). Pachytene morphology of cowpea chromosomes. In Cowpea genetic resources. Edited by Ng NQ e Monti LM. IITA Ibadan, Nigeria. pp 137–143. Barratt DHP, Barber L, Kruger NJ, Smith AM, et al. (2001). Multiple, distinct isoforms of sucrose synthase in pea. Plant Physiol. 127: 655–664. Barreto PD (1999). Recursos genéticos e programa de melhoramento de feijão-de-corda no Ceará: Avanços e perspectivas. In: Queirós MA de Goedert CO e Ramos SRR (eds) Recursos genéticos e melhoramento de plantas para o Nordeste Brasileiro EMBRAPA, CPATSA, Petrolina. Bastos E (1987). Cana-de-açúcar, o verde mar de energia. São Paulo: Editora Ícone. pp 130. (Coleção Brasil Agrícola). Baud S, Vaultier MN e Rochat C (2004). Structure and expression profile of the sucrose synthase multigene family in Arabidopsis. J. Exp. Bot. 55: 397–409. Benko-Iseppon AM (2001). Estudos moleculares no caupi e em espécies relacionadas: Avanços e perspectivas. Embrapa Documentos 56: 327-332. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, et al. (2000). GenBank. Nucleotide Acid Res. 28:15-18. Birch PRJ e Kamoun S (2000). Studying interaction transcriptomes: coordinated analyses of gene expression during plant–microorganism interactions. Trends in plant sci. 77-82. Bolle C (2004). The role of GRAS proteins in plant signal transduction and development. Planta 218: 683–692. Borém A e Santos FR (2001). Biotecnologia simplificada. Editora Suprema. Viçosa, MG. Borisov AY, Madsen LH, Tsyganov VE, Umehara Y, et al. (2003). The Sym35 gene required for root nodule development in pea is an ortholog of NIN from Lotus japonicus. Plant Physiol. 131:1009–1017. Brandes EW (1956). Origin, dispersal and use in breeding of the Malanesian garden sugarcanes and their derivates Saccharum officinarum L. Proc. Cong. Int. Soc. Sug. Technol. 9:709-750. 53 Brundrett M (2002). Co-evolution of roots and mycorrhizas of land plants. New Phytol. 154: 275–304. Calegari A (2000). Coberturas verdes em sistemas intensivos de produção. In: Workshop Nitrogênio na sustentabilidade de sistemas intensivos de produção agropecuária. Dourados. Anais: Embrapa Agropecuária Oeste; Embrapa Agrobiologia pp 141-153. Calvert C, Gant SJ e Bowles DJ (1996). Tomato annexins p34 and p35 bind to F-actin and display nucleotide phosphodiesterase activity inhibited by phospholipid binding. Plant Cell 8:333-342. Camargos LS (2002). Análise das alterações no metabolismo de nitrogênio em Canavalia ensiformes (l.) em resposta a variações na concentração de nitrato fornecida. Dissertação de mestrado, Piracicaba, São Paulo – Brasil. Capoen W, Goormachtig S, De Rycke R, Schroeyers K, et al. (2005). SrSymRK, a plant receptor essential for symbiosome formation. Proc. Natl. Acad. Sci. U.S.A. 102:10369-10374. Carson DL e Botha FC (2000). Preliminary analysis of expressed sequence tags for sugarcane. Crop Sci. 40:1769-1779. Carraro DM e Kitajima JP (2002). Sequenciamento e bioinformática de genomas bacterianos. Biotecnol. Ciên. Desenvol. 28:16-20. Casu R, Dimmock C, Thomas M, Bower N, et al. (2001). Genetic and expression profiling in sugarcane. Proc. Int. Soc. Sugarcane Technol. 24:626-627. Catalano CM, Lane WS e Sherrier DJ (2004). Biochemical characterization of symbiosome membrane proteins from Medicago truncatula root nodules. Electrophoresis 25:519-531. Catoira R, Galera C, de Billy F, Penmetsa RV, et al. (2000). Four genes of Medicago truncatula controlling components of a nod factor transduction pathway. Plant Cell 12: 1647–1665. Cebolla A, Vinardell JM, Kiss E, Olàh B, et al. (1999). The mitotic inhibitor CCS52 is required for endoreduplication and ploidy-dependent cell enlargement in plants. Euro. Mol. Biol. J. 18:4476-4484. Chanda SV, Sood CR, Reddy VS e Singh YD (1998). Influence of plant growth regulators on some enzymes of nitrogen assimilation in mustard seedlings. J. Plant Nutrition 21:1765-1777. Charon C, Sousa C, Crespi M e Kondorosi A (1999). Alteration of ENOD40 expression profiles modifies Medicago truncatula root nodule development induced by Sinorhizobium meliloti. Plant Cell 11:1953-1965. Chen WX, Tan ZY, Gao JL, Li Y, et al. (1997). Rhizobium hainanense sp. nov., isolated from tropical legumes. Intern. J. Syst. Bacteriol. 47:870-873. 54 Chen WX, Yan, GH e Li JL (1988). Numerical taxonomic study of fast-growing soybean rhizobia and a proposal that Rhizobium fredii be assigned to Sinorhizobium gen. nov. Intern. J. Syst. Bacteriol. 38:392-397. Chen C, Gao M, Liu J e Zhu H (2007). Fungal symbiosis in rice requires an ortholog of a legume common symbiosis gene encoding a Ca+/calmodulin-dependent protein kinase. Plant Physiol. 145:1619–1628. Clark GB e Roux SJ (1995). Annexins of plant cells. Plant Physiol. 109:1133-1139. Cronk Q, Ojeda I e Pennington RT (2006). Legume comparative genomics: progress in phylogenetics and phylogenomics. Cur. Opin. Plant Biol. 9:99-103. Dal Santo P, Logan MA, Chisholm AD e Jorgensen EM (1999). The inositol trisphosphate receptor regulates a 50-second behavioral rhythm in C. elegans. Cell 98:757-767. Dangl JL e Jones JDG (2001). Plant pathogens and integrated defence responses to infection. Nature 411:826–833. De Koninck P e Schulman H (1998). Sensitivity of CaM kinase II to the frequency of Ca2+ oscillations. Sci. 279:227-230. Dantas B (1960). Contribuição para a história da “gomose” da cana-de-açúcar, em Pernambuco e no Brasil. In: Boletim Técnico nº11 Instituto agronômico do Nordeste. Recife, pp 3-17. Dean RM, Rivers RL, Zeidel ML e Roberts DM (1999). Biochem. 38:347-353. Demattê JL (2005). I. Recuperação de manutenção da fertilidade do solo. Piracicaba, POTAFOS, Informações Agronômicas, Piracicaba, SP, 111:1-24. (Encarte Técnico). Dickstein R, Hu XJ, Yang J, Ba L, et al. (2002). Differential expression of tandemly duplicated Enod8 genes in Medicago. Plant Sci. 163:333-343. Dixon RA e Sumner LW (2003). Legume natural products: understanding and manipulating complex pathways for human and animal health. Plant Physiol. 131: 878–885. Duncan KA, Hardin SC e Huber SC (2006). The three maize sucrose synthase isoforms differ in distribution, localization, and phosphorylation. Plant Cell Physiol. 47: 959–971. Eberhart SA e Russel WA (1966). Stability parameters for comparing varieties. Crop Sci. 6:36-40. Eck RV e Dayhoff MO (1966). Atlas of protein sequence and structure. Silver Spring, Md: Nat. Biomed. Res. Found. 161-169. Ehlers JD e Hall AE (1997). Cowpea (Vigna unguiculata Walp L.). Field Crops Res. 53:187-204. Eisen MB, Spellman PT, Brown PO e Botstein D (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. U.S.A. 95:14863-14868. 55 Emes MJ e Fowler MW (1979). The intracellular location of the enzymes of nitrate assimilation in the apices of seedling pea roots. Planta 144:249-253. Endre G, Kereszt A, Devei Z, Mihacea S, et al. (2002). A receptor kinase gene regulating symbiotic nodule development. Nature 417:962-966. Fang Y e Hirsh AM (1998). Studying early gene ENOD40 expression and induction by nodulation factor and cytokinin in transgenic alfalfa. Plant Physiol. 116:53-68. Fatokun CA, Tarawali SA, Singh BB, Kormawa PM, et al. (editors) (2002). Challenges and opportunities for enhancing sustainable cowpea production. Proceedings of the World Cowpea Conference III held at the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria, pp 4–8 Setembro. Favery B, Complainville A, Vinardell JM, Lecomte P, et al. (2002). The endosymbiosisinduced genes ENOD40 and CCS52a are involved in endoparasitic-nematode interactions in Medicago truncatula. Mol. Plant-Microbe Interac. 15:1008–1013. Fernandes MF, Fernandes RPM e Hungria M (2003). Caracterização genética de rizóbios nativos dos tabuleiros costeiros eficientes em culturas do guandu e caupí. Pesq. Agropec. Bras. 8:911-920. Fery RL (1990). The cowpea: production, utilization and research in the United States. Hort. Rev. 12:197-222. Finlay KW e Wilkinson GN (1963). The analysis of adaptation in plant breeding programme. Austr. J. Agric. Res. 14:742-754, 1963. Fitch WM (1971). Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20:406-416. Flemetakis E, Kavroulakis N, Quaedvlieg NEM, Spaink HP, et al. (2000). Lotus japonicus contains two distinct Enod40 genes that are expressed in symbiotic, nonsymbiotic, and embryonic tissues. Mol. Plant-Microbe Inter. 13: 987–994. FNP Consultoria e comércio (2004). Agrianual 2004: anuário da agricultura brasileira. São Paulo. pp 546. Foucher F e Kondorosi E (2000). Cell cycle regulation in course of nodule organogenesis in Medicago. Plant Mol. Biol. 43:773-786. Franssen HJ, Nap JP e Bisseling T (1992). Nodulins in root nodule development. In: Biological Nitrogen Fixation (Stacey G, Burris RH and Evans HJ, eds) New York, London: Chapman & Hall pp 598-624. Freire-Filho FR (1988). Cowpea taxonomy and introduction to Brazil. In: Watt EE and Araújo JPP de (eds) Cowpea research in Brazil IITA, EMBRAPA, Brasília, pp 3-10. Freire-Filho FR, Ribeiro VQ, Rocha MM e Lopes ACA (2002). Adaptabilidade e estabilidade da produtividade de grãos de linhagens de caupi de porte enramador. Rev. Ceres. Viçosa. 49:383-393. 56 Freire-Filho FR, Ribeiro VQ, Rocha MM e Lopes ACA (2001). Adaptabilidade e estabilidade de rendimento de grãos de genótipos de caupi de porte semi-ereto. Rev. Cient. R. 6:31-39. Freire-Filho FR, Ribeiro VQ, Barreto PD e Santos CAF (1999). Melhoramento genético do caupi (Vigna unguiculata (L.) Walp.) na região nordeste. In: Queirós MA, Goedert CO e Ramos SRR (eds). Recursos genéticos e melhoramento de plantas para o Nordeste Brasileiro, EMBRAPA. Gonnet S e Diaz P (2000). Glutamine synthetase and glutamate synthase activities in relation to nitrogen fixation in Lotus spp. Rev. Bras. Fisiol. Veg. 12:195–202. Graham PH e Vance CP (2003). Legumes: importance and constraints to greater use. Plant Physiol. 131: 872–877. Gresshoff PM (1993). Molecular genetic analysis of nodulation genes in soybean. Plant Breeding Rev. 11:275-318. Gresshoff PM (2003). Post-genomic insights into plant nodulation symbioses. Gen. Biol. 4:201. Grivet L e Arruda P (2001). Sugarcane genomics: depiciting the complex genome of an important tropical crop. Curr. Opin. Plant Biol. 5:122-127. Harada T, Satoh S, Yoshioka T e Ishizawa K (2005). Expression of sucrose synthase genes involved in enhanced elongation of pondweed (Potamogeton distinctus) turions under anoxia. Ann. Bot. 96:683–692. Hardison RC (1996). A brief history of hemoglobins: plant, animal, protist, and bacteria. Proc. Natl. Acad. Sci. U.S.A. 93: 5675–5679. Hirsch AM, Lum MR e Downie JA (2001). What makes the Rhizobia-legume symbiosis so special? Plant Physiol. 127:1484–1492. Hodge A, Campbell CD e Fitter AH (2001). An arbuscular mycorrhizal fungus accelerates decomposition and acquires nitrogen directly from organic material. Nature 413:297–299. Hohnjec N, Becker JD, Puhler A, Perlick AM, et al. (1999). Genomic organization and expression properties of the MtSucS1 gene, which encodes a nodule-enhanced sucrose synthase in the model legume Medicago truncatula. Mol. Gen. Genet. 261: 514–522. Horst I, Welham T, Kelly S e Kaneko T (2007). TILLING Mutants of Lotus japonicus reveal that nitrogen assimilation and fixation can occur in the absence of noduleenhanced Sucrose Synthase. Plant Physiol. 144:806–820. Hungría M, Barradas C e Wallsgrove R (1991). Nitrogen fixation, assimilation and transport during the initial growth stage of Phaseolus vulgaris L. J. Exp. B. 42:839844. Imaizumi-Anraku H, Takeda N, Charpentier M, Perry J, et al. (2005). Plastid proteins crucial for symbiotic fungal and bacterial entry into plant roots. Nature 433:527-531. 57 Ingelbrecht IL, Irvine JE e Mirkov AC (1999). Posttranscriptional and silencing in transgenic sugarcane: Dissection of homology-dependent virus resistance in monocot that has a complex polyploid genome. Plant Physiol. 199:1187-1197. Jessen D, Barnes DK e Vance CP (1988). Bidirectional selection in alfalfa for activity of nodule nitrogen and carbon assimilating enzymes. Crop Sci. 28:18-22. Jeong J, Suh S, Guan C, Tsay YF, et al. (2004). A nodule-specific dicarboxylate transporter from alder is a member of the peptide transporter family. Plant Physiol. 134:969– 978. Johanson U, Karlsson M, Johansson I, Gustavsson S, et al. (2001). The complete set of genes encoding major intrinsic proteins in arabidopsis provides a framework for a new nomenclature for major intrinsic proteins in plants. Plant Physiol. 126:13581369. Jordan DC (1984). Transfer of Rhizobium japonicum, Bucchanan 1980 to Bradyrhizobium gen. nov., a genus of slow-growing, root nodule bacteria from leguminous plants. Intern. J. Syst. Bacteriol. 32:136-139. Jung JS, Preston GM, Smith BL, Guggino WB, et al. (1994). Molecular structure of the water channel through aquaporin CHIP. The hourglass model. J. Biol. Chem. 269:14648–14654. Kaetzel MA e Dedman JS (1995). Annexins: novel Ca2+-dependent regulators of membrane function. News Physiol. Sci. 10:171-176. Kaiser BN, Moreau S, Castelli J, Thomson R, et al. (2003) The soybean NRAMP homologue, GmDMT1, is a symbiotic divalent metal transporter capable of ferrous iron transport. The Plant J. 35:295–304. Kalo P, Gleason C, Edwards A, Marsh J, et al. (2005). Nodulation signaling in legumes requires NSP2, a member of the GRAS family of transcriptional regulators. Sci. 308:1786–1789. Kanamori N, Madsen LH, Radutoiu S, Frantescu M, et al. (2006). A nucleoporin is required for induction of Ca2+ spiking in legume nodule development and essential for rhizobial and fungal symbiosis. Proc. Natl. Acad. Sci. U.S.A. 103: 359–364. Kent NA e Shiels A (1990). Nucleotide and derived amino acid sequence of the major intrinsic protein of rat eye lens. Nucleic Acid Res. 18:4256. Khan MS, Zaidi A e Wani PA (2007). Role of phosphate-solubilizing microorganisms in sustainable agriculture - a review. Agro. Sust. Develop. 1:29-43. Kistner C e Parniske M (2002). Evolution of signal transduction in intracellular symbiosis. Trends Plant Sci. 7:511-518. Kondorosi E, Roudier F e Gendreau E (2000). Plant cell-size control: growing by ploidy? Curr. Opin. Plant Biol. 3:488–492. 58 Kouchi H e Hata S (1993). Isolation and characterization of novel nodulin cDNAs representing genes expressed at early stages of soybean nodule development. Mol. Gen. Genet. 238:106–119. Kouchi H, Takane K, So RB, Ladha JK, et al. (1999). Rice ENOD40: Isolation and expression analysis in rice and transgenic soybean root nodules. Plant J. 18:121-129. Kumagai H, Kinoshita E, Ridge RW e Kouchi H (2006). RNAi Knock-Down of ENOD40s leads to significant suppression of nodule formation in Lotus japonicus. Plant Cell Physiol. 47:1102–1111. Kuykendall LD, Saxena B, Devine TE e Udell SE (1992). Genetic diversity in Bradyrhizobium japonicum Jordan 1982 and a proposal for Bradyrhizobium elkanii sp. nov. Can. J. Microbiol. 38:501-505. Kuklinsky-Sobral J, Araujo WL, Mendes R, Geraldi IO, et al. (2004). Isolation and characterization of soybean-associated bacteria and their potential for plant growth promotion. Environm. Microbiol. 12:1244-1251. Lajudie P, Willems A, Pot B, Dewettinck D, et al. (1994). Polyphasic taxonomy of rhizobia: emendation of the genus Sinorhizobium and description of Sinorhizobium meliloti comb. nov., Sinorhizobium saheli sp. nov., and Sinorhizobium teranga sp. nov. Intern. J. Syst. Bacteriol. 44:715-733. Larkins BA, Dilkes BP, Dante RA, Coelho CM, et al. (2001). Investigating the hows and whys of DNA endoreduplication. J. Exp. Bot. 52:183-192. Laursen NB, Larsen K, Knudsen JY, Hoffmann HJ, et al. (1994). A protein binding ATrich sequence in the soybean leghemoglobin c3 promoter is a general cis element that requires proximal DNA elements to stimulate transcription. Plant Cell 6: 659– 668. Lévy J, Bres C, Geurts R, Chalhoub B, et al. (2004). A putative Ca2+ and calmodulindependent protein kinase required for bacterial and fungal symbioses. Sci. 303:1361-1364. Limpens E, Franken C, Smit P, Willemse J, et al. (2003). LysM domain receptor kinases regulating rhizobial Nod factor-induced infection. Sci. 302:630–633. Long SR (1996). Rhizobium symbiosis: Nod factors in perspective. Plant Cell 8:1885-1898. Long SR (2001). Genes and signals in the Rhizobium–legume symbiosis. Plant Physiol. 125:69–72. Lu YH, D’Hont A, Paulet F, Grivet L, et al. (1994). Molecular diversity and genome structure in modern sugarcane varieties. Euphytic. 78:217-226. Macedo IDC (2001). Agroindústria da cana-de-açúcar: participação na redução da taxa de carbono na atmosfera no Brasil. Informativo CTC (Centro de Tecnologia Copersucar), Piracicaba, nº 67, pp 1-4. 59 Madhaiyan M, Poonguzhali S, Senthilkumar M, Seshadri S, et al. (2004). Growth promotion and induction of systemic resistance in rice cultivar Co-47 (Oryza sativa L.) by Methylobacterium spp. Bot. Bull. Acad. Sin. 4:315-324. Madsen EB, Madsen LH, Radutoiu S, Olbryt M, et al. (2003). A receptor kinase gene of the LysM type is involved in legume perception of rhizobial signals. Nature 425:637– 640. Magalhães ACN (1987). Ecofisiologia da cana-de-açúcar: aspectos do metabolismo do carbono na planta. In: Castro PRC, Ferreira SO, Yamada T (Ed.). Ecofisiologia da produção agrícola. Piracicaba Potafós pp 113-118. Magloire N (2005). The genetic, morphological and physiological evaluation of african cowpea genotypes. University of the free state, bloemfontein. Marsh JF, Rakocevic A, Mitra RM e Brocard L (2007). Medicago truncatula NIN is essential for rhizobial-Independent nodule organogenesis induced by autoactive calcium/calmodulin-dependent protein kinase. Plant physiol. 144:324–335. Manthey K, Krajinski F, Hohnjec N, Firnhaber C, et al. (2004). Transcriptome profiling in root nodules and arbuscular mycorrhiza identifies a collection of novel genes induced during Medicago truncatula root endosymbioses. Mol. Plant–Microbe Interac. 17:1063–1077. Matsuoka S, Garcia AAF e Arizono H (1999). Melhoramento da cana-de-açúcar. In: Aluízio Borém. (ed) Melhoramento de Espécies Cultivadas. 1st edition. Editora UFV,Viçosa, pp 205-251. Mauro VP e Verma DPS (1988). Transcriptional activation in nuclei from uninfected soybean of a set of genes involved in symbiosis with Rhizobium. Mol. PlantMicrobe Interact. 1:46-51. McClung AD, Carrol AD e Battey NH. (1994). Identification and characterization of ATPase activity associated with maize (Zea mays) annexins. Biochem. J. 303:709712. Miflin BJ e Lea PJ (1980). Ammonia assimilation. In: Miflin BJ (Ed.). The biochemistry of plants: amino acids and derivatives. New York: Academic Press, pp 169-202. Mims MP e Prchal JT (2005). Divalent metal transporter 1. Hematol. 10:339–345. Misaghi IJ e Donndelinger CR (1990). Endophytic bacteria in symptom-free cotton plants. Phytopathol. 9:808-811. Mitra RM, Gleason CA, Edards A, Hadfield J, et al. (2004). A Ca2+/calmodulin-dependent protein kinase required for symbiotic nodule development: gene identification by transcript-based cloning. Proc. Natl. Acad. Sci. U.S.A., impresso. Morais DAL (2003). Análise boinformática de genes de resitência à patógenos no genoma expresso da cana-de-açúcar. Dissertação (Mestrado), Universidade Federal de Pernambuco, Pernambuco, Recife, pp 114. 60 Moreira FMS e Siqueira JO (2002). Microbiologia e bioquímica do solo. Lavras: Universidade Federal de Lavras. pp 625. Morgan SO e Fernandez MP (1997). Distinct annexin subfamilies in plants and protists diverged prior to animal annexins and from a common ancestor. J. Mol. Evol. 44:178-188. Moss SE (1997). Annexins. Trends Cell Biol. 7:87-89. Mullin BC, Swensen SM e Goetting-Minesky P (1990). in Nitrogen Fixation Achievements and Objectives, eds. Gresshoff PM, Roth LE, Stacey G e Newton WE. Chapman & Hall, New York pp 781-787. Mylona P, Pawlowski K e Bisseling T (1995). Symbiotic nitrogen fixation. Plant Cell 7:869-885. Neto PASP, Azevedo JL e Araujo WL (2003). Microrganismos endofíticos: interação com plantas e potencial biotecnológico. Biotecnol. Ciên. Desenvol. 29:62-76. Ng NQ e Maréchal R (1985). Cowpea taxonomy, origin germ plasm. In: Sinch SR, Rachie KO, eds. Cowpea research, production end utilization. Cheichecter, Johm Wiley. pp 11-21. Niebel FC, Lescure N, Cullimore JV e Gamas P (1998). The Medicago truncatula MtAnn1 gene encoding an annexin is induced by Nod factors and during the symbiotic interaction with Rhizobium meliloti. Mol. Plant-Microbe Interac. 11:504–513. Nogueira EM, Vinagre F, Masuda HP, Vargas C, et al. (2001). Expression of sugarcane genes induced by inoculation with Gluconacetobacter diazotrophicus and Herbaspirillum rubrisubalbicans. Gen. Mol. Biol. 24:199-206. Novaretti WRT (1981). Efeitos de diferentes níveis de populações iniciais de Meloidogyne javanica em duas variedades de cana-de-açúcar (Saccharum spp.) cultivadas no Estado de São Paulo. Dissertação de Mestrado em Entomologia – Escola Superior de Agricultura “Luiz de Queiroz”, Universidade de São Paulo, Piracicaba. Oldroyd GED e Downie AL (2004). Calcium, kinases and nodulation signalling in legumes. Mol. Cell Biol. 5:566-576. Oldroyd GED e Downie JA (2006). Nuclear calcium changes at the core of symbiosis signalling. Curr. Opin. Plant Biol. 9:351–357. Oldroyd GED e Long SR (2003). Identification and characterization of nodulationsignaling pathway 2, a gene of Medicago truncatula involved in Nod factor signaling. Plant Physiol. 131:1027–1032. Oldroyd GED, Harrison MJ e Udvardi M (2005). Peace talks and trade deals. Keys to longterm harmony in legume-microbe symbioses. Plant Physiol. 137:1205. Ott T, van Dongen JT, Günther C, et al. (2005). Symbiotic leghemoglobins are crucial for nitrogen fixation in legume root nodules but not for general plant growth and development. Curr. Biol. 15:531–535. 61 Padulosi S, Ng QN e Perrino P (1997). Origin, taxonomy and morphology of Vigna unguiculata (L.) Walp. In: Singh BB and Raj M (eds) Advances in Cowpea Research. Page RD (1996). Treeview program, version 161. Comput. Appl. Biosci. 12:357-358. Pant KC, Chandel KPS e Joshi BS (1982). Analysis of diversity in indian cowpea genetic resources. SABRO J. 14: 103-111. Pao SS, Paulsen IT e Saier MH Jr (1998). Major facilitator superfamily. Microbiol. Mol. Biol. Rev. 62:1–34. Papadopoulou K, Roussis A e Katinakis P (1996). Phaseolus ENOD40 is involved in symbiotic and non-symbiotic organogenetic processes: expression during nodule and lateral root development. Plant Mol. Biol. 30:403–417. Parniske M (2000). Intracellular accommodation of microbes by plants: a common developmental program for symbiosis and disease? Curr. Opin. Plant Biol. 3:320– 328. Parniske M e Downie JA (2003). Plant biology: locks, keys and symbioses. Nature 425: 569–570. Paterson AH (2006). Leafing through the genomes of our major crop plants: strategies for capturing unique information. Nature Rev. Gen. 7:174-184. Patil S, Takezawa D e Poovaiah BW (1995). Chimeric plant calcium/calmodulin-dependent protein kinase gene with a neural visinin-like calcium-binding domain. Proc. Natl. Acad. Sci. U.S.A. 92:4897–4901. Perret X, Staehelin C e Broughton W (2000). Molecular basis of symbiotic promiscuity. Microbiol. Mol. Biol. Rev. 64:180–201. Perrin RM e Wigge PA (2002). Genome studies and molecular genetics/Plant biotechnology web alert. Curr. Opin. Plant Biol. 5:89-90. Pignone D, Cifarelli S e Perrino P (1990). Chromosome identification in Vigna unguiculata (L.) Walp. In Cowpea genetic ressources. Edited by Ng NQ e Monti LM. IITA Ibadan, Nigeria. pp 144–150. Pinazza LA e Alimandro R (2003). Cana-de-açúcar: Alimento bom e doce. Agroanalysis 23:9-31. Portieles R, Rodriguez R, Hernández I, Canales E, et al. (2002). Determinación del número cromosómico de um grupo de clones silvestres de origen desconocido y clones de fundacióndel complejo Saccharum. Cult. Trop. 23:69-72. Pringle D e Dickstein R (2004). Purification of ENOD8 proteins from Medicago sativa root nodules and their characterization as esterases. Plant Physiol. Bioch. 42:73–79. Prosdocini F, Cerqueira GC, Binneck E, Silva AF, et al. (2002). Bioinformática: manual do usuário. Biotecnol. Cien. Desenvol. 29:12-25. 62 Radutoiu S, Madsen LH, Madsen EB e Felle HH (2003). Plant recognition of symbiotic bacteria requires two LysM receptor-like kinases. Nature 425:585–592. Rao YL, Jan Y e Jan YN (1990). Similarity of the product of the Drosophila neurogenic gene big brain to transmembrane channel proteins. Nature 345:163-167. Rausch C, Daram P, Brunner S, Jansa J, et al. (2001). A phosphate transporter expressed in arbuscule-containing cells of potato. Nature 414:462–466. Raven PH, Evert RF e Eichhor SE (2001). Biologia Vegetal. Sexta ed. Ed. Guanabara Koongan S.A. Raynal P e Pollard HB (1994). Annexins: The problem of assessing the biological role for a gene family of multifunctional calcium- and phospholipid-binding proteins. Biochem. Biophys. Acta 1197:63-93. Reibach PH e Streeter JG (1984). Evaluation of active versus passive uptake of metabolites by Rhizobium japonicum bacteroids. J. Bacteriol. 159:47–52. Reinhold-Hurek B e Hurek T (1998). Life in grasses: diazotrophic endophytes. Trends Microbiol. 6:139-144. Rivers RL, Dean RM, Chandyi G, Halli JE, et al. (1997). Functional analysis of Nodulin 26, an aquaporin in soybean root nodule symbiosomes. J. Biol. Chem. 272:6256–16261. Roach BT e Daniels J (1987). A review of the origin and improvement of sugarcane. In: COPERSUCAR International Sugarcane Workshop. COPERSUCAR, Brasil, pp 131. Roberts DM e Tyerman SD (2002). Voltage-dependent cation channels permeable to NH4+, K+, and Ca2+ in the symbiosome membrane of the model legume Lotus japonicus. Plant Physiol. 128:370–378. Rosenblueth M, Martinez L, Silva J e Martinez-Romero E (2004). Klebsiella variicola, a novel species with clinical and plant-associated isolates. Syst. App. Microbiol. 1:2735. Rosse LN, Vencovsky R e Ferreira DF (2002). Comparação de métodos de regressão para avaliar a estabilidade fenotípica em cana-de-açúcar. Pesq. agropec. bras. 37:25-32. Ruan YL, Llewellyn DJ e Furbank RT (2003). Suppression of sucrose synthase gene expression represses cotton fiber cell initiation, elongation, and seed development. Plant Cell 15: 952–964. Rubini MR, Silva-Ribeiro RT, Pomella AWV, Maki CS, et al. (2005). Diversity of endophytic fungal community of cacao (Theobroma cacao L.) and biological control of Crinipellis perniciosa, causal agent of Witches' Broom Disease. Int. J. Biol. Scien. 1:24-33. Saccardo F, Del GiLidice A e Galasso I (1992). Cytogenetics of cowpea. In: eds. G. Thottappilly LM, Monti DR, Mohan R e Moore AW. Biotechnology: Enhancing Research on Tropical Crops im Africa. CTA/IITA co-publication, IITA, Ibadan, Nigeria, pp 89-98. 63 Saitou N e Nei M (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. Sandhiya GS, Sugitha TCK, Balachandar D e Kumar K (2005). Endophytic colonization and in planta nitrogen fixation by a diazotrophic Serratia sp. in rice. Ind. J. Exp. Biol. 9:802-807. Sato S, Nakamura Y, Asamizu E, Isobe S, et al. (2007). Genome sequencing and genome resources in model legumes. Plant Physiol. 144:588-593. Schardl CL, Leuchtmann A e Spiering MJ (2004). Symbioses of grasses with seedborne fungal endophytes. Ann. Rev. Plant Biol. 55:315-340. Schauser L, Roussis A, Stiller J e Stougaard J (1999). A plant regulator controlling development of symbiotic root nodules. Nature 402: 191–195. Schauser L, Wieloch W e Stougaard J (2005). Evolution of NIN-Like proteins in Arabidopsis, rice and Lotus japonicus. J. Mol. Evol. 60: 229–237. Scheres B, Van De Wiel C, Zalensky A, Horvath B, et al. (1990). The ENOD12 gene product is involved in the infection process during the pea–Rhizobium interaction. Cell 60:281–94. Schröder G, Frühling M, Pühler A e Perlick AM (1997). The temporal and spatial transcription pattern in root nodules of Vicia faba nodulin genes encoding glycinerich proteins. Plant Mol. Biol. 33:113–123. Schuller KA, Day DA e Gibson AH (1986). Enzymes of ammonia assimilation and ureide biosynthesis in soybean nodules: effect of nitrate. Plant Physiol. 80:646-650. Seals DF, Parrish ML e Randall SK (1994). A 42-kilodalton annexin-like protein is associated with plant vacuoles. Plant Physiol. 106:1403-1412. Sevilla M, Burris RH, Gunapala N e Kennedy C (2001). Comparison of benefit to sugarcane plant growth an 15N2 incorporation following inoculation of sterile plants with Acetobacter diazotrophicus wild-type and Nif-mutant strains. Mol.Plant– Microbe Interac. 14:358–366. Shiu S-H e Bleecker A (2001). Plant receptor-like kinase gene family: diversity, function and signaling. Science’s STKE 113:22. Shoshima AHR, Tavano OL e Neves VA (2005). Digestibilidade in vitro das proteínas do Caupi (Vigna unguiculata L. Walp) Var. “Br-14 Mulato”: Efeito dos fatores antinutricionais. Braz. J. Food Technol 8:299-304. Simon L, Bousquet J, Levesque RC e Lalonde M (1993). Origin and diversification of endomycorrhizal fungi and coincidence with vascular land plants. Nature 363:67-69. Simon MV, Benko-Iseppon AM, Resende LV, Winter P, et al (2007). Genetic diversity and phylogenetic relationships in Vigna Savi germplasm revealed by DNA amplification fingerprinting. Genome 50:538-547. Singh BB, Ehlers JD, Sharma B e Freire-Filho FR (2002). Recent progress in cowpea breeding in Fatokun CA, Tarawali SA, Singh BB, Kormawa PM e Tamò M 64 (editors). Challenges and opportunities for enhancing sustainable cowpea production. Proceedings of the World Cowpea Conference III held at the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria, pp 4–8. Simpson AJ, Reinach FC, Arruda P, Abreu FA, et al. (2000). The genome sequence of the plant pathogen Xylella fastidiosa. Nature 406:151-157. Singh BB (2005). Cowpea Vigna unguiculata (L.) Walp. In Genetic Resources, Chromosome Engineering and Crop Improvement. Vol 1. Ed. by: Singh RJ, Jauhar PP. Boca Raton: CRC Press pp 117-162. Smeekens S (2000). Sugar-induced signal transduction in plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 51:49–81. Smit P, Raedts J, Portyanko V, Debelle F, et al. (2005). NSP1 of the GRAS protein family is essential for rhizobial Nod factor-induced transcription. Sci. 308:1789–1791. Sneath PHA e Sokal RR (1973). Numerical taxonomy: The principles and practice of numerical Classification. Freeman, San Francisco, CA, pp 573. Sousa C, Johansson C, Charon C, Manyani H, et al. (2001). Translational and structural requirements of the early nodulin gene enod40, a short-open reading framecontaining RNA, for elicitation of a cell-specific growth response in the alfalfa root cortex. Mol. Cell Biol. 21:354–366. Spaink HP, Kondorosi A e Hooykaas PJJ (1998). The Rhizobiaceae: Molecular biology of model plant-associated bacteria. Kluwer Academic Publishers, Dordrecht, The Netherlands Spaink HP (2000). Root nodulation and infection factors produced by rhizobial bacteria. Annu. Rev. Microbiol. 54:257-288. Sprent JI e Sprent P (1990). Nitrogen fixing organisms. Pure and applied aspects. Chapman & Hall, London, United Kingdom. Steele WM e Mehra KL (1980). Structure, evolution and adaptation to farming system and inveronment in Vigna. In: Summerfield DR, Bunting AH eds. Advances in legume science. Royol Bot. Gar. pp 459-468. Stougaard J (2000). Regulators and regulation of legume root nodule development. Plant Physiol. 124: 531–540. Stracke S, Kistner C, Yoshida S, Mulder L, et al. (2002). A plant receptor-like kinase required for both bacterial and fungal symbiosis. Nature 417:959-962. Sturz A e Kimpinski J (2004). Endoroot bacteria derived from marigolds (Tagetes spp.) can decrease soil population densities of root-lesion nematodes in the potato root zone. Plant Soil 262:241-249. Subbaiah CC e Sachs MM (2001). Altered patterns of sucrose synthase phosphorylation and localization precede callose induction and root tip death in anoxic maize seedlings. Plant Physiol. 125:585–594. 65 Suganuma N, Watanabe M, Yamada T, Izuhara T, et al. (1999). Involvement of ammonia in maintenance of cytosolic glutamine synthetase activity in Pisum sativum nodules. Plant Cell Physiol. 40:1053-1060. Sudhir K, Koichiro T e Masatoshi N (2004). MEGA3: Integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinf. 5:150-163. Sweet G, Ganor C, Voeglele R, Wittekindt NA, et al (1990). Glycerol facilitator of Escherichia coli: cloning of glpF and identification of the glpF product. J. Bacteriol. 172:424-430. Szczyglowski K, Kapranov P, Hamburger D e de Bruijn FJ (1998). The Lotus japonicus LjNOD70 nodulin gene encodes a protein with similarities to transporters. Plant Mol. Biol. 37:651–661. Szczyglowski K e Amyot L (2003). Symbiosis, inventiveness by recruitment? Plant Physiol. 131:935–940. Szmrecsányi T e Moreira EP (1991). O desenvolvimento da agroindústria canavieira do Brasil desde a Segunda Guerra Mundial. Est. Avan. 5:57-79. Tajima S, Ito H, Tanaka K, Nanakado T, et al. (1991). Soybean cotyledons contain a uricase that crossreacts with antibodies raised against the nodule uricase (nod-35). Plant Cell Physiol. 32:1307-1311. Tajima S e Kouchi H (1996). Metabolism and compartmentation of carbon and nitrogen in legume nodules. Plant-Microbe Interac. pp 27-60. Takezawa D, Ramachandiran S, Paranjape V e Poovaiah BW (1996). Dual regulation of a chimeric plant serine/threonine kinase by calcium and calcium/calmodulin. J. Biol. Chem. 271:8126-8132. Tarayre S, Vinardell JM, Cebolla A, Kondorosi A, et al. (2004). Two classes of the Cdh1type activators of the anaphase-promoting complex in plants: Novel functional domains and distinct regulation. Plant Cell 16:422–434. Tateno Y, Imanishi T, Miyazaki S, Fukami-Kobayashi K, et al. (2002). DNA databank of Japan (DDBJ) for genome scale research in life science. Nuc. Acids Res. 25:48764882. Theissen G, Becker A, Rosa AD, Kanno A, et al. (2000). A short history of MADS-box genes in plants. Plant Mol. Biol. 42: 115–149. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin J, et al. (1997) The Clustal_X windows interface: flexible strategies formultiple sequencealignment aided by quality analysis tools. Nuc. Acids Res. 25:4876-4882. Timko MP (2002). Molecular cloning in cowpea: perspectives on the status of genome characterization and gene isolation for crop improvement. In: Fatokun CA, Tarawali SA, Singh BB, Kormawa PM and Tamò M (eds) Challenges and opportunities for enhancing sustainable cowpea production. Proceedings of the World Cowpea Conference III held at the International Institute of Tropical Agriculture (IITA) Nigeria, pp 197-212. 66 Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, et al. (2008). Sequencing and analysis of the gene-rich space of cowpea. BMC Genomics 9:103. Timko MP, Ehlers JD e Roberts PA (2007). Cowpea. In Genome Mapping and Molecular Breeding in Plants, Pulses, Sugar and Tuber Crops. Eds: Kole C. Berlin: SpringerVerlag 3:49-68. Tsavkelova EA, Cherdyntseva TA, Botina SG e Netrusov AI (2007). Bacteria associated with orchid roots and microbial production of auxin. Microbiol. Res. 1:69-76. Van Aelst L, Hohmann S, Zimmermann FK, Jans AWH, et al. (1991). A yeast homologue of the bovine lens fibre MIP gene family complements the growth defect of a Saccharomyces cerevisiae mutant on fermentable sugars but not its defect in glucose-induced RAS-mediated cAMP signalling. Eur. Mol. Biol. Organ. J. 10:2095-2104. Vance CP, Egli MA e Griffith SM (1988). Plant regulated aspects of nodulation and N2 fixation. Plant Cell Env. 11:413-427. Vargas C, de Pádua VLM, Nogueira EM, Vinagre F, et al. (2003). Signaling pathways mediating the association between sugarcane and endophytic diazotrophic bacteria: a genomic approach. Symbiosis 35:159–180. Verma DP e Long S (1983). The molecular biology of Rhizobium-legume symbiosis. Int. Rev. Cytol. 14:212–245. Vettore AL, da Silva FR, Kemper EL, Souza GM, et al. (2003). Analysis and functional annotation of an expressed sequence tag collection for tropical crop sugarcane. Gen. Res. 13:2725-2735. Vezzani FM (2001). Qualidade do sistema solo na produção agrícola. Porto Alegre, Universidade Federal do Rio Grande do Sul. pp 184. (Tese de Doutorado). Vinardell JM, Fedorova E, Cebolla A, Kevei Z, et al. (2003). Endoreduplication mediated by the anaphase-Promoting complex activator CCS52A is required for symbiotic cell differentiation in Medicago truncatula nodules. Plant Cell 15:2093–2105. Vincill ED, Szczyglowski K e Roberts DM (2005). GmN70 and LjN70. Anion transporters of the symbiosome membrane of nodules with a transport preference for nitrate. Plant Physiol. 137:1435–1444. Vleghels I, Hontelez J, Ribeiro A, Fransz P, et al. (2003). Expression of ENOD40 during tomato plant development. Planta 218:42–49. Weaver CD e Roberts DM (1992). Determination of the site of phosphorylation of nodulin 26 by the calcium-dependent protein kinase from soybean nodules. Bioch. 31:8954– 8959. Wienkoop S e Saalbach G (2003). Proteome analysis. Novel proteins identified at the peribacteroid membrane from Lotus japonicus root nodules. Plant Physiol. 131:1080–1090. 67 Xavier GR, Martins LMV, Rumjanek NG e Freire-Filho FR (2005). Variabilidade genética em acessos de caupí analisada por meio de marcadores RAPD. Pesq. agropec. bras. 40:353-359. Yang WC, Katinakis P, Hendriks P e Smolders A (1993). Characterization of Gm-ENOD40, a gene showing novel patterns of cell-specific expression during soybean nodule development. Plant J. 3: 573–585. Zahran HH (1999). Rhizobium-legume symbiosis and nitrogen fixation under severe conditions and in arid climate. Microbiol. and Mol. Biol. Rev. 4:968–989. Zhu H, Choi HK, Cook DR e Shoemaker RC (2005). Bridging model and crop legumes through comparative genomics. Plant Physiol. 137:1189-1196. Zilli JE, Valicheski RR, Rumjanek NG e Simões-Araújo JL (2006). Eficiência simbiótica de estirpes de Bradyrhizobium isoladas de solo do Cerrado em caupi. Pesq. agropec. bras. 41:811-818. Zrenneran R, Salanoubat M, Willmitzer L e Sonnewald U (1995). Evidence of the crucial role of sucrose synthase for sink strength using transgenic potato plants (Solanum tuberosum L.). Plant J. 7: 97–107. Zucchero JC, Caspi M e Dunn K (2001). ngl9: a third MADS box gene expressed in alfalfa root nodules. Mol. Plant Microbe Intec. 14:1463–146. REFERENCIAS ELETRÔNICAS Centro Nacional de Referência em Biomassa http://www.cenbio.org.br/pt/index.html (Setembro 16, 2008). (CENBIO), Cesnik R. Melhoramento da cana-de-açúcar: marco sucroalcooleiro no Brasil. Rev. Elet. Jorn. Cient., http://comciencia.br/comciencia/?section=8&edicao=23&id=256. (Novembro 18, 2008). Empresa Brasileira de Pesquisa Agropecuária (EMBRAPA), http://www.embrapa.br (Fevereiro 10, 2008). Empresa Brasileira de Pesquisa Agropecuária http://www.cana.cnpm.embrapa.br/agroeco.html (Março 25, 2008). (EMBRAPA), Harvest, http://www.harvest-web.org/ (Outubro 2, 2008). Nucleotide Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/dbEST/dbEST_summary.html, (Novembro 10, 2008). Nucleotide Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/Taxonomy/ (Outubro 13, 2008). (NCBI), 68 Portal Única, http://www.unica.com.br (Novembro 16, 2008). Sugarcane Expressed Sequence Tag Project http://www.sucest.lad.dcc.unicamp.br/en/ (Dezembro, 2008). (SUCEST), 69 Capítulo 2 Artigo Científico ______________________________________________________________________ Analysis of Genes Associated with Symbiotic Nitrogen Fixation in the Cowpea (Vigna unguiculata) Transcriptome Artigo a ser submetido à revista Genetic and Molecular Research 70 Analysis of Genes Associated with Symbiotic Nitrogen Fixation in the Cowpea (Vigna unguiculata) Transcriptome Gabriela Souto Vieira-Mello; Petra Barros dos Santos; Nina da Mota Soares-Cavalcanti; Ana Carolina Wanderley-Nogueira; Tercílio Calsa-Júnior; Ederson Akio Kido and Ana Maria Benko-Iseppon Universidade Federal de Pernambuco, Center for Biological Sciences (CCB), Departamento de Genética, Laboratório de Genética e Biotecnologia Vegetal e Laboratório de Genética Molecular, Recife, PE, Brazil. Short running title: Nitrogen Fixation Genes in Cowpea Transcriptome. Key words: data mining, early nodulins, late nodulins, expressed sequence tags, salinity stress. Corresponding Author: Ana Maria Benko-Iseppon, UFPE, CCB, Departamento de Genética, Laboratório de Genética e Biotecnologia Vegetal, Av. Prof. Moraes Rego, s/nº; 50732-970, Recife, PE, Brazil. E-mail: ana.benko.iseppon@pq.cnpq.br 71 ABSTRACT: Legumes have a special ability to establish endosymbiosis with soil rhizobia, forming new organs, called nodules, where nitrogen fixation occurs. These processes, including the nodule development and establishment, are associated with the spatially and temporally regulated expression of nodule-enhanced transcripts, the nodulins, classified in early and late, according with their temporal expression and the role they play in nitrogen fixation. This work aimed to identify candidate sequences to early (Annexin, DMI3, NSP1, NORK, CCS52A, NIN, ENOD40 and ENOD8) and late (NOD26, NOD70, Glutamine synthase, Leghemoglobin, NOD35, Sucrose synthase and DMT1) nodulins in the collection of cowpea ESTs under diverse conditions available in NordEST and HarvEST databeses, using bioinformatic tools. The 263 candidates sequences found (139 from early nodulins and 124 for late nodulins) have shown similarity with the respective genes in other legumes. The hierarchical clustering analysis revealed higher expression of early nodulins transcripts in libraries extracted from leaves of IT85F genotype collected with 90 minutes after mosaic viruses infection (IM90) and from root of genotype tolerant to salinity after 8 hours of stress (ST08). In the case of the late nodulins, the libraries of salinity sensitive plants submitted to salt stress (after eight and two hours, respectively SS08 and SS02) presented the most representative expression. Multiple alignments showed relative conservation regarding the nodulins in different organisms. The dendrogram revealed a consistent branch including most dicot taxa separated from monocots. In the Annexin dendogram the legumes were placed as outgroup, while in the ENOD8, Glutamine synthase and Sucrose synthase dendrograms the Fabaceae family was separated from other dicots, suggesting that these proteins presented divergent evolution during Magnoliophyta group evolutionary process. The present work aimed to bring valuable information for future in vitro and in vivo assays, as well as for development of molecular markers for genetic breeding and mapping purposes of cowpea, allowing a better understanding of diversity and evolution of the genes involved in nitrogen fixation. 72 INTRODUCTION Biological nitrogen fixation reduces N2 to ammonium, being the largest source of available nitrogen for life on earth (Newton, 2000). Much of this ammonium comes from symbiotic nitrogen fixation (SNF) by rhizobia within legume root nodules. The Leguminosae is one of the most successful families of land plants, mainly because of SNF, which enables legumes to colonize soils that contain little or no available nitrogen. This feature, together with the nutritious and protein-rich seeds that they produce, placed legumes as an essential part of traditional and modern agriculture (Colebatch et al., 2004). Leguminous plants are able to grow under nitrogen-limiting conditions because of their ability to establish endosymbiosis with soil bacteria, collectively called rhizobia. During this interaction new organs, called nodules, are formed in the root plant, allowing the fixation of atmospheric nitrogen to supply the plant with ammonium. In return, the microsymbionts obtain photosynthates and an environment with low concentration of oxygen required by nitrogenase action (Spaink, 2000). These recognition events allow the invasion of the host as well as the formation of a primary nodule; these two processes occur in parallel and eventually merge when infection threads release bacteria into the cytoplasm of the newly formed primordial cells (Parniske and Downie, 2003). In this process, the bacteria become enclosed by a plant-derived membrane, the symbiosome membrane (SM), and bacteroid differentiation precedes the metabolic phase of symbiosis (Van de Velde et al., 2006). Among cultivated legumes, cowpea (Vigna unguiculata (L.) Walp) arises as an important crop for dryland areas, despite of its abilities to grow under adverse soil and climatic conditions (Martins et al., 2003), playing an important role in cropping systems in 73 sub-Saharan Africa, Asia, Central and South America (Singh et al., 1997) especially because cowpea nodules are very resistant to high temperatures (Simões-Araújo et al., 2002). Despite of such qualities, little information is available about the genetic background of this crop regarding nitrogen fixation under normal or stress conditions. With the identification of the majority of the bacterial genes required in the SNF, important progress has been made in the elucidation of the genetic mechanisms used by the plants in this interaction (Long, 2001). In the past few years, different expression profiling strategies were pursued to identify symbiotically induced genes co-activated during early and late stages of nodulation (Küster et al., 2007). Nodule development is associated with the spatially and temporally regulated expression of a number of nodule-enhanced transcripts (referred to as nodulins) that aid in the establishment of the symbiosis (Stougaard, 2000), including a number that encode membrane transport proteins (Kaiser et al., 2003; Jeong et al., 2004) and proteins associated with the symbiosome membrane (Catalano et al., 2004). Plant nodulins were classified into two mainly classes according to its expression moment and the role that they play in the SNF. The early nodulins are generally expressed during the early stages of nodulation and seem to be involved in the infection processes and/or nodule organogenesis, while the late nodulins are expressed in mature nodules, acting in the nitrogen fixation itself (Niebel et al., 1998). In general, strategies like EST sequencing, construction and analysis of cDNA libraries, in silico profiling of symbiosis-related gene expression through mining comprehensive EST collections, and experimental approaches have been carried out mainly in two model legumes: M. truncatula (Barker et al., 1990) and Lotus japonicus (Handberg and Stougaard, 1992). In these models and additionally in soybean (Glycine max; Lee et al., 74 2004), such approaches have enabled comprehensive analysis of gene expression profiles during the nodulation process (Colebatch et al., 2004). However, cowpea still lacks comprehensive studies like those, with few nodulins evaluated up to date. The present work aimed to evaluate nodulin genes transcriptionally activated in the cowpea transcriptome under diverse experimental conditions, including in silico expression profiling, gene structure and evolution, as compared with available information from other plants deposited in public databases. MATERIAL AND METHODS Protein sequences derived from full length cDNA sequences from legumes were used as seed sequences (Table 1), being obtained in FASTA format at NCBI database (National Center for Biotechnology Information; http://www.ncbi.nlm.nih.gov), including most representative genes from the nodulin family (Annexin, DMI3, NIN, NSP1, NORK, CCS52A, ENOD8, ENOD40, NOD26, DMT1, NOD70, GS, Leghemoglobin, NOD35 and Sucrose synthase). For each seed sequence a tBLASTn (Altschul et al., 1997) was performed against the cowpea databases (NordEST; http://www.gentrop.ufpe.br/vigna and HarvEST; http://www.harvest-web.org/, together including 202,066 ESTs), considering a cut off of 1e10 . The obtained clusters from both databases were assembled to avoid redundancies in BLAST results. The contigs were built using the EGAssembler program available at http://egassembler.hgc.jp/. 75 Reverse alignments of selected clusters were made at the NCBI database using the BLASTx tool (http://www.ncbi.nlm.nih.gov/BLAST/) in order to confirm the similarity between the cowpea sequences and the available sequences at GenBank. After that, the clusters were translated using the ORFfinder program at NCBI homepage. The presence and integrity of the conserved domains (CDs) was analyzed using RPS_BLAST tool (Altschul et al., 1997). For each gene, the cowpea sequences that presented CDs were selected together with the sequences found using the BLINK tool (NCBI), aiming to generate multiple alignments at CLUSTALx program. These analyses allowed a structural comparison of conserved and divergent sites among cowpea sequences and other organisms. The BLINK tool permitted the inclusion of sequences from different organisms additional to those obtained by BLASTx, but only those with significant alignments and CDs with adequate structural composition. The phylogenetic analysis was performed using the MEGA (Molecular Evolutionary Genetic Analysis) program, Version 4 for Windows (Kumar et al., 2004) using Maximum parsimony method, with bootstrap of 2,000 replications and Pairwise deletion for the treatment of GAPs during the alignments, generating a consensus tree with a cut-off of 50 (50% more parsimonious trees). Only sequences from the NordEST database were used to perform an expression profile of cowpea nodulins, since most libraries from HarvEST project (60%) were constructed with a mixture of tissues from V. unguiculata, bringing less information about spatial and temporal expression of nodulin transcripts. The Hierarchical Clustering Analysis and Reordered Data Matrices, performed by CLUSTER and TREEVIEW (Eisen et al., 1998) 76 was carried out with normalized data and allowed the study of clusters expression patterns. Dendrograms including both axes (using the weighted pair-group for each gene class and library) were generated by the TreeView program (Eisen et al., 1998). On the graphics (Figure 5), yellow represented no expression and red all degrees of expression. A preliminary analysis of nodulin distribution patterns in cowpea libraries was verified by direct correlation of the reads frequency of each cluster in various NordEST cDNA libraries using normalized data. RESULTS 1. Cowpea Orthologs 1.1 Early Nodulins After trimming redundant clusters, the search for early nodulins at cowpea database revealed 139 candidates, with e-values ranging from 1e-159 to 5e-10 (Tables 2 and 3). The annexin results showed high similarity (5e-150 to 3e-31) with four clusters, of which one presented the four annexin CDs complete, while in two these domains were incomplete, whilst in one sequence no CD was found. All clusters found in tBLASTn presented best matches with their respective protein after BLASTx analysis at the GenBank, showing similarity with M. truncatula, except the third match, which presented similarity with annexin protein from Fragaria x ananassa. Regarding DMI3, the cowpea transcriptome presented eight candidates, of which four presented the S_TKc CD complete, three presented two EFh CD complete and two one EFh CD complete, however none of them showed the three domains together (S_TKc + 77 EFh + EFh). Furthermore, the reverse alignments revealed that most candidates had best matches with Fabaceae family. The tBLASTn results of nodulin NIN showed eight candidates, including three with full PB1_NLP domain and one with partial domain; while in four clusters no domain could be found. Two presented similarity with Oryza sativa, while the other six showed higher similarity with legume species. In the searches for NSP1 candidates it was possible to note the presence of seven clusters; however, these candidates presented low degree of similarity (≤ 6e-22). The clusters found had the full GRAS conserved domain in three, while in the last four this domain was incomplete. On the other hand, these sequences matched with homologous proteins from Phaseolus vulgaris, Castanea sativa and Solanum lycopersicum with high similarity (e-values from 0.0 to 2e-63). With respect to NORK gene, the cowpea transcriptome revealed the presence of 61 candidates, with e-values ranging from 1e-159 to 5e-24, of which 20 and 40 presented the PKc_Tyr domain complete and incomplete, respectively, while in one no domain was found. In general, the NORK orthologs had higher similarity with Dicotyledonous plants, mainly of Fabaceae family; however, four sequences matched with the Monocotyledonous O. sativa. The CCS52A nodulin presented nine cowpea sequences with best e-value of 3e-114; the full WD40 domain was presented in two clusters and partial domain in four clusters, while in three no domain was obtained. The BLASTx results revealed higher identity of all sequences with legume proteins, except one that aligned to an A. thaliana sequence. The searches for ENOD8 candidates revealed 38 sequences (e-values 3e-113 to 1e-10), in which the SGNH_plant_lipase_like domain seemed complete in 21 and incomplete in 16 78 clusters. Interestingly, BLASTx results showed similarity mainly with A. thaliana and O. sativa, but no match was found with organisms from Fabaceae family. In relation to ENOD40 candidates, three clusters with e-values between 2e-98 and 4e24 were observed. All candidates had the desired domain RRM complete and matched with M. truncatula ENOD40 protein. 1.2 Late Nodulins In a general view, the evaluation of late nodulins revealed the presence of 124 candidates (with e-values ranging from 0.0 to 8e-11), from which 55.6% showed the searched domains Tables 2 and 3). All the eight candidates of DMT1 presented a partial Nramp CD with e-values from 3e-131 to 5e-25 . In addition, 50% of the candidate sequences showed similarity with soybean (Glycine max), while the others matched with A. thaliana and S. lycopersicum. The search for GS orthologous revealed seven candidates, from which three and one presented both domains, Gln-synt_N and Gln-synt_C, complete and incomplete, respectively, another one with only the Gln-synt_C incomplete CD and one with only Glnsynt_C incomplete CD and a single candidate without domain. After reverse alignments all clusters showed similarity to their respective protein from legume, mainly P. vulgaris and Vigna aconitifolia. In relation to the Leghemoglobin gene, 49 clusters were found with e-values ranging from 6e-47 to 4e-17. The full globin CD was found in 40 candidates being incomplete in eight, while one presented no domain. After BLASTx, all clusters matched with legumes, with V. unguiculata represented as the most similar organism. 79 Regarding the NOD26 analysis, tBLASTn pointed 25 candidates with e-values between 2e-145 to 2e-10. From these candidates 15 presented the complete MIP domain, in seven this CD was incomplete and in three no conserved domain was found. After reverse alignments all sequences were similar to NOD26 protein from Fabaceae family. The data mining results for NOD70 revealed 15 sequences with significant homology (e-values 3e-91 to 9e-12), being four and six with the Nodulin-like domain complete and incomplete, respectively, and five without domain. After BLASTx analysis it was possible to observe the sequences presenting similarity with their respective protein, being of the Brassicaceae and Rutaceae families the most common; with A. thaliana and Poncirus trifoliata representing 73.3% of the obtained sequences, while just two were similar to G. max. The cowpea NOD35 candidates totalized only two sequences with high degree of similarity. One presented the full Uricase domain, which was similar to the NOD35 protein of G. max, while the other showed this domain incomplete and presented similarity to the respective protein of P. vulgaris. Finally, 14 candidates to SucSin gene could be identified, with e-values between 0.0 and 8e-11, from which 71.42% showed similarity with sequences from Fabaceae members. The two sequences with complete Sucrose_synth domain were similar to the legumes Vigna radiata and Pisum sativum, while 12 sequences showed similarity with other legumes, as Vicia faba, G. max, and non-legumes plants, as Beta vulgaris, A. thaliana and Citrus unshiu, being seven with the desired domain incomplete and five with no domain. 80 2. Dendrograms The multiple alignments generated during this work used proteins of the early nodulins: Annexin and ENOD8; and late nodulins: Glutamine synthase and Sucrose synthase. In the results a high degree of conservation was perceived among the nodulin sequences from diverse organisms. In the resulted dendrograms it was possible to observe the grouping of different organisms (fungi, protists, plants and animals) in separated clades according their kingdom classification. All analyzed dendrograms placed sequences from monocots and dicots in different clades, with a clear segregation between the Fabaceae family and plants from other groups. The generated dendrograms early nodulins are show in the figure 1. In the ENOD8 dendogram (Figure 1A) it was possible to distinguish two groups; the first one comprising protozoans (I), as outgroup, and the other plants species (II). The group II showed two subclades, grouping monocots (IIa) and dicots (IIb) in different branches. In the dicot group it was possible to see a clear separation (dashed line) between legumes and non-legumes of the Plantaginaceae and Apiaceae families (Figure 1A). The annexins dendrogram placed Fungi as an outgroup (Branch I), and grouped animals and plants in two clades (I and II respectively) according to their higher taxonomic classification (plant, animal and fungi kingdoms). In group III the Fabaceae family (IIIb) was separated from dicots (dashed line), which was placed together with the monocots Zea mays and O. sativa, but in a separated subclades (dotted line) (Figure 1B). Regarding dendrograms for late nodulins (Figure 2), both analyzed sequence groups (sucrose synthase and glutamine synthase), showed organisms grouped in accordance to their taxonomic classification. Thus, it was possible to note a clear separation at the 81 Magnoliopsida class with monocots and dicots placed in distinct branches, whereas within this last class the Fabaceae family appeared separated from other dicots (dashed line). The sucrose synthase dendrogram (Figure 2A) presented the dicot clade subdivided into two subclades, one comprising the Asterid subclass (Cichorium intybus and members of Solanacea family) and the other comprising the Rosid subclass (A. thaliana, Alnus glutinosa, Citrus unshiu and members of the Fabaceae family). 82 A B III Arabidopsis thaliana II I Figure 1: Dendrograms generated after Maximum Parsimony analysis showing relationships among conserved domains in early nodulins (A) ENOD8 and (B) Annexin sequences including Vigna unguiculata orthologs. Numbers in the base of branches indicate bootstrap values. 83 A B Figure 2: Dendrograms generated after Maximum Parsimony analysis showing relationships considering conserved domains of late nodulins (A) Sucrose synthase and (B) Glutamine synthase sequences with 3. unguiculata Distributionorthologs. of ESTs in the NordEST Libraries Numbers in the base of clades indicate bootstrap values. Vigna 84 The distribution of the 581 reads in the nine libraries was analyzed, allowing the identification of 73 early and 508 late nodulins. Moreover, all libraries of the NordEST database presented at least one transcript of each nodulin class, with exception of the IM90 library (Leaves of IT85F genotype collected with 90 minutes after mosaic viruses inoculation), where no transcripts from the late nodulins could be detected. After direct counting of the early nodulin transcripts, a higher prevalence of transcripts could be observed in ST08 library (roots of tolerant plants to salinity after 8 hours salt stress), followed by SS08 library (roots of salinity sensitive plants after 8 hours of salt stress), that together represented 54% of the early nodulins, while the control library (CT00, no stress) had the lower representation (1%) (Figure 3A). Regarding the distribution of late nodulins, it was possible to note that ST02 library (roots of salinity tolerant plants after 2 hours of salt stress) showed the higher abundance (38%) of transcripts, whilst the five libraries (ST00, roots of salinity tolerant plants without salt stress; ST08; SS00, roots of salinity sensitive plants without salt stress; BM90, leaves of BR14-Mulato genotype and CT00 negative control) presented few reads totalizing together just 13% (Figure 3B). In the comparison of the total number of reads found in the NordEST libraries several differences regarding the two nodulin categories could be observed, especially in respect to the SS02 (roots of sensitive plants to salinity after 2 hours of stress), SS08 and ST02 libraries, which showed a difference of more than 85% of read content between both nodulin types (Figure 4). 85 A B Figure 3: General distribution of transcripts found in the NordEST libraries. (A) Prevalence of early nodulin genes. (B) Prevalence of late nodulin genes. Abbreviations for libraries: CT00 (Negative Control); BM90 (Leaves of BR14-Mulato genotype); IM90 (Leaves of IT85F genotype collected with 90 minutes after mosaic viruses infection); SS00 (Root of salinity sensitive plant without salt stress); SS02 (Root of salinity sensitive plant after 2 hours of stress); SS08 (Root of salinity sensitive plant after 8 hours of stress); ST00 (Root of salinity tolerant plant without salt stress); ST02 (Root of salinity tolerant plant after 2 hours of stress); ST08 (Root of salinity tolerant plant after 8 hours of stress). Figure 4: Comparative prevalence of early and late nodulins genes in the cowpea NordEST libraries. Numbers outside columns refer to the total of reads found in each library. Abbreviations for libraries: CT00 (Negative Control); BM90 (Leaves of BR14-Mulato genotype); IM90 (Leaves of IT85F genotype collected with 90 minutes after mosaic viruses infection); SS00 (Root of salinity sensitive plant without salt stress); SS02 (Root of salinity sensitive plant after 2 hours of stress); SS08 (Root of salinity sensitive plant after 8 hours of stress); ST00 (Root of salinity tolerant plant without salt stress); ST02 (Root of salinity tolerant plant after 2 hours of stress); ST08 (Root of salinity tolerant plant after 8 hours of stress). 86 4. Expression Pattern The hierarchical clustering analysis, made after data normalization, allowed an evaluation of expression intensity and co-expression or co-regulation of different NordEST libraries and protein families. In this analysis two early nodulins (NIN and ENOD40) and one late nodulin (DMT1) were not evaluated since they were absent in the database. Interestingly, considering the expression each nodulin gene separately none of them presented transcripts in all libraries of Nordest project; as examples, in relation to all early nodulins, just the annexin candidates presented transcripts in the control (CT00) library. Regarding the late nodulins studied, they were completely absent of the IM90 library (leaves of IT85F genotype collected with 90 minutes after mosaic viruses infection). In relation to the early nodulins, the higher expression was detected in IM90 and ST08 libraries, followed by BM90 and SS08 libraries, while the ST00 library showed the lower expression, presenting transcripts only for the NORK and ENOD8 genes, with 12 and four reads, respectively. Furthermore, in the grey dendogram showing a spatial coexpression among libraries, it was possible to observe a stronger relation among ST08/BM90, SS08/SS02 and ST00/IM90 libraries. Regarding the co-expression of early nodulins (pink dendogram), the analysis revealed the clustering of ENOD8/NORK + DMI3 genes (Figure 5A). For late nodulins an almost complete absence of expression was observed in BM90 and SS00 libraries, with exception of NOD26, which presented 19 and 21 transcripts from BM90 and SS00, respectively, and NOD70, with seven reads from SS00, while the prevalence of expression was clear in SS08, SS02, ST02 and CT00 libraries. It is interesting to note that the Leghemoglobin gene showed a higher expression in these libraries and a co-expression with the group NOD35/GS. 87 The NOD26 gene presented reads distributed in all NordEST libraries, except in the IM90 library; the higher transcription was observed at SS08 and SS02 libraries. Also, this gene presented co-expression with the sucrose synthase gene (Figure 5B). A B Figure 5: Expression pattern of cowpea transcripts to the here studied nodulins genes. (A) Graphic representation of the early nodulins CCS52a, Annexin, NSP1, DMI3, ENOD8 and NORK clusters. (B) Graphic representation of the late nodulins NOD70, SS, NOD26, NOD35, GS and Lgb. Darker red quadrants indicate higher expression in the corresponding tissue/library, lighter red/orange lower expression, and yellow represents no expression. Black dendrograms reflect the relationships among libraries and pink dendrograms the relationship among nodulins. Abbreviations: GS, Glutamine Synthase; SS, Sucrose synthase; Lgb, Leghemoglobin; CT00 (Negative Control); BM90 (Leaves of BR14-Mulato genotype); IM90 (Leaves of IT85F genotype collected with 90 minutes after mosaic viruses infection); SS00 (Root of salinity sensitive plant without salt stress); SS02 (Root of salinity sensitive plant after 2 hours of stress); SS08 (Root of salinity sensitive plant after 8 hours of stress); ST00 (Root of salinity tolerant plant without salt stress); ST02 (Root of salinity tolerant plant after 2 hours of stress); ST08 (Root of salinity tolerant plant after 8 hours of stress). 88 DISCUSSION 1. Cowpea orthologs Several nodulins orthologs have been described in non-legumes, mainly arabidopsis and rice, both presenting whole genome sequencing available (Miyao et al., 2007; Zhu et al., 2006). In addition, a number of legume genes that are required for nodulation are also essential for the symbiotic associations with arbuscular mycorrhizal (AM) fungi, which are established in more than 80% of flowering plants. These two associations share several common features, such as genetically controlled microbial infection by the host plant, transcriptional activation of a common set of host genes and formation of an intracellular plant-microbe interface where the nutrient exchange occurs (Oldroyd and Downie, 2004). Furthermore, it has been hypothesized that the nitrogen-fixing root nodule symbiosis evolved from part of the existing mechanisms for the AM symbiosis (the ancient association), considering that the legumes have recruited preexisting genes to make a functional nodule organogenesis (Heckmann et al., 2006; Smit et al., 2005) with the nonlegume orthologs of these common components maintaining equivalent biological functions to their legume counterparts (Chen et al., 2007). Thus, as expected, some here analyzed early nodulins presented similarity to nonlegumes sequences. In the results for the DMI3, NIN and NORK early nodulins it can be observed that cowpea sequences matched with rice orthologs sequences that are required in signaling during the AM symbiosis (Chen et al., 2007; Dangl and Jones, 2001; Godfroy et al., 2006; Schauser et al., 2005). Notably, arabidopsis lacks the orthologs for some early nodulins that play a significant role in the symbiosis signaling, like NORK and DMI3 (Zhu et al., 2006). Such gene deletions in arabidopsis (and likely the lineage leading to the 89 Brassicaceae family) explain the inability of some Brassicaceae species to form symbiotic associations with mycorrhizal fungi and with rhizobia (Stacey et al., 2006). However, orthologs regarding other early nodulins have been described in the arabidopsis genome, such as CCS52A and ENOD8, acting in this Brassicaceae respectively as cell cycle controller and lipase (Brick et al., 1995; Cebolla et al., 1999; Tarayre et al., 2004). The lack of cowpea clusters similar to ENOD8 in the literature, is probably due the low amount of Fabaceae sequences of this nodulin deposited in NCBI. In addition, the analysis of NOD70, DMT1 and Sucrose Synthase late nodulins also revealed similarities between selected cowpea clusters and arabidopsis sequences. Transcripts of the DMT1 transmembrane protein have been found in different cell types, bearing highly conserved structure and homology to other plants, including non-legumes (Mims and Prchal, 2005). Regarding the NOD70 gene, the here generated cowpea sequences showed low degree of similarity with legumes, probably due to the low amount of sequences deposited in the GenBank. This is in accordance to Vincill et al. (2005) that evaluated a subfamily of membrane transport proteins with a higher degree of similarity to GmN70 in a variety of plant species; similarly, in our results, the NOD70 nodulin matched with non-legumes, mainly with arabidopsis nodulins. The sucrose is an important metabolite for the plant growth and development, acting in several physiological processes beyond the supply of carbon to the bacteroids, like growth regulation, signal transduction and genetic expression. Considering these functions, sucrose synthase was found in a variety of plant species (Smeekens, 2000; Sturm et al. 1999). This enzyme is encoded by a small multigenic family present in several plants, such as potato (Zrenner et al., 1995), corn (Duncan et al., 2006), arabidopsis (Baud et al., 2004) 90 and rice (Harada et al., 2005), with several isoforms found in legumes like M. truncatula (Hohnjec et al., 1999) and pea (Barratt et al., 2001). Regarding the annexin early nodulin, the low number of clusters found in the cowpea database, as expected, was consistent with the role of this gene, that is implicated in the preparation for infection or nodule organogenesis, rather than in the infection process itself in legumes (Niebel et al., 1998) The low amount of cowpea cluster candidates can be explained by the fact that the cowpea sequencing projects used not only young but also mature plants, which presents the nodules formed and, consequently, lower annexin activity. Besides the role in the symbioses, annexins from non-legumes are associated with different cellular processes. In arabidopsis it has been proposed that annexins are part of the oxidative stress response, while in strawberry studies using annexin cDNA sequences revealed that the expression during fruit maturation was enhanced (Wilkinson et al., 1995). Niebel et al. (1998) showed that the M. truncatula sequence (used in this work as seed sequence) is similar to strawberry annexin. In our work, the cowpea cluster aligned with the strawberry and alfalfa sequences, probably sharing the same functions in the symbiosis process. The NSP1, consisting of the highly conserved GRAS domain, constitute a family of plant-specific proteins that play roles in various developmental processes such as signal transduction, meristem maintenance and development. The fact that putative orthologs exist in a variety of plants, such as A. thaliana, Populus trichocarpa (Smit et al., 2005), potato and lettuce, indicates a more ancient function for this gene than the symbiosis (Bolle, 2004). Consistent with these roles, Heckmann et al. (2006) reported that the Nicotiana benthamiana (Solanaceae family) NSP1-like gene can function in the Nod factor-signaling pathway, however its ability to activate downstream gene expression is unlikely to be direct 91 in terms of transcriptional activation of nodulation-specific promoters. Instead, it seems more probable that it acts to activate some aspect of the gene induction pathway that is common to several genes. This data indicated that the conserved domains necessary for perception and activation in the NSP1 protein have been conserved among legumes and non-legumes (Heckmann et al., 2006). In addition, the Castanea sativa SCARECROW-LIKE protein, which is comprised by the GRAS domain, play a role during the earliest stages of adventitious root formation (Sanchez et al., 2007). Together, these facts can explain the similarity found between the cowpea clusters and non-leguminous plants. The reverse alignments results for early nodulin ENOD40 and late nodulins Glutamine synthase, NOD26, NOD35 and leghemoglobin revealed high similarity with sequences from the Fabaceae family, as expected; the evolutionary proximity from these organisms is a strong evidence that the cowpea genome presents an abundant and diverse set of genes involved in nitrogen fixation. In a general view, the analyses revealed that the cowpea nodulins share high similarities with nodulins from other legume plants and showed that significant proportion of nodule-specific functions are performed by recruiting genes common to non-legumes plants (Fedorova et al., 2002). 2. Dendrograms Significant proportion of nodule-specific functions are performed by recruiting preexisting genes common to non-legumes plants (Heckmann et al., 2006), and it is now known that many of the genes for nodulation have been acquired following duplication of those with related functions. Therefore, it is not known whether all legumes present these 92 extra genes and, if so, why they are not expressed in non-nodulating forms (Sprent, 2007). In addition, non-legume orthologs of these common components likely maintain equivalent biological functions to their legume counterparts (Chen et al., 2007). Thus it is suggested that in legumes some factor enabling to form nitrogen-fixing symbioses have arose, generating this group with different properties from the others angiosperms. The multiple alignments with cowpea and nodulin sequences from other species showed high degree of conservation among sequences. As expected, the generated dendrograms reflected the evolutionary history of plants. Most legumes stayed as separated subclades within the dicots, confirming the hypothesis of Doyle and Luckow (2003) that nodulation specific genes arose within the legume family already among the earliest lineages. The ENOD8, belonging to the GDSL family, is a hydrolytic protein with esterase and lipase activities (Upton and Buckley, 1995) found in plant and bacteria; however this protein is not completely conserved in all organisms (Pringle and Dickstein, 2004). In our dendrogram (Figure 1A) a clear segregation of the bacteria (clade I) and plant (clade II) groups in monophyletic groups was evident. Regarding the plant kingdom, exclusive synapomorphies characterized the monocots (IIa) and the dicots (IIb) classes. Within dicots the legumes (dashed line) appeared as a monophyletic group, as expected, since these proteins are strongly involved in nitrogen fixation process, being associated with the symbiosome membrane in root nodules (Catalano et al., 2004). Numerous orthologous to ENOD8 have been found in plants that are not able to fix nitrogen, being all probably induced by exogenous signals or are regulated in a tissue specific manner. In Daucus carota cell cultures, a GDSL gene with 55% identity to ENOD8 encoded a secreted glycoprotein (van Engelen et al., 1995) that was induced by the plant 93 pathogen Sclerotinia sclerotiurum (Bertinetti and Ugalde, 1996). In addition, transcripts of the ENOD-like genes were identified in roots, stems and flowers of flowering plants, suggesting that these genes might have roles in the development of different organs involved principally in the regulation of plant development, morphogenesis, secondary metabolites synthesis and defense responses (Ling et al., 2006). Annexins show different properties and diverse intracellular localizations including association with plasma or organelle membranes, cytoplasm and nuclei, for example. They play different roles in several organisms (Clark and Roux, 1995; Raynal and Pollard, 1994; Moss, 1997) including plants (Clark and Roux, 1995). Members of the annexin family are composed by a variable N-terminal region and a highly conserved C-terminal core, with exception of the animal VI class, which contains eight repetitions of the annexin domain (Morgan and Fernandez, 1997). However, plant annexins share common biological activities and functions with their animal counterparts, such as the ability to bind to F-actin (Hu et al., 2000) and to stimulate Ca2+-dependent exocytosis (Carroll et al., 1998) or to function as GTPase (Shin and Brown, 1999). The expression of both animal and plant annexins can be regulated during the cell cycle, suggesting that they have a potential role during cell division (Hawkins et al., 2000). Moreover, in animals annexins have been implicated in the transduction pathways of mitogenic signals, in membrane trafficking processes such as exocytosis, in interactions with cytoskeletal elements, or in the formation of voltage-dependent, ion-selective calcium channels (Raynal and Pollard, 1994). The here obtained annexin dendrogram is in accordance to this divergent evolution, showing animal annexins (branch II) as a monophyletic group and also as a sister-group of plants, probably because these two kingdons share synarqueomorphic characters. 94 Previous studies of annexin evolution indicated a single clade (ANXC) composed by fungal and mycetozoan annexins (Morgan and Fernandez., 1997). Braun et al. (1998) viewed that the N. crassa annexin homologue is most closely related to the annexin homologue of Dictyostelium discoideum, suggesting a phylogenetic link between cellular slime molds and true fungi, what is also in accordance with our findings. According to Moss (1997) and Morgan and Fernandez (1997), plant annexins make up a monophyletic cluster whose members generally lack amino-terminal domains and functional calcium-binding sites in their second and third repeats. As seen in the present results, the non-legume families (Brassicaceae and Malvaceae) formed a paraphyletic merophyletic group. In addition, we can see the presence of specific features in annexins from monocots and dicots, which resulted in the separation of these classes in smaller clades, as expected. In non legume plants annexins have also been reported to be associated with different cellular processes. Annexins purified from plant species such as tomato, maize, cotton and celery presented different characteristics (Clark and Roux, 1995); for example, a cotton annexin was associated with the modulation of callose synthase activity located in plasma membrane (Andrawis et al., 1993), while maize annexins showed ATPase activity (McClung et al., 1994). Moreover, their role in the oxidative stress response has been proposed since an A. thaliana annexin-encoding cDNA was able to complement an Escherichia coli mutant unable to grow in the presence of high concentrations of H2O2 (Gidrol et al., 1996). This function divergence could justify the diverging positions among legume and non legume annexins, also here observed. With respect to the legumes, in the phylogeny the cowpea sequence behaved like a sister-group (IIIb), what was expected since the Fabaceae family (M. truncatula and V. unguiculata) present annexins with distinct characteristics (almost all associated to nodule 95 formation) from other plants (Manthey et al., 2004). The function of alfalfa annexin, for example, may be related to the changes occurring in the cellular cytoskeleton during the nodulation process (Niebel et al., 1998). The Sucrose Synthase dendrogram (Figure 2A) presented plants (II) and cyanobacteria (I) as a monophyletic group, due to the difference in the functions of sucrose in these two organisms. While in plants sucrose works as important metabolite for vegetal grown and development and as a primary carbon source for the bacteroids (in the case of legumes) (Smeekens, 2000), in cyanobacteria the sucrose is often synthesized in response to salt or osmotic stress and is thought to help to maintain osmotic balance and to stabilize protein and membrane structure and function (Hagemann and Marin, 1999). However, Lunn (2002) suggested that sucrose is synthesized by the same route, via sucrose synthase, in both organisms, and hypothesized that plants inherited the sucrose metabolism from a unicellular organism, cyanobacterial endosymbiont, predicting a horizontal genetic transfer and parallel evolution, which reflects in the suitability of sucrose for a transport function. The sucrose synthase is encoded by a small multigene family represented by different isoforms between the plant species; for example, in L. japonicus this gene is encoded at least by six genes (Horst et al., 2007), the same number of genes reported for arabidopsis (Baud et al., 2004) and rice (Harada et al., 2005; Huang et al., 1996). Three sucrose synthase isoforms of M. truncatula are closely related to the three pea sucrose synthase isoforms and in both organisms a similar expression pattern is observed (Barratt et al., 2001; Hohnjec et al., 1999). In addition, a phylogenetic analysis of plant sucrose synthase genes has been reported previously, and their data clearly show that plant sucrose synthase genes can be classified into at least three major branches: one monocot group and two dicots groups (Sturm et al., 1999). 96 In our dendrogram a clear segregation of monocot and dicot can be identified, all sharing synarqueomorphic characteristics. Besides this, there is a clear separation between legumes and non legumes into two subclades. The results reflected the evolution process in which this protein has been through, suggesting that these isoforms diverged during a relatively long period, at least before the divergence between mono and dicotyledonous plant groups (Horst et al., 2007). In monocots some isoforms of sucrose synthase constitutes a critical link in biosynthesis of developing endosperm (Komatsu et al., 2001), besides the fact that the sucrose synthase gene is expressed in many tissues, including seedling roots and shoots, endosperm and embryo (Chourey et al., 1998). Regarding the legumes in the Magnoliopsid clade, Horst et al. (2007) described different results about the relative contributions of sucrose synthase in carbon metabolism in the nodule, suggesting that there may be species-specific differences in sucrose metabolism in different legume nodules. This may explain the fact that legumes, as a monophyletic group, showed distinct characteristics between temperate (V. faba, P. sativum and M. truncatula) and tropical species (G. max, V. unguiculata and P. vulgaris); the transported products in the nitrogen fixation process differs between these two organisms groups. In temperate legumes, the principal transported product is Asn, with activity of Asn synthetase is enhanced in nodules of these plants (Atkins et al., 1984). In legumes of tropical origin, the major transported solutes are the ureides, allantoin and allantoic acid, since in the nodules of these species the activity of the ‘de novo’ purine pathway and enzymes of purine oxidation is exceptionally high. Among other, these characteristics are described as differences between tropical and temperate legumes, which can justify the separation of these species, also confirmed by the present results. 97 The multiple alignments with glutamine synthase proteins from different organisms (Figure 2B) showed a high degree of conservation. In the generated dendrogram distinct monophyletic groups in branches I, II, III and IV were evident, including Archeae, Metazoa, Fungi and Plant, respectively. It is interesting to note that organisms from Archeae were placed as outgroup, sharing a synarqueomorphic features with the eukaryotes. Regarding Eukaryotes, similar patterns were found by Saccone et al. (1995). Moreover, the obtained results were expected, since the GSI form of glutamine synthase has been found only in prokaryotes, whereas the GSII form is found in all eukaryotes and in bacteria belonging to Rhizobiaceae (Shatters et al., 1989), Frankiaceae (Rochefort and Benson, 1990), and Streptomycetaceae (Kumada et al., 1990), suggesting that these two gene forms share a very old common ancestor (Turner and Young et al., 2000). In relation to the Eukaryotes, metazoa, fungi and plant were grouped together in the same subclade, a result similar to the found by Saccone et al. (1995), which constructed a molecular phylogeny based in GS enzymes. In higher plants GS is an octameric enzyme that occurs in diverse isoenzymatic forms with their subunits encoded by members of a small multigene family (Temple et al., 1998). These GS isoforms are located in the cytosol and chloroplast, assimilating ammonia produced by different physiological processes in distinct organs (Ortega et al., 1999). In leaves, chloroplastic GS function to assimilate primary ammonia reduced from nitrate and also to reassimilate ammonia released during photorespiration (Lam et al., 1996). In roots, GS assimilates ammonia (or NO3-) derived directly from the soil or, in the case of legumes, are fixed by bacteroids (Lea and Ireland, 1999) whilst in the cotyledons it reassimilates ammonia released by the breakdown of nitrogenous reserves during germination (Swarup et al., 1990). These multiple isoenzymes have been shown to be differentially expressed in 98 both developmental- and organ-specific manner (McGrath and Coruzzi, 1991; Peterman and Goodman, 1991), explaining the plant branch (IV) in the generated dendogram, which followed the traditional phylogeny, placing monocot and dicots in different subclades (IVa and IVb). The GS gene family has been particularly well characterized in leguminous plants in which a crucial role is played by the cytosolic GS in the assimilation of ammonium released by nitrogen-fixing bacteria within the infected cells of the nodule (Stanford et al., 1993). Indeed, in several legume species the expression of one or more cytosolic GS genes has been shown to be induced during nodule development (Cullimore and Bennett, 1992). Moreover, the separation of tropical and temperate legume species in the dendogram was expected since nodule GS regulation includes additional tissue-specific and developmental (Morey et al., 2002). 3. Expression pattern The rapidly expanding field of genomics provides vast opportunities for evaluating the coordinated functioning and expression of thousands of genes (Lockhart and Winzeler, 2000). In addition, differentially expressed sequence tags (ESTs) have been isolated and characterized from effective root nodules of M. truncatula, with a number of identified genes showing enhanced expression in plant-rhizobium symbiosis (Györgyey et al., 2000). As originally defined, nodulin genes are those expressed exclusively in nodules (Legocki and Verma, 1980). However, over the last several years, this presumption has been reviewed and modified since a number of nodulin genes have been detected in other plant organs, although with limited expression (Kapranov et al., 1997; Mathesius et al., 99 2001). Usually, they are members of protein families that play a role in nodule functioning, but are also active in other physiological processes (Nogueira et al., 2001). Moreover, several environmental conditions are limiting factors to the growth and activity of the N2-fixing plants. In the Rhizobium-legume symbiosis, the process of N2 fixation is strongly related to the physiological state of the host plant (Zahran, 1999). Therefore, an effective nitrogen fixation is limiting by factors, such as salinity, unfavorable soil pH, nutrient deficiency, mineral toxicity, temperature extremes, inadequate photosynthesis and plant diseases that impose limitations on the vigor of the host legume (Brockwell et al., 1995; Peoples et al., 1998). Together, these factors cause changes in the activation/deactivation of some genes, modifying the expression pattern in certain tissues and organs. The effects of salt on nitrogen fixation of legumes have been examined in several studies, revealing a reduction of N2-fixing activity by salt stress usually attributed to a reduction in respiration of the nodules, in cytosolic protein production, specifically leghemoglobin and reduction in the activity of the ammonium assimilation enzymes glutamine synthetase and glutamate synthase (Cordovilla et al., 1995). In addition, increasing salt concentrations may have a detrimental effect on soil microbial populations affecting the fixation rates and effectiveness (Thies et al., 1991). Additional modifications in the plant host during salt stress were studied by Zahran and Sprent (1986), which described that soybean root hairs showed little curling or deformation when inoculated with Bradyrhizobium japonicum in the presence of 170 mM NaCl, with complete suppression of the nodulation at 210 mM NaCl. They also observed a reduction in the bacterial colonization and root hair curling of V. faba, where the proportion of root hairs containing infection threads was reduced by 30%. 100 According to Tejera et al. (2004) to cope the adverse effects of salinity stress common beans (P. vulgaris) increase the root to shoot ratio, decreasing the content of dry plant biomass and the nodule number. However, as observed in most cultivated crops, the salinity response of legumes varied greatly and depended on factors as climatic conditions, soil properties, stage of growth and species. For example, V. faba, P. vulgaris and G. max are more tolerant to salinity than other legumes, like P. sativum (Cordovilla et al., 1995). Our results revealed that some nodulins presented low expression in libraries submitted to salt stress, as expected, since nitrogen fixation is affected by salinity. The annexin, Glutamine Synthase and NOD35 genes showed a higher expression in the control library (CT00; roots in hydropony in the absence of salt stress), probably due to the basal expression of nodulins under non stressed conditions, while transcripts from the DMI3 and NOD70 nodulins were found in abundance in the salinity sensitive genotype without salt stress (SS00), both extracted from root, tissue directly involved in nodulation. Surprisingly, the early nodulins NSP1 and CCS52A presented a high expression in the SS08 and ST02 libraries, respectively. The observed abundance of ENOD8 and NORK early nodulins in libraries extracted from leaves of both BR14-Mulato (BM90) and IT85F (IM90) genotypes collected 90 minutes after mosaic virus infection was expected, since many nodulin genes participate in developmental processes and are also expressed in diverse tissues from other plants (Bauer et al., 1996; Papadopoulou et al., 1996). In A. thaliana, for example, the ENOD8 homologous presented higher expression in anthers tissues (Peng and Dickstein, 1994). By other hand, the high expression of NORK cowpea candidates in libraries constructed from infected leaves can be explained by the fact that this gene encodes a transmembrane protein with structural analogy to receptor kinases involved in molecular signaling or disease 101 resistance (De Mita et al., 2007; Jones and Jones, 1994). However, currently available genetic maps of legumes are limited due to the lack of markers tightly linked to nitrogenfixation, since almost all described markers are focused mainly in resistance to parasites (Bukar et al., 2004; Ouédraogo et al., 2002). The creation of a large-scale EST database of M. truncatula offered a the possibility to prospect genes in silico, whose expression are specific for or greatly enhanced by symbiosis, allowing the identification of genes that were proposed to be up- or down regulated in the root nodule. Using this approach, it was possible to identify some important nodulins in cowpea transcriptome and also to observe their expression pattern under stress conditions. The identified sequences represent valuable resources for the development of markers for molecular breeding and gene-specific markers for nodulation in cowpea and other related legumes, enriching the genetic, physiological and metabolical data related to the nitrogen fixation. 102 REFERENCES Altschul SF, Madden TL, Schaffer AA, Zhang J et al. (1997). Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nuc. Ac. Res. 25 (17): 3389-3402. Andrawis A, Solomon M and Delmer DP (1993). Cotton fiber annexins: a potential role in the regulation of callose synthase. Plant J. 3:763-772. Atkins CA, Pate JS and Shelp BJ (1984). Effects of short-term N2 deficiency on N metabolism in legume nodules. Plant Physiol. 76: 705–710. Barker DG, Bianchi S, London F, Dattee Y, et al. (1990). Medicago truncatula, a model plant for studying the molecular genetics of the Rhizobium-legume symbiosis. Plant Mol. Biol. 8:40–49. Barratt DHP, Barber L, Kruger NJ, Smith AM, et al. (2001). Multiple, distinct isoforms of sucrose synthase in pea. Plant Physiol. 127:655–664. Baud S, Vaultier MN and Rochat C (2004). Structure and expression profile of the sucrose synthase multigene family in Arabidopsis. J. Exp. Bot. 55:397–409. Bauer P, Rated P, Crespi MD, Schultze M and Kondorosi A (1996). Nod factors and cytokinins induce similar cortical cell division, amyloplast deposition and MsENOD12A expression patterns in alfalfa roots. Plant J. 10:91-105. Bertinetti C and Ugalde RA (1996). Studies on the response of carrot cells to a Sclerotinia sclerotiorum elictor: Induction of the expression of an extracellular glycoprotein mRNA. Mol. Plant Microbe Interact. 9:658-663. Bolle C (2004). The role of GRAS proteins in plant signal transduction and development. Planta 218:683–692. Braun EL, Kang S, Nelson MA and Natvig DO (1998). Identification of the first fungal annexin: analysis of annexin gene duplications and implications for eukaryotic evolution. J. Mol. Evol. 47:531-543. Brick DJ, Brumlik MJ, Buckley JT, Cao JX, et al. (1995). A new family of lipolytic plant enzymes with members in rice, Arabidopsis and maize. FEBS Lett. 377:475–480. Brockwell J, Bottomley PJ and Thies JE (1995). Manipulation of rhizobia microflora for improving legume productivity and soil fertility: a critical assessment. Plant Soil 174:143–180. Bukar O, Kong L, Singh BB, Murdock L, et al. (2004). AFLP and AFLP-derived SCAR markers associated with Striga gesnerioides resistance in cowpea. Crop Sci. 44(4):1259-1264. 103 Carroll AD, Moyen C, Van Kesteren P, Tooke F, et al. (1998). Ca2+, annexins, and GTP modulate exocytosis from maize root cap protoplasts. Plant Cell 10:1267–1276. Catalano CM, Lane WS and Sherrier DJ (2004). Biochemical characterization of symbiosome membrane proteins from Medicago truncatula root nodules. Electrophoresis 25:519-531. Cebolla A, Vinardell JM, Kiss E, Olah B, et al. (1999). The mitotic inhibitor ccs52 is required for endoreduplication and ploidy-dependent cell enlargement in plants. Europ. Mol. Biol. J. 18:4476–4484. Chen X, Laudeman TW, Rushton PJ, Spraggins TA, et al. (2007). CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences. BMC Bioinfo. 8:129. Clark GB and Roux SJ (1995). Annexins of plant cells. Plant Physiol. 109:1133–1139. Colebatch G, Desbrosses G, Ott T, Krusell L, et al. (2004). Global changes in transcription orchestrate metabolic differentiation during symbiotic nitrogen fixation in Lotus japonicus. The Plant J. 39:487–512. Cordovilla MP, Ligero F and Lluch C (1995). Influence of host genotypes on growth, symbiotic performance and nitrogen assimilation in Faba bean (Vicia faba L.) under salt stress. Plant Soil 172:289–297. Cullimore JV and Bennett MJ (1992). Nitrogen assimilation in the legume root nodule: current status of the molecular biology of the plant enzymes. Can. J. Microbiol. 38:461-466. Dangl JL and Jones JDG (2001). Plant pathogens and integrated defence responses to infection. Nature 411:826–833. De Mita S, Ronfort J, McKhann HI and Poncet C (2007). Investigation of the demographic and selective forces shaping the nucleotide diversity of genes involved in Nod factor signaling in Medicago truncatula. Genetics 177:2123-2133. Doyle JJ and Luckow MA (2003). The Rest of the Iceberg. Legume diversity and evolution in a phylogenetic context. Plant Physiol. 131:900–910. Duncan KA, Hardin SC and Huber SC (2006). The three maize sucrose synthase isoforms differ in distribution, localization, and phosphorylation. Plant Cell Physiol. 47:959– 971. Eisen MB, Spellman PT, Brown PO and Botstein B (1998). Cluster analysis and display of genome-wide expression patterns. Genetics 25:14863-14868. 104 van Engelen F, de Jong A, Meijer E, Kuil C, et al. (1995). Purification, immunological characterization and cDNA cloning of a 47 kDa glycoprotein secreted by carrot suspension cells. Plant Mol. Biol. 27:901–910. Fedorova M, van de Mortel J, Matsumoto PA, Cho Jennifer, et al. (2002). Genome-wide identification of nodule-specific transcripts in the model legume Medicago truncatula. Plant Physiol.130:519–537. Gidrol X, Sabelli PA, Fern YS and Kush AK (1996). Annexin-like protein from Arabidopsis thaliana rescues delta oxyR mutant of Escherichia coli from H2O2 stress. Proc. Natl. Acad. Sci. U.S.A. 93:11268–11273. Godfroy O, Debelle F, Timmers T and Rosenberg C (2006). A rice calcium- and calmodulin-dependent protein kinase restores nodulation to a legume mutant. Mol. Plant–Microbe. Interac. 19:495–501. Györgyey J, Vaubert D, Jiménez-Zurdo JI, Charon C, et al. (2000). Analysis of Medicago truncatula nodule expressed sequence tags. Mol. Plant-Microbe Interact. 13:62–71. Hagemann M and Marin K (1999). Salt-induced sucrose accumulation is mediated by sucrose-phosphate-synthase in cyanobacteria. J. Plant Physiol. 155:424–430. Handberg K and Stougaard J (1992). Lotus japonicus, an autogamous, diploid legume species for classical and molecular genetics. Plant J. 2:487–496. Harada T, Satoh S, Yoshioka T and Ishizawa K (2005). Expression of sucrose synthase genes involved in enhanced elongation of pondweed (Potamogeton distinctus) turions under anoxia. Ann. Bot. (Lond) 96:683–692. Hawkins TE, Merrifield CJ and Moss SE (2000). Calcium signalling and annexins. Cell Biochem. Biophys. 33:275–296. Heckmann AB, Lombardo F, Miwa H, Perry JA, et al. (2006). Lotus japonicus nodulation requires two GRAS domain regulators, one of which is functionally conserved in a non-legume. Plant Physiol. 142:1739–1750. Hohnjec N, Becker JD, Puhler A, Perlick AM, et al. (1999). Genomic organization and expression properties of the MtSucS1 gene, which encodes a nodule-enhanced sucrose synthase in the model legume Medicago truncatula. Mol. Gen. Genet. 261:514–522. Horst I, Welham T, Kelly S, Kaneko T, et al. (2007). TILLING Mutants of Lotus japonicus reveal that nitrogen assimilation and fixation can occur in the absence of noduleenhanced sucrose synthase. Plant Physiol. 144:806–820. Hu S, Brady SR, Kovar DR, Staiger CJ, et al. (2000). Identification of plant actin-binding proteins by F-actin affinity chromatography. Plant J. 24:127–137. 105 Huang JW, Chen JT, Yu WP, Shyur LF, et al. (1996). Complete structures of three rice sucrose synthase isogenes and differential regulation of their expressions. Biosci. Biotechnol. Biochem. 60:233–239. Jeong J, Suh S, Guan C, Tsay YF, et al. (2004). A nodule-specific dicarboxylate transporter from alder is a member of the peptide transporter family. Plant Physiol. 134:969–978. Jones DA and Jones JDG (1994). The role of leucine-rich repeat proteins in plant defences. Adv. Bot. Res. 24:89–167. Kaiser BN, Moreau S, Castelli J, Thomson R, et al. (2003). The soybean NRAMP homologue, GmDMT1, is a symbiotic divalent metal transporter capable of ferrous iron transport. The Plant J. 35:295–304. Kapranov P, Bruijn de FJ and Szczyglowski K (1997). Novel, highly expressed late nodulin gene (LjNOD16) from Lotus japonicus. Plant Physiol. 113:1081–1090. Kumada Y, Takano E, Nagaoka K and Thompson CJ (1990). Streptomyces hygroscopicus has two glutamine synthetase genes. J. Bacteriol. 172:5343-5351. Kumar S, Tamura K and Nei M (2004). MEGA 3: Integrated Software for Molecular Evolutionary Genetics Analysis and Sequence Alignment. Brief. Bioinf. 5:150-163. Küster H, Becker A, Firnhaber C, Hohnjec N, et al. (2007). Development of bioinformatic tools to support EST-sequencing, in silico- and microarray-based transcriptome profiling in mycorrhizal symbioses. Phytochem. 68:19-32. Lam HM, Coschigano KT, Oliveira IC, Melo-Oliveira R, et al. (1996). The molecular genetics of nitrogen assimilation into amino acids in higher plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 47:569–593. Lea PJ and Ireland RJ (1999). Nitrogen metabolism in higher plants. Plant. In BK Singh, ed, Plant Amino Acids, Bioch. Biotech. Marcel Dekker, New York, pp 1–47. Lee H, Hur CG, Oh CJ, Kim HB, et al. (2004). Analysis of the root nodule-enhanced transcriptome in soybean. Mol. Cells 18:53-62. Legocki RP and Verma DP (1980). Identification of “nodule-specific” host proteins (nodulins) involved in the development of rhizobium-legume symbiosis. Cell 20:153163. Ling H, Zhao J, Zuo K, Qiu C, et al. (2006). Isolation and expression analysis of a GDSLlike lipase gene from Brassica napus L. J. Biochem. Mol. Biol. 39:297–303. Lockhart DJ and Winzeler EA (2000). Genomics, gene expression and DNA arrays. Nature 405:827–836. 106 Long SR (2001). Genes and signals in the Rhizobium-legume symbiosis. Plant Physiol. 125:69–72. Lunn JE (2002). Evolution of sucrose synthesis. Plant Physiol. 128:1490-1500. Manthey K, Krajinski F, Hohnjec N, Firnhaber C, et al. (2004). Transcriptome profiling in root nodules and arbuscular mycorrhiza identifies a collection of novel genes induced during Medicago truncatula root endosymbioses. MPMI 17(10):1063–1077. Martins LMV, Xavier GR, Rangel FW, Ribeiro JRA, et al. (2003). Contribution of biological nitrogen fixation to cowpea: a strategy for improving grain yield in the Semi-Arid Region of Brazil. Biol. Fer. Soils 38:333-339. Mathesius U, Keijzers G, Natera SHA, Weinman JJ, et al. (2001). Establishment of a root proteome reference map for the model legume Medicago truncatula using the expressed sequence tag database for peptide mass fingerprinting. Proteomics 1:1424– 1440. McClung AD, Carroll AD and Battey NH (1994). Identification and characterization of ATPase activity associated with maize (Zea mays) annexins. Biochem. J. 30:709–712. McGrath RB and Coruzzi GM (1991). A gene network controlling glutamine and asparagine biosynthesis in plants. Plant J. 1: 275-280. Mims MP and Prchal J T (2005). Divalent metal transporter 1. Hematol. 10:339–345. Miyao A, Iwasaki Y, Kitano H, Itoh J, et al. (2007). A large-scale collection of phenotypic data describing an insertional mutant population to facilitate functional analysis of rice genes. Plant Mol. Biol. 63:625–635. Morey KJ, Ortega JL and Sengupta-Gopalan C (2002). Cytosolic glutamine synthetase in soybean is encoded by a multigene family, and the members are regulated in an organspecific and developmental manner. Plant Physiol. 128:182–193. Morgan SO and Fernandez MP (1997). Distinct annexin subfamilies in plants and protests diverged prior to animal annexins and from a common ancestor. J. Mol. Evol. 44:178188. Moss SE (1997). Annexins. Trends Cell Biol. 7:87-89. Newton WE (2000). Nitrogen fixation: from molecules to crop productivity. Dordrecht: Kluwer 3–8. Niebel FC, Lescure N, Cullimore JV and Gamas P (1998). The Medicago truncatula MtAnn1 gene encoding an annexin is induced by nod factors and during the symbiotic interaction with Rhizobium meliloti. Mol. Plant-Microbe Interac. 11:504–513. 107 Niebel FC, Timmersy ACJ, Chabaud M, Defaux-Petras A, et al. (2002). The Nod factorelicited annexin MtAnn1 is preferentially localised at the nuclear periphery in symbiotically activated root tissues of Medicago truncatula. The Plant J. 32:343–352. Nogueira EM, Vinagre F, Masuda HP, Vargas C, et al. (2001). Expression of sugarcane genes induced by inoculation with Gluconacetobacter diazotrophicus and Herbaspirillum rubrisubalbicans. Genet. Mol. Biol. 24:199-206. Oldroyd GED and Downie AL (2004). Calcium, kinases and nodulation signalling in legumes. Mol. Cell Biol. 5:566-576. Ortega JL, Roche D and Sengupta-Gopalan C (1999). Oxidative turnover of soybean root glutamine synthetase: in vitro and in vivo studies. Plant Physiol. 119:1483–1495. Ouédraogo JT, Tignegre JB, Timko MP and Belzile FJ (2002). AFLP markers linked to resistance against Striga gesnerioides race 1 in cowpea (Vigna unguiculata). Genome 45(5):787-793. Papadopoulou K, Roussis A and Katinakis P (1996). Phaseolus ENOD40 is involved in symbiotic and non-symbiotic organogenetic processes: expression during nodule and lateral root development. Plant Mol. Biol. 30:403–417. Parniske M and Downie JA (2003). Plant biology: locks, keys and symbioses. Nature 425:569–570. Peng T and Dickstein R (1994). Regulation of plant nodule-specific genes expressed in alfalfa nodules arrested at an early stage of development. Plant Sci. 101:65–73. Peoples MBRR, Gault GJ, Scammell BS, Dear J, et al. (1998). Effect of pasture management on the contributions of fixed N to the N economy of leyfarming systems. Aust. J. Agric. Res. 49:459–474. Peterman TK and Goodman HM (1991). The glutamine synthetase gene family of Arabidopsis thaliana: light-regulation and differential expression in leaves, roots and seeds. Mo1. Gen. Genet. 230:145-154. Pringle D and Dickstein R (2004). Purification of ENOD8 proteins from Medicago sativa root nodules and their characterization as esterases. Plant Physiol. Bioch. 42:73–79. Raynal P and Pollard HB (1994). Annexins: the problem of assessing the biological role for a gene family of multifunctional Ca2+- and phospholipid-binding proteins. Biochem. Biophys. Acta 1197:63-93. Rochefort DA and Benson DR (1990). Molecular cloning, sequencing and expression of the glutamine synthetase II (glnll) gene from the actinomycete root nodule symbiont Frankia sp. strain CpI1. J. Bacteriol. 172:5335-5342. 108 Saccone C, Gissi C, Lanave C and Pesole G (1995). Molecular classification of living organisms. J. Mol. Evol. 40:273-279. Sánchez C, Vielba JM, Ferro E, Covelo G, et al. (2007). Two SCARECROW-LIKE genes are induced in response to exogenous auxin in rooting-competent cuttings of distantly related forest species. Tree Physiol. 27:1459-1470. Schauser L, Wieloch W and Stougaard J (2005). Evolution of NIN-Like proteins in Arabidopsis, rice and Lotus japonicus. J. Mol. Evol. 60:229–237. Shatters RG, Somerville JE and Kahn ML (1989). Regulation of glutamine synthetase II activity in Rhizobium meliloti. J. Bacteriol. 171:5087–94. Shin H and Brown RM (1999). GTPase activity and biochemical characterisation of a recombinant cotton fiber annexin. Plant Physiol. 119:925–934. Simões-Araújo JL, Rodrigues RL, Gerhardt LBA, Mondego JMC, et al. (2002). Identification of differentially expressed genes by cDNA-AFLP technique during heat stress in cowpea nodules. FEBS Letters 515:44-50. Singh BB, Mohan Raj DR, Dashiell KE and Jackai Len (1997). Advances in cowpea research. IITA-JIRCAS, Ibadan, Nigeria. Smeekens S (2000). Sugar-induced signal transduction in plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 51:49–81. Smit P, Raedts J, Portyanko V, Debellé F, et al. (2005). NSP1 of the GRAS protein family is essential for rhizobial Nod factor-induced transcription. Sci. 308:1789–1791. Spaink HP (2000). Root nodulation and infection factors produced by rhizobial bacteria. Annu. Rev. Microbiol. 54:257-288. Sprent JI (2007). Evolving ideas of legume evolution and diversity: a taxonomic perspective on the occurrence of nodulation. New Phytol. 174:11–25. Stacey G, Libault M, Brechenmacher L, Wan J, et al. (2006). Genetics and functional genomics of legume nodulation. Curr. Opin. Plant Biol. 9:110-121. Stanford AC, Larsen K, Barker DC and Cullimore JV (1993). Differential expression within the glutamine synthetase gene family of the model legume Medicago truncatula. Plant Physiol. 103:73-81. Stougaard J (2000). Regulators and regulation of legume root nodule development. Plant Physiol. 124:531–540. Sturm A, Lienhard S, Schatt S and Hardegger M (1999). Tissue-specific expression of two genes for sucrose synthase in carrot (Daucus carota L.). Plant Mol. Biol. 39:349–360. 109 Swarup R, Bennett MJ and Cullimore JV (1990). Expression of glutamine-synthetase genes in cotyledons of germinating Phaseolus vulgaris L. Planta 183:51–56. Tarayre S, Vinardell JM, Cebolla A, Kondorosi A, et al. (2004). Two classes of the Cdh1type activators of the anaphase-promoting complex in plants: Novel functional domains and distinct regulation. The Plant Cell 16:422–434. Tejera NA, Campos R, Sanjuan J and Lluch C (2004). Nitrogenase and antioxidant enzyme activities in Phaseolus vulgaris nodules formed by Rhizobium tropici isogenic strains with varying tolerance to salt stress. J. Plant Physiol. 161:329–338. Temple SJ, Vance CP and Gantt JS (1998). Glutamate synthase and nitrogen assimilation. Trends Plant Sci. 3:51–56. Thies JE, Singleton PW and Bohlool BB (1991). Modelling symbiotic performance of introduced rhizobia in the field by the use of indices of indigenous population size and nitrogen status of the soil. Appl. Environ. Microbiol. 57:29–37. Turner SL and Young JPW (2000). The glutamine synthetases of rhizobia: phylogenetics and evolutionary implications. Mol. Biol. Evol. 17:309–319. Upton C and Buckley JT (1995). A new family of lipolytic enzymes. Trends Biochem. Sci. 20:178-179. Van de Velde W, Guerra JCP, De Keyser A, De Rycke R, et al. (2006). Aging in legume symbiosis. A molecular view on nodule senescence in Medicago truncatula. Plant Physiol. 141:711–720. Vincill ED, Szczyglowski K and Roberts DM (2005). GmN70 and LjN70. Anion transporters of the symbiosome membrane of nodules with a transport preference for nitrate. Plant Physiol. 137:1435–1444. Wilkinson JQ, Lanahan MB, Conner TW and Klee HJ (1995). Identification of mRNAs with enhanced expression in ripening strawberry fruit using polymerase chain reaction differential display. Plant Mol. Biol. 27:1097–1108. Zahran HH (1999). Rhizobium-Legume symbiosis and nitrogen fixation under severe conditions and in an arid climate. Microbiol. Mol. Biol. Rev. 63:968–989. Zahran HH and Sprent JI (1986). Effects of sodium chloride and polyethylene glycol on root hair infection and nodulation of Vicia faba L. plants by Rhizobium leguminosarum. Planta 167:303–309. Zhu H, Riely BK, Burns NJ and Ane JM (2006). Tracing non-legume orthologs of legume genes required for nodulation and arbuscular mycorrhizal symbioses. Genetics 172:2491–2499. 110 Zrenner R, Salanoubat M, Willmitzer L and Sonnewald U (1995). Evidence of the crucial role of sucrose synthase for sink strength using transgenic potato plants (Solanum tuberosum L.). Plant J. 7: 97–107. 111 Table 1. Type and features of nodulin genes used as query against the cowpea databases. The genes are grouped in two nodulin types, Early nodulin (orange background) and late nodulin (green background) with respective gene name, accession number at NCBI, protein size in amino acids (aa), organism, with respective domains. Gene name Accession number Size (aa) Organism Database Annexin CAA75308 313 DMI3 Q6RET7 523 NIN CAB61243 878 NSP1 ABK35066 542 NORK CAD10811 925 CCS52A AAY58271 487 ENOD8 AAL68832 381 Medicago truncatula Medicago truncatula Lotus japonicus Lotus japonicus Medicago truncatula Lotus japonicus Medicago truncatula ENOD40 CAD48198 261 DMT1 AAO39834 516 GS Q43785 356 Lgb CAA38024 162 NOD26 AAT35231 310 NOD70 AAW51884 598 SucSin P13708 805 NOD35 BAA19672 309 Medicago truncatula Glycine max Medicago sativa Medicago sativa Medicago truncatula Glycine max Glycine max Glycine max Conserved Domain 1 Name Size Begin (aa) Annexin 66 14 End 79 Conserved Domain 2 Name Size Begin (aa) Annexin 66 86 End 150 Conserved Domain 3 Name Size Begin (aa) Annexin 66 172 End 232 Conserved Domain 4 Name Size Begin (aa) Annexin 66 243 End 308 S_TKc 256 11 306 EFh 63 370 460 EFh 63 441 508 - - - - RWP-RK 52 571 621 PB1_NLP 82 781 862 - - - - - - - - GRAS 371 154 532 - - - - - - - - - - - - PKc_Tyr 258 602 868 - - - - - - - - - - - - WD40 289 186 462 - - - - - - - - - - - - SGNH_ plant_ lipase_ like RRM 315 35 365 - - - - - - - - - - - - 74 152 220 - - - - - - - - - - - - Nramp 360 77 439 - - - - - - - - - - - - Glnsynt_N Globin 82 18 97 259 103 354 - - - - - - - - 140 20 157 Glnsynt_C - - - - - - - - - - - - MIP 228 80 268 - - - - - - - - - - - - Nodulinlike Sucrose_ synth Uricase 248 27 253 - - - - - - - - - - - - 550 7 554 - - - - - - - - - - - - 286 15 304 - - - - - - - - - - - - 112 Table 2. Main cowpea clusters significantly similar to known nodulins. tBLASTn results including the best match of each nodulin type: (I) Features and evaluation results with gene name, e-value, cluster size in nucleotides (n), ORF (Open Reading Frame) size in aminoacids (aa), frame ( Fr) and number (#) of hits. (II) Data about BLASTx best alignment: Gi number of NCBI, plant species, e-value and frame. (I) Cluster Features and Evaluation Gene name Cluster Nr. Size (n) ORF (aa) E-value Annexin Contig2698 1153 313 5e-150 DMI3 UP12_145693 1464 403 NIN UP12_25530 653 NSP1 UP12_1465 NORK NCBI gi|Nr. (II) BLASTx Information aa Positives Frame (%) E-value Score 3176098 4e-147 525 92 +2 Medicago truncatula 3e-60 91992434 0.0 701 92 +3 Medicago truncatula 189 6e-57 33468530 3e-86 321 83 +2 Lotus japonicus 1586 349 4e-22 89474462 1e-165 587 79 +1 Solanum lycopersicum Contig1184 1154 284 1e-159 56412259 1e-171 546 92 +3 Sesbania rostrata CCS52A UP12_8627 1046 71 3e-114 66932877 2e-106 389 94 +3 Lotus japonicus ENOD8 UP12_14302 1279 378 3e-113 33147016 4e-124 519 81 +3 Oryza sativa ENOD40 UP12_6121 1276 253 2e-98 23304837 2e-80 303 80 +2 Medicago truncatula 31322147 4e-129 521 92 +2 Glycine max Plant Species DMT1 UP12_13157 1492 312 3e-131 GS UP12_2868 1344 356 0.0 121345 0.0 654 99 +2 Phaseolus vulgaris Lgb VUPISS02004C04 886 145 6e-47 20138590 2e-64 249 100 +1 Vigna unguiculata NOD26 UP12_17225 1157 301 2e-145 47531135 4e-98 362 90 +1 Medicago truncatula NOD70 UP12_10450 2342 591 3e-91 57545995 4e-39 167 60 +2 Glycine max SucSin UP12_10000 1602 805 0.0 267057 2e-160 1590 99 +1 Vigna radiata NOD35 UP12_2046 1257 308 5e-169 6175091 5e-164 581 97 +3 Phaseolus vulgaris 113 Table 3. Conserved domains description of the best hits in cowpea database for each nodulin type, including cluster numbers (Nr), gene name, size in amino-acids (aa), ORF (Open Reading Frame) size in amino-acids (aa), alignment of protein (Ptn), Conserved Domain (CD) present, integrity (Int) and number (#) of hits with Complete Domain (Com Dom). Conserved Domain 1 Gene name Annexin DMI3 NIN NSP1 NORK CCS52A ENOD8 Cluster Nr. Name Size (aa) ORF (aa) Contig2698 Annexin 66 313 S_TKc 256 403 PB1_NLP 82 223 GRAS 302 366 258 452 289 389 315 378 73 253 360 312 82 356 140 145 228 301 248 224 398 550 286 177 UP12_ 145693 UP12_ 10901 UP12_ 9726 UP12_ 9214 UP12_ 6633 UP12_ 14302 ENOD40 DMT1 GS Lgb NOD26 NOD70 SucSin NOD35 UP12_ 6121 UP12_ 13157 Contig1 VUPISS02 004C04 UP12_ 17225 UP12_ 12841 Contig2 Contig1 PKc_Tyr WD40 SGNH_ plant_ lipase_ like RRM Nramp Gln-synt_N Globin MIP Nodulin-like Sucrose_ synth Uricase Align. Ptn/CD 14/1 79/66 1/46 214/256 130/2 212/83 1/32 250/288 115/3 380/262 96/4 380/285 23/2 358/315 155/1 224/66 1/127 236/360 21/5 97/82 5/1 141/140 74/1 262/205 17/1 223/201 7/1 554/550 14/1 174/162 Conserved Domain 2 Int # Com Dom Name Size (aa) C 3 Annexin 66 C 4 EFh 63 C 3 - - C 3 - C 20 C Conserved Domain 3 Int # Com Dom Name Size (aa) C 2 Annexin 66 C 5 EFh 63 - - - - - - - - - - - - - - - 2 - - - - C 21 - - - C 3 - - I 0 - C 3 C Conserved Domain 4 Int # Com Dom Name Size (aa) Align Ptn/CD Int # Com Dom C 2 Annexin 66 86/1 150/66 C 1 C 3 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Glnsynt_C 259 103/1 354/258 C 3 - - - - - - - - - - 40 - - - - - - - - - - - - - - - C 15 - - - - - - - - - - - - - - - C 4 - - - - - - - - - - - - - - - C 2 - - - - - - - - - - - - - - - I 1 - - - - - - - - - - - - - - - Alig. Ptn/CD 243/1 308/66 262/2 321/61 Align Ptn/CD 170/3 232/65 337/5 394/63 114 Capítulo 3 Artigo Científico _______________________________________________________ Expression of Nodulins Genes in Sugarcane Transcriptome Revealed by Computational Analysis ___________________________________________________________ Artigo a ser submetido à revista Genetics and Molecular Research. 115 Identification and Expression of Nodulins in Sugarcane Transcriptome Revealed by In Silico Analysis Gabriela Souto Vieira-Mello; Petra Barros dos Santos; Nina da Mota Soares Cavalcanti and Ana Maria Benko-Iseppon Universidade Federal de Pernambuco, Centro de Ciências Biológicas, Departamento de Genética, Laboratório de Genética e Biotecnologia Vegetal, Recife, PE, Brazil. Short running title: Symbiotic Nitrogen Fixation Genes in Sugarcane Transcriptome. Key words: data mining, sugarcane, early nodulins, late nodulins, expression pattern Corresponding Author: Ana Maria Benko-Iseppon, UFPE, CCB, Departamento de Genética, Laboratório de Genética e Biotecnologia Vegetal, Av. Prof. Moraes Rego, s/nº; 50732-970, Recife, PE, Brazil. E-mail: ana.benko.iseppon@pq.cnpq.br 116 ABSTRACT Nodulin genes have been defined as plant genes that are exclusively induced during nodule formation in legume plants. Many studies, however, revealed a number of nodulins in nonlegumes, including some monocots, suggesting that these genes play additional roles in plants besides the nodulation. Sugarcane (Saccharum spp.) establishes a beneficial association with endophytic nitrogen-fixing bacteria, with some genes involved in these plant-bacteria association presenting homology with known legume nodulins. In order to gain insight into the role played by nodulins in sugarcane, we investigated the presence and expression profile of nodulin genes in the sugarcane transcriptome. In the present work 13 gene coding legume nodulins were selected and used to search for orthologs in the sugarcane database (SUCEST) using in silico procedures. Ortolog sequences were identified, translated and their conserved domains (CDs) analyzed (using BLAST, ORFfinder and RPS_BLAST tools, respectively). To evaluate the expression profile we used the CLUSTER program, considering tissue of origin and treatment of each library regarding the available transcripts. We identified 195 candidate contigs in SUCEST database, presenting significant alignments with known legume nodulins. In silico evaluation revealed higher expression in FL (flowers), RT (roots) and NR (normalized mix of tissues), confirming the multifunction character of sugarcane nodulins besides the interaction with the endophytic bacteria. The multiple alignments showed a high homology regarding of the sugarcane candidates with respective proteins from other plants, mainly monocots, revealing that the genic structure was relatively conserved among species, probably regarding very ancient genetic processes. 117 INTRODUCTION Sugarcane is one of the most important sources of sugar and alcohol in the world and is cultivated in tropical and subtropical areas in more than 80 countries around the globe, especially in Brazil, where is used mainly for sugar and ethanol production. This crop occupies an area around one million ha, contributing to 25% of the world’s production (UDOP, 2008). Several Brazilian sugarcane varieties have the ability to grow with low nitrogen fertilizer inputs. Historically, this crop has been selected in Brazil for high yields with low inputs of inorganic nitrogen fertilizer and, unwittingly, for higher contributions of Biological Nitrogen Fixation (BNF) (Nogueira et al., 2001). This important Brazilian crop establishes association with endophytic diazotrophic bacteria, including Gluconacetobacter diazotrophicus, Herbaspirillum seropedicae and Herbaspirillum rubrisubalbicans, showing unique features when compared with other nitrogenfixing associations. The bacteria colonize the intercellular spaces and vascular tissues of most organs of the infected symbiont promoting plant growth, without causing visible plant anatomical changes or disease symptoms (Baldani et al., 1997; Reinhold-Hurek and Hurek, 1998), possibly due to effective nitrogen supply (Sevilla et al., 2001). Moreover, it was suggested that the plant supplies sucrose, among others photosyntates, favoring the endophytic growth (Fuentes-Ramírez et al., 1999). It is still unclear which mechanisms are involved in the establishment of this particular type of interaction and what kind of molecules mediate signaling between plant and bacteria. In addition, little is known about the role of the plant in this association (Nogueira et al., 2001). However the fact that distinct sugarcane genotypes have different rates of BNF suggests that plant 118 genetic factors might be controlling the process of bacteria recognition, colonization and/or nitrogen fixation (Urquiaga et al., 1992). Nodulins have been defined as plant genes that are exclusively induced during nodule formation in legume plants (van Kammen, 1984). Many studies, however, revealed a number of nodulin-related sequences in non-legumes, e.g. those of leghemoglobin (Trevaskis et al., 1997), uricase II (Takane et al., 1997), ENOD93 (Reddy et al., 1998) and ENOD40 (Kouchi et al., 1999). Moreover, some non leguminous plants, including rice, were found to have the ability to perceive lipochitooligosaccharide nodulation signal molecules (Nod factors) produced by the rhizobia (Reddy et al., 1998). These findings suggest that nodule formation processes are conserved, at least partially, in non legumes. Inherent nodulation potential of non legumes can probably be attributed to the existence of nodulin genes in these plants (Reddy et al., 1999). In addition, some sugarcane genes involved in plant-bacteria signalization during the association and nitrogen metabolism are probably activated by the endophytic bacteria in the early steps of plant colonization, allowing sugarcane to assimilate and process the nitrogen fixed by the bacteria (Vargas et al., 2003). These genes also seem to act as nodule activators, once they present homology with some legume nodulins (Nogueira et al., 2001). The investigation of plant gene expression during plant-bacteria associations in sugarcane transcriptome could be a strategy to unravel the plant molecular mechanisms which are involved in this particular type of association between plants and diazotrophic bacteria. Large-scale sequencing of cDNA libraries by the expressed sequence tag (EST) approach has proven to be a powerful tool to discover new genes and to generate gene expression profiles from different cells and tissues growing under distinct developmental and physiological conditions (Ohlrogge and Benning, 2000). In this context, the present work aimed to perform an in silico identification and characterization of early (Annexin, DMI3, NORK, CCS52A, NIN, ENOD40 and ENOD8) and late 119 (NOD26, NOD70, Glutamine synthase, Leghemoglobin, Sucrose synthase and DMT1) nodulins in the sugarcane transcriptome (SUCEST project), by using known sequences of legume nodulin as templates, including an evaluation of the expression profiles of nodulin-related sequences in this organism. MATERIALS AND METHODS The sugarcane ESTs used in the present work are available in the SUCEST database (www.biotec.icb.ufmg.br/sucest). Information regarding the 31 libraries of the SUCEST project are described in Table 1 (for further details see Grivet and Arruda, 2001; Vettore et al., 2001). The identification of sugarcane nodulins was performed by a search using 13 sequences of known legume nodulins selected from the literature and available at the NCBI databank (Early nodulins: Annexin, DMI3, NORK, CCS52A, NIN, ENOD40 and ENOD8 and Late nodulins: NOD26, NOD70, Glutamine synthase, Leghemoglobin, Sucrose synthase and DMT1) (Table 2), against the SUCEST database using the local tBLASTn tool. After this search, the sugarcane sequences that matched with nodulin genes with a cut-off of e-10 were used for a homology screening in Genbank (NCBI) using the BLASTx tool (Altschul et al., 1997). The cluster frame of the tBLASTn alignment was used to predict the Open Reading Frames (ORFs) for each selected cluster. Sugarcane clusters were translated using the Orfinder tool at NCBI (http://www.ncbi.nlm.nih.gov/projects/gorf) and screened for conserved motifs with aid of the RPSBLAST CD-search tool (Altschul et al., 1990). An analysis of nodulin distribution patterns in sugarcane libraries was verified by direct correlation of the read´s frequency of each cluster in the SUCEST cDNA libraries, while the prevalence of sugarcane clusters were verified by direct counting of the reads that composed each cluster, followed by data normalization (considering the total 120 number of reads sequenced in each library) and calculation of the relative frequency (reads per library). To generate an overall picture of the nodulin expression pattern in sugarcane, a hierarchical clustering approach (Eisen et al., 1998) was applied using normalized data and a graphic representation constructed with aid of the CLUSTER program. Dendrograms including both axes (using the weighted pair-group for each gene class and library) were generated by the TreeView program (Eisen et al., 1998). On the generated graphics yellow means no expression and brown all degrees of expression. Table 1. Description of the SUCEST libraries, including library code, number of ESTs per library and description of tissues and situations of ESTs extraction. Abbreviations: EST, Expressed Sequence Tag; #, number. Library Code # ESTs Brief Description AD1 18137 AM1, AM2 28128 Tissues of plants cultivated in vitro and infected with Gluconacetobacter diazotroficans Apical meristem of young plants CL3, CL4, CL6 11872 Calli treated for 12h at 4ºC and 37ºC in the dark or ligth FL1, FL2, FL3, FL4, FL5, FL8 HR1 83899 Flowers at different developmental stages 12000 LB1, LB2 18047 Tissues of plants cultivated in vitro and infected with Herbaspirilum rubrisublbicans Lateral buds from mature plants LR1, LR2 18141 Young leaf LV1 6432 Leaves from plants grown in vitro NR1, NR2 768 All normalized tissues RT1, RT2, RT3 31487 0.3 cm-length roots from mature plants and root apex RZ1, RZ2, RZ3 24096 Root to shoot zone of young plants SB1 16318 Stalk bark from mature sugarcane plants SD1, SD2 21406 Developing seeds ST1, ST3 20762 First and fourth internodes of young plants 121 RESULTS 1. Sugarcane Orthologs Using 13 well known nodulin genes as template (Table 2) we could identify 195 candidate sequences in SUCEST database (that includes 311,493 ESTs), being 129 clusters (1,524 reads) for early nodulins and 66 clusters (1,646 reads) for late nodulins, with e-values ranging from 0.0 to e-10. In a general view most analyzed sugarcane clusters showed similarity with monocots, mainly organisms from Poaceae family, as Oryza sativa (Table 3). 1.1. Early Nodulins Regarding annexin a high degree of similarity was found after tBLASTn, with best e-value 4e-152. All nine clusters presented best matches with their respective protein after BLASTx analysis at the GenBank, two of them with the complete annexin domain. All candidate sequences matched with monocot plants (Zea mays and Oryza sativa), with exception of two clusters, which presented similarity with annexins from Arabidopsis thaliana (Brassicaceae) and Cicer arietinum (Fabaceae). After tBLASTn with the DMI3 seed sequence 25 clusters were identified in the sugarcane database with e-values varying from 8e-65 to e-10, seven with complete S_TKc domain and 18 with incomplete domain. After reverse alignments (BLASTx) 19 sugarcane sequences exhibited best similarity with monocots, including Zea mays (four clusters), Triticum aestivum (one cluster) and Oryza sativa (14 clusters). The other six sequences were similar to members of dicot families (Cucurbitaceae, Rosaceae, Fabaceae and Brassicaceae). Concerning the CCS52A candidates, 12 selected clusters were obtained in the SUCEST database, with e-values ranging from 2e-130 to 3e-11. After reverse alignment 83.3% presented similarity with respective protein from O. sativa while the rest was similar to Lotus japonicus and A. 122 thaliana. The WD40 conserved domain was complete in three and incomplete in four sequences, while the rest of the clusters lacked the procured domain. The search for orthologs to NIN genes revealed the presence of five clusters with high degree of similarity. In two of these the searched domain RWP-RK was complete, in one this domain was incomplete and in two no domain was identified. In the BLASTx analysis the clusters showed similarity to the respective protein from O. sativa and L. japonicus, with e-values ranging from 7e-147 to 8e-23. The NORK analysis revealed the higher number of similar clusters, presenting 46, with the best e-value equal to 2e-101. In the BLASTx results 93.5% of all selected clusters showed similarity with monocots, mainly O. sativa, while 6.5% were similar to A. thaliana. Regarding the integrity of the PKc_Tyr conserved domains, in 28 and 18 clusters they were complete and incomplete, respectively. The 28 putative ENOD8 obtained using BLASTn at SUCEST database showed high similarity with O. sativa proteins after BLASTx, with more than 75% of the selected clusters similar to this monocot. 12 clusters presented the procured SGNH_plant_lipase_like domain complete. A similar result was observed in the reverse alignment to the ENOD40. In this last one, four clusters were selected in SUCEST database, three showing high similarity with a respective protein from O. sativa, with the RRM domain found complete and incomplete in three and one clusters, respectively. 1.2. Late Nodulins After trimming redundant clusters, the search for late nodulins in sugarcane database also revealed a high degree of similarity with monocot ortholog proteins, highlighting DMT1 and NOD70, in which all found clusters were similar to the respective protein as observed after BLASTx. With respect to DMT1, all eleven selected clusters showed similarity with proteins from O. sativa, 123 however just one presented the complete domain, while seven beard incomplete domains and three clusters lacked the procured domains. Regarding the NOD70 results, only two clusters were similar to proteins from the dicots Glycine max (Fabaceae family) and Poncirus trifoliata (Rutaceae family), while the other 13 were similar to O. sativa proteins, being three with complete domain, five with incomplete domain and seven with no domain. Considering the Glutamine synthase (GS) results, all selected clusters exhibited best matches with sugarcane proteins after BLASTx analysis at the GenBank. Two GS candidates displayed the Gln-synt_N conserved domain (CD) complete, five incomplete and one had no domain. In the other hand, only two clusters were found in the Leghemoglobin (Lgb) tBLASTn results, from which one presented the Globin conserved domain complete. In BLASTx results this two candidates presented high similarity with hemoglobin protein from Z. mays. Sucrose synthase (SS) and NOD26 candidates revealed similarity mainly with other monocots. In the case of SS five from 13 selected sequences were similar to respective protein from legumes plants G. max (4) and Pisum sativum (1) and the remainder nine showed best matches with S. officinarum (2), O. sativa (2), Sorghum bicolor (1), Z. mays (2) and Bambusa oldhamii (1). Considering the searched CD, 70% of selected cluster presented the Sucrose_synth domain incomplete, while 15% presented this domain domain complete; the same percentage was also found for the cluster with no CD. In relation to NOD26 candidates, 16 clusters were obtained after tBLASTn, with the reverse alignments confirming that all sequences were similar to monocotyledonous plants (Z. mays, S. bicolor, O. sativa and S. officinarum) with e-values ranging from 1e-131 to 3e-85. Regarding the integrity of the procured conserved domain (MIP), in eight it was found complete, in two incomplete and in six the domain was absent. After trimming redundant clusters, the search for late nodulins in sugarcane database also revealed a high degree of similarity with monocot ortholog proteins, highlighting DMT1 and NOD70, 124 in which all found clusters were similar to the respective protein as observed after BLASTx. With respect to DMT1, all eleven selected clusters showed similarity with proteins from O. sativa, however just one presented the complete domain, while seven beard incomplete domains and three clusters lacked the procured domains. Regarding the NOD70 results, only two clusters were similar to proteins from the dicots Glycine max (Fabaceae family) and Poncirus trifoliata (Rutaceae family), while the other 13 were similar to O. sativa proteins, being three with complete domain, five with incomplete domain and seven with no domain. Considering the Glutamine synthase (GS) results, all selected clusters exhibited best matches with sugarcane proteins after BLASTx analysis at the GenBank. Two GS candidates displayed the Gln-synt_N conserved domain (CD) complete, five incomplete and one had no domain. In the other hand, only two clusters were found in the Leghemoglobin (Lgb) tBLASTn results, from which one presented the Globin conserved domain complete. In BLASTx results this two candidates presented high similarity with hemoglobin protein from Z. mays. Sucrose synthase (SS) and NOD26 candidates revealed similarity mainly with other monocots. In the case of SS five from 13 selected sequences were similar to respective protein from legumes plants G. max (4) and Pisum sativum (1) and the remainder nine showed best matches with S. officinarum (2), O. sativa (2), Sorghum bicolor (1), Z. mays (2) and Bambusa oldhamii (1). Considering the searched CD, 70% of selected cluster presented the Sucrose_synth domain incomplete, while 15% presented this domain domain complete; the same percentage was also found for the cluster with no CD. In relation to NOD26 candidates, 16 clusters were obtained after tBLASTn, with the reverse alignments confirming that all sequences were similar to monocotyledonous plants (Z. mays, S. bicolor, O. sativa and S. officinarum) with e-values ranging from 1e-131 to 3e-85. Regarding the integrity of the procured conserved domain (MIP), in eight it was found complete, in two incomplete and in six the domain was absent. 125 2. Distribution of ESTs in SUCEST Libraries Considering the distribution of the 3,170 nodulin transcripts in the 13 analyzed tissues, in general a higher prevalence could be observed in flower (FL= 22%), root (RT=13.5%) and stem-root transition (RZ=10.5%; Figure 1A) tissues. Regarding the correlation of the transcripts distribution among nodulin classes it is interesting to note that all 29 analyzed libraries from SUCEST database comprised at least one read while the AD library displayed no difference regarding the number of reads considering the two classes (early and late); both had 122 reads in total. In counterpart, the RT library exhibited the highest difference, with the late nodulin class presenting 292 reads and the early nodulin class 139 reads (Figure 1A). Considering the correlation of the distribution among reads and nodulins, it was clear that SS reads were most abundant in the SUCEST libraries, with 879 reads (representing 27.7% of nodulin transcripts), followed by NORK with 612 reads (19.3% of the total number). The lowest number of reads was observed for Lgb, with eight reads, representing only 0.25% of all transcripts found (Figure 1B). Regarding the individual analysis of the early nodulins (Figure 2A) a higher prevalence in the FL library was detected, with 381 reads (25%), followed by RZ library, with 196 reads (13%); however, the expression was also abundant in other libraries, as RT and AM, both representing 9% of the total of reads (139 and 142 reads, respectively). On the other hand, the late nodulin graphic (Figure 2B) revealed that the FL and RT libraries had the most abundant number of reads, representing together 37% of all late nodulin transcripts, while the NR and LV had the lowest representation (2%), with 38 and 30 reads each. 126 A 400 350 300 250 Early Nodulins 200 Late Nodulins 150 100 50 0 FL R T R Z AM S T AD S B S D L B L R HR L V NR C L B Figure 1. (A) Comparative prevalence of early and late nodulin genes in the SUCEST libraries. Numbers in vertical refer to the total of reads. (B) Prevalence of reads per nodulin category. Numbers outside the columns refer to the absolute number of reads found and below the percentage of reads that compose each gene category. Abbreviations for libraries: AD: tissues infected by Gluconacetobacter diazotroficans, AM: Apical meristem; CL: Callus; FL: Flower; HR: tissues infected with Herbaspirillum rubrisubalbicans; LB: Lateral Bud; LR: Leaf Roll; LV: Leaves; NR: All tissues normalized; RT: Root; RZ: Stem-Root transition; SB: Stalk Bark; SD: Seeds; ST: Stem. 127 A B Figure 2. Prevalence of sugarcane nodulins in the SUCEST libraries. (A) Occurrence of the early nodulins reads (B) Occurrence of the late nodulins reads. Numbers refer to the percentage of reads in each library for each nodulin class. Library codes: AD: tissues infected by Gluconacetobacter diazotroficans, AM: Apical meristem; CL: Callus; FL: Flower; HR: tissues infected with Herbaspirillum rubrisubalbicans; LB: Lateral Bud; LR: Leaf Roll; LV: Leaves; NR: All tissues normalized; RT: Root; RZ: Stem-Root transition; SB: Stalk Bark; SD: Seeds; ST: Stem 128 3. Expression pattern analysis All transcripts from the two nodulin classes were used to perform a hierarchical clustering analysis permitting an evaluation of expression intensity considering the different colors and coexpression among different libraries (black upper dendrogram) or candidates (pink lateral dendrogram). Considering both graphics (Figure 3) it was evident that early nodulins were more represented in SUCEST libraries than late nodulins. The in silico expression approach also revealed a higher expression in flower tissues at different developmental stages, more specifically FL2 library for early nodulins, and in all tissues normalized for the late nodulins, mainly NR2 library. Regarding the co-expression among SUCEST libraries in the early nodulin graphic (black upper dendrogram), some libraries showed a stronger relation, like FL2/RZ2, RT1+LR2/RT2 and ST3/LB1 (Figure 3A), while in the case of the late nodulins the libraries that showed coexpression were SD2/SD1/LR2 +FL5 and FL2/CL3 (Figure 3B). It is interesting to highlight that the early nodulins were best represented by reads from tissues of Stem-Root transition (RZ1 and RZ3) and flower (FL2), totalizing 19.5%, while the late nodulins were prevalent in NR2 library (35.3%). Considering the spatial co-expression with transcripts of the nodulin classes (pink lateral dendrogram), the most representative presence in all tissues was found for NORK, regarding the early nodulins, and the Suc Sin, considering the late nodulins. In addition the transcripts of the early nodulins ENOD8, DMI3, Annexin and CCS52A were related, showing a co-expression among the SUCEST libraries. Concerning the co-expression of late nodulins the analysis revealed two main groups Suc Sin/GS and DMT1/NOD26. 129 Figure 3: Differential display of standard sugarcane transcripts representing selected nodulin genes. Graphic A represents the expression of early nodulins (NIN, ENOD40, NORK, CCS52A, ENOD8, DMI3 and Annexin) and graphic B represents the late nodulins (Suc Sin, GS, Lgb, NOD35, NOD70, DMT1 and NOD26). Yellow means no expression and brown means all levels of expression. Library codes: AD1: tissues infected by Gluconacetobacter diazotroficans; AM1: Apical meristem from mature plants; AM2: Apical meristem from immature plants; CL3, CL4 and CL6: Pool of calli treated for 12h at 4o e 37oC in the dark or light; FL1, FL3, FL4, FL5 and FL8: Flowers harvested at different developmental stages; HR1: tissues infected with Herbaspirillum rubrisubalbicans; LB1 and LB2: Lateral bud from mature plants; LR1: Leaf roll from immature plants, large insert; LR2: Leaf roll from immature plants, small insert; LV1: Etiolated leaves from plantlets grown in vitro; NR1 and NR2: all tissues normalized; RT1, RT2 and RT3: 0.3 cm length roots from mature plants and root apex; RZ1, RZ2 and RZ3: Root to shoot zone transition of young plants zone 1, 2 and 3; SB1: Stalk bark from mature plants; SD1 and SD2: Seeds in different stages of development; ST1: Stem, first internodes; ST3: Stem, fourth internodes. 130 DISCUSSION 1. Sugarcane Orthologs 1.1 Early Nodulins Annexins form a multigene and multifunctional family of amphipathic proteins presenting a broad taxonomic distribution covering prokaryotes, fungi, protists, plants and higher vertebrates (Gerke and Moss, 2002; Morgan et al., 2004). Regarding Magnoliophyta this protein is conserved in both dicotyledonous and monocotyledonous (Smallwood et al., 1992). Concerning their functions, in legume annexins are upregulated by Nod factors and play a role in nodulation signaling (Niebel et al., 1998). Besides the role in the symbioses, annexins from non-legumes are associated with different cellular processes. For example in maize, annexins are considered to be multifunctional proteins capable of peroxidase activity, elevation of cytosolic calcium and direct formation of a passive Ca2+- and K+-permeable conductance (Laohavisit et al., 2009). Annexins have also been documented in plant nuclei where they may participate in DNA replication (Clark et al., 1998). Another research described a wheat annexin that accumulates in the plasma membrane in response to cold treatment and may act as a Ca2+ channel (Breton et al., 2000). In the case of sugarcane Annexin orthologs, the best alignments followed the taxonomic proximity, since these sequences showed high similarity with the same gene from the Poaceae family (O. sativa and Z. mays). In addition two sequences presented the conserved domain complete, indicating the existence and conservation of Annexin genes in sugarcane. DMI3 is a plant-specific protein that belongs to the CCaMK group of serine-threonine protein kinases in well-characterized plant genomes, present from the moss Physcomitrella patens to higher plants including dicots and monocots (Messinese et al., 2007). We found many DMI3 orthologs in sugarcane transcriptome bearing high similarity with previously sequenced genes. 131 Most sugarcane CCaMKs presented high similarity with rice sequences. Regarding this resemblance, it is interesting to note that some authors suggested that legume DMI3 also beard high similarity to rice and lily (Sathyanarayanan and Poovaiah, 2002; Yang and Poovaiah 2003). Little is known about the biological role of CCaMKs in plants. The preferential expression of the lily and rice CCaMKs in developing anthers and root tips (Poovaiah et al., 1999; Wang and Poovaiah, 1999) has led to the suggestion that they could play a role in mitosis and meiosis (Yang and Poovaiah 2003). Other authors suggested that a CCaMK is required by mycorrhized plants to interpret a complex calcium signature elicited in response to fungus signals (Hrabak et al., 2003; Yang and Poovaiah, 2003). This could be also the case of sugarcane that besides the interaction with endophytic bacteria, is able to establish mycorrhizal associations (Reis et al., 1999). The CCS52A protein is an APC activator involved in mitotic cyclin degradation and in regulation of endoreduplication, playing a role in cell enlargement during root nodule organogenesis in legumes (Vinardell et al., 2003). However, this gene is not exclusively associated with the nodulation process and appears to be a ubiquitous regulator of cell cycle transition to differentiation in plants cells (Foucher and Kondorosi, 2000). The phylogeny of the CCS52A follow the classic taxonomic relationship and orthologs of this protein have been found in various other plant species like L. japonicus, M. truncatula, arabidopsis, tobacco, tomato, potato, soybean, wheat and rice, indicating a strong conservation of the CCS52A proteins in the plant kingdom (Cebolla et al., 1999; González-Sama et al., 2006; Vinardell et al., 2003). Despite of the scarce information regarding the presence of CCS52A in sugarcane, our findings confirm that this gene is present in this organism since we found clusters with complete domains and best hits with high degree of similarity with rice. 132 The initiation of nodule development has been shown to be dependent on nodule inception protein (NIN) (Borisov et al., 2003). In addition, NIN also represses spatial expression pattern of nodulation factors, which may control nodule number (Marsh et al., 2007). Interestingly, the NIN gene family is found widely among higher plants and algae, including many species that are not able to promote gaseous nitrogen fixation (Castaings et al., 2009). The most prominent feature of the NIN protein is a 60-aminoacid-long sequence that is strongly conserved across a variety of proteins in different plants species. In non-legumes this high conserved region (named RWPQP) has been predicted to correspond to the DNA binding compound, acting in the dimerization in non-legumes (Schauser et al., 2005). Riechman et al. (2000) found that there are no close relatives to the legume NIN proteins in rice or arabidopsis, instead, these non-legumes presented NIN-like proteins (NLPs) regarding the closest relatives of legume NINs. In addition, the NLPs are multidomain proteins with a high degree of conservation; the phylogenetic tree inferred from the NLP alignment suggested that there are at least three copies of this gene in the common ancestor of mono- and eudicotyledons (Schauser et al., 2005). Our results confirmed these findings, since the five identified sugarcane clusters showed a high similarity with rice NLPs, confirming the presence of NIN proteins in sugarcane. Sugarcane’s most abundant nodulin regarded the NORK gene class, with 46 clusters. The extracellular domain of NORK protein includes a unique 400-amino-acid sequence and three LRR (Leucine Rich Repeat) domains, followed by a transmembrane domain and a typical serine/threonine protein kinase domain. LRR domains mediate protein interactions and are thought to be involved in ligand recognition by LRR-RLKs (Leucine Rich Repeat-Receptor-like kinases), that are required for perception of a liposaccharide nodulation signal in legumes (Endre et al., 2002; Shiu and Bleecker 2001). Proteins that possess similarity to the unique NORK 133 extracellular domain are found in monocots and dicots, suggesting that this region may have a biological role that is not limited to nodulation (Endre et al., 2002). The RLKs comprise the largest gene family of receptors in plants, with more than 600 homologs in arabidopsis and 1100 in rice (Shiu et al., 2004). In both organisms these RLKs might have roles in plant development and in signal transduction during interactions with endophytic organisms and pathogens (Morillo and Tax, 2006). In addition Vinagre et al. (2006) identified in sugarcane a LRR-RLK whose expression is regulated in response to interactions with beneficial bacteria. Together, these facts confirm our findings in sugarcane transcriptome and explain the high number of clusters found. ENOD8 is a member of the GDSL family of lipolytic enzymes present in plant and bacteria that have the putative active site serine sequence context, which is not perfectly conserved in all members of the GDSL gene family (Györgyey et al., 2000). In plants, GDSL lipase candidates of species like arabidopsis, Rauvolfia serpentina, Medicago sativa, Hevea brasiliensis and Alopecurus myosuroides have been isolated, cloned and characterized, revealing that they are conserved between these species (Arif et al., 2004; Cummins and Edwards, 2004; Oh et al., 2005; Pringle and Dickstein, 2004; Ruppert et al., 2005). Nogueira et al. (2001) found in infected libraries of SUCEST that sugarcane ENOD8 shows similarity to myrosinase-associated protein (MyAP) related with the plant defense responses. In our results sequences of ENOD8 protein were also found in non-infected tissues, suggesting that this protein plays a role in other functions besides the interaction with endophytic organisms; however, in monocots these functions remain unknown. In dicots, like arabidopsis and Brassica napus, ENOD8 sequences are specifically expressed in anthers and encodes hydrolytic proteins; some of which with esterase and lipase activity (Cook and Dénarié, 2000; Peng and Dickstein, 1994). In addition, the similarity with rice and arabidopsis sequences found in our 134 alignments can be explained by the fact that few sequences from other non-legumes are available in NCBI database In legumes the ENOD40 is a critical gene responsible for cortical cell divisions leading to the initiation of nodule development in rhizobial association (Charon et al., 1999), while in the interaction with arbuscular mycorrhiza plays a role in the fungal growth in the root cortex and increases the frequency of arbuscule formation (Sinvany et al., 2002). However, ENOD40 is not exclusively associated with plant-host interactions and possible functions in non-legumes fall into three possible groups: transport, organogenesis and regulation of phytohormone status. Thus, it has been suggested that ENOD40 may also have a regulatory role during different stages of plant development but its precise function is still poorly understood (Ruttink, 2003). ENOD40 genes can be identified by the presence of regions that are highly conserved among distantly related plant species (Compaan et al., 2003). In accordance to this fact ENOD40 from O. sativa encodes peptides that are homologous to proteins encoded by the corresponding genes in legumes, even thought their expression is not associated with symbiotic interactions (Reddy et al., 1999). The occurrence of ENOD40 sequences in monocots and different clades within the core eudicots shows that ENOD40 is an ancient gene that has been maintained in these plants during divergent evolution (Ruttink, 2003). This gene was also functionally characterized in Z. mays (Yang et al., 1993); in addition, previous studies have identified isoforms in the sugarcane genome, using southern analysis (Reddy et al., 1999), confirming the present evidences from sugarcane transcriptome. Additionally, the low number of clusters found can be explained by Compaan et al. (2003) that suggested that this gene category is low expressed in most non-legume plant species. 135 1.2 Late Nodulins DMT1 belongs to the NRAMP (Natural Resistance-Associated Macrophage Protein) family of metal transporters. For instance, several members of the NRAMP family have been characterized and have shown to be involved in metal uptake and transport in several organisms like fungi, animals, plants and bacteria (Cellier and Gros, 2004). In monocots, like rice, expression studies have revealed that several NRAMP genes are upregulated upon Fe deficiency, suggesting a role in Fe homeostasis (Thomine et al., 2000; Bereczky et al., 2003). In our study the majority of sugarcane orthologs exhibited best alignments with NRAMP proteins of O. sativa what may suggest that these genes can be also related with Fe homeostasis in sugarcane. Nodule development is associated with the spatially and temporally regulated expression of a number of genes that encode membrane transport proteins (Jeong et al., 2004). Muñoz et al. (1996) showed that NOD70 from monocots has homology with a sulfate transporter and a possible role in nutrient supply during plant-microbe symbiogenesis. However, Szczyglowski et al. (1998) showed that NOD70 genes from legumes encode a polytopic membrane protein with sequence and topology similar to members of the major facilitator superfamily (MFS) of membrane transporters. Moreover Vincill et al. (2005) showed that the NOD70 from legumes encode a symbiosome membrane protein that possesses an anion transport activity with selectivity for nitrate, nitrite, and chloride. Nitrate is an important nitrogen source for plants, being also a signal molecule that controls various aspects of plant development. In addition to its role as nutrient, nitrate was shown to act as a signal molecule, which independently of its assimilation, controls numerous aspects of plant development and metabolism (Wang et al., 2003). In our results almost all orthologous sequences showed similarity with a nitrate and chloride transporter from others monocot plants, especially Z. mays, that display a homologue of NOD70 described as 136 a nitrate and chloride transporter (Vincill et al., 2005), confirming the importance of this gene also in sugarcane, since many alignments with similar features to mays could be identified. With respect to sugarcane GS orthologs, the best alignments showed the procured domain complete, with most candidates presenting similarity with sugarcane sequences after reverse alignments, in agreement with the fact that these proteins were already characterized in this organism (Nogueira et al., 2005). Ours findings can be supported by evidences from other GS isoforms that have also been found in some non-legumes from temperate climates as Z. mays and H. vulgare, besides the dicot Lycopersicum esculentum (Becker et al., 1992, Miflin, 1974; Sakakibara et al., 1992). Some functional studies regarding GS role in monocots revealed that some GS isoforms are important for normal growth (Hirel et al., 2005) and grain filling in the case of rice and maize (Shrawat and Good, 2008). The great similarity of Leghemoglobin candidate clusters was in accordance to the classic taxonomic relationships, since the significant alignments occurred with Z. mays. One of the candidates has shown the Globin domain complete, and the existence of this gene in sugarcane is clear evident, since nonsymbiotic hemoglobin genes have also recently been found in monocotyledonous plants such as barley, wheat and rice (Andersson et al., 1996; Taylor et al., 1994) and symbiotic hemoglobin is present in the nitrogen-fixing nodules of both legumes and nonlegumes plants (Andersson et al., 1997). The function of these plant hemoglobins in nonsymbiotic tissues is not clear; they may be associated with the transport of oxygen or, as suggested by Appleby et al. (1990), they may act as oxygen sensors in the signal transduction pathway for activation of anaerobic genes. Sucrose synthase catalyzes the reversible conversion of sucrose UDP to UDP-glucose and fructose and is the central enzyme of carbohydrate metabolism in all plant species. This enzyme is implicated in a wide variety of processes, including nitrogen fixation (Gordon et al., 1999), starch 137 synthesis (Chourey et al., 1998), cellulose biosynthesis (Amor et al., 1995), phloem transport (Nolte and Koch, 1993) and the ability of storage organs to act as carbon sinks (Zrenner et al., 1995). In monocots, sucrose synthases play different functions. For example, in rice this enzyme is induced transcriptionally and translationally in seedlings under oxygen deficiency and its activity under submerged conditions is significantly higher than under aerobic conditions. In Potamogeton distinctus the transcription of sucrose synthase is increased in elongating turions under oxygen deficiency (Harada et al., 2005) while in maize has been implicated in various roles in the synthesis and degradation of sucrose as well as in the flow of carbon from one organ to another (Shaw et al., 1994). In addition, parenchyma cells of sugarcane stems accumulate sucrose up to 20% of their fresh mass or 60% of dry mass in mature internodes (Moore and Maretzki, 1996). The involvement of sucrose synthase in sugarcane metabolism is already proved (Schäfer et al., 2004) and the great number of sequences found with high homology with this gene confirms the existence of SUS in sugarcane genome, as expected. The NOD26-like major intrinsic protein (MIP) is a subfamily of aquaporins, a category of sequences involved in a number of processes concerning water and solute relations in plants. In some legumes like soybean, NOD26 has its activity specifically in the peribacteroid membrane of N2-fixing symbiotic root nodules (Fortin et al., 1987). Homologues of NOD26 have been identified in plant species that do not develop any N2-fixing symbiosis, but their subcellular localization is still unclear (Weig et al., 1997). The separation into different functional groups probably occurred before the monocot-dicot divergence, being suggested that the ancestral gene of these groups encoded a protein with a specific biological role. The persistence of these groups in dicots and monocots is also an indication of the crucial importance of MIPs in Angiospermae (Chaumont et al., 2000). 138 Regarding monocots, maize cells contain several NOD26 homologues in their plasma membrane that have different functions or are differentially regulated (Lopez et al., 2003). Liu et al. (1994), characterized two rice cDNAs which are homologous to the genes encoding the MIP family and saw that the expression was enhanced by water stress, salt stress and exogenous abscisic acid. Together, all these facts explain the great amount of NOD26 homologous sequences found in sugarcane transcriptome. 2. Expression pattern Nodulins were initially defined as plant genes that are exclusively induced during nodule formation and specifically expressed in nodules (Munõz et al., 1996). However, functional evaluations showed that many of these genes are in fact expressed in nonsymbiotic tissues and/or during nonsymbiotic conditions (Charon et al., 1999) also presenting a number of homologues in non-legume plants, as arabidopsis and rice that are unable to form nodules (Chen et al., 2007; Miyao et al., 2007). Thus, it is hypothesized that nodulin genes have arisen as a result of the recruitment of pre-existing, non-symbiotic genes which might have roles in other physiological processes, like controlling growth and development, common to all plants (Andersson et al., 1996; Mylona et al., 1995). In fact, the presence of nodulin transcripts in non-infected tissues in SUCEST libraries confirms this hypothesis. Evaluations recognized that legume genes that are required for nodulation are also essential for the symbiotic associations with arbuscular mycorrhizal (AM) fungi, which are established in more than 80% of flowering plants, including monocots as rice (Kistner et al., 2005; Oldroyd and Downie, 2004). In sugarcane several genes possibly involved in nitrogen metabolism and plant-bacteria signaling during endophytic diazotrophic association seem to act as noduleenhanced genes (Nogueira et al., 2001). Therefore, the higher expression of the early nodulins 139 found in our study, in comparison to the late nodulins, was expected, since this type of nodulins act mainly in the plant-host signaling pathways. Regarding the expression pattern of early nodulins, the observed majority of nodulin transcripts in flower libraries were also expected, since this is the largest SUCEST library with a higher number of developing stages and sequences, as compared with other tissues. Regarding NORK results, a significant expression could be detected in most SUCEST libraries; what is in agreement with the fact that genes encoding RLKs isoforms, besides their roles in organism interactions, are very closely related to plant developmental processes, being present in tissues under growth and differentiation, like seeds, plantlets in different stages of development in flowers, leaves and root-to-shoot transition regions, confirming the crucial importance of this proteins for plants (Morillo and Tax, 2006). In addition, in rice this isoforms are involved in hormone reception, growth-factor recognition, the recognition of fungal elicitors and development of shoot and floral apical meristems (Takayama and Sakagami, 2002), what explain the presence of sugarcane reads of NORK in many meristematic tissues. In legumes the expression of ENOD40 is induced within hours of Rhizobium inoculation and it appears to be critical for proper nodule development; however, transcripts are also localized in the stem, lateral roots and other tissues in these plants (Charon et al., 1999). Our results have shown that the ENOD40 sugarcane presented a similar expression pattern as previously found in rice, in which expression was detected in the developing vascular bundles in the stem (Kawahara and Chonan, 1968). Additionally, in maize transcripts were detected in roots, leaf and leaf veins, with the highest expression in the stem (Varkonyi-Gasic and White, 2002). The occurrence of annexin transcripts in almost all SUCEST libraries occurred in accordance to Proust et al. (1996) that using Northern-blot analysis revealed that annexins from plants have a fairly widespread expression. Concerning monocot annexins, Smallwood et al. 140 (1992) showed that the transcripts were found in root tissues, stem and young expanding leaves of barley, while Carroll et al. (1998) reported that the maize annexin was expressed in root cap cells and differentiating vascular tissues in roots (Carroll et al., 1998), both similar to annexin expression in sugarcane found in our analysis. Besides the nodulins described above, many early nodulins presented an expression related to organ differentiation in monocots, like DMI3, ENOD8, and CCS52A (Foucher and Kondorosi, 2000; Messinese et al., 2007; Peng and Dickstein, 1994). Based on the distribution and prevalence of these early nodulins in sugarcane transcriptome, we can suggest that these genes also play a role in the organ development in this monocot. In general the late nodulins are preferentially expressed in mature nodules, acting directly in the nitrogen fixation (Niebel et al., 1998) what can explain the low number of transcripts found in the present approach in sugarcane transcriptome that is unable to form nodules. Regarding the expression pattern of these genes, the highest number recognized regarded the sucrose synthase, glutamine synthase and NOD26, nodulins required for the development of several tissues. Studying rice Hirose et al. (2008) detected sucrose synthase transcripts in a wide range of tissues and at different developmental stages, indicating that they are involved in diverse growth processes. We found a similar expression pattern in sugarcane transcriptome, with great amounts of transcripts in the NR2 library (normalized mix of tissues). This gene is also crucial for enhanced growth in tissue development, explaining the high number of transcripts found in the early stages of flower development (FL2) and tissues of root to shoot of young plants (RZ). In addition, this enzyme showed a high activity in others tissues like callus cells and mature sugarcane internodes (Botha and Black, 2000), in accordance with our findings in CL and ST libraries. 141 The expression pattern of glutamine synthase and NOD26 was quite similar, with both presenting reads in almost all SUCEST libraries, what was expected since these genes play an important role in all plant groups. The high expression of glutamine synthase in sugarcane was in root tissues. Similar results, were observed in other monocots, like Z. mays and H. vulgare with significant expression of glutamine synthase isoforms in roots (Becker et al., 1992; Sakakibara et al., 1992). The NOD26 is an aquaporin that displays a crucial role in water and solute relations in plants. In agreement with this function, our analysis in sugarcane transcriptome revealed the prevalence of NOD26 transcripts in tissues that presented a high water flux and requirement like, seeds (SD), roots (RT), flowers (FL) and internodes (ST). In other monocots, like rice, some isoforms of this gene have also been localized in organs and tissues with these characteristics, e.g. vascular tissues, flowers and roots (Fraysse et al., 2005; Senadheera et al., 2009). In a general view, the fact that different nodulins are expressed in most SUCEST libraries support the assumption that these genes are expressed not only in nodulation conditions, revealing the importance of these genes for all angiosperms. CONCLUDING REMARKS With aid of bioinformatic tools it was possible to identify all 13 nodulin gene categories out of 195 sugarcane contigs, allowing also inferences regarding their expression pattern. For all nodulin classes candidates bearing the respective conserved domains could be found in sugarcane, most of them putatively involved in tissue development and growth, besides plant host interactions. Considering the low amount of described nodulins in monocots, the identified sequences represent valuable resources for functional evaluations including expression assays and may lead to significant benefits for sugarcane production. 142 ACKNOWLEDGMENTS We thank CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) and FACEPE (Fundação de Amparo à Pesquisa do Estado de Pernambuco) for the concession of fellowships. To FAPESP (Fundação de Amparo à Pesquisa do Estado de São Paulo) and SUCEST for the access to the Sugarcane EST data bank. REFERENCES Altschul SF, Madden TL, Schaffer AA, Zhang J et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nuc. Ac. Res. 25 (17): 3389-3402. Amor Y, Haigler CH, Johnson S, Wainscott M, et al. (1995). A membrane-associated form of sucrose synthase and its potential role in synthesis of cellulose and callose in plants. Proc. Natl. Acad. Sci. U.S.A. 92: 9353–9357. Andersson CA, Llewellyn DJ, Peacock WJ and Dennis ES (1997). Cell-specific expression of the promoters of two nonlegume haemoglobin genes in transgenic legume, Lotus corniculatus. Plant Physiol. 113:45–57. Andersson CA, Jensen EO, Llewellyn DJ, Dennis ES, et al. (1996). A new hemoglobin gene from soybean: a role for hemoglobin in all plants. Proc. Natl. Acad. Sci. U.S.A. 93:5682–5687. Appleby CA, Dennis ES and Peacock WJ (1990). A primaeval origin for plant and animal haemoglobins? Aust. Syst. Bot. 3:81-89. Arif SA, Hamilton RG, Yusof F, Chew NP, et al. (2004). Isolation and characterization of the early nodule-specific protein homologue (Hev b 13), an allergenic lipolytic esterase from Hevea brasiliensis latex. J. Biol. Chem. 279:23933-23941. 143 Baldani JI, Caruso L, Baldani VLD, Goi SR, et al. (1997). Recent advances in BNF with non-legume plants. Soil Biol. Biochem. 29:911–922. Becker D, Kemper E, Schell J and Masterson R (1992). New plant binary vectors with selectable markers located proximal to the left T-DNA border. Plant Mol. Biol. 20:1195-1197. Bereczky Z, Wang HY, Schubert V, Ganal M, et al. (2003). Differential regulation of nramp and iron metal transporter genes in wild type and iron uptake mutants of tomato. J. Biol. Chem. 278:24697-24704. Borisov AY, Madsen LH, Tsyganov VE, Umehara Y, et al. (2003). The Sym35 Gene Required for Root Nodule Development in Pea Is an Ortholog of Nin from Lotus japonicus. Plant Physiol. 131:1009–1017. Botha FC and Black KG (2000). Sucrose phosphate synthase and sucrose synthase activity during maturation of internodal tissue in sugarcane. Aust. J. Plant Physiol. 27:81-85. Breton G, Vasquez-Tello A, Danyluk J and Sarhan F (2000). Two novel intrinsic annexins accumulate in wheat membranes in response to low temperature. Plant Cell Physiol. 41:177–184. Carroll AD, Moyen C, Van Kesteren P, Tooke F, et al. (1998) Ca2+, annexins, and GTP modulate exocytosis from maize root cap protoplasts. Plant Cell 10:1267–1276. Castaings L, Camargo A, Pocholle D, Gaudon V, et al. (2009). The nodule inception-like protein 7 modulates nitrate sensing and metabolism in Arabidopsis. The Plant J. 57:426–435. Cebolla A, Vinardell JM, Kiss E, Olah B, et al. (1999). The mitotic inhibitor ccs52 is required for endoreduplication and ploidy-dependent cell enlargement in plants. Euro. Mol. Biol. J. 18:4476– 4484. Cellier M and Gros P (eds) (2004). The Nramp Family. New York: Eurekah.com and Kluwer Academic/Plenum 144 Charon C, Sousa C, Crespi M and Kondorosi A (1999). Alteration of enod40 expression modifies Medicago truncatula root nodule development induced by Sinorhizobium meliloti. Plant Cell 11:1953–1966. Chaumont F, Barrieu F, Jung R and Chrispeels MJ (2000). Plasma membrane intrinsic proteins from maize cluster in two sequence subgroups with differential aquaporin activity. Plant Physiol. 122: 1025–1034. Chen X, Laudeman TW, Rushton PJ, Spraggins TA, et al. (2007). CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences. BMC Bioinfo. 8:129. Chourey PS, Taliercio EW, Carlson SJ and Ruan YL (1998). Genetic evidence that the two isozymes of sucrose synthase present in developing maize endosperm are critical, one for cell wall integrity and the other for starch biosynthesis. Mol. Gen. Genet. 259:88–96. Clark GB, Dauwalder M and Roux SJ (1998). Immunological and biochemical evidence for nuclear localization of annexin in peas. Plant Physiol. Biochem. 36:621–627. Compaan B, Ruttinka T, Albrecht C and Meeley R (2003). Identification and characterization of a Zea mays line carrying a transposon-tagged ENOD40. Bioch. et Bioph. Acta 1629:84– 91. Cook D and Dénarié J (2000). Progress in the genomics of Medicago truncatula and the promise of its application to grain legume crops. Grain Legumes 28:12-13. Cummins I and Edwards R (2004). Purification and cloning of an esterase from the weed black-grass (Alopecurus myosuroides), which bioactivates aryloxyphenoxypropionate herbicides. Plant J. 39:894-904. Eisen MB, Spellman PT, Brown PO and Botstein D (1998). Cluster analysis and display of genomicwide expression pattern. Proc. Natl. Acad. Sci. U.S.A. 95:14863-14868. Endre G, Kereszt A, Kevei Z, Mihacea S, et al. (2002). A receptor kinase gene regulating symbiotic nodule development. Nature 417:962–966. 145 Fortin MG, Morrison NA and Verma DP (1987). Nodulin-26, a peribacteroid membrane nodulin is expressed independently of the development of the peribacteroid compartment. Nucleic Acids Res. 15:813–824. Foucher F and Kondorosi E (2000). Cell cycle regulation in course of nodule organogenesis in Medicago. Plant Mol. Biol. 43:773-786. Fraysse LC, Wells B, McCann MC and Kjellbom P (2005). Specific plasma membrane aquaporins of the PIP1 subfamily are expressed in sieve elements and guard cells. Biol Cell. 97:519–534. Fuentes-Ramírez LE, Caballero-Mellado J, Sepúlveda J and Martínez-Romero E (1999). Colonization of sugarcane by Acetobacter diazotrophicus is inhibited by high N-fertilization. Fed. Eur. Microbiol. Soc. Microbiol. Ecol. 29:117-128. Gerke V and Moss SE (2002). Annexins: From structure to function. Physiol. Rev. 82:331–371. González-Sama A, de la Peña TC, Kevei Z, Mergaert P, et al. (2006). Nuclear DNA Endoreduplication and Expression of the Mitotic Inhibitor Ccs52 Associated to Determinate and Lupinoid Nodule Organogenesis. Mol. Plant-Microbe Interac. 19:173–180. Gordon AJ, Minchin FR, James CL and Komina O (1999). Sucrose synthase in legume nodules is essential for nitrogen fixation. Plant Physiol. 120:867–878. Grivet L and Arruda P (2001). Sugarcane genomics: depiciting the complex genome of an important tropical crop. Curr. Opin. Plant. Biol. 5:122-127. Györgyey J, Vaubert D, Jiménez-Zurdo JI, Charon C, et al. (2000). Analysis of Medicago truncatula nodule expressed tags. Mol. Plant Microbe Interac. 13:62-71. Harada T, Satoh S, Yoshioka T and Ishizawa K (2005). Expression of sucrose synthase genes involved in enhanced elongation of pondweed (Potamogeton distinctus) turions under anoxia. Ann. Bot. (Lond) 96:683–692. 146 Hirel B, Andrieu B, Valadier MH, Renard S, et al. (2005). Physiology of maize II: Identification of physiological markers representative of the nitrogen status of maize (Zea mays) leaves during grain filling. Physiol. Plantarum 124:178-188. Hirose T, Scofield GN and Terao T (2008). An expression analysis profile for the entire sucrose synthase gene family in rice. Plant Sci. 174:534–543. Hrabak EM, Chan CWM, Gribskov M, Harper JF, et al. (2003). The Arabidopsis CDPK-SnRK of protein kinases. Plant Physiol. 132:666–680. van Kammen A (1984). Suggested nomenclature for plant genes involved in nodulation and symbiosis. Plant Mol. Biol. Report 2:43–45. Kawahara H and Chonan N (1968). Studies on morphogenesis in rice plants. 5. Histological observation on the maturing process of vascular bundles in culm. Japan. Jour. Crop Sci. 37:399– 410. Kistner C, Winzer T, Pitzschke A, Mulder L, et al. (2005). Seven Lotus japonicus genes required for transcriptional reprogramming of the root during fungal and bacterial symbiosis. Plant Cell 17:2217–2229. Kouchi H, Takane K, So RB, Ladha JK, et al. (1999). Rice ENOD40: isolation and expression analysis in rice and transgenic soybean root nodules. Plant J. 18:121–129. Jeong J, Suh S, Guan C, Tsay YF, et al. (2004). A nodule-specific dicarboxylate transporter from alder is a member of the peptide transporter family. Plant Physiol. 134:969–978. Laohavisit A, Mortimer JC, Demidchik V, Coxon KM, et al. (2009). Zea mays Annexins Modulate Cytosolic Free Ca2+ and Generate a Ca2+-Permeable Conductance. The Plant Cell 21:479–493. Liu Q, Umeda M and Uchimiya H (1994). Isolation and expression analysis of two rice genes encoding the major intrinsic protein. Plant Mol. Biol. 26:2003-2006. 147 Lopez F, Bousser A, Sissoeff I, Gaspar M, et al. (2003). Diurnal regulation of water transport and aquaporin gene expression in maize roots: contribution of PIP2 proteins. Plant Cell Physiol. 44:1384–1395. Marsh JF, Rakocevic A, Mitra RM, Brocard L, et al. (2007). Medicago truncatula NIN is essential for rhizobial-Independent nodule organogenesis induced by autoactive calcium/calmodulindependent Protein kinase. Plant Physiol. 144:324–335. Messinese E, Mun J-H, Yeun LH, Jayaraman D, et al. (2007). A novel nuclear protein interacts with the symbiotic DMI3 calcium- and calmodulin-dependent protein kinase of Medicago truncatula. Mol. Plant-Microbe Interac. 20:912–921. Miflin BJ (1974). The location of nitrite reductase and other enzymes related to amino acid biosynthesis in the plastids of root and leaves. Plant Physiol. 54:550-555. Moore PH and Maretzki A (1996). Sugarcane. - In: Zamski E and Schaffer AA (ed.): Photoassimilate Distribution in Plants and Crops: Source-Sink Relationships. pp 643-669. Marcel Dekker, New York - Basel - Hong Kong. Morgan RO, Martin-Almedina S, Iglesias JM, Gonzalez-Florez MI, et al. (2004). Evolutionary perspective on annexin calcium-binding domains. Biochim. Biophys. Acta 1742:133–140. Morillo SA and Tax FE (2006). Functional analysis of receptor-like kinases in monocots and dicots. Curr. Op. Plant Biol. 9:460–469. Muñoz JA, Palomares AJ and Rated P (1996). Plant genes induced in the Rhizobium-legume symbiosis. W. J. Microbiol. Biotech.12:189-202. Miyao A, Iwasaki Y, Kitano H, Itoh J, et al. (2007). A large-scale collection of phenotypic data describing an insertional mutant population to facilitate functional analysis of rice genes. Plant Mol. Biol. 63:625–635. Mylona P, Pawlowski K and Bisseling T (1995). Symbiotic Nitrogen Fixation. Plant Cell 7:869-885. 148 Niebel FC, Lescure N, Cullimore JV and Gamas P (1998). The Medicago truncatula MtAnn1 gene encoding an annexin is induced by nod factors and during the symbiotic interaction with Rhizobium meliloti. Mol. Plant-Microbe Interac. 11(6):504–513. Nogueira EM, Vinagre F, Masuda HP, Vargas C, et al. (2001). Expression of sugarcane genes induced by inoculation with Gluconacetobacter diazotrophicus and Herbaspirillum rubrisubalbicans. Genet. Mol. Biol. 24:199-206. Nogueira EM, Olivares FL, Japiassu JC, Vila C, et al. (2005). Characterization of glutamine synthetase genes in sugarcane genotypes with different rates of biological nitrogen fixation. Plant Sci. 169:819-832. Nolte KD and Koch KE (1993). Companion-cell specific localization of sucrose synthase in zones of phloem loading and unloading. Plant Physiol. 101:899–905. Oh IS, Park AR, Bae MS, Kwon SJ, et al. (2005). Secretome analysis reveals an Arabidopsis lipase involved in defense against Alternaria brassicicola. Plant Cell 17:2832-2847. Ohlrogge J and Benning C (2000). Unraveling plant metabolism by EST analysis. Curr. Opin. Plant Biol. 3:224-228. Oldroyd GED and Downie AL (2004). Calcium, kinases and nodulation signalling in legumes. Mol. Cell Biol. 5:566-576. Peng T and Dickstein R (1994). Regulation of plant nodule-specific genes expressed in alfalfa nodules arrested at an early stage of development. Plant Sci. 101:65–73. Poovaiah BW, Xia M, Liu Z, Wang W, et al. (1999). Developmental regulation of the gene for chimeric calcium/calmodulin-dependent protein kinase in anthers. Planta 209:161–171. Pringle D and Dickstein R (2004). Purification of ENOD8 proteins from Medicago sativa root nodules and their characterization as esterases. Plant Physiol. Biochem. 42:73-79. Proust J, Houlne G, Schantz M-L and Schantz R (1996). Characterization and gene expression of an annexin during fruit development in Capsicum annum. FEBS Lett. 383:208–212. 149 Reis VM, de Paula MA and Döbereiner J (1999). Ocorrência de micorrizas arbusculares e da bactéria Diazotrófica Acetobacter diazotrophicus em cana-de-açúcar. Pesq. Agropec. Bras. 34:1933-1941. Ruppert M, Woll J, Giritch A, Genady E, et al. (2005). Functional expression of an ajmaline pathway-specific esterase from Rauvolfia in a novel plant-virus expression system. Planta 222:888-898. Reddy PM, Aggarwal RK, Ramos MC, Ladha JK, et al. (1999). Widespread occurrence of the homologs of the early nodulin (ENOD) genes in Oryza species, related grasses. Biochem. Biophys. Res. Commun. 258:148–154. Reddy PM, Kouchi H and Ladha JK (1998). Isolation, analysis and expression of homologues of the soybean early nodulin gene GmENOD93 (GmN93) from rice. Biochim. Biophys. Acta 1443:386–392. Reinhold-Hurek B and Hurek T (1998). Life in grasses: diazotrophic endophytes. Trends in Microbiol. 6:139–144. Riechman JL, Heard J, Martin G, Reuber L, et al. (2000). Arabidopsis transcription factors: Genomewide comparative analysis among eukaryotes. Sci. 290:2105–2110. Ruttink T (2003). ENOD40 affects phytohormone cross-talk. PhD Thesis, Wageningen University, ISBN:9058089797. Sakakibara H, Kawabata S, Hase T and Sugiyama T (1992). Differential effects of nitrate and light on the expression of glutamine synthetases and ferredoxin-dependent glutamate synthase in maize. Plant Cell Physiol. 33:1193–1198. Sathyanarayanan PV and Poovaiah BW (2002). Autophosphorylation-dependent inactivation of plant chimeric calcium/calmodulin-dependent protein kinase. Eur. J. Biochem. 269:2457-2463. Schauser L, Wieloch W and Stougaard J (2005). Evolution of NIN-like proteins in Arabidopsis, rice, and Lotus japonicus. J. Mol. Evol. 60:229–237. 150 Schäfer WE, Rohwer JM and Botha FC (2004). Protein-level expression and localization of sucrose synthase in the sugarcane culm. Physiol. Planta. 121:187-195. Senadheera P, Singh RK and Maathuis F (2009). Differentially expressed membrane transporters in rice roots may contribute to cultivar dependent salt tolerance. J. Exp. Bot. 1-11. Sevilla M, Burris RH, Gunapala N and Kennedy C (2001). Comparison of benefit to sugarcane plant growth a 15 N2 incorporation following inoculation of sterile plants with Gluconacetobacter diazotrophicus wild-type and Nif-mutant strains. Mol. Plant-Microbe Interact. 14:358-366. Shaw JF, Chang RC, Chuang KH, Yen YT, et al. (1994). Nucleotide sequence of a novel arylesterase gene from Vibrio mimicus and characterization of the enzyme expressed in Escherichia coli. Bioch. J. 298:675–680. Shrawat AK and Good AG (2008). Genetic Engineering Approaches to Improving Nitrogen Use Eficiency. ISB News Report. Shiu SH and Bleecker AB (2001). Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases. Proc. Natl. Acad. Sci. U.S.A. 98:10763–10768. Shiu SH, Karlowski WM, Pan R, Tzeng YH, et al. (2004). Comparative analysis of the receptor-like kinase family in Arabidopsis and rice. Plant Cell 16:1220–1234. Sinvany G, Kapulnik Y, Wininger S, Badani H, et al. (2002). The early nodulin enod40 is induced by, and also promotes arbuscular mycorrhizal root colonization. Physiol. Mol. Plant Pathol. 60:103– 109. Smallwood MF, Gurr SJ, McPherson MJ, Roberts K, et al. (1992). The pattern of plant annexin gene expression. Biochem. J. 281:501-505. Szczyglowski K, Shaw RS, Wopereis J, Copeland S, et al. (1998). Nodule organogenesis and symbiotic mutants of the model legume Lotus japonicus. Mol. Plant-Microbe Interact. 11:684– 697. 151 Takane K, Tanaka K, Tajima S, Okazaki K and Kouchi H (1997). Expression of a gene for uricase II (nodulin-35) in cotyledons of soybean plants. Plant Cell Physiol. 38:149-154. Takayama S and Sakagami Y (2002). Peptide signalling in plants. Curr. Opin. Plant. Biol. 5:382387. Taylor ER, Nie XZ, MacGregor AW and Hill RD (1994). A cereal haemoglobin gene is expressed in seed and root tissues under anaerobic conditions. Plant Mol. Biol. 24:853-862. Thomine S, Wang R, Ward JM, Crawford NM, et al. (2000). Cadmium and iron transport by members of a plant metal transporter family in Arabidopsis with homology to Nramp genes. Proc. Natl. Acad. Sci. U.S.A. 97:4991–6. Trevaskis B, Watts RA, Andersson CR, Llewellyn DJ, et al. (1997). Two hemoglobin genes in Arabidopsis thaliana: the evolutionary origins of leghemoglobins. Proc. Natl. Acad. Sci. U.S.A. 94: 12230–12234. União dos Produtores de Bioenergia (UDOP), http://www.udop.com.br (November 18, 2008). Urquiaga S, Cruz HS and Boddey RM (1992). Contribution of nitrogen fixation to sugarcane: nitrogen-15 and nitrogen balance estimates. Soil Sc. Soc. Am. J. 56:105-114. Vargas C, de Pádua VLM, Nogueira EM, Vinagre F, et al. (2003). Signaling pathways mediating the association between sugarcane and endophytic diazotrophic bacteria: a genomic approach. Symbiosis 35:159–180. Varkonyi Gasic E and White DWR (2002). The white clover ENOD40 gene family. Expression patterns of two types of genes indicate a role in vascular function. Plant Physiol. 129:11071118. Vettore AL, Da Silva FR, Kemper EL and Arruda P (2001). The libraries that made SUCEST. Genet. Mol. Biol. 24:1–7. 152 Vinagre F, Vargas C, Schwarcz K, Cavalcante J, et al. (2006). SHR5: a novel plant receptor kinase involved in plant-N2-fixing endophytic bacteria association. J. Exp. Bot. 57:559-569. Vinardell JM, Fedorova E, Cebolla A, Kevei Z, et al. (2003). Endoreduplication mediated by the anaphase-promoting complex activator CCS52A is required for symbiotic cell differentiation in Medicago truncatula nodules. Plant Cell 15:2093–2105. Vincill ED, Szczyglowski K and Roberts DM (2005). GmN70 and LjN70. Anion transporters of the symbiosome membrane of nodules with a transport preference for nitrate. Plant Physiol. 137:1435–1444. Wang RC, Okamoto M, Xing X and Crawford NM (2003). Microarray analysis of the nitrate response in Arabidopsis roots and shoots reveals over 1000 rapidly responding genes and new linkages to glucose, trehalose-6-phosphate, iron, and sulfate metabolism. Plant Physiol. 132:556567. Wang W and Poovaiah BW (1999). Interaction of plant chimeric calcium/calmodulin-dependent protein kinase with a homolog of eukaryotic elongation factor-1alpha. J. Biol. Chem. 274:1200112008. Weig A, Deswarte C and Chrispeels MJ (1997). The major intrinsic protein family of Arabidopsis has 23 members that form three distinct groups with functional aquaporins in each group. Plant Physiol. 114:1347-1357. Yang T and Poovaiah BW (2003). Calcium/calmodulin-mediated signal network in plants. Trends Plant Sci. 8:505–512. Yang WC, Katinakis P, Hendriks P, Smolders A, et al. (1993). Characterization of GmENOD40, a gene showing novel patterns of cell-specific expression during soybean nodule development. Plant J. 3:573-585. 153 Zrenner R, Salanoubat M, Willmitzer L and Sonnewald U (1995). Evidence of the crutial role of sucrose synthase for sink strength using trangenic potato plants (Solanum tuberosum L.). Plant J. 7:97–107. 154 Table 2. Type and features of nodulins genes used as query against the Sugarcane databases. The genes are grouped in two nodulins types, Early nodulin (written in orange) and Late nodulin (written in green) with respective gene name, accession number at NCBI, size of the protein in aminoacids (aa), organism, with respective domains. Gene name Accession number Size (aa) Organism Database Annexin CAA75308 313 DMI3 Q6RET7 523 NIN CAB61243 878 NORK CAD10811 925 CCS52A AAY58271 487 ENOD8 AAL68832 381 Medicago truncatula Medicago truncatula Lotus japonicus Medicago truncatula Lotus japonicus Medicago truncatula ENOD40 CAD48198 261 DMT1 AAO39834 516 GS Q43785 356 Lgb CAA38024 162 NOD26 AAT35231 310 NOD70 AAW51884 598 SucSin P13708 805 Medicago truncatula Glycine max Medicago sativa Medicago sativa Medicago truncatula Glycine max Glycine max Conserved Domain 1 Name Size Begin (aa) Annexin 66 14 End 79 Conserved Domain 2 Name Size Begin End (aa) Annexin 66 86 150 Conserved Domain 3 Name Size Begin End (aa) Annexi 66 172 232 n EFh 63 441 508 Conserved Domain 4 Name Size Begin End (aa) Annexi 66 243 308 n - S_TKc 256 11 306 EFh 63 370 460 RWP-RK 52 571 621 82 781 862 - - - - - - - - PKc_Tyr 258 602 868 PB1_NL P - - - - - - - - - - - - WD40 289 186 462 - - - - - - - - - - - - SGNH_ plant_ lipase_like RRM 315 35 365 - - - - - - - - - - - - 74 152 220 - - - - - - - - - - - - Nramp 360 77 439 - - - - - - - - - - - - Gln-synt_N 82 18 97 259 103 354 - - - - - - - - Globin 140 20 157 Glnsynt_C - - - - - - - - - - - - MIP 228 80 268 - - - - - - - - - - - - Nodulin-like 248 27 253 - - - - - - - - - - - - Sucrose_sy nth 550 7 554 - - - - - - - - - - - - 155 Table 3. Main sugarcane clusters similar to nodulins genes. tBLASTn results and sequence evaluation of sugarcane nodulins genes including the best match of each gene: (I) Features and evaluation results with sugarcane cluster number, cluster size in nucleotides (n), ORF (Open Reading Frame) size in amino-acids (aa), e-value; score, frame and numbers (#) of matched clusters. (II) Data about BLASTx best alignment: NCBI GI number and plant species. Gene name and expected domain Annexin Annexin DMI3 S_TKc NIN PB1_NLP NORK PKc_Tyr CCS52A WD40 ENOD8 SGNH_plant_lipase_like ENOD40 RRM DMT1 Nramp Glutamine Synthase Gln-synt_N Leghemoglobin Globin NOD26 MIP NOD70 Nodulin-like Sucrose Synthase Sucrose_synth (I) Cluster Features and Evaluation Sugarcane ORF Size (n) e-value Cluster Nr. (aa) (II) BLASTx Information Score Frame # Clusters NCBI GI Nr. Plant Species e-value SCSBST3098G08.g 1166 314 4,00e-83 541 -1 9 162459661 Zea mays 4,00e-152 SCJLLR1011H04.g 2424 515 8,00e-65 244 1 25 1899175 Cucurbita pepo 0.0 SCQGRT1041A07.g 1254 315 4,00e-38 524 -3 5 56783862 Oryza sativa 7,00e-147 SCVPLB1015A04.g 2973 976 1,00e-101 365 1 46 77548313 Oryza sativa 0.0 SCCCLR1080G07.g 1338 231 2,00e-130 468 -1 12 25446692 Oryza sativa 6,00e-130 SCVPLB1020A04.g 1229 317 100e-60 229 -3 28 51969146 Arabidopsis thaliana 3,00e-76 SCBGLR1047F09.g 1017 203 2,00e-59 178 3 4 42408101 Oryza sativa 2,00e-59 SCBFRZ2017F03.g 2116 263 0.0 385 1 11 108706772 Oryza sativa 0.0 SCJFLR1013F02.g 1743 356 0.0 746 3 9 56681313 Saccharum officinarum 0.0 SCMCRZ3064B09.g 955 171 6,00e-32 132 3 2 125503242 Oryza sativa 3,00e-75 SCEPRZ1008D05.g 2134 317 1,00e-76 282 1 16 162458955 Zea mays 1,00e-131 SCRFLB1055B10.g 1637 486 1,00e-129 457 2 15 28209525 Oryza sativa 5,00e-169 SCCCRZ1002G07.g 3056 816 0.0 1277 -1 13 3915873 Glycine max 0.0 156 CONCLUSÕES GERAIS • Os transcriptomas do feijão-caupi (NordEST/HarvEST) e de cana-de-açúcar (SUCEST) apresentaram representantes de todas as nodulinas estudadas. • A presença e a estrutura das nodulinas no transcriptoma do feijão-caupi sugere que esta leguminosa apresenta, durante o estabelecimento da fixação biológica de nitrogênio, mecanismos semelhantes aos encontrados em outras leguminosas modelo, como Lotus japonicus e Medicago truncatula. • Em cana-de-açúcar o grande número de sequências ortólogas às nodulinas indica a atuação destes genes em outras vias metabólicas, além da nodulação, tal como em outras plantas não leguminosas, sugerindo que o estabelecimento da simbiose nodular envolveu genes ancestrais, que provavelmente não estavam relacionados à via simbiôntica. • As nodulinas candidatas de caupi apresentaram maior similaridade com sequências de plantas leguminosas, enquanto em cana uma maior similaridade foi observada com genes de outras Monocotiledôneas, em conformidade com a proximidade evolutiva dos organismos. • Sequências de organismos que pertencem à mesma família tendem a se agrupar nos dendrogramas, sugerindo que as nodulinas estejam sujeitas as pressões seletivas, associadas à evolução divergente. • Em caupi a maioria dos transcritos foi observada em bibliotecas de folhas infectadas e de raiz sob estresse salino, enquanto em cana a maioria das reads pertencia às bibliotecas de flores e raízes, fornecendo indícios de que em Angiospermas esses genes estão envolvidos em outros processos, além da fixação biológica de nitrogênio. • As seqüências identificadas neste trabalho representam uma ferramenta valiosa para o desenvolvimento de marcadores moleculares para o melhoramento das espécies estudadas, fornecendo meios de elucidar os mecanismos utilizados por esses genes em outras vias, que não a de fixação, o que permitirá expandir os conhecimentos sobre a simbiose em plantas não-leguminosas de importância econômica. 157 ANEXO Instrução para autores 158 159 160 161