Protein domain analysis

We wrote a review about the importance of considering domains for the study of protein evolution and function [1].

Functional characterization

We discovered remote sequence homology of a domain present in a number of human proteins involved in cancer and apoptosis to a bacterial domain that seems to be related to redox processes [2]. This finding could lead to explain some disease processes in terms of that biological process.

We characterized the remote sequence homology between a hypothetical protein whose structure was solved in the frame of a structural genomics project to the human SHWACHMAN-Bodian-diamond syndrome (SBDS) protein. These and other computational and experimental tests led us to propose that SBDS forms part of a large and ancient family with an RNA metabolic function [3].

This review discusses the role of proteins of the BCL-2 family in the regulation of mitochondrial apoptosis [4]. After sequence analysis and threading we propose that Mtch2 (one of the members of the family), which was first predicted to have three trans-membrane (TM) domains, has a typical mitochondrial carrier structure (six TM domains connected with hydrophilic loops).

Protein SUMOylation is a post-translational modification that is one of many mechanisms used to regulate mitochondrial morphological changes. To study the role of SUMO proteases in mitochondrial dynamics, we did a screen of those that could cleave SUMO1 [5]. Computational database analysis indicated a family of SUMO proteases of which SENP2 and SENP5 had features suggesting that they are targeted to the mitochondria. Accordingly, overexpression of SENP5 rescued SUMO1-induced mitochondrial fragmentation and its silencing produced mitochondrial fragmentation and other metabolic effects, indicating the importance of SENP5 as a regulator of mitochondrial morphology and metabolism.

Genetic variation of the human FTO gene was associated to obesity by two separate groups in 2007. These initial reports indicated the lack of homology of its protein product to sequences of known function. We used computational analysis to study this gene and its orthologs in other species [6]. We identified the FTO family as containing an N-terminal domain with homology to the non-heme dioxygenase (Fe(II)- and 2-oxoglutarate-dependent dioxygenases) superfamily (including the Hypoxia Inducible Factor), and characterized features in the human FTO protein such as its amino acids involved in cofactor (Fe) and co-substrate (2-oxoglutarate) binding, and a potential bipartite nuclear localization signal which is conserved in part of the family.

To search for proteins regulating mitochondrial morphology we did a computational screen for protein sequences predicted as mitochondrial and containing RING domains. We characterized one of them, which we name MAPL (mitochondrial anchored protein ligase) and is located in the outer membrane of the mitochondria [7]. Observation of MAPL-YFP led to the first observation of mitochondrially derived vesicles, which fuse with peroxisomes. The protein contains a domain that we named BAM (Besides a Membrane) [8], which presents a complicated taxonomic distribution, being scattered in archaea and bacteria, and present in most eukarya (but not in fungi). We deduce that this domain has been generated along the eukaryotic lineage and has been horizontally transferred multiple times to and between prokaryotic lineages. It must have a function that confers prokaryotes with a selective advantage without being crucial.

Plant-specific protein KCBP (kinesin-like calmodulinn binding protein) is involved in cell division. Its function is related to its unique domain architecture. The N-terminal MyTH4 domain followed by a FERM is similar to human Myosin. This is followed by a coiled coil region and a C-terminal kinesin motor domain. We found that the MyTH4-FERM part of KCBP links the motor domain to the site of cell division [9].

Evolutionary aspects

We studied the family of Lsg1-like proteins that are found found in microsporidia, yeast, plants, insects, and vertebrates [10]. They contain a centered GTPase domain belonging to a catalytically undefined GTPase family: MMR/HSR1. The family has a pattern of expansion from yeast to vertebrates that correlates with the formation of cellular compartments. We characterized the GTPase function of the human protein, showing that it shuttles between nucleus, cytosol and endoplasmic reticulum. Those proteins could be potentially involved in rRNA maturation. By searching for homologs of these proteins in complete genomes, we identified ten subfamilies of this family, YRG (YlqF related GTPases), associated to six subcellular compartments (nuclear bodies, nucleolus, nucleus, cytosol, mitochondria, and chloroplast), and which can be found in archaeal, bacterial and eukaryotic proteomes [11]. The YRG family helps to recapitulate the evolution of compartmentalization.

Upon search for homologs of zebrafish proteins actinodin 1 and 2 (And1 and And2), we identified a family of four genes (sharing an N-terminal domain and small scattered repeats) whith orthologs in the teleost fish but not in tetrapods. The existence of a homolog in the elephant shark, which diverged from a taxa containing both teleost fish and tetrapods, suggests the loss of these genes along the tetrapod lineage. Knock-down of genes and1 and and2 produces the loss of fin formation or regeneration and hints at the relation between the evolutionary loss of these genes and the emergence of limbs with fingers in tetrapoda [12].

In a review we put in evolutionary perspective the physiological data and the molecular data related to the renin-angiotensin-aldosterone system (RAAS) [13]. We describe that while most of RAAS-related genes appeared around 400 million years ago, which agrees with the physiological evidence, some existed before organisms were using renin, suggesting that these genes have other more ancestral functions, and that the RAAS was built on top of existing machinery. One of the members of the RAAS, Mas, was extensively duplicated in tretrapoda, probably accounting for the evolution of the perception of itch and pain in sensory neurons. We discussed this family (the Mas-related G protein-coupled receptos, Mrgprs) in another review [14].

In a review, we describe the evolution of the RNA-based RNA interference (RNAi)-like pathways in nematoda (including Caenorhabditis elegans), which have evolved rapidly and are very expanded in comparison to humans or to model organisms such as Drosophila [15]. In particular, for the Argonaute family, there is an entire nematoda-specific subfamily (WAGOs) with 13 proteins that interact with 22G-RNAs. The other two subfamilies are Ago and Piwi. While the proteins in this family have a MID, a PAZ and a PIWI domain, the Ago-subfamily  nematode ERGO-1 protein, has a very divergent (or different) MID domain unique to nematoda. This is one example of the different expansions and evolution of this family in nematoda. Together with other findings in arthropods and mollusks, this is hinting at an ancestral state of metazoan using RNAi-like pathways in somatic tissues, which was eventually reduced to the germline in the evolutionary line leading to mammals.

Structural aspects

The PAZ domain is present in three protein families: Dicer, Argonaute and Piwi. In Piwi, the PAZ domain recognizes the 3' end of a specific type of small RNA named Piwi-interacting RNA (piRNA), which is always modified with a 2'-O-methyl group. This paper [16] presents the structure of the PAZ domain of murine Piwi in complex with a piRNA and discusses the evolution of the PAZ domain to interact specifically with the 3' ends of different RNA types.

Human LRP2 is a receptor with functions both in patterning of the embryonic brain and a function in adult kidney. This work demonstrates that the zebrafish ortholog is involved in kidney function but not in brain development [17]. LRP2B, an LPR2 homolog specific to fish, does not seem to be related to either development or kidney functions.

We evaluated a number of novel mutations of the human cardiac alpha-myosin (MYH6), which result in sarcomeric disease and congenital heart defects [18]. Using protein structure information we could conclude that some of these mutations are located in the interface of MYH6 with actin, explaining their importance.



[1] Ponting, C.P., J. Schultz, R.R. Copley, M.A. Andrade and P. Bork. 2000. Evolution of domain families. Contribution to "Analysis of amino acid sequences" in Advances in Protein Chemistry. 54, 185-244.

[2] Sánchez-Pulido, L., A.M. Rojas, A. Valencia, C. Martinez-A and M.A. Andrade. 2004. ACRATA: A novel electron transfer domain associated to apoptosis and cancer. BMC Cancer. 4, 98.

[3] Savchenko, A., N. Krogan, J.R. Cort, E. Evdokimova, J.M. Lew, A.A. Yee, L. Sánchez-Pulido, M.A. Andrade, A. Bochkarev, J.D. Watson, M.A. Kennedy, J. Greenblatt, T. Hughes, C.H. Arrowsmith, J.M. Rommens and A.M. Edwards. 2005. The Shwachman-Bodian-Diamond syndrome protein family is involved in RNA metabolism. Journal of Biological Chemistry. 280, 19213-19220.

[4] Schwarz, M., M.A. Andrade-Navarro and A. Gross. 2007. Mitochondrial carriers and pores: Key regulators of the mitochondrial apoptotic program? Apoptosis. 12, 869-876.

[5] Zunino, R., P. Rippstein, M. Andrade-Navarro and H.M. McBride. 2007. The SUMO protease SENP5 is required to maintain mitochondrial morphology and function. J. Cell. Sci. 120, 1178-1188.

[6] Sanchez-Pulido, L. and M.A. Andrade-Navarro. 2007. The FTO (fat mass and obesity associated) gene codes for a novel member of the non-heme dioxygenase superfamily. BMC Biochemistry. 8, 23.

[7] Neuspiel, M., A.C. Schauss, E. Braschi, R. Zunino, P. Rippstein, R.A. Rachubinski, M.A. Andrade-Navarro and H.M. McBride. 2008. Cargo-selected transport from the mitochondria to peroxisomes is mediated by vesicular carriers. Current Biology. 18, 102-108.

[8] Andrade-Navarro, M.A., L. Sanchez-Pulido and H.M. McBride. 2009. Mitochondrial vesicles: an ancient process providing new links to peroxisomes. Current Opinion in Cell Biology. 21, 560-567.

[9] Buschmann, H., J. Dols, S. Kopischke, E.J. Peña, M.A. Andrade-Navarro, S. Zachgo, D.B. Szymanski, M. Heinlein, J.H. Doonan and C.W. Lloyd. 2015. Arabidopsis KCBP interacts with AIR9 but stays in the cortical division zone throughout mitosis via its MyTH4-FERM domain. Journal of Cell Science. 128, 2033-2046.

[10] Reynaud, E.G., M.A. Andrade, F. Bonneau, T.B. Ly, M. Knop, K. Scheffzek and R. Pepperkok. 2005. Human Lsg1 defines a family of essential GTPases that correlates with the evolution of compartmentalization. BMC Biology. 21, 3.

[11] Mier, P., A.J. Pérez-Pulido, E.G. Reynaud and M.A. Andrade-Navarro. 2017. Reading the evolution of compartmentalization in the ribosome assembly toolbox: the YRG protein family. PLoS One. 12, e0169750.

[12] Zhang, J., P. Wagh, D. Guay, L. Sanchez-Pulido, B.K. Padhi, V. Korzh, M.A. Andrade-Navarro and M. Akimenko. 2010. Loss of fish actinotrichia proteins and the fin-to-limb transition. Nature. 466, 234-237.

[13] Fournier, D., F.C. Luft, M. Bader, D. Ganten and M.A. Andrade-Navarro. 2012. Emergence and evolution of the renin-angiotensin-aldosterone system. Journal of Molecular Medicine90, 495-508.

[14] Bader, M., N. Alenina, M.A. Andrade-Navarro and R.A. Santos. 2014. Mas and its related G protein-coupled receptors, Mrgprs. Pharmacological Reviews. 66, 1080-1105.

[15] Almeida, M.V., M.A. Andrade-Navarro and R.F. Ketting. 2019. Function and evolution of nematode iRNA pathways. Non-coding RNA. 5, 8.

[16] Simon, B., J.P. Kirkpatrick, S. Eckhardt, M. Reuter, E.A. Rocha, M.A. Andrade-Navarro, P. Sehr, R.S. Pillai, T. Carlomagno. 2011. Recognition of 2’-O-methylated 3’-end of piRNA by the PAZ domain of a piwi protein. Structure19, 172-180.

[17] Kur, E., A. Christa, K.N. Veth, C.R. Gajera, M.A. Andrade-Navarro, J. Zhang, J.R. Willer, R.G. Gregg, S. Abdelilah-Seyfried, S. Bachmann, B.A. Link, A. Hammes and T.E. Willnow. 2011. Loss of Lrp2 in zebrafish disrupts pronephric tubular clearance but not forebrain development. Developmental Dynamics240, 1567-1577.

[18] Posch, M.G., S. Waldmuller, M. Muller, T. Scheffold, D. Fournier, M.A. Andrade-Navarro, B. De Geeter, S. Giullaumont, C. Dauphin, D. Zousseff, K.R.L. Schmitt, A. Perrot, F. Berger, R. Hetzer, P. Bouvagnet, C. Özcelik. 2011. Cardiac alpha-myosin (MYH6) is the predominant sarcomeric disease gene for familial atrial septal defects. PLoS One. 6, e28872.