The story behind the previous cover images

whiteboard2

This is a draft of a flow chart for the selection of proteins to train a Support Vector Machine. The appropriate choice of examples for training and becnhmark is a very important initial step in the development of methods for prediction in computational biology. In this case, we were working on a method to detect protein subcellular location based on amino acid composition and exposure. Therefore, we thought to start with all human protein sequences associated to genes in the Entrez NCBI database, then to take one associated to each human gene, for which we can obtain functional (GO) information, including subcellular location, and then to take those whose structure is deposited in the PDB database, provided that the solved structure had more than 150 amino acids.  You can read the rest of the story in the related publication:

Mer, A.S. and M.A. Andrade-Navarro. 2013. A novel approach for protein subcellular location prediction using amino acid exposure. BMC Bioinformatics. 14:342. [NYCE] PubMed: 24283794

 

whiteboard1

The diagram displays simplified representations of protein domains with tandem repeats that were detected using a neural network. Sequence similarity analyses of these domains suggest a complex story of duplications and rearrangements of protein fragments. These mechanisms are used to increase protein variability and function. Read more about this in the related publication:

Palidwor, G.A., S. Shcherbinin, M.R. Huska, T. Rasko, U. Stelzl, A. Arumughan, R. Foulle, P. Porras, L. Sanchez-Pulido, E.E. Wanker, M.A. Andrade-Navarro. 2009. Detection of alpha-rod repeats using a neural network and application to huntingtin. PLoS Comp. Biol. 5, e1000304. [ARD] PubMed: 19282972