Prediction of protein secondary structure from CD spectra

We have applied a Kohonen self-organising map for the classification of circular dichroism spectra of proteins of known three dimensional structure. The optimization of the parameters of the network [1] gave maps where the spectra were clearly ordered in function of the protein contents on secondary structure (helix, sheet and random conformations).

The maps were evaluated in order to assign values of percentages of secondary structure to each of the neurons of the network [2]. Then the maps can be used for prediction of percentages of secondary structure using spectra of proteins with unknown three dimensional structure. Different methods of evaluation were tested in order to find the one with best accuracy. The method gives an indication of the accuracy of each individual prediction. The program itself and information about its use was available at the k2d site between 2000 and 2014.

In 2008 we updated the method (K2D2; [3]) to use an alternative CD range, and, most importantly, to consider other protein spectra published in the meantime. In 2011 we developed an alternative method (K2D3; [4]) that takes advantage of the possibility of deriving theoretical CD spectra from known protein structures to generate a large training dataset. This method improves performance in the 200-240 nm range, especially for the prediction of beta-strand content.

References

[1] Merelo, J.J., M.A. Andrade, A. Prieto and F. Morán. 1994. Proteinotopic Feature Maps. Neurocomputing. 6, 443-454.

[2] Andrade, M.A., P. Chacón, J.J. Merelo and F. Morán. 1993. Evaluation of secondary structure of proteins from UV circular dichroism using an unsupervised learning neural network. Prot. Eng. 6, 383-390.

[3] Perez-Iratxeta, C. and M.A. Andrade-Navarro. 2008. K2D2: Estimation of protein secondary structure from CD spectra. BMC Structural Biology. 8, 25.

[4] Louis-Jeune C., M.A. Andrade-Navarro and C. Perez-Iratxeta. 2011. Prediction of protein secondary structure from circular dichroism using theoretically derived spectra. Proteins. 80, 374-381.