Delineation of Protein Structure Classes from Multivariate Analysis of Protein Raman Optical Activity Data
Delineation of Protein Structure Classes from Multivariate Analysis of Protein Raman Optical Activity Data
Received 2 June 2006; revised 15 August 2006; accepted 15 August 2006. Edited by P. Wright. Available online 22 August 2006.
Fujiang Zhu1, George E. Tranter2, Neil W. Isaacs1, Lutz Hecht1 and Laurence D. Barron1
Journal of Molecular Biology
Volume 363, Issue 1 , 13 October 2006
ScienceDirect
Copyright ? 2006 Elsevier Ltd All rights reserved.
1WestCHEM, Department of Chemistry, University of Glasgow, Glasgow G12 8QQ, UK
2Biological Chemistry, Division of Biomedical Sciences, Imperial College, London SW7 2AZ, UK
Abstract
Vibrational Raman optical activity (ROA), measured as a small difference in the intensity of Raman scattering from chiral molecules in right and left-circularly polarized incident light, or as the intensity of a small circularly polarized component in the scattered light, is a powerful probe of the aqueous solution structure of proteins. On account of the large number of structure-sensitive bands in protein ROA spectra, multivariate analysis techniques such as non-linear mapping (NLM) are especially favourable for determining structural relationships between different proteins. Here NLM is used to map a dataset of 80 polypeptide, protein and virus ROA spectra, considered as points in a multidimensional space with axes representing the digitized wavenumbers, into readily visualizable two and three-dimensional spaces in which points close to or distant from each other, respectively, represent similar or dissimilar structures. Discrete clusters are observed which correspond to the seven structure classes all a, mainly a, a?, mainly ?, all ?, mainly disordered/irregular and all disordered/irregular. The average standardised ROA spectra of the proteins falling within each structure class have distinct features characteristic of each class. A distinct cluster containing the wheat protein A-gliadin and the plant viruses potato virus X, narcissus mosaic virus, papaya mosaic virus and tobacco rattle virus, all of which appear in the mainly a cluster in the two-dimensional representation, becomes clearly separated in the direction of increasing disorder in the three-dimensional representation. This suggests that the corresponding five proteins, none of which to date has yielded high-resolution X-ray structures, consist mainly of a-helix and disordered structure with little or no ?-sheet. This combination of structural elements may have functional significance, such as facilitating disorder-to-order transitions (and vice versa) and suppressing aggregation, in these proteins and also in sequences within other proteins. The use of ROA to identify proteins containing significant amounts of disordered structure will, inter alia, be valuable in structural genomics/proteomics since disordered regions often inhibit crystallization.
Keywords: raman optical activity; multivariate analysis; non-linear mapping; natively unfolded proteins; disordered structure
Abbreviations: Agli, A-gliadin; NMV, narcissus mosaic virus; PMV, papaya mosic virus; PPII, poly(L-proline) II helix; PVX, potato virus X; ROA, Raman optical activity; TMV, tobacco mosaic virus; TRV, tobacco rattle virus; UVCD, ultraviolet circular dichroism; VCD, vibrational circular dichroism; NLM, non-linear mapping; SCP, scattered circular polarization; ICP, incident circular polarization
You can view the abstract online. A subscription is required to view the full text or it can be purchased online.
Votes:39