Our paper describes the first provably-efficient algorithm for determining protein structures de novo, solely from experimental data. We show how the global nature of a certain kind of NMR data provides quantifiable complexity-theoretic benefits, allowing us to classify our algorithm as running in polynomial time. While our algorithm uses NMR data as input, it is the first polynomial-time algorithm to compute high-resolution structures de novo using any experimentally-recorded data, from either NMR spectroscopy or X-Ray crystallography. Improved algorithms for protein structure determination are needed, because currently, the process is expensive and time-consuming. For example, an area of intense research in NMR methodology is automated assignment of nuclear Overhauser effect (NOE) restraints, in which structure determination sits in a tight inner-loop (cycle) of assignment/refinement. These algorithms are very time-consuming, and typically require a large cluster. Thus, algorithms for protein structure determination that are known to run in polynomial time and provide guarantees on solution accuracy are likely to have great impact in the long-term. Methods stemming from a technique called "distance geometry embedding" do come with provable guarantees, but the NP-hardness of these problem formulations implies that in the worst case these techniques cannot run in polynomial time. We are able to avoid the NP-hardness by (a) some mild assumptions about the protein being studied, (b) the use of residual dipolar couplings (RDCs) instead of a dense network of NOEs, and (c) novel algorithms and proofs that exploit the biophysical geometry of (a) and (b), drawing on a variety of computer science, computational geometry, and computational algebra techniques. In our algorithm, RDC data, which gives global restraints on the orientation of internuclear bond vectors, is used in conjunction with very sparse NOE data to obtain a polynomial-time algorithm for protein structure determination. An implementation of our algorithm has been applied to 6 different real biological NMR data sets recorded for 3 proteins. Our algorithm is combinatorially precise, polynomial-time, and uses much less NMR data to produce results that are as good or better than previous approaches in terms of accuracy of the computed structure as well as running time. In practice approaches such as restrained molecular dynamics and simulated annealing, which lack both combinatorial precision and guarantees on running time and solution quality, are commonly used. Our results show that by using a different "slice" of the data, an algorithm that is polynomial time and that has guarantees about solution quality can be obtained. We believe that our techniques can be extended and generalized for other structure-determination problems such as computing side-chain conformations and the structure of nucleic acids from experimental data.
Geometry of kinked protein helices from NMR data.
Geometry of kinked protein helices from NMR data.
Geometry of kinked protein helices from NMR data.
J Magn Reson. 2011 Feb 16;
Authors: Murray DT, Lu Y, Cross TA, Quine JR
Mathematical questions related to determining the structure of a protein from NMR orientational restraints are discussed. The protein segment is a kinked alpha helix modeled as a regular alpha helix in which two adjacent torsion angles have been varied from their ideal values. Varying these torsion angles breaks the helix into two regular helical segments joined at a kink. The...
nmrlearner
Journal club
0
03-23-2011 05:41 PM
Binding site identification and structure determination of protein-ligand complexes by NMR a semiautomated approach.
Binding site identification and structure determination of protein-ligand complexes by NMR a semiautomated approach.
Binding site identification and structure determination of protein-ligand complexes by NMR a semiautomated approach.
Methods Enzymol. 2011;493:241-75
Authors: Ziarek JJ, Peterson FC, Lytle BL, Volkman BF
Over the last 15years, the role of NMR spectroscopy in the lead identification and optimization stages of pharmaceutical drug discovery has steadily increased. NMR occupies a unique niche in the biophysical analysis of drug-like...
nmrlearner
Journal club
0
03-05-2011 01:02 PM
Geometry of kinked protein helices from NMR data
Geometry of kinked protein helices from NMR data
Publication year: 2011
Source: Journal of Magnetic Resonance, In Press, Accepted Manuscript, Available online 16 February 2011</br>
Dylan T., Murray , Yuanting, Lu , T.A., Cross , J.R., Quine</br>
Mathematical questions related to determining the structure of a protein from NMR orientational restraints are discussed. The protein segment is a kinked alpha helix modeled as a regular alpha helix in which two adjacent torsion angles have been varied from their ideal values. Varying these torsion angles breaks the helix into two regular...
nmrlearner
Journal club
0
02-17-2011 09:41 AM
[NMR paper] NMR data collection and analysis protocol for high-throughput protein structure determination.
NMR data collection and analysis protocol for high-throughput protein structure determination.
Related Articles NMR data collection and analysis protocol for high-throughput protein structure determination.
Proc Natl Acad Sci U S A. 2005 Jul 26;102(30):10487-92
Authors: Liu G, Shen Y, Atreya HS, Parish D, Shao Y, Sukumaran DK, Xiao R, Yee A, Lemak A, Bhattacharya A, Acton TA, Arrowsmith CH, Montelione GT, Szyperski T
A standardized protocol enabling rapid NMR data collection for high-quality protein structure determination is presented that...
nmrlearner
Journal club
0
12-01-2010 06:56 PM
[NMR paper] Protein structure elucidation from minimal NMR data: the CLOUDS approach.
Protein structure elucidation from minimal NMR data: the CLOUDS approach.
Related Articles Protein structure elucidation from minimal NMR data: the CLOUDS approach.
Methods Enzymol. 2005;394:261-95
Authors: Grishaev A, Llinás M
In this chapter we review automated methods of protein NMR data analysis and expand on the assignment-independent CLOUDS approach. As presented, given a set of reliable NOEs it is feasible to derive a spatial H-atom distribution that provides a low-resolution image of the protein structure. In order to generate such a...
nmrlearner
Journal club
0
11-24-2010 11:14 PM
[NMR paper] De novo protein structure determination using sparse NMR data.
De novo protein structure determination using sparse NMR data.
Related Articles De novo protein structure determination using sparse NMR data.
J Biomol NMR. 2000 Dec;18(4):311-8
Authors: Bowers PM, Strauss CE, Baker D
We describe a method for generating moderate to high-resolution protein structures using limited NMR data combined with the ab initio protein structure prediction method Rosetta. Peptide fragments are selected from proteins of known structure based on sequence similarity and consistency with chemical shift and NOE data. Models...
nmrlearner
Journal club
0
11-19-2010 08:29 PM
[NMR paper] An approach to global fold determination using limited NMR data from larger proteins
An approach to global fold determination using limited NMR data from larger proteins selectively protonated at specific residue types.
An approach to global fold determination using limited NMR data from larger proteins selectively protonated at specific residue types.
J Biomol NMR. 1996 Oct;8(3):360-8
Authors: Smith BO, Ito Y, Raine A, Teichmann S, Ben-Tovim L, Nietlispach D, Broadhurst RW, Terada T, Kelly M, Oschkinat H, Shibata T, Yokoyama S, Laue ED
A combination of calculation and experiment is used to demonstrate that the global fold of...