CNS scripts for water refinement of NMR-derived protein structures are available on the
RECOORD website.
Reference:
RECOORD: A recalculated coordinate database of 500+ proteins from the PDB using restraints from the BioMagResBank
Aart J. Nederveen 1, Jurgen F. Doreleijers 2, Wim Vranken 3, Zachary Miller 4, Chris A.E.M. Spronk 5, Sander B. Nabuurs 5, Peter Güntert 6, Miron Livny 4, John L. Markley 2, Michael Nilges 7, Eldon L. Ulrich 2, Robert Kaptein 1, Alexandre M.J.J. Bonvin 1 *
1Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, The Netherlands
2Center for Eukaryotic Structural Genomics, University of Wisconsin-Madison, Madison, Wisconsin, USA
3Macromolecular Structure Database, European Bioinformatics Institute, Wellcome Trust, Genome Campus, Hinxton, Cambridge, United Kingdom
4Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin, USA
5Center for Molecular and Biomolecular Informatics, Radboud University, Nijmegen, The Netherlands
6Tatsuo Miyazawa Memorial Program, RIKEN Genomic Sciences Center, 1-7-22 Suehiro-cho, Tsurumi, Yokohama, Japan
7Unité de Bioinformatique Structurale, Institut Pasteur, Paris, France
email: Alexandre M.J.J. Bonvin (a.m.j.j.bonvin@chem.uu.nl)
*Correspondence to Alexandre M.J.J. Bonvin, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, The Netherlands
Abstract
State-of-the-art methods based on CNS and CYANA were used to recalculate the nuclear magnetic resonance (NMR) solution structures of 500+ proteins for which coordinates and NMR restraints are available from the Protein Data Bank. Curated restraints were obtained from the BioMagResBank FRED database. Although the original NMR structures were determined by various methods, they all were recalculated by CNS and CYANA and refined subsequently by restrained molecular dynamics (CNS) in a hydrated environment. We present an extensive analysis of the results, in terms of various quality indicators generated by PROCHECK and WHAT_CHECK. On average, the quality indicators for packing and Ramachandran appearance moved one standard deviation closer to the mean of the reference database. The structural quality of the recalculated structures is discussed in relation to various parameters, including number of restraints per residue, NOE completeness and positional root mean square deviation (RMSD). Correlations between pairs of these quality indicators were generally low; for example, there is a weak correlation between the number of restraints per residue and the Ramachandran appearance according to WHAT_CHECK (r = 0.31). The set of recalculated coordinates constitutes a unified database of protein structures in which potential user- and software-dependent biases have been kept as small as possible. The database can be used by the structural biology community for further development of calculation protocols, validation tools, structure-based statistical approaches and modeling. The RECOORD database of recalculated structures is publicly available from
http://www.ebi.ac.uk/msd/recoord. Proteins 2005. © 2005 Wiley-Liss, Inc