The CONNJUR Project is developing an open source integration environment for biomolecular NMR data analysis. CONNJUR software is a workflow generator, based on legacy NMR analysis applications, that is being developed as Open Source Software -- it is perpetually free for anyone to use, modify and distribute.
CONNJUR is developed by a community of NMR spectroscopists and scientific programmers who aim to create and maintain NMR analysis tools bounded by the goals of excellent end-product, free of charge, open source software using the highest standards. The workflow for modern biomolecular NMR spectroscopy consists of three phases: spectral reconstruction, the process of converting time domain data into the frequency domain; spectral analysis, which includes peak identification and resonance assignment; and biophysical characterization, which includes all subsequent data analysis in which the spectroscopic data is used to draw biophysical inferences (such as structure determination). Despite the simplicity of this overview, the actual processing workflow for biomolecular NMR is incredibly intricate and complex, requiring the use of dozens of software tools. Each of these software tools has its own data format mandating a simultaneous requirement for format converters (usually PERL scripts) which provide the interface between these tools. The remainder of the data management issues is left in the hands of the spectroscopist, who is solely in charge of ensuring data integrity between the various phases and sub-phases of data analysis.
NMR Data Analysis Workflow Without Integration: Data analysis proceeds through several steps using various NMR processing tools. Key data must be shuttled form one tool to another, requiring format conversion for most steps. Additional Information is retrieved from external databases. AT all processing stages, critical data is often stored in paper copy, reducing the efficiency of archival retrieval and increasing the error-rate in NMR analysis.
CONNJUR: An Open Source Software Solution
The complicated nature of NMR data management can be alleviated through the use of a common data store, preferably a relational database whose management system can guarantee relational integrity between the various pieces of data derived at different steps along the workflow. Effective use of a relational database is predicated on the development of a data model, the development of which is underway by at least three research groups1-3. However, once such a data model and database are developed, the unavoidable issue becomes how to use it. One solution is to redevelop all of the existing software tools such that they store and retrieve data in the database1. Another solution is to provide a workbench environment with interfaces which allow existing tools to integrate with the relational database backend. The latter solution is desirable in that it implicitly supports legacy applications removing the requirement that the user learn the operation of additional software tools. It also provides a static framework upon which dynamic tools can be developed, implemented, and optimized.
NMR Data Analysis Using CONNJUR's Integration Environment: Data analysis still proceeds through several steps using the various NMR processing tools. However, in contrast with the above workflow, intermediate storage maximizes the efficiency of archival retrieval, decreases the error-rate, and leverages the database's built-in functionality for format conversion for data exchange.
This website describes such an integrated, workbench environment called CONNJUR as an open source initiative for the biomolecular NMR community. CONNJUR employs a standard three-tier architecture, composed of a relational database back-end, an application layer for wrapping existing NMR processing tools, and a front-end user interface. CONNJUR is coded in JAVA to support the contributions of a wide range of developers and capitalizes on other open source software for its development (e.g. using MySQL as the relational database). The purpose of CONNJUR is to provide a workbench environment from which most NMR data processing can be coordinated. Integrating NMR software tools with a common, relational database will ensure data integrity, provide user guidance and improve efficiency more than is currently the case using the various, available software tools independently.
CONNJUR Architecture: Patterned off of the standard three-tier architecture, CONNJUR is built of three main layers: (1) an interface for user interaction, (2) a middle layer which both wraps the third-party software tools, their business logic and provides communication with (3) the third layer, a relational database management system. This separation of layers allows for alterations to one portion (for instance changing the database application) without extensive code revisions to the other layers. The boxes represent the concept of 'actors', which are modular computational units which transparently invoke third-party software to accomplish discrete conceptual tasks. The loose coupling between actors allows NMR spectroscopists to code and implement their own actors with little (or no) knowledge of the application code as a whole. This will, in turn, facilitate the rapid development of additional functionality for CONNJUR.
Enhanced Communication
When using GUI-driven software for spectral reconstruction, the only mechanism for sharing processing strategies is to describe the procedure stepwise, including all of the logic employed in the selection of the order of subprocesses and their configuration. This is equivalent to hands-on training. Script-based software allows a more sophisticated level of communication, as researchers can share their processing scripts7, in which the author's logic is hardcoded. Unfortunately, although the logic for any one script is built in, one requires extensive documentation to make adjustments to the script for different types of spectra. A script generator (http://sbtools.uchc.edu) can be used to further encapsulate the logic of spectral reconstruction, but suffers from the compromise between flexibility and usability. A script generator with too many options ceases to be useful.
CONNJUR provides an ideal environment for communication. Business logic for spectral reconstruction can be coded into the actors at any level of detail desired - such that the processing workflow is constructed independent from the spectral details, such as filename, number of points, order of dimensions, etc. As the processing workflows are modeled and stored inside the CONNJUR relational database, it will be an easy future development to provide for their import/export as XML.
Free and Open Source
CONNJUR will be distributed as Free and Open Source Software (FOSS). This licensing mechanism has many concrete benefits to the end user:
Price: FREE to download, install and use
Extensibility: No restrictions or royalties to modify or resell
Transparency: Source code readily inspected and verified
Active Development: FOSS encourages the scientific community to aid in the development of CONNJUR
Perpetual: As FOSS, CONNJUR is not dependent on the original developers. If we are unable to continue the project
any member of the NMR community is free to continue developing and distributing the software
How can a software application be considered free if the supporting hardware and software are prohibitively expensive? In the spirit of Open Source, CONNJUR is designed for interoperability with other free and/or Open Source projects as demonstrated with the prototype:
Operating System: Linux
Framework Programming Language: Java
Database: MySQL
Integrated Development Environment: Eclipse
References
Vranken, et al. (2004). The CCPN data model for NMR spectroscopy: Development of a software pipeline. Proteins: Structure, Function and Bioinformatics, 59, 687-696.
Fox-Erlich, et al. (2004) Delineation and analysis of the conceptual data model implied by the "IUPAC Recommendations for Biochemical Nomenclature". Protein Science, 13, 2559-2563.
Baran, et al. (2006) SPINS: A laboratory information management system for organizing and archiving intermediate and final results from NMR protein structure determinations. Proteins: Structure, Function, and Bioinformatics, 62, 843-851.
Ellis, et al. (2006) Development of an Integrated Framework for Protein Structure Determinations: A Logical Data Model for NMR Data Analysis. Published in Proceedings of the Third International Conference on Information Technology, Las Vegas, Nevada, USA
Delaglio, et al. (2005) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR.6, 277-293.
Compressed sensing and the reconstruction of ultrafast 2D NMR data: Principles and biomolecular applications.
Compressed sensing and the reconstruction of ultrafast 2D NMR data: Principles and biomolecular applications.
Compressed sensing and the reconstruction of ultrafast 2D NMR data: Principles and biomolecular applications.
J Magn Reson. 2011 Apr;209(2):352-8
Authors: Shrot Y, Frydman L
A topic of active investigation in 2D NMR relates to the minimum number of scans required for acquiring this kind of spectra, particularly when these are dictated by sampling rather than by sensitivity considerations. Reductions in this minimum number of scans have...
nmrlearner
Journal club
0
07-23-2011 08:54 AM
CONNJUR spectrum translator: an open source application for reformatting NMR spectral data
CONNJUR spectrum translator: an open source application for reformatting NMR spectral data
Abstract NMR spectroscopists are hindered by the lack of standardization for spectral data among the file formats for various NMR data processing tools. This lack of standardization is cumbersome as researchers must perform their own file conversion in order to switch between processing tools and also restricts the combination of tools employed if no conversion option is available. The CONNJUR Spectrum Translator introduces a new, extensible architecture for spectrum translation and introduces two...
nmrlearner
Journal club
0
03-18-2011 06:51 PM
Biomolecular NMR data analysis.
Biomolecular NMR data analysis.
Related Articles Biomolecular NMR data analysis.
Prog Nucl Magn Reson Spectrosc. 2010 May;56(4):329-45
Authors: Gryk MR, Vyas J, Maciejewski MW
nmrlearner
Journal club
0
10-19-2010 04:51 PM
[NMR analysis blog] Mnova 6.0, at last! GSD, Line Fitting, Data Analysis, handling of LC/GC/MS data and m
Mnova 6.0, at last! GSD, Line Fitting, Data Analysis, handling of LC/GC/MS data and much more!
It's been over 6 weeks since my last post on this blog but don’t worry, I haven’t been idle. On the contrary, I have a very good excuse for this lack of posts: We all at Mestrelab have been working very hard trying to get version 6.0 of Mnova finished. Now I’m delighted to announce that we have done it and version 6.0 is finally available for download from our Web site. This is certainly a major upgrade of the software in which we have put a lot of work and passion. It brings a number of...
nmrlearner
News from NMR blogs
0
08-21-2010 09:12 PM
[NMR software blog] Open Source NMR freeware
Open Source NMR freeware
Most of the readers arrive here using Google, without knowing me and my blog. Usually they get very angry because they arrive... on the trapping post I wrote 3 years ago! I want to do something to keep them glad...
So you want "open source" stuff? Do you know what it really means? Are you ready to compile, test, debug it and add a graphic interface to it?
Just because you asked for it, here is a list of available projects. If you know other links, add them into a comment.
CCPN
NPK
matNMR
ProSpectND
nmrlearner
News from NMR blogs
0
08-21-2010 06:29 PM
Biomolecular NMR Data Analysis
Biomolecular NMR Data Analysis
Publication year: 2010
Source: Progress in Nuclear Magnetic Resonance Spectroscopy, In Press, Accepted Manuscript, Available online 2 March 2010</br>
Michael R., Gryk , Jay, Vyas , Mark W., Maciejewski</br>
More...