(Note: This entry has been written by Dr. Silvia Mari, from R4R who has helped us to design and implement this module)
Background: spectroscopy and chemometrics
“For many years, there was the prevailing view that if one needed fancy data analyses, then the experiment was not planned correctly, but now it is recognized that most systems are multivariate in nature and univariate approaches are unlikely to result in optimum solutions.”
Hopke, P. K. (2003). The evolution of chemometrics. Analytica Chimica Acta, 500(1-2), 365–377]
Either we apply analytical chemistry for quality and control or we attempt to a more “system biology” approach for our R&D we do need advanced methods to design experiments, calibrate instruments, and analyze the resulting data. And the “emergence of chemometrics thinking came from the realization that traditional univariate statistics is not sufficient to describe and model chemical experiments”
Geladi, P. (2003). Chemometrics in spectroscopy . Part 1 . Classical chemometrics, 58, 767–782
With this in mind Mnova 9 now offers to its users a module called PCA which could be found under the main menu “Advanced”. It is the result of our first efforts to include chemometric tools into Mnova and it is meant to give spectroscopist the possibility to interactively work on both stacked spectra and its corresponding statistical plots.
Starting from mid ‘70s where the first paper with chemometrics in the title appeared in 1975 [1], chemometrics has grown up and is now considered a functioning research area in the chemical science. It has expanded widely from its beginnings into a variety of other areas including multivariate calibration, pattern recognition, and mixture resolution and today there are several applications of interest for the NMR spectroscopists [2-5].
PCA module
Principal Component Analysis (PCA) is a procedure which uses orthogonal transformation to convert a set of observations from correlated variables into a set of values of linearly uncorrelated variables (named principal components) [6].
PCA module under Advanced menu is working in two subsequent steps: (1) matrix generation and (2) principal component analysis. The overall workflow can be represented with the following illustration, where general steps available in Mnova are highlighted in blue whilst specific functionalities of this new PCA module are highlighted in yellow.
With the aim to help the spectroscopist to refine and optimize the data matrix to be used for advanced analysis, PCA in Mnova makes it very easy the detection and removal of spectrum outliers, reveal problems in spectral alignment as well as in its phase or baseline. Once the user has properly corrected those regions of interest, the PCA module allows to re-run the analysis, either replacing the previous analysis or creating a new one for comparison.
Interaction with the stacked spectra.
The main effort applied during the design and development could be summarized in one word: SYNCHRONIZATION. PCA plots, PCA tables and stacked plot are always synchronized. By doing so selections of a point in the score plot imply a selection in the stacked plot.
In the same way, a selection of a point in the loading plots (hence a selection of a variable of the matrix) generate a shadow into the stacked plot according to the bin position and size.
Colors and graphics
When dealing with large dataset, color coding plays a very important role and eventually essential. Even if PCA does not use class definition in its algorithm since it is an unsupervised method, the kind of patterns expected is generally known.
The driving concept here is that colors are assigned on the basis of class belonging. Again, as in the previous section, colors are always synchronized from PCA tables to PCA plots and to stacked spectra as well
Moreover, in the loading plot, the user is allowed to select more than one bin (see flag option in the loading plot table, or multiple selection of table entry using shift or ctrl key). Visualization of a bin region is obtained with a colored box that is displayed superimposed over the stacked plot. The User can associate different colors to different bins regions
Data filtering and scaling
The results of the analysis depend on the types of filtering and scaling of the matrix that user selects, which therefore must be specified. It can be demonstrated how both factors greatly affect the outcome of the data analysis and thus the rank of the most important variables. PCA module includes several possibilities in terms of data cleaning and scaling.
There is not a general rule in the selection of the type of scaling. For that purposes we recommend the manuscript from van den Berg et. al. [7] which describes extensively how these transformations could improve the information content of the data matrix. Finally, bear in mind that visual inspection and assessment is ultimately one of the most important steps in chemometrics. Conclusion
We have introduced in Mnova 9 a chemometric module called PCA (Principal Component Analysis). PCA have been shown to be very effective in compressing large volume of noisy correlated data into a subspace of much lower dimension than the original data set. Data pretreatment method is crucial to the outcome of the data analysis. The resulting low dimensional representation of the data set has been shown to be of great utility for analysis or monitoring the system under study, as well as in selecting variables for control or markers of the expected pattern.
The possibility to interactively play with PCA plots and spectra at the same time, and the user friendly interface provided by Mnova will be of great advantages also for spectroscopists that are not familiar with multivariate analysis but would like to learn more and test it.
As has always been for Mnova community, the future of this new first step in chemometrics will be driven by user requirements. For that reason we look forward to get feedback, criticisms, suggestions, comments and lots of requests for future development. So, play with it and have fun at looking at your own datasets from a different perspective!
References
[1] B.R. Kowalski, Chemometrics: views and propositions, J. Chem. Inf. Comp. Sci. 15 (1975) 201–203
[2] Chemometrics in bioreactor monitoring. Lourenço, N. D., Lopes, J. a, Almeida, C. F., Sarraguça, M. C., & Pinheiro, H. M. (2012). Bioreactor monitoring with spectroscopy and chemometrics: a review. Analytical and bioanalytical chemistry, 404(4), 1211–37. doi:10.1007/s00216-012-6073-9
[3] Metabonomics and chemometrics in food science and nutrition. Kuang, H., Li, Z., Peng, C., Liu, L., Xu, L., Zhu, Y., Wang, L., et al. (2012). Metabonomics approaches and the potential application in food safety evaluation. Critical reviews in food science and nutrition, 52(9), 761–74. doi:10.1080/10408398.2010.508345
[4] Pharmaco-metabonomic phenotyping and chemometrics. Robertson, D. G., Reily, M. D., & Baker, J. D. (2007). Metabonomics in Pharmaceutical Discovery and Development, 526–539.
[5] Metabonomics and chemometrics in drug safety and toxicology. Griffin, J. (2004). The potential of metabonomics in drug safety and toxicology. Drug Discovery Today Technologies, 1(3), 285–293. doi:10.1016/j.ddtec.2004.10.011
[6] Principal component analysis, Svante Wold, Kim Esbensen, Paul Geladi. Volume 2, Issues 1–3, August 1987, Pages 37–52
[7] Van den Berg, R. A., Hoefsloot, H. C. J., Westerhuis, J. A., Smilde, A. K., & van der Werf, M. J. (2006). Centering, scaling, and transformations: improving the biological information content of metabolomics data. BMC genomics, 7(1), 142. doi:10.1186/1471-2164-7-142
[NMR analysis blog] NMR Baseline Correction - New method in Mnova 9
NMR Baseline Correction - New method in Mnova 9
One of the most ubiquitous issues present in FT-NMR spectra is the existence of baseline artifacts which might adversely affect the identification and quantification of NMR resonances. Whilst modern NMR instruments are equipped with powerful digital filtering employing also oversampling techniques that produce high quality baselines, it is usually the case that some minor baseline corrections might be needed in order to get optimal results. Also, it should not be forgotten that there are thousands of old NMR instruments lacking those latest...
nmrlearner
News from NMR blogs
0
12-27-2013 11:54 AM
[NMR analysis blog] Faster NMR Data Processing with Mnova 9
Faster NMR Data Processing with Mnova 9
http://2.bp.blogspot.com/-c0WFKHCUap8/UqiT2-EoLtI/AAAAAAAAA3s/47Kgu5Otc9M/s1600/FASTNMR.png
For nearly a decade, computer CPU chip makers have gradually adopted the use of multiple cores to increase performance. For instance, the computer from which I’m writing this entry has 4 cores. Roughly speaking, this makes it possible to run different tasks in each core so ideally, depending on the specific application or algorithm; it would be possible to make some operations faster proportionally to the number of available cores.
However, Mnova NMR has...
nmrlearner
News from NMR blogs
0
12-24-2013 12:59 AM
[NMR analysis blog] Mnova goes NUS
Mnova goes NUS
This is one example of a NUS spectrum (HMQC) acquired by Dr. Manuel Martín-Pastor at the University of Santiago de Compostela and processed with Mnova 9.0.
http://2.bp.blogspot.com/-jWf9k5IVY1k/UrcZQSYfeXI/AAAAAAAAA4c/CJPQmw0N-PI/s1600/HMQC.png
nmrlearner
News from NMR blogs
0
12-22-2013 05:24 PM
[NMR analysis blog] Mnova 9.0
Mnova 9.0
http://2.bp.blogspot.com/-CivZ6ohbPPg/UrTEjlu3XBI/AAAAAAAAA38/Fj3D66mpeJg/s1600/mnova.png
I’m very happy to announce that after a long period of very intensive work, version 9.0 of Mnova is finally ready! From our point of view, this version is probably the most ambitious release we have attempted since Mnova was created. Aside from many improvements and bug fixes, this new version comes with great new features, including support for Non Uniform Sampling (NUS), a powerful PCA module, Reference Deconvolution, Absolute Referencing and many, many more.
We are currently...
[Question from NMRWiki Q&A forum] How to process DOSY data in Mnova?
How to process DOSY data in Mnova?
Hi we have a DOSY dataset aquired using Doneshot.c sequence on a varian instrument.
Will MNOVA process the dataset?
Thanks!