BioNMR
NMR aggregator & online community since 2003
BioNMR    
Learn or help to learn NMR - get free NMR books!
 

Go Back   BioNMR > NMR community > News from NMR blogs
Advanced Search
Home Forums Wiki NMR feeds Downloads Register Today's Posts



Jobs Groups Conferences Literature Pulse sequences Software forums Programs Sample preps Web resources BioNMR issues


Webservers
NMR processing:
MDD
NMR assignment:
Backbone:
Autoassign
MARS
UNIO Match
PINE
Side-chains:
UNIO ATNOS-Ascan
NOEs:
UNIO ATNOS-Candid
UNIO Candid
ASDP
Structure from NMR restraints:
Ab initio:
GeNMR
Cyana
XPLOR-NIH
ASDP
UNIO ATNOS-Candid
UNIO Candid
Fragment-based:
BMRB CS-Rosetta
Rosetta-NMR (Robetta)
Template-based:
GeNMR
I-TASSER
Refinement:
Amber
Structure from chemical shifts:
Fragment-based:
WeNMR CS-Rosetta
BMRB CS-Rosetta
Homology-based:
CS23D
Simshift
Torsion angles from chemical shifts:
Preditor
TALOS
Promega- Proline
Secondary structure from chemical shifts:
CSI (via RCI server)
TALOS
MICS caps, β-turns
d2D
PECAN
Flexibility from chemical shifts:
RCI
Interactions from chemical shifts:
HADDOCK
Chemical shifts re-referencing:
Shiftcor
UNIO Shiftinspector
LACS
CheckShift
RefDB
NMR model quality:
NOEs, other restraints:
PROSESS
PSVS
RPF scores
iCing
Chemical shifts:
PROSESS
CheShift2
Vasco
iCing
RDCs:
DC
Anisofit
Pseudocontact shifts:
Anisofit
Protein geomtery:
Resolution-by-Proxy
PROSESS
What-If
iCing
PSVS
MolProbity
SAVES2 or SAVES4
Vadar
Prosa
ProQ
MetaMQAPII
PSQS
Eval123D
STAN
Ramachandran Plot
Rampage
ERRAT
Verify_3D
Harmony
Quality Control Check
NMR spectrum prediction:
FANDAS
MestReS
V-NMR
Flexibility from structure:
Backbone S2
Methyl S2
B-factor
Molecular dynamics:
Gromacs
Amber
Antechamber
Chemical shifts prediction:
From structure:
Shiftx2
Sparta+
Camshift
CH3shift- Methyl
ArShift- Aromatic
ShiftS
Proshift
PPM
CheShift-2- Cα
From sequence:
Shifty
Camcoil
Poulsen_rc_CS
Disordered proteins:
MAXOCC
Format conversion & validation:
CCPN
From NMR-STAR 3.1
Validate NMR-STAR 3.1
NMR sample preparation:
Protein disorder:
DisMeta
Protein solubility:
camLILA
ccSOL
Camfold
camGroEL
Zyggregator
Isotope labeling:
UPLABEL
Solid-state NMR:
sedNMR


Reply
 
Thread Tools Search this Thread Rate Thread Display Modes
  #1  
Old 07-31-2014, 02:00 PM
nmrlearner's Avatar
Senior Member
 
Join Date: Jan 2005
Posts: 23,777
Points: 193,617, Level: 100
Points: 193,617, Level: 100 Points: 193,617, Level: 100 Points: 193,617, Level: 100
Level up: 0%, 0 Points needed
Level up: 0% Level up: 0% Level up: 0%
Activity: 50.7%
Activity: 50.7% Activity: 50.7% Activity: 50.7%
Last Achievements
Award-Showcase
NMR Credits: 0
NMR Points: 193,617
Downloads: 0
Uploads: 0
Default PCA and NMR: Practical aspects

PCA and NMR: Practical aspects

As of version 9.0, it is possible to perform PCA of NMR data sets directly from within the Mnova User Interface without having to resort to third party applications. The basic PCA functionality has been previously covered in this blog (see Chemometrics under Mnova 9 – PCA)and in this entry we are going to discuss in more detail some more practical aspects, particularly on the different binning, filtering and scaling options.


What follows has been kindly written by Silvia Mari (project leader of the PCA module) and Isaac Iglesias, who programmed this module in Mnova.


Introduction

Matrix generation from an array of NMR spectra is the core step in chemometric analysis. This procedure involves several options that the user should chose. In this entry we want to focus on the practical aspects concerning matrix preparation from NMR data. Broadly speaking, we can consider three main issues:
  • Choice of binning method: Sum vs Peak
  • Filtering or not filtering?
  • Choice of Scaling strategy
Choice of binning method: Sum vs Peak


When dealing with high resolution NMR spectra it is in general impracticable to work with the entire data points of the spectra which are usually in the order of 32Kb and bigger. The most common strategy used to reduce the number of variables consists in dividing each spectrum in a defined number of regions, the so called bins. Several binning strategies are available today, from regular binning, where bins have fixed width, to more sophisticated strategies such as gaussian or dynamic adaptive binning [1]. But even for these cases, when dealing with particularly crowded spectra, it usually happens that shifts in peaks close to bin boundaries can cause dramatic quantitative changes in adjacent bins. A good help in solving this problem could come from peak deconvolution strategies. Generally speaking, a deconvolved peak is a mathematical entity characterized by a chemical shift (frequency), intensity and half-height line width. The integral of a peak can be automatically derived assuming a peak shape (i.e. Lorentzian) and the intensity and line width. For this reason, binning a spectrum of deconvolved peaks reads out virtually completely the problem of bin boundaries as illustrated in figure 1.







Figure 1 – Binning real peaks versus binning deconvolved peaks

When dealing with an array of NMR spectra, whilst regular binning of a number b of bins over stacked spectra containing s spectra will generate a matrix bxs (see figure 2), it is not possible to generate a similar matrix using directly deconvolved peaks (peak list) since the number and position of peaks varies from spectrum to spectrum







Figure 2 – Matrix generation from regular binning or peak list.


To encompass this problem there are two main strategies: (1) provide algorithms for peak alignment over the spectra series, as well as strategies for dealing with missing peaks in order to end up with the same number of peaks and the same peak positions for all the spectra; (2) perform binning over the peak table.


In the PCA module available in Mnova, we adopt the second solution. User can decide whether to use regular binning (Sum) or binning over deconvolved peaks (Peak) from the binning options. An example of better classification is qualitatively represented in figure 3, where score plots are represented for binning using Sum method (panel A) and binning using Peak method (panel B).






Figure 3 – Score plots obtained using same bin width of 0.03ppm; in both cases data were normalized by the sum and pareto scaled. In panel A bins were obtained directly as integration of real spectra; in panel B bins were obtained by binning of the corresponding peak list obtained after global spectral deconvolution.

Filtering or not filtering?

When reducing bin width to approximate spectral resolution, and hence increasing the number of variables, it is generally required to introduce filtering strategies in order to filter out those variables that do not show significantly changes. There are established filtering strategies that are commonly applied to genomics type of data and that could also be successfully used for NMR-based type of data[1]. In the PCA module we have implemented five filtering options, namely:
  1. Standard Deviation
  2. Median Absolute Deviation
  3. Interquartile Range
  4. Mean Value
  5. Median Value


In the first three cases a fixed fraction (default 10%) of the bins is discarded (e.g. if the matrix is composed by 100 bins it means that 10 bins are discarded) and the selection is based on the Filter method chosen. In the case of Mean Value or Median Value, user is asked to input a value for the Mean or the Median. By doing so, only bins that display a lower value of the inputted one are discarded. In the following figure, the difference in clustering capability when the filtering is applied or not is illustrated. Finally, it worth noting that very often, NMR data can contain regions which should discarded and included into the so called blind regions; these regions will not be taken into account in the principal component calculation.









Figure 4 - Score plots obtained using same bin width of 0.01ppm; in both cases data were normalized by the sum and pareto scaled. In panel A no filter was applied; in panel B filtering strategy based on Mean Value was applied. A cut-off value of 100 was used.


Choice of Scaling strategy

Scaling is an operation that is performed on the variables (columns) of the matrix. Scaling strategy depends from one hand from the biological information we wish to extract, but on the other hand also on the data analysis method chosen (in our case PCA). As a first approach the so-called Centeringis generally applied to every analysis. With Centering all bin values fluctuate around zero instead of around the mean of each bin; therefore Centering is a method that adjusts for differences in the offset between high and low abundant compounds. There are several methods available in literature for scaling [3], and generally centering is applied in combination with these methods. Scaling strategies could be divided in two subclasses:methods that use data dispersion (such as standard deviation) as scaling factor; and methods that use size measure (such as the mean). For the first group Mnova includes Auto, Pareto andVast scaling strategies. For the second group Range and Level scaling are available. Generally speaking, when dealing with PCA analysis, the first group is normally preferred. Figure 5 shows score plot differences between PCA that used Pareto scaling (A panel) in comparison with PCA that used Level scaling




Figure 5 - Score plots obtained using same bin width of 0.05 ppm and normalization by the sum. In panel A Pareto scaling was applied; in panel B Level scaling was applied.





References

[1] Amber J Hackstadt, Filtering for increased power for microarray data analysis. BMC Bioinformatics 2009, 10:11

[2] Paul E. Anderson, Metabolomics, Volume 7, Issue 2, pp 179-190 (2010)


[3] Robert A van den Berg, Centering, scaling, and transformations: improving the biological information content of metabolomics data. BMC Genomics 2006, 7:142









More...

Source: NMR-analysis blog
Reply With Quote


Did you find this post helpful? Yes | No

Reply
Similar Threads
Thread Thread Starter Forum Replies Last Post
[NMR paper] Practical aspects of NMR signal assignment in larger and challenging proteins.
Practical aspects of NMR signal assignment in larger and challenging proteins. Related Articles Practical aspects of NMR signal assignment in larger and challenging proteins. Prog Nucl Magn Reson Spectrosc. 2014 Apr;78C:47-75 Authors: Frueh DP Abstract NMR has matured into a technique routinely employed for studying proteins in near physiological conditions. However, applications to larger proteins are impeded by the complexity of the various correlation maps necessary to assign NMR signals. This article reviews the data analysis...
nmrlearner Journal club 0 02-19-2014 03:12 PM
Practical aspects of high-sensitivity multidimensional 13C MAS NMR spectroscopy of perdeuterated proteins
Practical aspects of high-sensitivity multidimensional 13C MAS NMR spectroscopy of perdeuterated proteins April 2012 Publication year: 2012 Source:Journal of Magnetic Resonance, Volume 217</br> </br> The double nucleus enhanced recoupling (DONER) experiment employs simultaneous irradiation of protons and deuterons to promote spin diffusion processes in a perdeuterated protein. This results in 4–5times higher sensitivity in 2D 13C–13C correlation experiments as compared to PDSD . Here, a quantitative comparison of PDSD, 1H-DARR, 2H-DARR, and 1H+ 2H DONER has been...
nmrlearner Journal club 0 02-03-2013 10:13 AM
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects November 2011 Publication year: 2011 Source:Progress in Nuclear Magnetic Resonance Spectroscopy, Volume 59, Issue 4</br> </br> Graphical abstract
nmrlearner Journal club 0 12-15-2012 09:51 AM
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects November 2011 Publication year: 2011 Source:Progress in Nuclear Magnetic Resonance Spectroscopy, Volume 59, Issue 4</br> </br> Graphical abstract
nmrlearner Journal club 0 12-01-2012 06:10 PM
Practical Aspects of High-Sensitivity Multidimensional 13C MAS NMR Spectroscopy of Perdeuterated Proteins
Practical Aspects of High-Sensitivity Multidimensional 13C MAS NMR Spectroscopy of Perdeuterated Proteins Publication year: 2012 Source:Journal of Magnetic Resonance</br> Ümit Akbey, Barth-Jan van Rossum, Hartmut Oschkinat</br> The double nucleus enhanced recoupling (DONER) experiment employs simultaneous irradiation of protons and deuterons to promote spin diffusion processes in a perdeuterated protein. This results in 4-5 times higher sensitivity in 2D 13C-13C correlation experiments as compared to PDSD. Here, a quantitative comparison of PDSD, 1H-DARR, 2H-DARR, and...
nmrlearner Journal club 0 03-09-2012 09:16 AM
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects Publication year: 2011 Source:Progress in Nuclear Magnetic Resonance Spectroscopy, Volume 59, Issue 4</br> Julia Koehler, Jens Meiler</br> Graphical Abstract http://ars.sciencedirect.com/content/image/1-s2.0-S0079656511000410-fx1.jpg Graphical abstract Highlights
nmrlearner Journal club 0 03-09-2012 09:16 AM
Practical Aspects of High-Sensitivity MultidimensionalC MAS NMR Spectroscopy of Perdeuterated Proteins
Practical Aspects of High-Sensitivity MultidimensionalC MAS NMR Spectroscopy of Perdeuterated Proteins Publication year: 2012 Source: Journal of Magnetic Resonance, Available online 1 March 2012</br> Ümit*Akbey, Barth-Jan*van Rossum, Hartmut*Oschkinat</br> Thedouble nucleus enhanced recoupling(DONER) experiment employs simultaneous irradiation of protons and deuterons to promote spin diffusion processes in a perdeuterated protein. This results in 4-5 times higher sensitivity in 2DC-C correlation experiments as compared to PDSD.Here, a quantitative comparison of PDSD,H-DARR,H-DARR,...
nmrlearner Journal club 0 03-01-2012 11:03 PM
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects
Expanding the utility of NMR restraints with paramagnetic compounds: Background and practical aspects Publication year: 2011 Source: Progress in Nuclear Magnetic Resonance Spectroscopy, In Press, Accepted Manuscript, Available online 27 May 2011</br> Julia, Koehler , Jens, Meiler</br> *Highlights:*? introduction of a lanthanide ion into a protein leads to paramagnetic effects and partial alignment. ? Paramagnetic Relaxation Enhancements (PREs), Residual Dipolar Couplings (RDCs), and Pseudo-Contact Shifts (PCSs), among others, can be measured. ? amplitude of paramagnetic effects...
nmrlearner Journal club 0 05-28-2011 10:54 PM



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



BioNMR advertisements to pay for website hosting and domain registration. Nobody does it for us.



Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright, BioNMR.com, 2003-2013
Search Engine Friendly URLs by vBSEO 3.6.0

All times are GMT. The time now is 12:05 AM.


Map