NMR wisdom:Structure Calculation Using CS-DP ROSETTA from NESG Wiki

Revision as of 08-21-2010, 01:27 PM markber (contribs)			Current Revision 08-21-2010, 09:34 PM
	[B]Structure Calculation Using CS-DP ROSETTA[/B] This is a copy of instructions for CS-DP Rosetta from NESG Wiki that was created on 8/20/19. Please go to the original post at [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FStructure_Calculation_Using_CS-DP_ROSETTA"]http://www.nmr2.buffalo.edu/nesg.wiki/Structure_Calculation_Using_CS-DP_ROSETTA[/URL] for a more up-to-date version. [B] [B]Introduction[/B] [/B] The CS-DP-Rosetta approach [1] merges the ideas of model generation using [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FStructure_Calculation_Using_CS-Rosetta"]CS-Rosetta[/URL] with model filtering by agreement to NOESY data via the DP-score from the [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FRPF_Analysis"]RPF program[/URL] to generate high accuracy protein structures. This hybrid approach uses both local backbone chemical shift data (CS-Rosetta) and unassigned NOESY data (DP-filtering) to direct Rosetta trajectories toward the native structure, producing more accurate models than CS-Rosetta alone. Given a raw (or refined) NOESY peak list and chemical shift (backbone and extensive sidechain) information, the DP-Score is used as a filter to effectively guide the trajectory of CS-Rosetta decoy generation, significantly reducing the search space. Since the NOESY peak list data are not directly included in structure calculation, CS-DP-Rosetta is much more robust with respect to the quality of these peak lists compared to methods which attempt to assign each NOESY peak to one or more specific interproton interactions. [B] [B]Protocols[/B] [/B] [B] Prerequisites [/B] [LIST] []Complete <sup>1</sup>H, <sup>13</sup>C, and <sup>15</sup>N resonance assignments using either conventional triple resonance or GFT approaches. [LIST] []Backbone assignment: using either [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FAutoAssign"]AutoAssign[/URL] or [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FThe_PINE_Server"]PINE[/URL]. []Side chain assignment: manual [/LIST] []3D <sup>13</sup>C- and <sup>15</sup>N-edited NOESY spectra [/LIST] [B] Peak picking procedures [/B] Generating raw NOESY peaks lists by automatic peak picking: [LIST] []The <sup>13</sup>C and <sup>15</sup>N-edited 3D-NOESY raw peak lists are prepared systematically using the program SPARKY by automatic peak picking the 3D spectrum using the 2D <sup>1</sup>H,<sup>13</sup>C and <sup>1</sup>H,<sup>15</sup>N-HSQC as root spectra and 0.02 and 0.2 ppm as the pick-picking tolerances in the indirect <sup>1</sup>H and heavy atom dimensions, respectively. [/LIST] Generating refined NOESY peaks lists: [LIST] []Refined NOESY peak lists are obtained by expert manual editing which involves artifact removal, picking of overlapped peaks and(or) picking starting from a script-based 3D-HcCH intra/sequential-residue NOE list that was manually extended to the long range unpicked resonances. [/LIST] [B] CS-DP-ROSETTA procedure [/B] [B]Input for CS-DP-Rosetta:[/B] [LIST] []Protein sequence file (fasta format) []chemical shift file (BMRB2.1 format) []unassigned raw or refined NOESY peak lists files (Xeasy format or Sparky format). [/LIST] [B]Step 1: Generating ~50000 CS-Rosetta decoys[/B] I. Generate 3-mer & 9-mer fragments library based on sequence and chemical shift information by Rosetta Fragments server or Rosetta software suite II. Use CS-Rosetta protocol to generate ~50,000 decoys, keep 1,000 decoys with lowest Rosetta energy [B]Step 2: Filter decoys by a linear combination of Rosetta energy and DP-Score[/B] I. Prepare sequence file and chemical shift file in BMRB format, and peak list files can be Sparky, Xeasy or any other table format. Use AutoStructure 2.2.1 GUI to make control file for RPF calculation. By default, tolerance for <sup>13</sup>C and <sup>15</sup>N is set to 0.5, and tolerance for the <sup>1</sup>H dimensions is set to 0.05. Make sure sequence file, chemical shift file and all the peak list files in the project directory (1) $Autostructure_install_path/bin/asgui. (2) Select File->New Control File, a “Control File Display” window would appear. (3) In the “General Section” tab, input “Protein Name”, select “Sequence File”, “Chemical Shift File”, set “Iterative Analysis Cycles” to 1, then select tab “PeakList Section”, add peak list files one by one. Save the control file after all the peak lists have been added. II. Calculate DP-Score for the 1000 decoys with lowest Rosetta energy using RPF module of AutoStructure 2.2.1. Command: $Autostructure_install_path/bin/autostructure -c control_file –o rpf_output_path -q path_of_query_structure -s Extract DP-score from rpf_output_path/.ovw file III. Calculate DP-Score for the 1,000 decoys with lowest Rosetta energy using RPF module of AutoStructure 2.2.1 IV. Calculate target function for each decoy: ti = (CS-Rosetta all-atom energy)i + 1000*(1 – DP-score)i V. Rank 1000 decoys based on t<sub>i</sub> , keep the first 20 lowest t<sub>i</sub> decoys for further model rebuild-and-refinement [B]Step 3: Model rebuilding and refinement for 20 lowest t<sub>i</sub> decoys[/B] I. Identifying flexible regions which have largest C-alpha deviations within the 20 lowest t<sub>i</sub> decoys II. Flexible regions are stochastically rebuilt by fragment insertion and CCD loop closure III. Using physically realistic Rosetta force field to perform all-atom refinement for the whole structure IV. The best 10 models with the lowest t<sub>i</sub> are saved as the final CS-DP-Rosetta models [B] Rosetta command line arguments [/B] [B]For CS-Rosetta-DP-Rosetta:[/B] I. First stage CS-Rosetta: -in::file::frag3 aat000_03_05.200_v1_3.gz -in::file::frag9 aat000_09_05.200_v1_3.gz - abinitio::rg_reweight 0.5 -abinitio::rsd_wt_helix 0.5 -abinitio::rsd_wt_loop 0.5 - abinitio::use_filters false -abinitio::increase_cycles 10 -in::file::fasta t000_.fasta.gz - in::file::psipred_ss2 t000_.psipred_ss2.gz -abinitio::fastrelax -score::weights score13_env_hb -silent_gz II. Second stage rebuild-and-refine aa t000 _ -relax -looprlx -nstruct 10 -fa_input -use_sspair -farlx -ex1 -ex2 -random_loop - termini -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -farlx_cycle_ratio 0.4 - idl_no_chain_break -loop_skip_rate 0.0 -loop_file t000.loop_file.gz -vary_omega - output_silent_gz -output_chi_silent -l s.list [B] [B]References[/B] [/B] [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpubmed%2F20000319%3Fitool%3DEntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum%26ordinalpos%3D1"]1. Raman, S., Huang, Y. J., Mao, B., Rossi, P., Aramini, J. M., Liu, G., Montelione, G. T., and Baker, D. (2010) Accurate automated protein NMR structure determination using unassigned NOESY data. [I]J[/I][I]. Am. Chem. Soc.[/I] [I]132[/I], 202-207.[/URL]			[B]Structure Calculation Using CS-DP ROSETTA[/B] This is a copy of instructions for CS-DP Rosetta from NESG Wiki that was created on 8/20/19. Please go to the original post at [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FStructure_Calculation_Using_CS-DP_ROSETTA"]http://www.nmr2.buffalo.edu/nesg.wiki/Structure_Calculation_Using_CS-DP_ROSETTA[/URL] for a more up-to-date version. [B] [B]Introduction[/B] [/B] The CS-DP-Rosetta approach [1] merges the ideas of model generation using [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FStructure_Calculation_Using_CS-Rosetta"]CS-Rosetta[/URL] with model filtering by agreement to NOESY data via the DP-score from the [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FRPF_Analysis"]RPF program[/URL] to generate high accuracy protein structures. This hybrid approach uses both local backbone chemical shift data (CS-Rosetta) and unassigned NOESY data (DP-filtering) to direct Rosetta trajectories toward the native structure, producing more accurate models than CS-Rosetta alone. Given a raw (or refined) NOESY peak list and chemical shift (backbone and extensive sidechain) information, the DP-Score is used as a filter to effectively guide the trajectory of CS-Rosetta decoy generation, significantly reducing the search space. Since the NOESY peak list data are not directly included in structure calculation, CS-DP-Rosetta is much more robust with respect to the quality of these peak lists compared to methods which attempt to assign each NOESY peak to one or more specific interproton interactions. [B] [B]Protocols[/B] [/B] [B] Prerequisites [/B] [LIST] []Complete <sup>1</sup>H, <sup>13</sup>C, and <sup>15</sup>N resonance assignments using either conventional triple resonance or GFT approaches. [LIST] []Backbone assignment: using either [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FAutoAssign"]AutoAssign[/URL] or [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.nmr2.buffalo.edu%2Fnesg.wiki%2FThe_PINE_Server"]PINE[/URL]. []Side chain assignment: manual [/LIST] []3D <sup>13</sup>C- and <sup>15</sup>N-edited NOESY spectra [/LIST] [B] Peak picking procedures [/B] Generating raw NOESY peaks lists by automatic peak picking: [LIST] []The <sup>13</sup>C and <sup>15</sup>N-edited 3D-NOESY raw peak lists are prepared systematically using the program SPARKY by automatic peak picking the 3D spectrum using the 2D <sup>1</sup>H,<sup>13</sup>C and <sup>1</sup>H,<sup>15</sup>N-HSQC as root spectra and 0.02 and 0.2 ppm as the pick-picking tolerances in the indirect <sup>1</sup>H and heavy atom dimensions, respectively. [/LIST] Generating refined NOESY peaks lists: [LIST] []Refined NOESY peak lists are obtained by expert manual editing which involves artifact removal, picking of overlapped peaks and(or) picking starting from a script-based 3D-HcCH intra/sequential-residue NOE list that was manually extended to the long range unpicked resonances. [/LIST] [B] CS-DP-ROSETTA procedure [/B] [B]Input for CS-DP-Rosetta:[/B] [LIST] []Protein sequence file (fasta format) []chemical shift file (BMRB2.1 format) []unassigned raw or refined NOESY peak lists files (Xeasy format or Sparky format). [/LIST] [B]Step 1: Generating ~50000 CS-Rosetta decoys[/B] I. Generate 3-mer & 9-mer fragments library based on sequence and chemical shift information by Rosetta Fragments server or Rosetta software suite II. Use CS-Rosetta protocol to generate ~50,000 decoys, keep 1,000 decoys with lowest Rosetta energy [B]Step 2: Filter decoys by a linear combination of Rosetta energy and DP-Score[/B] I. Prepare sequence file and chemical shift file in BMRB format, and peak list files can be Sparky, Xeasy or any other table format. Use AutoStructure 2.2.1 GUI to make control file for RPF calculation. By default, tolerance for <sup>13</sup>C and <sup>15</sup>N is set to 0.5, and tolerance for the <sup>1</sup>H dimensions is set to 0.05. Make sure sequence file, chemical shift file and all the peak list files in the project directory (1) $Autostructure_install_path/bin/asgui. (2) Select File->New Control File, a “Control File Display” window would appear. (3) In the “General Section” tab, input “Protein Name”, select “Sequence File”, “Chemical Shift File”, set “Iterative Analysis Cycles” to 1, then select tab “PeakList Section”, add peak list files one by one. Save the control file after all the peak lists have been added. II. Calculate DP-Score for the 1000 decoys with lowest Rosetta energy using RPF module of AutoStructure 2.2.1. Command: $Autostructure_install_path/bin/autostructure -c control_file –o rpf_output_path -q path_of_query_structure -s Extract DP-score from rpf_output_path/.ovw file III. Calculate DP-Score for the 1,000 decoys with lowest Rosetta energy using RPF module of AutoStructure 2.2.1 IV. Calculate target function for each decoy: ti = (CS-Rosetta all-atom energy)i + 1000*(1 – DP-score)i V. Rank 1000 decoys based on t<sub>i</sub> , keep the first 20 lowest t<sub>i</sub> decoys for further model rebuild-and-refinement [B]Step 3: Model rebuilding and refinement for 20 lowest t<sub>i</sub> decoys[/B] I. Identifying flexible regions which have largest C-alpha deviations within the 20 lowest t<sub>i</sub> decoys II. Flexible regions are stochastically rebuilt by fragment insertion and CCD loop closure III. Using physically realistic Rosetta force field to perform all-atom refinement for the whole structure IV. The best 10 models with the lowest t<sub>i</sub> are saved as the final CS-DP-Rosetta models [B] Rosetta command line arguments [/B] [B]For CS-Rosetta-DP-Rosetta:[/B] I. First stage CS-Rosetta: -in::file::frag3 aat000_03_05.200_v1_3.gz -in::file::frag9 aat000_09_05.200_v1_3.gz - abinitio::rg_reweight 0.5 -abinitio::rsd_wt_helix 0.5 -abinitio::rsd_wt_loop 0.5 - abinitio::use_filters false -abinitio::increase_cycles 10 -in::file::fasta t000_.fasta.gz - in::file::psipred_ss2 t000_.psipred_ss2.gz -abinitio::fastrelax -score::weights score13_env_hb -silent_gz II. Second stage rebuild-and-refine aa t000 _ -relax -looprlx -nstruct 10 -fa_input -use_sspair -farlx -ex1 -ex2 -random_loop - termini -short_range_hb_weight 0.50 -long_range_hb_weight 1.0 -farlx_cycle_ratio 0.4 - idl_no_chain_break -loop_skip_rate 0.0 -loop_file t000.loop_file.gz -vary_omega - output_silent_gz -output_chi_silent -l s.list [B] [B]References[/B] [/B] [URL="http://www.bionmr.com/forum/redirector.php?url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fpubmed%2F20000319%3Fitool%3DEntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum%26ordinalpos%3D1"]1. Raman, S., Huang, Y. J., Mao, B., Rossi, P., Aramini, J. M., Liu, G., Montelione, G. T., and Baker, D. (2010) Accurate automated protein NMR structure determination using unassigned NOESY data. [I]J[/I][I]. Am. Chem. Soc.[/I] [I]132[/I], 202-207.[/URL]