Protein structure decoys refer to the artificial structural conformations of proteins, which
are often used to guide the design, test and training of the protein folding force fields.
Although structure decoys can be generated by various approaches, including trivial random
walks, only those with protein-like structural features and elements are useful for
protein force field design.
In this page, we list several sets of decoys generated in the practical folding
simulations by I-TASSER and QUARK in both benchmarking and CASP experiments.
Similar to all the existing decoy sets in literature and websites, however, these decoys
have several defects:
(1) the decoy sets are designed only for pre-existing protein targets;
(2) the distribution is often dominated to a few clusters;
(3) the structures often have flaws in secondary structure
and radius of gyration distributions, which can be easily recognized by trivial potentials.
To address these issues, a devoted method, 3DRobot, was recently developed for creating
diverse and well-packed structure decoys for any proteins (see below, or Deng et al,
3DRobot: Automated Generation of Diverse and Well-packed Protein Structure Decoys.
Boinformatics, 32: 378-87, 2016).
3DRobot Decoy sets
-
The 3DRobot Decoy Set was generated by the
3DRobot,
a program devoted for automated generation of diverse and
well-packed protein structure decoys.
The decoy set contains 200 non-homologous proteins each with 300 decoys that
are evenly distributed in the RMSD space from 0 to 12 Anstrongs.
You can also create decoys for your own protein targets through
the on-line 3DRobot Server,
Please refer to the following paper for detail of this decoy set:
- Haiyou Deng, Ya Jia, Yang Zhang.
3DRobot: Automated Generation of Diverse and Well-packed Protein Structure
Decoys.
Boinformatics, 32: 378-87, 2016).
(Download the PDF file and
Support Information).
I-TASSER Decoy sets
-
The I-TASSER Decoy Set I
was taken from the trajectories of the
I-TASSER simulations which include 12,500-32,000 raw
decoys for each protein target.
The backbone structure is built by I-TASSER ab initio simulation
and the side-chain atoms are added using Pulchra.
This decoy set was used to test the
SPICKER clutering program for identifying best near-native structures.
Please refer to the following paper for detail of the I-TASSER Decoy Set I:
- S Wu, J Skolnick, Y Zhang:
Ab initio modeling of small proteins by iterative TASSER simulations.
BMC Biology 2007, 5: 17.
(Download the PDF file)
-
The I-TASSER Decoy Set II
is a structurally non-redundant set of protein structures
which were selected from the I-TASSER Decoy Set I.
Each protein target contains 300-500 decoys.
Each decoy structure was first generated by I-TASSER Monte Carlo simulations
and then refined by GROMACS4.0 MD simulation using OPLS-AA force field
with the purpose of removing steric clashes and improving hydrogen-bonding network.
This decoy set was used to test the ability of atomic potentials, including
RW and RWplus potentials,
for near-native structure identification and RMSD-energy correlation.
Please refer to the following paper for detail of Decoy Set II:
- J Zhang and Y Zhang,
A Distance-Dependent Atomic Potential Derived from Random-Walk
Ideal Chain Reference State for Protein Fold Selection and
Structure Prediction.
PLoS One, vol 5, e15386 (2010).
(Download the PDF file)
QUARK Decoy sets
-
The QUARK Decoy Set was generated by the
QUARK
based ab initio folding simulations
without using template structures.
It contains 145 non-homologous proteins each with 5,000 decoys randomly selected
from the simulation trajectories.
The QUARK simulations were made on the backbone conformations,
with the side-chain atoms added and further refined by the
ModRefiner program.
Please refer to the following paper for detail of this decoy set:
- Dong Xu, Yang Zhang,
Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins, 2012, 80, 1715-1735
(download the PDF file and Support Information).
- Dong Xu and Yang Zhang.
Improving the Physical Realism and Structural Accuracy of Protein Models by a Two-step Atomic-level Energy Minimization.
Biophysical Journal, vol 101, 2525-2534 (2011).
(download the PDF file).
CASP Decoy sets
-
The CASP8 Decoy Set contains the top 100 structural decoys generated
by I-TASSER in CASP8, for all 121 protein domains that were finally assessed
by the accessors. The decoys were ranked based on the structure density of the
SPICKER clusters and 'model[1-5].pdb' are the structure models that were
submitted to CASP8 by Zhang-Server.
Reference:
-
Yang Zhang, I-TASSER: Fully automated protein structure prediction in CASP8.
Proteins: Structure, Function, and Bioinformatics, 77 (Suppl 9): 100-113
(2009).
(Download the PDF file and
Support Information).
-
The CASP9 Decoy Set
is a decoy set generated by I-TASSER and QUARK for
85 single-domain targets in the CASP9 experiment.
Each target contains around 200-780 decoys, from which all
the Zhang-Server and QUARK server models have been selected.
Reference:
-
Dong Xu, Jian Zhang, Ambrish Roy, Yang Zhang.
Automated protein structure modeling in CASP9
by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement.
Proteins: Structure, Function, and Bioinformatics, 2011,
79 (Suppl 10) 147-160
(download the PDF file).
-
The CASP10 Decoy Set
is a decoy set generated by I-TASSER and QUARK for
43 single-domain targets in the CASP10 experiment.
Each target contains around 200-825 decoys, from which all
the Zhang-Server and QUARK server models have been selected.
Reference:
-
Yang Zhang, Interplay of I-TASSER and QUARK for template-based and ab initio
protein structure prediction in CASP10.
Proteins: Structure, Function, and Bioinformatics, 2014,
82 (Suppl 2), 175-187
(download the PDF file and Support Information).
-
The CASP11 Decoy Set
is a decoy set generated by I-TASSER and QUARK for
64 single-domain targets in the CASP11 experiment.
Each target contains around 400-1500 decoys, from which all
the Zhang-Server and QUARK server models have been selected.
Reference:
-
Wenxuan Zhang, Jianyi Yang, Baoji He, Sara Elizabeth Walker, Hongjiu Zhang, Brandon Govindarajoo,
Jouko Virtanen, Zhidong Xue, Hong-Bin Shen, Yang Zhang
Integration of QUARK and I-TASSER for ab initio protein structure prediction in CASP11.
Proteins: Structure, Function, and Bioinformatics, 84 (Suppl 1): 76-86 (2016).
(download the PDF file
and Support Information).