Benchmark datasets for: Roy, Ambrish, and Yang Zhang. "Recognizing protein-ligand binding sites by global structural alignment and local geometry refinement." Structure 20.6 (2012): 987-997.
- Training Set: 348 representative protein-ligand complexes collected from PDB for the purpose of training. Also includes CASP7 and CASP8 targets.
- Testing Set: 501 proteins used for testing COFACTOR.
- Results: Results obtained for the testing dataset using COFACTOR.
The above datasets are collected from the following published literature:
- Dessailly, Benoit H., et al. "LigASite—a database of biologically relevant binding sites in proteins with known apo-structures." Nucleic acids research 36.suppl 1 (2008): D667-D673.
- Perola, Emanuele, et al. "A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevance." Proteins: Structure, Function, and Bioinformatics 56.2 (2004): 235-249.
- Hartshorn, Michael J., et al. "Diverse, high-quality test set for the validation of protein-ligand docking performance." Journal of medicinal chemistry 50.4 (2007): 726-741.
Benchmark datasets for: Roy, Ambrish, Jianyi Yang, and Yang Zhang. "COFACTOR: an accurate comparative algorithm for structure-based protein function annotation." Nucleic acids research (2012): gks372.
- EC set: 318 representative PDB entries with Enzyme Commission (EC) number annotation.
- GO set: 337 representative PDB entries with Gene Ontology (GO) annotation.
Benchmark datasets for: Chengxin Zhang, Peter L Freddolino, and Yang Zhang. "COFACTOR: improved protein function prediction by combining structure, sequence, and protein-protein interaction information." Nucleic acids research (2017): gkx366.
- GO set: 1244 representative UniProt proteins with experimental Gene Ontology (GO) annotation and structure models predicted by I-TASSER.
- EC set: 318 representative PDB proteins with Enzyme Commission (EC) number annotation and structure models predicted by I-TASSER.
- LBS set: 500 representative PDB proteins with experimental structure for ligand binding and structure models predicted by I-TASSER.
- Figures: Source code of figures published with the COFACTOR paper.
- Naïve: The background probability of GO terms used by the "Naïve" baseline predictor.