Overview
LOMETS (LOcal MEta-Threading-Server) is a meta-server method for protein structure prediction [1,2]. It generates protein structure predictions by ranking and selecting models from multiple state-of-the-art threading programs. Starting from a query sequence, deep multiple sequence alignments (MSAs) are generated by iterative sequence homology searches through multiple sequence databases. 11 programs, which are all locally installed on our cluster, are implemented to identify structural templates from the PDB library. The top templates are ranked and selected by a score that combines the alignment Z-score, program-specific confidence scores and the sequence identity to the query. The functional annotations (including gene ontology terms and enzyme commission number) are generated by searching the template structures through the BioLiP function library [3]. A flowchart of the LOMETS pipeline is depicted in Figure 1, where users can find references for the individual threading methods at the bottom of the page.
Users can use LOMETS' output to generate biological insights for their protein of interest. For example the functional annotations of the targets given by LOMETS can tell users the type of enzymes the target proteins belong to (EC term) and/or protein functions, such as protein-binding and ATP-binding, of the target proteins (GO term), so that users can reduce the scope of experiments based on the information.
For those users who want to quickly predict 3D models for a query sequence, detect its homologous templates and/or determine the functional annotations (GO terms and EC numbers) for the detected templates, we recommend they use LOMETS. Because the LOMETS server does not attempt to refine the threading models, the response time is fast.
LOMETS is a meta-server method designed for protein structure prediction. It has two major advantages over other protein structure prediction servers. First, LOMETS can give users results quickly. Second, the quality of the structural models predicted by LOMETS are relatively high, even though they are slightly worse than I-TASSER.
Both LOMETS and I-TASSER are servers designed for protein structure prediction. Starting from a query sequence, the I-TASSER server first retrieves template proteins using LOMETS, and then performs structural refinement assembly simulations. Despite their accuracy, the refinement simulations are time consuming. For those users who want a quicker response time or who do not need refined models, we recommend they use only LOMETS. Because the LOMETS server does not attempt to refine the threading models, the response time is faster than the I-TASSER server.
Second, since I-TASSER models are often structures combined from multiple templates, it is difficult or impossible to track the source of the original templates used to build the composite models. However, since LOMETS models are mostly derived from individual templates, the correspondence between final models and the starting templates is more transparent. Partly due to the usefulness of template data transparency, LOMETS provides a longer list of template alignments (11*10=110 templates) while I-TASSER only lists the top-ten templates that are most influential to the final model construction.
Finally, both the LOMETS and I-TASSER servers give functional annotation information. But the functional annotations given by I-TASSER are predicted using our in-house COFACTOR server for query proteins. On the other hand, LOMETS shows the functional annotation information directly associated with the original homologous templates. Even so, since the query protein and templates should be homologous, this gives users a general sense of the query's function.
In summary, if users want to have a quicker response and pay more attention to the insights derived from the original homologous templates, we recommend they use LOMETS. However, if users want to construct high-quality model predictions for the 3D structure of a query protein, especially if the query protein may not have closely homologous templates, we recommend they use I-TASSER.
For a given target, 220 templates are generated by 11 component servers, where each server generates 20 templates that are sorted by their Z-scores for each threading algorithm. The top 10 templates are finally selected from the 220 templates based on the following scoring function:
score(i,j)=Z(i,j)/Z0(i) * conf(i)+seqid(i,j)
where Z(i,j) is the Z-score of the j-th template for the i-th server, Z0(i) is the Z-score cutoff for defining good/bad templates for the i-th server, conf(i) is the confidence of the i-th server which is defined as the average TM-score to native of all predictions calculated from a large-scale benchmark test. seqid(i,j) is the sequence identity to query for the j-th template of the i-th server. The parameters are listed in the following table:
i Server(i) Z0(i) conf(i) Reference - --------- ------ ------- --------- 1 CEthreader 5.6 0.617 [4] 2 HHpred 83.0 0.589 [5] 3 SparksX 6.9 0.587 [6] 4 FFAS3D 33.0 0.574 [7] 5 Neff-MUSTER 8.7 0.570 6 MUSTER 6.1 0.569 [8] 7 HHsearch 10.0 0.567 [5] 8 SP3 7.0 0.566 [9] 9 PPAS 7.6 0.562 [2] 10 PROSPECT2 3.2 0.558 [10] 11 PRC 21.0 0.536 [11]
Server running time statistics
The running time depends on the protein size and typically a smaller protein takes less time than a larger protein. Furthermore, if too many sequences are accumulated in the queue, the procedure may take a longer time. Figure 3 represents the actual response time versus protein size for the 1,433 jobs processed by the LOMETS2 server recently. The red line is fit to the targets with the quickest response time, which should correspond to the actual running time of the LOMETS2 programs when the job queue is clear.
User Inputs
The user needs to paste the fasta-formatted amino acid sequence into the input box, or upload the amino acid sequence of the query protein using the browse button.
Figure 4. User inputs.
Advanced Options
Exclude templates: LOMETS derives models from known PDB structures (templates). If "remove templates sharing >30% sequence identity with target" is chosen, templates will not be generated from template structures that are highly homologous to the target sequence. In general, excluding homologous templates will make structure prediction harder, so this option is only for benchmarking purposes.
Content in output page
Illustration of output
Figure 5. LOMETS summary output.
Figure 6. Models.
Figure 7. Individual threading results and functional annotations.
[1] W Zheng, C Zhang, Q Wuyun, R Pearce, Y Li, Y Zhang. LOMETS2: Improved meta-threading server for fold-recognition and structure-based function annotation for distant-homology proteins. Nucleic Acids Research, submitted (2019).
[2] Wu S, Zhang Y. LOMETS: A local meta-threading-server for protein structure prediction. Nucleic Acids Research. 35, 3375-3382 (2007).
[3] Yang J, Roy A, Zhang Y. BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions,
Nucleic Acids Research, 41: D1096-D1103 (2013)
[4] Zheng, W., Wuyun, Q., Zhang, Y. (2018) CEthreader: Detecting distant-homology proteins using contact map guided threadingin preparation.
Critical Assessment of Techniques for Protein Structure Prediction (CASP) 13 abstract book. 208-209.
[5] Soding, J. (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics (Oxford, England), 21, 951-960.
[6] Zhou, H. and Zhou, Y. (2005) Fold recognition by combining sequence profiles derived from evolution and from depth-dependent
structural alignment of fragments. Proteins, 58, 321-328.
[7] Xu D, Jaroszewski L, Li Z, Godzik A. (2014) FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking. Bioinformatics. 30(5): 660-7.
[8] Wu S, Zhang Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information.
Proteins, 72, 547-556 (2008).
[9] Zhou, H. and Zhou, Y. (2004) Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary
structure information for fold recognition. Proteins, 55, 1005-1013.
[10] Xu, Y. and Xu, D. (2000) Protein threading using PROSPECT: design and evaluation. Proteins, 40, 343-354.
[11] Madera, M. (2008) Profile Comparer: a program for scoring and aligning profilehidden Markov models. Bioinformatics. 24(22):2630-2631.
yangzhanglabumich.edu | (734) 647-1549 | 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218