DeepFoldRNA is a deep learning method for de novo RNA tertiary structure prediction. Starting from a query sequence, it first collects an alignment of homologous sequences from multiple sequence databases. Spatial restraints (distance maps and inter-residue orientations) are then predicted by deep self-attention-based neural networks and converted into negative log-likelihood potentials. Finally, full-length structure models are generated using L-BFGS folding simulations based on minimization of the potential with respect to the backbone pseudo-torsion angles.
Figure 1. Flowchart of DeepFoldRNA which consists of two modules: (A) Module-I: Restraint Construction. Starting from a query nucleic acid sequence, mutliple RNA sequence databases are searched in order to create a multiple sequence alignment (MSA) for the target RNA. Then the MSA is used to derive the predicted secondary structure which in turn is used to initialize the pair embedding that captures the spatial relationship between each nucleic acid. The raw MSA is also embedded into the network to initialize the MSA representation, which captures the information contained in the alignment of homologous sequences. The MSA and pair embeddings are then processed by the MSA Transformer layers, which use multiple self-attention mechanisms to extract information from the MSA and pair embeddings, where communication is encouraged between the two to ensure consistency. Finally, the sequence embedding is extracted from the row in the final MSA embedding that corresponds to the query sequence, which is further processed using self-attention mechanisms by the Sequence Transformer layers. Finally, the predicted distance and orientation maps are generated from a linear projection of the final pair embedding, while the pseudo-torsion angles are predicted by a linear projection of the sequence embedding. (B) Module-II: 3D structure assembly. These restraints are converted into a negative-log liklihood potential and L-BFGS folding simulations are used to minimize the conformation in torsion angle space to produce a final model.
Figure 2. Definition of the geometric restraints predicted by DeepFoldRNA. (A) inter-residue distances; (B) inter-residue torsion angles; and (C) backbone pseudo-torsion angles.
The DeepFoldRNA server takes as input a nucleic acid sequence in FASTA format, as well as an email address to inform a user when their job has finished running. Additionally, after submitting a job, an email will be sent to notify users that their job is running.
The output of the DeepFoldRNA server consists of the following sections:
yangzhanglabumich.edu
| (734) 647-1549 | 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218