BioLiP database is constructed using know protein structures in PDB. The overall flowchart of the database construction is shown below (left panel), which includes three major steps.
Step 1: for each entry in the PDB, the 3D structure is downloaded. For each protein chain (called receptor), the following information (if any) is collected: catalytic site residues mapped from the Catalytic Site Atlas; annotated Enzyme Commission (EC) numbers in the COMPND records; Gene Ontology (GO) terms and UniProt ID from the SIFTS project; and the PubMed abstract of the primary literature citation in the JRNL records.
Step 2: ligands, which are defined as small molecules, are extracted from the PDB file. Three kinds of ligands are collected in the BioLiP database: molecules from the HETATM records (excluding water and modified residues); small DNA/RNA; and peptides with less than 30 residues. The binding affinity (if any) for each ligand is taken from the original literature, Binding MOAD, PDBbind-CN, and Binding DB databases.
Step 3: each ligand is submitted to a composite automated and manual procedure to assess its biological relevance, which is illustrated in the right panel of the figure below.
For more details, please refer to the reference at the bottom of this page.