Glossary

A) Ligand Based Target Prediction

It based on hypothesis that similar ligands will have similar biological activity. Target protein can be linked to molecule in question by studying the similarity/chemistry, between set of known active ligands of target protein and the query molecule.

B) Structure Based Target Prediction

It takes in to account the structure of protein. Typically molecule in quetion docked to the active of site of set of target proteins and depending on docking score and interaction pattern, target is assigned to the molecule.

C) Descriptor

Numerical value repsenting some property of molecule. For e.g Molecular weight.

D) Fingerprint or Molecular Representation

Set of numbers or a vector describing molecular structure or some properties from molecule. Vector can be binary vector where 0 indicate absence and 1 indicate presence of certain feature in molecule. Fingerprint can be contructed by collecting set of numercial descriptors (For e.g MQN).

E) Similarity Measures

Mathematical calculations that quantify the similarity of two molecules.

F) Distances

Mathematical calculations that quantify the distance between two molecules. Distance can be considered as inverse of similarity.

G) ChEMBL database

It is a publicly available database (https://www.ebi.ac.uk/chembl/) which collects the information on biological active molecules reported in literatures. As of now it contains 1.4 million distinct compounds reported and more than 10'000 target proteins.