Compounds to be included in the database were selected according to the definitions provided in the "Glossary and Key Sources box" of the paper (ref. to be added soon) and the key inclusion and exclusion criteria listed in the table below. A first list of compounds was obtained using the keyword "peptides" and the filter "approved" from the DrugBank website, which contains only FDA-approved drugs. This search resulted in 1682 compounds. Out of these entries, only 89 matched the prescribed criteria and were included in the database. An additional 16 peptides from other markets (e.g. Europe or Japan) were found on EMA web page and Pharmaceutical and Medical Devices Agency website, and Drug Central website.
In general, peptides were included in the database if they fitted the criteria initially established and listed in the table. A repository of 105 compounds was finally obtained. With the aim of providing each peptide with a complete profile comprising relevant information regarding terminal half-life, protein binding, therapeutic indications, and routes of administration, specific searches were carried out first in DrugBank, then in Google Scholar, in National Center for Advancing Translational Sciences web page, in Drugs.com, and in pharmaceutical companies' websites, using the generic name of the individual peptide.
- Lower length limit: two amino acids linked together by an amide bond.
- Upper length limit: less than 50 amino acids and molar mass less than 5000 g/mol.*
- Peptides conjugated to other molecules are included, as long as they meet the other inclusion criteria (especially molar mass i.e. antibody-drug conjugates are excluded).
- Only non-insulin peptide drugs are included.
- Only peptides for human use are included.
- Theragnostic and diagnostic peptides are included.
- Peptides are included if approved in at least one of the main pharmaceutical market areas (North America, Western Europe, and Japan) or in one of the key countries in each region (United States, Germany, France, UK, Italy, Spain, Greece, the Netherlands, and Japan).
- In the case of a mixture of components we considered the main component (e.g., Gramicidin D, > 80%) as the only one present for ease of calculation and classification.
SMILES codes were collected from PubChem (see single peptide profile for the SMILES codes used for each compound) and used to calculate the peptide molar mass values (g/mol) on ChemAxon's Chemicalize platform.
The authors individually analysed the peptide's chemical structures and classified their constitutional members. In agreement with the definitions given, peptides were divided into natural and non-natural amino acids by fragmenting the backbone in a manner consistent with amide bond retrosynthetic analysis. Non-amino acidic moieties, previously identified as modifications, were further detached from the amino acid they are linked to. The figure below shows an example of this procedure applied to the peptide daptomycin, along with the resulting building blocks.
Daptomycin and its constitutional members. Kyn (kynurenine), Orn (ornithine), 3-Me-Glu (3-methyl-glutamic acid).
Due to the structural complexity of glycopeptide antibiotics (dalbavancin, telavancin, oritavancin, and teicoplanin), there was no obvious way to rationally divide and classify the building blocks as described above. Hence, the sequences of these peptides have not been shown (N.A.). Similarly, the high complexity of multicyclic peptides did not allow the identification of a defined single cycle and, therefore, the members of each cycle were not counted. Each constitutional member was classified as polar, acidic, basic, non-polar aliphatic, or aromatic based on its structural characteristics. For the natural amino acidic residues, the designations polar, acidic, non-polar aliphatic, or aromatic were derived from literature precedent and are generally ascribed by the nature of the side chain. Since non-natural amino acidic members form an amide backbone in the way of natural amino acids, they were classified following the same principle used for natural AAs. In contrast, the modifications group does not conform to the typical structure of an amino acid, and for this reason, their classifications were based not only on their entire structure but also on the way they are conjugated to the peptide. In the example of daptomycin, decanoic acid has been classified as an aliphatic modification regardless of the carboxylic acid moiety, since this acidic functional group is exploited to bind to the peptide N-terminus and its contribution to the final polarity is not relevant. The complete list of non-natural amino acids and modifications together with their polarity classifications can be found in the dedication of the website.
IV | intravenous |
IM | intramuscular |
SC | subcutaneous |
IP | intrapleural |
IC | intracavitary |
IA | intraarterial |
SL | sublingual |
US | United States of America |
EU | Europe |
UK | United Kingdom |
JP | Japan |
FDA | Food and Drug Administration |
EMA | European Medicines Agency |
PMDA | Pharmaceuticals and Medical Devices Agency |
AIFA | Agenzia Italiana del Farmaco (Italian Medicine Agency) |
MHRA | Medicines and Health products Regulatory Agency |
CADTH | Canadian Agency for Drugs and Technologies |