Data Sources
Galen integrates 28 biomedical databases into a unified knowledge graph. Each database contributes specific types of evidence — from drug bioactivity to gene essentiality to clinical genomics.
ChEMBL
Drug activityBioactivity measurements for 2.4M+ compounds
cBioPortal
Clinical genomicsSomatic mutations across 498 cancer studies
DepMap
Gene essentialityCRISPR knockout screens across 1,000+ cell lines
GDSC
Drug sensitivityDrug response across 1,000+ cell lines
GTEx
Gene expressionNormal tissue gene expression across 54 tissues
STRING
Protein interactionsProtein-protein interaction networks
OncoKB
Clinical annotationMSK precision oncology gene annotations
DrugComb
Drug combinationsDrug combination synergy measurements
COSMIC
Somatic mutationsCatalogue of somatic mutations in cancer
ClinVar
Variant classificationClinical significance of genetic variants
PharmGKB
PharmacogenomicsDrug-gene-disease relationships
IntOGen
Driver genesCancer driver gene predictions across tumor types
MSigDB
Gene setsMolecular Signatures Database gene set collections
DrugBank
Drug encyclopediaComprehensive drug target and interaction data
CTD
Chemical-gene-diseaseComparative Toxicogenomics Database
BioGRID
Genetic interactionsGenetic and physical interactions including synthetic lethality
DisGeNET
Gene-diseaseGene-disease association database
TCGA
Cancer genomicsThe Cancer Genome Atlas multi-omic data
UniProt
Protein functionUniversal protein function annotations
Reactome
PathwaysCurated biological pathway database
GO
Gene ontologyGene Ontology functional annotations
PubMed
LiteratureBiomedical literature mining (35M+ abstracts)