GALENAPI

The Cancer Knowledge Graph

At the core of Galen is a knowledge graph containing 280,000+ entities and 4.75M+ relationships, built from 28 biomedical databases and continuously updated by an autonomous research AI.

Entity types

Every node in the knowledge graph is a typed entity. The major entity types include:

gene

Protein-coding genes (EGFR, KRAS, TP53, BRCA1...)

drug

Approved drugs and investigational compounds

disease

Cancer types and subtypes (NSCLC, AML, TNBC...)

pathway

Biological pathways (MAPK, PI3K/AKT, WNT...)

mutation

Specific variants (EGFR L858R, BRAF V600E...)

protein

Protein products and complexes

cell_line

Cancer cell lines from DepMap and GDSC

clinical_trial

Active and completed trials

Relationships

Relationships connect entities with typed, directed edges. Each relationship carries provenance (which database it came from), a confidence score, and a Pearl Causal Hierarchy layer annotation (L1, L2, or L3).

Common relationship types include: targets, inhibits, activates, associated_with, mutated_in, co_occurs_with, and many more.

Provenance

Every relationship in the knowledge graph carries provenance metadata indicating which data source(s) contributed the evidence. Provenance values like chembl_36, depmap_crispr, or cbioportal_local tell you exactly where the evidence originated, enabling reproducibility and cross-validation.

Next: Pearl Causal Hierarchy

Every relationship has an L1/L2/L3 annotation. Learn what these mean and why they matter.

Read about causal hierarchy →