Interpretable deep learning for structure-based molecular discovery.

We build interpretable models for structure-based drug discovery, connecting deep learning predictions to mechanistic evidence, with calibrated uncertainty and explicit attribution so teams can distinguish model limitations from biological signal.

Protein-ligand complex: structure-based binding site visualization

Bioforager

Enrichment & prioritization workflows

Multi-stage computational funneling from large enumerated libraries to nominated chemotypes; each gate emits scores, rationales, and uncertainty so medicinal chemistry can trace promotions and deprioritizations before synthesis.

Bioforager: homology map across related binding sites

See product

Genesys model family

Multimodal foundation models for biomolecular systems

Joint representation of proteins, peptides, small molecules, and analogs under a single architectural prior, with data and compute efficiency suitable for iteration, and inspection hooks aligned to structural and dynamical validation.

Architecture

Unified encoding of protein, peptide, and small-molecule chemotypes including close analogs
Multi-Unimodal (MUM) training for incremental extension across modalities without full retraining
Latent organisation over protein / domain classes

Sample & compute efficiency

Fewer labelled examples than typical benchmark regimes
Lower training and inference budgets at matched task fidelity
Emphasis on out-of-distribution robustness for novel scaffolds

Inspection & attribution

Intermediate representations exposed for audit
Attention and layer-wise probes; coupling to conformational ensembles where applicable
Confidence and failure-mode decomposition for downstream QA

Genesys overview

Circe

Model introspection for discovery teams

Agentic workflows that interrogate discriminative and generative components: where uncertainty concentrates, which features drive a prediction, and whether evidence is consistent across poses, homologs, and auxiliary readouts, prior to IND-enabling spend.

Circe structure-based output visualization

Scientific & operational differentiation

Design choices aimed at translational risk, analogous to platform companies that pair large-scale inference with measurable feedback loops, but with interpretability as a first-class output.

Epistemic vs ontic error. Decompose apparent failures into model misspecification, dataset shift, and scoring artefacts versus biophysical mechanisms (off-target engagement, kinetics, ADME liabilities).
Ensemble and perturbed complexes. Stress-test docked or predicted poses under conformational and side-chain variation; rank compounds that are stable across perturbations.
Homology-aware selectivity. Explicit mapping of predicted affinity and confidence across related binding sites to anticipate polypharmacology and anti-target risk.
Traceable funnel decisions. Each enrichment stage exports rationales tied to inputs and model state, reducing undiagnosed false negatives and false positives in the medchem loop.
Retrospective and prospective validation. Benchmark against clinical and preclinical outcomes where labels exist; partner-forward studies on novel targets and chemotypes.
Documentation for governance. Structured narratives suitable for internal QA and external scrutiny when predictions and outcomes diverge.

Computational platform coverage

Custom model and workflow integration for sponsor teams, from target validation through lead optimisation, in line with structure-enabled discovery programs.

Ramachandran ensemble: conformational sampling for target assessment

Target & structure assessment

Tractability and druggability framing for nominated targets
Mutation and isoform-aware pocket definition
Force-field parameterised conformational ensemble sampling
QM Hamiltonian-based algorithms to probe MD-relevant conformational change

Protein complex four-view poses in solvent for hit identification

Hit identification & optimisation

Affinity and pose prediction with uncertainty propagation
Exploration of analog series under SAR constraints
Physics-grounded interaction fingerprint scoring
Prioritisation for synthesis under interpretable criteria

Lead nomination

Structure-aware shortlisting with explainable filters
Binding free energy decomposition across candidate series
Local structural perturbation analysis (R-group, linker)
Explicit handling of false-positive structural hypotheses

Publication: Multi-Unimodal Model (MUM) architecture, ISBI. IEEE Xplore.