A new scientific method.

Discovery Engine finds patterns in data that humans and agents miss.

Try Discovery Engine

How it works

Superhuman exploratory data analysis.

01

Upload

Drop in a tabular dataset and select your target variable. That's it – we do the rest.

02

Analyse

Discovery Engine fits neural networks to your data, then applies interpretability methods to extract the patterns they learned. All findings are validated on hold-out data, and contextualised with existing literature.

03

Discover

You get a ranked list of statistically significant patterns, with p-values, effect sizes, evidence, and context.

Plus IconPlus IconPlus IconPlus Icon

Publications

bioRxiv · 2025

Growth Cost and Transport Efficiency Tradeoffs Define Root System Optimization Across Varying Developmental Stages and Environments in Arabidopsis

Faizi, Mehta, Maida, Humphreys, Berrigan, McKee Reid, McCorkell, Tagade, Rumbelow, Showalter, Brent, Coroenne, Rigaud, Chandrasekhar, Navlakha, Martin, Pradal, Lee, Busch, Platre

bioRxiv · 2025

Automated Discovery of Patterns in T-Cell Receptor Physicochemical Signatures

Shams, Bishop, Mckee-Reid, Rumbelow

arXiv · 2025

Explaining Surface Layer Theory Departures in Marine Flux Profiles with Data-Driven Discovery

Foxabbott, Mckee-Reid, Cusick, McCorkell, Patel, Rumbelow, Rumbelow, Shams, Tagade, Hawbecker, Haupt

arXiv · 2025

Open Problems in Mechanistic Interpretability

Sharkey, Chughtai, Batson, Lindsey, Wu, Bushnaq, Goldowsky-Dill, Heimersheim, Ortega, Bloom, Biderman, Garriga-Alonso, Conmy, Nanda, Rumbelow, Wattenberg, Schoots, Miller, Michaud, Casper, Tegmark, Saunders, Bau, Todd, Geiger, Geva, Hoogland, Murfet, McGrath

AI 4 X Conference · 2025

Towards Data-Driven Scientific Discovery

Tagade, Mckee-Reid, McCorkell, Cusick, Sosa, Platre, Rumbelow, Shams

medRxiv · 2026

The Decline in Influenza Antibody Titers and Modifiers of Vaccine Immunity from over Ten Years of Serological Data

Fenoy, Plant, Xie, Ye, Tagade, Rumbelow, Einav

bioRxiv · 2025

Growth Cost and Transport Efficiency Tradeoffs Define Root System Optimization Across Varying Developmental Stages and Environments in Arabidopsis

Faizi, Mehta, Maida, Humphreys, Berrigan, McKee Reid, McCorkell, Tagade, Rumbelow, Showalter, Brent, Coroenne, Rigaud, Chandrasekhar, Navlakha, Martin, Pradal, Lee, Busch, Platre

bioRxiv · 2025

Automated Discovery of Patterns in T-Cell Receptor Physicochemical Signatures

Shams, Bishop, Mckee-Reid, Rumbelow

arXiv · 2025

Explaining Surface Layer Theory Departures in Marine Flux Profiles with Data-Driven Discovery

Foxabbott, Mckee-Reid, Cusick, McCorkell, Patel, Rumbelow, Rumbelow, Shams, Tagade, Hawbecker, Haupt

arXiv · 2025

Open Problems in Mechanistic Interpretability

Sharkey, Chughtai, Batson, Lindsey, Wu, Bushnaq, Goldowsky-Dill, Heimersheim, Ortega, Bloom, Biderman, Garriga-Alonso, Conmy, Nanda, Rumbelow, Wattenberg, Schoots, Miller, Michaud, Casper, Tegmark, Saunders, Bau, Todd, Geiger, Geva, Hoogland, Murfet, McGrath

AI 4 X Conference · 2025

Towards Data-Driven Scientific Discovery

Tagade, Mckee-Reid, McCorkell, Cusick, Sosa, Platre, Rumbelow, Shams

medRxiv · 2026

The Decline in Influenza Antibody Titers and Modifiers of Vaccine Immunity from over Ten Years of Serological Data

Fenoy, Plant, Xie, Ye, Tagade, Rumbelow, Einav

Plus IconPlus IconPlus IconPlus Icon

Pricing

Free for public data. Flexible for everything else.

Public analyses are free. For private data and deeper analysis, choose a plan that suits you.

Explorer

$0

/month

For open science.

10 credits/mo

  • +

    Unlimited public analyses (data and reports published)

  • +

    10 credits/month for private analyses (no roll over)

  • +

    Additional credits available to purchase

  • +

    Standard processing queue

Get Started Free

Researcher

$49

/month

For individual researchers with proprietary data.

50 credits/mo (which roll over)

  • +

    Unlimited public analyses (data and reports published)

  • +

    50 credits/month for private analysis (which roll over)

  • +

    Additional credits available to purchase

  • +

    Deep analysis for more comprehensive pattern search

  • +

    Priority processing queue

  • +

    Email support

Start Researcher

Most popular

Team

$199

/month

For research teams with proprietary data.

200 credits/mo (which roll over)

  • +

    Unlimited public analyses (data and reports published)

  • +

    200 credits/month for private analysis (which roll over)

  • +

    Additional credits available to purchase

  • +

    Deep analysis for more comprehensive pattern search

  • +

    Highest priority processing

  • +

    Priority email support

  • +

    Up to 5 seats

Start Team

Enterprise

Custom

For discovery at scale, dedicated compute, and custom integrations.

Unlimited credits

  • +

    Everything in Team, plus:

  • +

    Dedicated compute

  • +

    Unlimited seats

  • +

    Dedicated support

Talk to Us
Plus IconPlus IconPlus IconPlus Icon

API

Built for agents and developers.

Faster and cheaper than prompting for data analysis — and finds patterns that your agent would miss. Run Discovery Engine via API, Python SDK, or MCP. Skills included.

Python SDK

from discovery import Engine

engine = Engine(api_key="disco_...")
result = await engine.discover(
    file="data.csv",
    target_column="outcome",
)

for p in result.patterns:
    if p.novelty_type == "novel":
        print(p.description)
Plus IconPlus IconPlus IconPlus Icon

Get started

Your data has more to tell you.

Upload a dataset and get ranked, validated discoveries in minutes. Free for public analyses — no credit card required.

Try Discovery Engine
Plus IconPlus IconPlus IconPlus Icon

Why not just use an LLM?

Language models inherit our assumptions.

Discovery Engine is systematic and data-first.

Like humans, LLMs only find patterns they can hypothesise in the first place – and the literature that informs those hypotheses is full of biases, errors, and unreplicable findings. This means that most of the space of possible discoveries remains unexplored. By contrast, Discovery Engine finds patterns systematically, without assumptions – and so surfaces insights that would otherwise remain hidden.

Language is lossy.

Language is a lossy abstraction over data, and valuable nuance is lost in aggregation. Scientific papers are an incomplete representation of the underlying observations. Discovery Engine finds patterns directly in the data, disregarding scientific narrative and the pressure to publish. It finds raw patterns in the numbers, not the story in the paper.

The pattern discovery API that agents call.

Discovery Engine gives AI agents a capability they can't replicate with prompting and pandas: validated, novel pattern discovery — interactions, thresholds, and subgroup effects — without requiring prior hypotheses. One API call, structured results, citations included.

Plus IconPlus IconPlus IconPlus Icon

FAQ

Common questions.

What's the difference between standard and deep analysis?

Standard analysis finds most patterns — and is powerful enough for novel discoveries. Deep analysis (available on paid plans) runs a more exhaustive process, finding more patterns and often surfacing further novel relationships.

What's the difference between public and private?

Public datasets and their results are visible to all users — great for open science and academic work. Private datasets and reports are only visible to you and your team, ideal for proprietary or pre-publication data.

What's a credit?

Credits are used for private analyses. Cost scales with dataset size — a typical 10K-row dataset uses 1–3 credits, while larger datasets use more. Public analyses do not require credits.

Can I buy more credits?

Yes. All users can purchase additional credits for private analyses at $1 per credit. Purchased credits never expire.

What kind of data is supported?

We currently support tabular data up to 1GB, in CSV, TSV, Excel (.xlsx), JSON, Parquet, ARFF, and Feather formats, with timeseries and image support coming soon. For larger datasets or other modalities, please contact us.

How long does an analysis take?

Most analyses complete in minutes to hours, depending on dataset size. Public analyses and free plans have lower priority in the queue, which may result in long wait times to begin processing when the engine is busy. Our paid plans offer priority processing with no wait time.

Can AI agents use Discovery Engine?

Yes. Discovery Engine is available as a Python SDK, MCP server, and REST API. Agents can sign up, manage billing, and run analyses entirely programmatically. The SDK returns structured results that agents can reason over directly, plus a shareable report URL for the human.

Plus IconPlus IconPlus IconPlus Icon

Talk to us

Have a dataset in mind? Let's find what's hiding in it.

Whether you're exploring public data or running enterprise-scale discovery, we'd love to hear from you.

Plus IconPlus IconPlus IconPlus Icon

Contact

Get in touch with our team.