organism-core preview

Open-source preview · Python · Apache 2.0

AI agents that
earn trust before they get it.

organism-core researches its own success criteria before every action, validates them after, and grants autonomy based on a measurable track record.

The code on GitHub is an early, open preview — the 1.0 framework will be a commercial release.

01

Pre-action DoD research

Before any tool fires, the framework derives a Definition of Done from six prioritized sources with separate provenance tracking — not a vague prompt, but auditable criteria.

02

Cross-domain genericity

CI tests enforce identical pipeline structure across three demo domains. What works for code review works for email triage works for research — without per-domain glue.

03

Score-driven autonomy

Tools start supervised and earn higher autonomy stages — automatically — from a rolling window of validation scores. Quality regressions revoke autonomy the same way.

Five lines, a real quality gate.

Decorate a function, ship it, and let the framework prove it earns autonomy.

from organism_core import Tool, run

@Tool.register("draft_email", autonomy="staged")
def draft_email(to: str, topic: str) -> str:
    return llm.generate(f"Email to {to} about {topic}")

# Before execution, organism-core derives a Definition of Done
# from six sources (entity profile, lessons, related entities,
# vector search, domain patterns, user input).
# After execution, it scores the result against that DoD.
# Tools earn autonomy stages from a rolling track record.

result = run("draft_email", to="alice@example.com", topic="Q3 review")
print(result.score, result.stage, result.autonomy_next)

Different from LangGraph & CrewAI.

organism-core LangGraph CrewAI
Pre-action DoD research
Post-action validation against derived criteria manual manual
Auto-earned autonomy stages
Cross-domain genericity enforced in CI
Open-source preview, self-hostable

The field is converging on the same bet

In May 2026, Anthropic shipped Outcomes for Claude: you define what "done" looks like up front, a separate grader checks the result against those criteria, and the agent iterates until it is met. It is independent validation of the bet organism-core was built around — explicit, gradeable success criteria, scored after execution. organism-core goes one step further: it researches that Definition of Done itself, across six sources, instead of asking you to hand-write a rubric — in an open, model-agnostic framework.

organism-core is an independent project — not affiliated with, sponsored by, or endorsed by Anthropic.

Be among the first to put it into production.

We're onboarding production workload testers. Early access includes setup support, direct feedback channel, and influence on the public roadmap.