Legal AI is only useful if it is defensible.
Document extraction, clause identification, contract review, legal reasoning. Every output needs to be something a lawyer can sign off on. Spreadsheet-based QA does not scale. Generic LLM-as-judge misses the specifics. Composo deploys a quality layer calibrated to your legal domain in 2 to 4 weeks.
A failure report on your legal AI in under a week.
The problem
Legal AI teams are still evaluating in spreadsheets.
A senior legal-AI product lead described it to us plainly: "We actually still mainly work in spreadsheets. Each tab is a different document that we test with. For every data point, we track the source of truth and compare outputs."
"The eval piece is where we've made very little progress. If we keep growing at this rate, we will have to keep adding more people. And there are only so many lawyers in the world who really want to do this kind of tech-focused work."
That is the gap Composo closes. Automated evaluation that preserves the expert judgement of a lawyer, without requiring lawyers to review every output forever.
Use cases
Where Composo deploys in legal tech.
Document extraction
Governing law, jurisdiction, change-of-control, indemnity, termination clauses. Composo evaluates extraction accuracy against your specific definition of "complete" and "correct".
Contract review
Playbook-based markup, clause comparison, risk flagging. Catches missed risk triggers, misclassified clauses, and comments that would fail a senior reviewer.
Due diligence
Data-room summarisation, red-flag identification, precedent matching. Catches omissions that would surface as problems in a DD report.
Legal research and memos
Memo drafting, case-law synthesis, argument generation. Catches citation fabrication, misapplied precedent, and reasoning gaps.
Compliance and regulatory
Regulatory change review, policy mapping, disclosure drafting. Catches policy-inconsistent reasoning and outdated-standard references.
E-discovery triage
Privilege review, relevance triage, production-set generation. Catches over-inclusive and under-inclusive classifications.
Why us
Built by people who have shipped legal AI.
Michael Karotsieris, one of Composo's founding engineers, spent two and a half years at Eigen Technologies, one of the original document-AI and legal-AI startups. That shapes how Composo approaches legal-domain evaluation: precision matters, domain expertise cannot be shortcut, and the output has to be defensible to a senior reviewer.
Case study
Legal tech startup ships MVP in 4 weeks.
A US-based legal-tech startup replaced 15 hours a week of manual evaluation with Composo, cutting evaluation cost by 96% and shipping their MVP four weeks after deployment.
Read the case study →Frequently asked questions
What kinds of legal AI does Composo evaluate?
Document extraction, contract review, clause classification, legal research and memo drafting, due diligence summarisation, e-discovery triage, and legal-reasoning agents. Anywhere a legal professional or client relies on LLM output as part of a workflow that has to be defensible.
Why not just use spreadsheet-based evaluation like most legal-AI teams do today?
Spreadsheets do not scale and do not compound. Teams tracking document-level ground truth in tabs per document, scoring outputs manually, hit a ceiling at a few hundred examples. Composo automates the evaluation loop while preserving the domain-expert judgement that makes legal QA credible.
How does Composo handle the precision legal extraction needs?
Composo calibrates to your specific evaluation criteria during a 2 to 4 week deployment. For extraction tasks that means defining what counts as a correct governing-law clause, a complete indemnity identification, or an accurate change-of-control trigger, and letting Composo learn those standards from your expert reviewers.
Is the team credible on legal AI?
One of Composo's founding engineers, Michael Karotsieris, spent two and a half years as a software engineer at Eigen Technologies, one of the original document-AI and legal-AI startups. That experience shapes how Composo approaches legal-domain evaluation.
How does Composo fit alongside tools like Harvey, Legora, or Centari?
Composo is complementary. Legal AI platforms build the product; Composo evaluates the outputs those platforms produce. A legal-tech team can deploy Composo to continuously evaluate their AI, surface failure patterns, and give their customers defensible quality metrics.
See the failure patterns in your legal AI.
Delivered in under a week. No spreadsheets required.