Skip to content
Read our latest publication on optimal methods for LLM evaluation here
Legal Tech AI

Legal AI is only useful if it is defensible.

Document extraction, clause identification, contract review, legal reasoning. Every output needs to be something a lawyer can sign off on. Spreadsheet-based QA does not scale. Generic LLM-as-judge misses the specifics. Composo deploys a quality layer calibrated to your legal domain in 2 to 4 weeks.

Book a Diagnostic

A failure report on your legal AI in under a week.

The problem

Legal AI teams are still evaluating in spreadsheets.

A senior legal-AI product lead described it to us plainly: "We actually still mainly work in spreadsheets. Each tab is a different document that we test with. For every data point, we track the source of truth and compare outputs."

"The eval piece is where we've made very little progress. If we keep growing at this rate, we will have to keep adding more people. And there are only so many lawyers in the world who really want to do this kind of tech-focused work."

That is the gap Composo closes. Automated evaluation that preserves the expert judgement of a lawyer, without requiring lawyers to review every output forever.

Use cases

Where Composo deploys in legal tech.

Document extraction

Governing law, jurisdiction, change-of-control, indemnity, termination clauses. Composo evaluates extraction accuracy against your specific definition of "complete" and "correct".

Contract review

Playbook-based markup, clause comparison, risk flagging. Catches missed risk triggers, misclassified clauses, and comments that would fail a senior reviewer.

Due diligence

Data-room summarisation, red-flag identification, precedent matching. Catches omissions that would surface as problems in a DD report.

Legal research and memos

Memo drafting, case-law synthesis, argument generation. Catches citation fabrication, misapplied precedent, and reasoning gaps.

Compliance and regulatory

Regulatory change review, policy mapping, disclosure drafting. Catches policy-inconsistent reasoning and outdated-standard references.

E-discovery triage

Privilege review, relevance triage, production-set generation. Catches over-inclusive and under-inclusive classifications.

Why us

Built by people who have shipped legal AI.

Michael Karotsieris, one of Composo's founding engineers, spent two and a half years at Eigen Technologies, one of the original document-AI and legal-AI startups. That shapes how Composo approaches legal-domain evaluation: precision matters, domain expertise cannot be shortcut, and the output has to be defensible to a senior reviewer.

Case study

Legal tech startup ships MVP in 4 weeks.

A US-based legal-tech startup replaced 15 hours a week of manual evaluation with Composo, cutting evaluation cost by 96% and shipping their MVP four weeks after deployment.

Read the case study →

Frequently asked questions

What kinds of legal AI does Composo evaluate?

Document extraction, contract review, clause classification, legal research and memo drafting, due diligence summarisation, e-discovery triage, and legal-reasoning agents. Anywhere a legal professional or client relies on LLM output as part of a workflow that has to be defensible.

Why not just use spreadsheet-based evaluation like most legal-AI teams do today?

Spreadsheets do not scale and do not compound. Teams tracking document-level ground truth in tabs per document, scoring outputs manually, hit a ceiling at a few hundred examples. Composo automates the evaluation loop while preserving the domain-expert judgement that makes legal QA credible.

How does Composo handle the precision legal extraction needs?

Composo calibrates to your specific evaluation criteria during a 2 to 4 week deployment. For extraction tasks that means defining what counts as a correct governing-law clause, a complete indemnity identification, or an accurate change-of-control trigger, and letting Composo learn those standards from your expert reviewers.

Is the team credible on legal AI?

One of Composo's founding engineers, Michael Karotsieris, spent two and a half years as a software engineer at Eigen Technologies, one of the original document-AI and legal-AI startups. That experience shapes how Composo approaches legal-domain evaluation.

How does Composo fit alongside tools like Harvey, Legora, or Centari?

Composo is complementary. Legal AI platforms build the product; Composo evaluates the outputs those platforms produce. A legal-tech team can deploy Composo to continuously evaluate their AI, surface failure patterns, and give their customers defensible quality metrics.