The GenAI Testing solution for real people

LLM app testing & analytics platform. Designed to be loved by all teams. Powerful enough for any application. Simple enough for any user.

Try for Free
Book a Demo
 Try for Free
Book a Demo

It's tough to build high-performing GenAI apps you can trust

Teams currently spend huge amounts of time iterating across prompts, models & architectures. And the only way they can evaluate performance is with manual & subjective ‘testing by vibes’.

Composo makes GenAI testing easy

With Composo, teams can rapidly achieve high performance, guarantee accuracy & minimise the cost of their GenAI applications.

build around...

Build around your workflow

No additional steps, complexity, or changes to how you already work

Testing in progress...

Always see the big picture

Full end-end testing of your entire workflow.

No steep learning curve

Link to your application's codebase and start getting insights straight away

learning curve

Everything in one place

Effortlessly evaluate and optimize GenAI applications in a simple yet powerful platform, featuring a playground and a powerful testing & evaluation suite.


Get started straight away with our LLM model playground & AI prompt writer. It works immediately out the box, no set up required.

Testing & Evaluation Suite

Conduct powerful, automated testing & iteration of a linked application. Just a few lines of code to set up, then use Composo with no technical ability required.
Try for Free
Easy integration with Composo

Built for performance, designed for everyone

Composo is the most powerful GenAI tool you don't have to be a developer to use.

AI prompt writer

Just type in a prompt, choose target model to optimise for & press play. Works with GPT4, Claude, DALL-E, Midjourney and more.

LLM model playground

Chat with your own app, or directly with all major open & closed source LLMs (e.g. OpenAI, Anthropic, Gemini & LLAMA) in a simple to use playground.

Rigorous automated evaluation

Evaluate performance across a range of metrics e.g. ground truth pairs, vector similarity, validity of code & the research-backed Composo AI critic

Preset & custom tests

Use Composo’s built in tests for harmful output, resistance to prompt injection & more. Or build your own.

No limit to app complexity

Our codebase integration means you can test agents, RAG or anything else, fully end to end. No need to be constrained by low code app builders.

Rapid set-up

AI prompt writing & the LLM playground work immediately. To link your own app takes just 5 minutes & a few lines of code.

A smooth, yet powerful workflow

all your apps

Bring your whole team

Composo is human friendly, and tailored for collaboration across your teams. Everyone can collaborate using a simple UI. Designed to be loved by developers & non-technical teams alike.

Dan joined your team 🎉

Bring your team to Composo

Now your whole team can participate


Our Team


Sebastian Fox


Ex-McKinsey & QuantumBlack
Oxford University


Luke Markham


Ex-Graphcore ML Engineer
Oxford University


Armin Sommer

Founding Engineer

Software founder at 18
Computer Science ETH Zurich

Ready to try Composo?

No additional steps, complexity, or changes to how you already work. Start using Composo today and build the future of AI.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.


We’re here to answer your questions and make your life simpler.

How does Composo work?

Composo makes it easy to find the best prompts, models, temperatures & architectures for an app without the need for so much manual, human iteration & evaluation

For example, you can use Composo to automate finding the best models, prompts, hyperparameters & architectures (e.g. retrieval systems, prompt chains & agents) for an application, or to pressure test for harmful output & safety.

You can start using the Composo playground immediately, or link an application to use the powerful testing suite.

Once an app is linked, you can then use Composo to automatically evaluate how different app set ups (e.g. models, prompts, RAG settings) impact the quality of generated outputs.

Can automated testing really work?


It may take an AGI to be able to effectively conduct evaluation of an application in the way a human expert might (e.g. against one simple criteria like ‘Is this a good AI doctor’.) However, by deconstructing this criterion down into the individual specific components a human evaluator might be looking for, it is possible to achieve highly robust & accurate automated evaluation. For example, in an AI doctor application the criteria might be: Does it follow local guidelines? Is it accurate & relevant to questions asked? Is the output format correct? Is this valid code?

This is an active area of research, with many proof points emerging from leading teams at Meta, Google Deepmind & OpenAI.

Read more detailed explanations on this in our documentation.

How easy is it to set up Composo?

Incredibly easy. You can get started in our playground immediately for free. Then to link an application is extremely quick & easy (just a few lines of code). We'd be happy to talk you through this, so just get in touch!

Do I have to code to use Composo?

You can get started with our playground immediately, without any code or setup at all. However, most of the power of Composo comes from linking an application codebase. This involves adding a few lines of code, but is extremely quick to do. Once this initial setup is complete, anyone can harness the full power of Composo to conduct automated testing on their application without needing code.

Why is this different to other products out there?

Composo is built to be loved by developers, subject matter experts & product managers alike.

For developers, Composo isn’t a low code builder, instead it integrates to a codebase so that you can test actual applications written in code. Because of this, there is no limit to the functionality your application can have. Composo can test any application end to end.

For non-technical subject matter experts & product managers, Composo is really easy to use. After the initial set up, it requires no code at all.

How secure is my data?

We know our customers take data security extremely seriously, and we do too. We have a range of additional enterprise-grade security options (e.g. dedicated instances, segregated databases & on-premise deployment) for those with the most rigorous requirements. Just contact us for our bespoke enterprise offerings.

My app is complex with lots of agents, will Composo still work?

Composo links in directly to your application's codebase, and therefore is completely flexible to test even the most complex applications. It's up to you what you feel is most important to test, but we would recommend conducting tests of the full end-end system, as well as isolating individual components.

Still have questions?

Let’s get in touch, we’ll be happy to learn about what you’re building

Contact Us

Start using Composo today

With rapid codebase integration and flexible pricing, we make
it easy to take your GenAI apps testing to the next level.

start using