Quick start
Try Empirical in 3 steps
Empirical bundles together a CLI and a web app. The CLI handles running tests and the web app visualizes results.
Everything runs locally, with a JSON configuration file, empiricalrc.json
.
Required: Node.js 20+ needs to be installed on your system.
Start with a basic example
In this example, we will ask an LLM to parse user messages to extract entities and
give us a structured JSON output. For example, “I’m Alice from Maryland” will
become "{name: 'Alice', location: 'Maryland'}"
.
Our test will succeed if the model outputs valid JSON.
Set up Empirical
Use the CLI to create a sample configuration file in empiricalrc.json
.
Read the file to see the configured models and dataset samples that we will test for. The default configuration uses models from OpenAI.
Run the test
Run the test samples against the models with the run
command.
This step requires the OPENAI_API_KEY
environment variable to authenticate with
OpenAI. This execution will cost $0.0026, based on the selected models.
See results
Use the ui
command to open the reporter web app in your web browser and see
side-by-side results.
[Bonus] Fix GPT-4 Turbo
GPT-4 Turbo tends to fail our JSON syntax check, because it returns outputs
in markdown syntax (with backticks ```json
). We can fix this behavior by enabling
JSON mode.
Re-running the test with npx @empiricalrun/cli run
will give us better results
for GPT-4 Turbo.
Make it yours
Edit the empiricalrc.json
file to make Empirical work for your use-case.
- Configure which models to use
- Configure your test dataset
- Configure scoring functions to grade output quality