Evaluation examples
Examples of evaluations in Laminar
Basic correctness evaluation.
In this example our executor function calls an LLM to get the capital of a country. We then evaluate the correctness of the prediction by comparing it to the target capital. The evaluator function returns 1 if the prediction is correct and 0 otherwise.
1. Define an executor function
An executor function calls OpenAI to get the capital of a country. The prompt also asks to only name the city and nothing else. In a real scenario, you will likely want to use structured output to get the city name only.
from openai import AsyncOpenAI
import os
openai_client = AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def get_capital(data):
country = data["country"]
response = await openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": f"What is the capital of {country}? Just name the "
"city and nothing else",
},
],
)
return response.choices[0].message.content.strip()
2. Define an evaluator function
def evaluator(output, target):
return 1 if output == target["capital"] else 0
3. Define data and run the evaluation
from lmnr import evaluate
data = [
{"data": {"country": "Germany"}, "target": {"capital": "Berlin"}},
{"data": {"country": "Canada"}, "target": {"capital": "Ottawa"}},
{"data": {"country": "Tanzania"}, "target": {"capital": "Dodoma"}},
]
evaluate(
data=data,
executor=get_capital,
evaluators={'check_capital_correctness': evaluator},
project_api_key=os.environ["LMNR_PROJECT_API_KEY"],
)
Configuring the evaluation to report results to locally self-hosted Laminar
In this example, we configure the evaluation to report results to a locally self-hosted Laminar instance.
Evaluations send data to Laminar over both HTTP and gRPC. HTTP is used to create an evaluation and report the datapoints, stats, and trace ids. OpenTelemetry traces themselves are sent over gRPC.
Assuming you have configured Laminar to run on ports 8000 and 8001 on your localhost
, you will need to
pass these values to the evaluate
function.
from lmnr import evaluate
evaluate(
data=data,
executor=get_capital,
evaluators={'check_capital_correctness': evaluator_A},
project_api_key=os.environ["LMNR_PROJECT_API_KEY"],
base_url="http://localhost",
http_port=8000,
grpc_port=8001,
)
Run this file either by executing it, or by running it with lmnr eval
CLI.