Configuring Laminar evaluations
This page describes how to configure evaluations in Laminar and showcases some common use cases.
Configuring evaluations to report results to locally self-hosted Laminar
In this example, we configure the evaluation to report results to a locally self-hosted Laminar instance.
Evaluations send data to Laminar over both HTTP and gRPC. HTTP is used to create an evaluation and report the datapoints, stats, and trace ids. OpenTelemetry traces themselves are sent over gRPC.
Assuming you have configured Laminar to run on ports 8000 and 8001 on your localhost
, you will need to
pass these values to the evaluate
function.
Run this file either by executing it, or by running it with npx lmnr eval
CLI.
Run this file either by executing it, or by running it with npx lmnr eval
CLI.
Run this file either by executing it, or by running it with lmnr eval
CLI.
Configuring evaluations
evaluate
reference
Evaluations in Laminar are configured using the evaluate
function. The function takes the following arguments:
data
: Either (1) A list of dictionaries, where each dictionary contains the data, target, and metadata for a single evaluation; or (2) An instance ofLaminarDataset
– read more in the dedicated page.executor
: An optionally async function that takes a single argument, the evaluation data, and returns the output.evaluators
: A dictionary of async functions that take the output and target as arguments and return a score.name
(optional): Evaluation name, so it is easier to identify the evaluation in the UI. If not provided, a random name is assigned in the backend.groupName
/group_name
(optional): An optional string that groups evaluations together. Only evaluations with the same group name can be visually compared.
Additional optional configuration parameters are passed as a config
object in JavaScript/TypeScript and directly to evaluate
in Python.
projectApiKey
: The API key of the project where the evaluation results will be stored. Required, unless you set theLMNR_PROJECT_API_KEY
environment variable.concurrencyLimit
: The number of evaluations to run in parallel. Default is5
.baseUrl
: The base URL of the Laminar instance. Do NOT include port here. Default ishttps://api.lmnr.ai
.httpPort
: The port of the Laminar instance for HTTP. Used to send evaluation results and metadata. Default is 443. For local self-hosted Laminar, use 8000.grpcPort
: The port of the Laminar instance for gRPC. Used to send traces via OTel gRPC exporter. Default is 8443. For local self-hosted Laminar, use 8001.instrumentModules
: An object with modules to instrument. Read more in the instrumentation guide.
projectApiKey
: The API key of the project where the evaluation results will be stored. Required, unless you set theLMNR_PROJECT_API_KEY
environment variable.concurrencyLimit
: The number of evaluations to run in parallel. Default is5
.baseUrl
: The base URL of the Laminar instance. Do NOT include port here. Default ishttps://api.lmnr.ai
.httpPort
: The port of the Laminar instance for HTTP. Used to send evaluation results and metadata. Default is 443. For local self-hosted Laminar, use 8000.grpcPort
: The port of the Laminar instance for gRPC. Used to send traces via OTel gRPC exporter. Default is 8443. For local self-hosted Laminar, use 8001.instrumentModules
: An object with modules to instrument. Read more in the instrumentation guide.
project_api_key
: The API key of the project where the evaluation results will be stored. Required, unless you set theLMNR_PROJECT_API_KEY
environment variable.batch_size
: The number of evaluations to run in parallel. Default is5
.base_url
: The base URL of the Laminar instance. Do NOT include port here. Default ishttps://api.lmnr.ai
.http_port
: The port of the Laminar instance for HTTP. Used to send evaluation results and metadata. Default is 443. For local self-hosted Laminar, use 8000.grpc_port
: The port of the Laminar instance for gRPC. Used to send traces via OTel gRPC exporter. Default is 8443. For local self-hosted Laminar, use 8001.instrument_modules
: A set of modules to instrument. Read more in the instrumentation guide.
eval CLI reference
lmnr eval
subcommand is used to run evaluations.
Formatting evaluation files
When you run an evaluation file from the CLI, the evaluate functions must be called
at the top level of the file. You can have one or more evaluate
calls in the file.
In Python, you can not await
the calls to evaluate
when running an evaluation file from the CLI.
Options
First positional argument is the path to the evaluation file. E.g.
If file is not provided, lmnr eval
will run all files in the evals
directory that match the naming pattern.
For TypeScript/JavaScript, the pattern is *.eval.{ts,js}
. For Python, the pattern is eval_*.py
or *_eval.py
.
Params
--fail-on-error
– if set, the CLI will fail on non-critical errors, for example, if evaluate
is not called in the file. Default is false
.