Skip to main content
The lmnr datasets command is used to manage datasets in Laminar.

Usage

Creating a new dataset and iterating on it

1

Prepare input files

Prepare input files for the dataset. Supported formats are: .json, .jsonl, .csv. Every datapoint must at least have a data field. Save this file as data.json (or data.jsonl or data.csv).For JSON, the file must contain one array of datapoints.For JSONL, the file must contain one datapoint per line.For CSV, the file must contain a header row and one datapoint per row.Examples:
2

Set the project API key

export LMNR_PROJECT_API_KEY=<your-project-api-key>
Alternatively, you can set it in the .env file in the same directory where you run the CLI.
echo "\nLMNR_PROJECT_API_KEY=<your-project-api-key>" >> .env
Or, you can also pass the --project-api-key flag to the global datasets command, e.g.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets --project-api-key "<your-project-api-key>" list
3

Create a new dataset

Create a new dataset from the input file. This command will create a new dataset with the name my-cli-dataset and save the datapoints to the file my-cli-dataset.json.The datapoints are saved to a new file in order to:
  • Store datasets in the Laminar format. In particular, datapoint id is crucial for versioning (Learn more).
  • Not overwrite existing files.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets create my-cli-dataset data.json -o my-cli-dataset.json
4

Work on the dataset locally

Make any changes required to the dataset by editing the file my-cli-dataset.json.Make sure to not edit the id field of the datapoints.
If you delete a datapoint, this will not affect the dataset in Laminar. This is because the push operation only pushes new datapoint (versions) to the dataset.
5

Push the changes to Laminar

Push the changes to Laminar.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets push -n my-cli-dataset my-cli-dataset.json
This will push the changes to the dataset in Laminar.
6

Pull the changes from Laminar

If you need to update the local dataset with the latest changes from Laminar, you can pull the changes.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets pull -n my-cli-dataset my-cli-dataset.json
This will pull the changes from the dataset in Laminar to the local file my-cli-dataset.json.
This will overwrite the contents of the current file my-cli-dataset.json.

Working on an existing dataset

1

Set the project API key

export LMNR_PROJECT_API_KEY=<your-project-api-key>
Alternatively, you can set it in the .env file in the same directory where you run the CLI.
echo "\nLMNR_PROJECT_API_KEY=<your-project-api-key>" >> .env
Or, you can also pass the --project-api-key flag to the global datasets command, e.g.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets --project-api-key "<your-project-api-key>" list
2

Select the dataset to work on

List all datasets and select the one you want to work on.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets list
3

Pull the data from Laminar

Pull the data from Laminar to a local file.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets pull -n my-dataset my-dataset.json
This will pull the changes from the dataset in Laminar to the local file my-dataset.json.
If my-dataset.json already exists, this will overwrite the contents of the file.
4

Work on the dataset locally

Make any changes required to the dataset by editing the file my-dataset.json.Make sure to not edit the id field of the datapoints.
If you delete a datapoint, this will not affect the dataset in Laminar. This is because the push operation only pushes new datapoint (versions) to the dataset.
5

Push the changes to Laminar

Push the changes to Laminar.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets push -n my-dataset my-dataset.json
This will push the changes to the dataset in Laminar.

Setting the CLI to call a local Laminar instance

Global datasets command has optional arguments:
  • --base-url: The base URL of the Laminar instance. Do NOT include port here. Default is https://api.lmnr.ai.
  • --port: The HTTP port of the Laminar instance. Default is 443. For local self-hosted Laminar, use 8000.
  • --project-api-key: The API key of the project. If not provided, reads from LMNR_PROJECT_API_KEY environment variable.

Reference

  • JavaScript/TypeScript
  • Python
npx lmnr datasets [command]

General options

These are useful if you want to call a local Laminar instance.
  --project-api-key <key>  Project API key. If not provided, reads from LMNR_PROJECT_API_KEY env variable
  --base-url <url>         Base URL for the Laminar API. Defaults to https://api.lmnr.ai or LMNR_BASE_URL env variable
  --port <port>            Port for the Laminar API. Defaults to 443

Commands

List all datasets

List all datasets.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets list

Create a new dataset

Create a dataset from input files.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets create [options] <name> <paths...>
Arguments:
  name                      Name of the dataset to create
  paths                     Paths to files or directories containing data to push

Options:
  -o, --output-file <file>  Path to save the pulled data
  --output-format <format>  Output format (json, csv, jsonl). Inferred from file extension if not provided
  -r, --recursive           Recursively read files in directories (default: false)
  --batch-size <size>       Batch size for pushing/pulling data (default: 100)

Push datapoints to a dataset

Push datapoints to an existing dataset from a file or files.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets push -n [options] <paths...>
Arguments:
  paths                Paths to files or directories containing data to push

Options:
  -n, --name <name>    Name of the dataset (either name or id must be provided)
  --id <id>            ID of the dataset (either name or id must be provided)
  -r, --recursive      Recursively read files in directories (default: false)
  --batch-size <size>  Batch size for pushing data (default: 100)

Pull datapoints from a dataset

Pull datapoints from a dataset to a file or print them to the console.
  • JavaScript/TypeScript
  • Python
npx lmnr datasets pull [options] [output-path]
Arguments:
  output-path               Path to save the data. If not provided, prints to console

Options:
  -n, --name <name>         Name of the dataset (either name or id must be provided)
  --id <id>                 ID of the dataset (either name or id must be provided)
  --output-format <format>  Output format (json, csv, jsonl). Inferred from file extension if not provided
  --batch-size <size>       Batch size for pulling data (default: 100)
  --limit <limit>           Limit number of datapoints to pull
  --offset <offset>         Offset for pagination (default: 0)
  -h, --help                display help for command