Datasets
Introduction to Laminar datasets
Concept
Dataset is a collection of datapoints. It can be used for the following purposes:
- Data storage for use in future fine-tuning or prompt-tuning.
- Provide inputs and expected outputs for Evaluations.
Format
Every datapoint has two fixed JSON objects: data
and target
, each with arbitrary keys.
target
is only used in evaluations.
data
– the actual datapoint data,target
– data additionally sent to the evaluator function.metadata
– arbitrary key-value metadata about the datapoint.
For every key inside data
and target
, the value can be any JSON value.
Example
This is an example of a valid datapoint.
Use case: Evaluations
Datasets can be used for evaluations to specify inputs and expected outputs.
You will need to make sure the dataset keys match the input and output node names of the pipelines. See more in the Evaluations page.
Editing
Datasets are editable. You can edit the datapoints by clicking on the datapoint and editing the data in JSON. The changes are saved automatically.