LLM node

LLM node executes a prompt and optional chat messages against a selected model (e.g. gpt-3.5-turbo).

The node follows the standard practice of sending messages to an LLM’s chat completion (or similar) API, where the first message is the (optional) "system" message and the rest of the messages alternate between messages with "user" and "assistant" roles.

Message with "system" role will come from the “Prompt” on the node.

For sending "user" and "assistant" messages, you can enable the “Chat messages” and connect a node, which has output of “ChatMessageList” type. “ChatMessageList” is expected to contain a messages with "user" and "assistant" roles. (Read more)

LLM node with Chat messages

Prompt

The prompt can be templated, by adding dynamic input variables inside the double-curly braces.

Any node can be connected to the template variable’s handle, if its output is of string type.

For example, prompt can be "Generate random variable" (without any dynamic inputs) or "Write a poem about {{subject}}, use the following words in a poem: {{words}}" (with subject and words as dynamic inputs).

Supported models

OpenAI

  • gpt-3.5-turbo
  • gpt-3.5-turbo-16k (on deprecation path by OpenAI)
  • gpt-4-turbo-preview
  • gpt-4o

Anthropic

  • claude-3-haiku (20240307)
  • claude-3-sonnet (20240229)
  • claude-3-opus (20240229)
  • claude-3-5-sonnet (20240620)

Groq

  • groq-llama3-8b-8192
  • groq-llana3-70b-8192
  • groq-mixtral-8x7b-32768
  • groq:gemma-7b-it

Mistral

  • mistral-small
  • mistral-tiny

Amazon Bedrock

  • bedrock-anthropic.claude-v2
  • bedrock-anthropic.claude-v2:1
  • bedrock-anthropic.claude-3-sonnet-20240229-v1:0
  • bedrock-anthropic.claude-3-5-sonnet-20240620-v1:0
  • bedrock-anthropic.claude-3-haiku-20240307-v1:0
  • bedrock-anthropic.claude-3-opus-20240229-v1:0
  • bedrock-anthropic.claude-instant-v1

Model params

A JSON input for all additional model params supported by the provider. This could include things like top_p and temperature. Currently, tools must also be specified here (not supported for Bedrock).

Chat message input and output

If you want to specify history of chat messages as input, turn on the “Chat messages input” toggle. If you want the result to be a chat message list, turn on the “Chat messages output” toggle.

Structured output

For models that do not inherently support JSON structured output format, we offer the functionality on top of it.

In essence, we prompt the model in a special way to give it the definition of the answer schema. We also parse the output to, for example, remove any words before or after the JSON we are interested in. If the model fails to respond correctly, we add an error message to the chat messages and let the model try again. You can control the number of retry attempts.

To start using the structured output, you must specify the schema in the BAML format and a target for it.

The target must be a enum or a class. Enums are useful to make sure the model answers with just one word, which is helpful in conditional routing.

Classes are useful to specify the JSON. You can nest classes and add enums inside classes.

Read more about BAML syntax here: https://docs.boundaryml.com/docs/syntax/class

Stream

This switch is used to select which LLM nodes’ content to stream during endpoint call. When call to endpoint is made, set “stream” to true in request params, so that LLM nodes which have “stream” enabled are streamed using server-sent events.

Note that for now only selected models such as OpenAI, OpenAI Azure, Anthropic, and Bedrock Anthropic support streaming.

Note that when running pipelines in pipeline builder, LLM node will always be streamed.

Tool calling

To specify tools to the model, use model params input.

If you specify tools, and the model decides to use them, we replace the model’s output with the tool calls array. (This is currently only supported for OpenAI) To make use of tool calls, use Tool call node