Skip to content

Prompts / Completions

You can think of a prompt as an instruction for a Large Language Model (LLM). The better the prompt, the better the completion.

You can create a new prompt by clicking on the icon next to the "Prompts" label in the navigation section of the sidebar.

Related resources

Model

At the top of the prompt page you can find two dropdowns to select an inference API provider and an LLM from their portfolio to complete your prompt.

Model selection

Token price

The inference price for input and output tokens is set by each provider and displayed to the right of the model selection in units of cents per 1,000 tokens. There is no markup, inference costs will be the same as if you would use the provider's API directly.

The total completion price is always calculated as

completion cost=ninput1000priceinput+noutput1000priceoutput

Model settings

Below the model selection you can tune the model settings. To edit a parameter, just click on it to open its details. The popover lets you adjust the value of the parameter and gives you useful information about the min/max range, the default value, and an explanation of what the parameter does.

If a specific model does not support a setting it will be grayed out and omitted when submitting the prompt to the API.

Token limit

The maximum number of tokens the model should process/generate in the completion. Unfortunately, different providers treat the token limit differently. Sometimes the token limit refers to input plus output tokens, e.g. OpenAI, and sometimes it refers only to the output tokens, e.g. Anthropic.

The model can never exceed the token limit. So if you experience completion cutoffs, the token limit is most likely the culprit.

Temperature

The sampling temperature to use. Higher values like 0.9 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Top p

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

It is generally recommended to either alter the temperature or the top p parameter, but not both at the same time.

Frequency penalty

Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Presence penalty

Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Seed

If specified, the model will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism is not guaranteed!

JSON mode

If set to true, the model is forced to output JSON-formatted responses. Note that your prompt has to contain the word "json", otherwise the API will throw an error.

Stop sequences

A comma-separated list of up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

Messages

At the core of Promptmetheus's design is composability. Every prompt consists of a system message and/or a user message.

The system message can be used to provide instructions to the model, while the user message is the actual prompt that the LLM is presented with to complete.

What makes Promptmetheus different from other prompt engineering tools is that you can compose each of those messages via blocks (aka. sections, fragments, or however you want to call them).

When you execute a prompt, Promptmetheus compiles all blocks in a message into a single text string before sending it to the inference API.

Blocks

Blocks are just sections of a message. They allow you to better structure your prompts and test different variants. They can also generate valuable performance insights to efficiently optimize the prompt.

To add a block you can just hover over the "Add block" label at the bottom of each message or over the icon between blocks to insert a block at the desired position (see block types below).

Blocks can also be re-ordered by drag-and-dropping the block by its icon.

To rename a block, you can double click on its identifier (which is "Block#" plus the last 6 digits of its unique ID) or select the respective action from the block menu.

There are no hard rules on how to break up your prompt, just experiment and find out what works for you.

Block types

There are different types of blocks you can use to compose your prompt.

Text blocks

Text blocks are simple prose sections and probably what you want to use for most of your prompt composition. Text blocks are specific to each prompt and cannot be reused in other prompts.

Data blocks

Data blocks inject a dataset into your prompt, where each item in the dataset will be available as a variant (see below). Since data blocks can be re-used across multiple prompts in a project, their content is not editable from the prompt page. To edit a dataset item, you have to open the dataset.

For more info on datasets and how they work, please take a look at the Datasets section.

Variants

Each block can have multiple variants (or alternatives) and you can seamlessly switch between them to test and experiment with different versions of the section. Variants are not concatenated like blocks, only the selected variant will be part of the prompt. To add a new variant, just click on the icon next to the existing variant tabs or right-click on a variant and select "duplicate" to create a copy of it.

Versions

In the making...

Block actions

When you hover over a block, you'll see a selection of several quick actions and a action dropdown menu at the top right of the block.

Deactivate block ; Highlight variant ; Copy to clipboard

Block actions

Deactivate block

With the toggle icon you can deactivate/re-activate blocks to test how the prompt performs without them. Deactivated blocks will be grayed out and do not get included in preview and the compiled prompt.

Highlight variant

The highlight action is a convenient way to quickly identify all completions that use the selected variant of the current block. Just click on the icon to activate/deactivate the feature. All completions that do not use the selected variant will be grayed out.

Copy to clipboard

With the copy action you can copy the content of a block in plain text to the clipboard. See the copy/pasting content section below for more details.

Execute for all variants

The "execute for all variants" action is a shortcut to loop through all variants of the respective block and simultaneously execute the prompt once for each of them as selected choice.

Code highlighting

Promptmetheus can highlight code in your prompts with Shiki. Simply type ``` followed by the to initiate a code section. There are no special properties to code blocks except the visual formatting.

Here's the list of language indicators that are currently supported:

c, cpp, csharp, css, dart, go, html, java, json, jsx, kotlin, python, rust, svelte, swift, ts, tsx, vue

Copy/pasting content

Promptmetheus uses the TipTap rich-text editor to provide the best possible user experience and to enable features like formatting, variables, affixation, etc.

This comes with a small drawback when copy/pasting content between Promptmetheus and a plain text editors:

  • Pasting external content into Promptmetheus should just work as expected.
  • When copying block content there are two scenarios:
    1. To copy content from one block to another within Promptmetheus, you have to select the content and use ctrl+c/v or right-click copy/paste to correctly preserve linebreaks, variables, etc.
    2. Using method 1 will lead to excess linebreaks however when pasting the content into a plain text editor. This is due to how the operating systems converts formatted HTML to plain text. To get around this, you can use the block copy action to correctly convert the content to plain text and avoid the extra linebreaks. Note that using the copy action and then pasting the content back into Promptmetheus will erase linebreaks, variables, etc.

Variables

Variables allow you to define terms that you want to use in different places in your prompts or project but for which the values might change frequently.

Examples of good use cases for variables are titles, names, locations, identifiers, etc.

For now, variables can only be used inside text blocks, not data blocks.

Variables preview

Local variables

Local variables are limited in scope to the current prompt only.

You can add local variables to message blocks by just typing {{, followed by the variable name (it will autocomplete), and then either space or enter. You can only use alphanumeric characters and underscores in variable names, no spaces.

Note that the variable "name" is not the value the will be substituted into the prompt.

Global variables

Global variables can be defined in the project settings and share common values across all prompts in the project.

You can use global variables just like local variables. Promptmetheus will always check if a global variable exists for a given name and if not, it will create a local one.

For more information on global variables, take a look at the Projects section.

Preview

On the top right of the left panel on the prompt page you can find a preview button that will let you see how your compiled prompt will look like when it is sent to the API provider. The preview is in plain text and should have variable values substituted in.

Prompt fingerprint

Directly below the dialog title you'll see a random-looking alphanumeric sequence marked with a icon. This is the prompt's unique fingerprint. The fingerprint is a SHA-256 hash of the plain text version of the compiled prompt.

For most users the fingerprint will likely not be relevant, but in some cases it might be very useful to uniquely identify a prompt and its exact version. The same prompt will always generate the same fingerprint. If the fingerprint is different, them prompt is also different.

You can also find fingerprints for the prompts that generated each completion in the respective completion prompt dialogs.

Completions

When you "execute" a prompt, Promptmetheus will compile your prompt into a plain text string and send it to the API of the inference provider of your choice, where it will be completed by the selected LLM and returned to Promptmetheus for you to inspect.

Completion example

Ratings

In the making...

Fragments

In the making...

Model parameters

At the bottom left of each completion you can find the identifier of the model that was used for the completion together with the values of the selected model parameters:

...and indicators for Seed, JSON Mode, and Stop Sequences.

Inference metrics

At the bottom right you can find a selection of relevant metrics for the completion.

  • Inference Time in seconds
  • Inference Speed in tokens per second (tps)
  • Token Count (input/output)
  • Inference Cost (in cents)

Associated prompt

In the making...

Completion prompt example

Search, filter, display mode, and sweep

You can search completions and/or filter them by rating with the respective actions at the top right of the screen.

Additionally, there are 3 display modes available that you can just toggle through:

  1. text only
  2. text plus ratings, actions, model parameter, and inference metrics
  3. same as 2, plus prompt fragments

The last action in the list is the sweep button. It allows you to clear all completions from the current prompt. Note that there is currently no "undo" button and completions that are cleared cannot be restored (that's why you will have to confirm this action).