Every request to Pulze can be configured through our Policies. Policies is a way to customize the behaviour of your request, but it does not affect model selection or labelling.

As of now, you can only set the policies on a per-request basis. We are planning to add settings to the Model Settings page at some point in the future.

Policies

Max cost (for the whole request)

key: max_cost
value: float

The maximum cost, in your Org’s currency unit (currently only USD supported) of the whole request. If you have very complex queries or large custom data sets, make sure to set this as a high value. Use with caution.

This feature is not yet available!

Maximum number of times to retry a particular model

key: max_same_model_retries
value: int >= 0

The maximum number of times to retry a request to any particular model.

If set to 0, the system won’t retry the request to this model. In general, there will be up to N+1 LLM calls performed to the same model (the original, and the retries.)

To be used in combination with max_switch_model_retries

The maximum number of requests (worst-case) will be (max_same_model_retries+1) * (max_switch_model_retries+1)

For requests that require multiple LLM calls, the settings are applied to each of the intermediate requests! This can result in a high number of tokens used.

Maximum number of different models to try

key: max_switch_model_retries
value: int >= 0

The maximum number of different models to retry for any particular request.

If set to 0, the system won’t use any other models. In general, Pulze will try the request with N+1 different LLM models (the best model, and one for each “retry”.)

To be used in combination with max_same_model_retries

The maximum number of requests (worst-case) will be (max_same_model_retries+1) * (max_switch_model_retries+1)

For requests that require multiple LLM calls, the settings are applied to each of the intermediate requests! This can result in a high number of tokens used.

Privacy level

key: privacy_level
value: 1, 2, 3

The level of privacy you want for this particular request, and all of its sub-requests.

  1. (default) Store the prompt, response, and all the metadata associated with it (labels, weights, costs…)
  2. Store all the metadata, but the prompt and the LLM’s response will not be logged in any way. The log is still visible, labelled, and searcheable.
  3. (stealth mode) The log is not stored[*], not visible, not searchable, and no prompt, response or labels are stored.
[*] Internally we must log the datetime and costs incurred, which we require for billing.

Prompt

key: prompt_id
value: <PROMPT_ID>

In some cases, you want to wrap your prompt in a set of predefined instructions, which we call Prompts. Enter the Prompt ID to use it in the request. You can find such ID by clicking on an existing Prompt. Alternatively, you can set a predefined Prompt by visiting the App’s settings tab

Example

The above example will modify the policies for that request only.

You can set any, some, or all the policies on every request

If you send Pulze-Labels header, the policies will be stored as part of the Labels in this format:

{
  "policies_max_retries": "8",
  "policies_max_cost": "0.05",
  ...
  "other-pulze-labels": "below",
  "foo": "bar",
  "my_obj": "{\"key\": \"value\"}"
}

This will only happen if Pulze-Labels are defined.

Send {} if you don't want to send any particular Labels, but want the Pulze-Weights and Pulze-Policies to be auto-saved. Ignoring the Pulze-Labels header will not store the other header information as part of it -- essentially, ignoring the Pulze-Labels header is like saying "I don't want labels".

Note that privacy_level isn’t stored at all