Any request to Pulze will result in the selection of a particular model, based on the most suitable model matching your criteria.

This criteria is a combination of several factors (including which models are enabled, or how creative you want the response to be) but it’s also strongly veered by the interaction between three key forces: quality, cost, and latency, which can be pre-configured in the App’s settings.

  • quality (higher is better)
  • cost (higher is cheaper)
  • latency (higher is faster)
Their absolute values are not important, only their relative proportions.

As a quick example, quality = 0.7 and cost = 0.3 (same as quality = 7, cost = 3) will favour quality of the response, but taking into consideration that the model isn’t expensive.

You can set the weights on a per-app basis in the Model Settings page.

Example

The above example will modify the weights for that request only — which might result in a different model being chosen.

If you send Pulze-Labels header, the weights will be stored as part of the Labels in this format:

{
  "weights_cost": "0.2",
  "weights_quality": "0.8",
  "weights_latency": "0",
  ...
  "other-pulze-labels": "below",
  "foo": "bar",
  "my_obj": "{\"key\": \"value\"}"
}

This will only happen if Pulze-Labels are defined.

Send {} if you don't want to send any particular Labels, but want the Pulze-Weights and Pulze-Policies to be auto-saved. Ignoring the Pulze-Labels header will not store the other header information as part of it -- essentially, ignoring the Pulze-Labels header is like saying "I don't want labels".