Chat Completions

POST

chat

completions

object

prompt

string

messages

array

model

string

suffix

string

max_tokens

integer

temperature

number

top_p

number

tools

array

tool_choice

enum<string>

integer

stream

boolean

logprobs

integer

stop

string

presence_penalty

number

frequency_penalty

number

best_of

integer

logit_bias

object

Authorizations

Authorization

string

headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

prompt

The prompt text sent (for a /completions request)

messages

object[]

The conversation sent (with or without history) (for a /chat/completions request)

model

string

default: pulze

https://docs.pulze.ai/overview/models Specify the model you'd like Pulze to use. (optional). Can be the full model name, or a subset for multi-matching.

Defaults to our dynamic routing, i.e. best model for this request.

suffix

string

default:

COMING SOON

max_tokens

integer

The maximum number of tokens that the response can contain.

temperature

number

default: 1

Optionally specify the temperature for this request only. Leave empty to allow Pulze to guess it for you.

Required range: 0 < x < 1

top_p

number

default: 1

https://platform.openai.com/docs/api-reference/completions/create#completions/create-top_p An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass

tools

object[]

tool_choice

default: none

Available options:

none,

auto

integer

default: 1

How many completions to generate for each prompt. @default 1

Required range: x > 1

stream

boolean

default: false

** COMING SOON ** Specify if you want the response to be streamed or to be returned as a standard HTTP request

logprobs

integer

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logprobs Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens.

Required range: 0 < x < 5

stop

default:

Stop responding when this sequence of characters is generated. Leave empty to allow the model to decide.

presence_penalty

number

default: 0

https://platform.openai.com/docs/api-reference/completions/create#completions/create-presence_penalty Increase the model's likelihood to talk about new topics

Required range: -2 < x < 2

frequency_penalty

number

default: 0

https://platform.openai.com/docs/api-reference/completions/create#completions/create-frequency_penalty Increase the model's likelihood to not repeat tokens/words

Required range: -2 < x < 2

best_of

integer

default: 1

The number of responses to generate. Out of those, it will return the best n.

logit_bias

object

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logit_bias Modify the likelihood of specified tokens appearing in the completion.

See here for a detailed explanation on how to use: https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probability

Response

200 - application/json

The response returned to the user by the Chat Completions endpoint

choices

object[]

required

model

string

required

The fully qualified model name used by PulzeEngine

object

enum<string>

required

The type of response object

Available options:

text_completion,

chat.completion

created

integer

default: 0

Creation timestamp

metadata

object

Metadata of the response

metadata.app_id

string

The ID of the app this request belongs to

metadata.model

object

The model used in the request

metadata.costs

object

Cost (in $) of the request

metadata.cost_savings

object

Price difference -- compared with GPT-4

metadata.latency

number

The time it took for the Provider to return a response

metadata.category

string

Category assigned to this request (Science, Health, Games...)

metadata.labels

object

Custom labels (metadata) sent along in the request

metadata.error

string

If an error occurs, it will be stored here

metadata.scores

object

A ranking of the best models for a given request

metadata.score

number

default: 0

The score for the currently used LLM

metadata.temperature

Temperature used for the request

metadata.max_tokens

integer

Maximum number of tokens that can be used in the request+response.Leave empty to make it automatic, and set to -1 to use the maximum context size (model-dependent)

Required range: x > -1

metadata.status_code

integer

Status code of the response

metadata.retries

integer

default: 0

The number of retries needed to get the answer. null or 0 means no retries were required

metadata.extra

object

Extra data

string

This ID gets generated by the database when we save the request

usage

object

Tokens used

Was this page helpful?

Learning hub

OVERVIEW

COMMUNITY

PULZE ACADEMY

API REFERENCE

Authorizations

Body

Response