POST
/
v1
/
chat
/
completions

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
prompt

The prompt text sent (for a /completions request)

messages
object[]

The conversation sent (with or without history) (for a /chat/completions request)

model
string
default: pulze

https://docs.pulze.ai/overview/models Specify the model you'd like Pulze to use. (optional). Can be the full model name, or a subset for multi-matching.

Defaults to our dynamic routing, i.e. best model for this request.

suffix
string
default:

COMING SOON

max_tokens
integer

The maximum number of tokens that the response can contain.

temperature
number
default: 1

Optionally specify the temperature for this request only. Leave empty to allow Pulze to guess it for you.

top_p
number
default: 1

https://platform.openai.com/docs/api-reference/completions/create#completions/create-top_p An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass

tools
object[]
tool_choice
default: none
Available options:
none,
auto
n
integer
default: 1

How many completions to generate for each prompt. @default 1

stream
boolean
default: false

** COMING SOON ** Specify if you want the response to be streamed or to be returned as a standard HTTP request

logprobs
integer

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logprobs Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens.

stop
default:

Stop responding when this sequence of characters is generated. Leave empty to allow the model to decide.

presence_penalty
number
default: 0

https://platform.openai.com/docs/api-reference/completions/create#completions/create-presence_penalty Increase the model's likelihood to talk about new topics

frequency_penalty
number
default: 0

https://platform.openai.com/docs/api-reference/completions/create#completions/create-frequency_penalty Increase the model's likelihood to not repeat tokens/words

best_of
integer
default: 1

The number of responses to generate. Out of those, it will return the best n.

logit_bias
object

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logit_bias Modify the likelihood of specified tokens appearing in the completion.

See here for a detailed explanation on how to use: https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probability

Response

200 - application/json

The response returned to the user by the Chat Completions endpoint

choices
object[]
required
created
integer
default: 0

Creation timestamp

metadata
object

Metadata of the response

id
string

This ID gets generated by the database when we save the request

usage
object

Tokens used

model
string
required

The fully qualified model name used by PulzeEngine

object
enum<string>
required

The type of response object

Available options:
text_completion,
chat.completion