POST
/
v1
/
chat
/
completions

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
prompt

The prompt text sent (for a /completions request)

messages
object[]

The conversation sent (with or without history) (for a /chat/completions request)

model
string
default: pulze

https://docs.pulze.ai/overview/models Specify the model you'd like Pulze to use. (optional). Can be the full model name, or a subset for multi-matching.

Defaults to our dynamic routing, i.e. best model for this request.

suffix
string
default:

COMING SOON

max_tokens
integer

The maximum number of tokens that the response can contain.

temperature
number
default: 1

Optionally specify the temperature for this request only. Leave empty to allow Pulze to guess it for you.

Required range: 0 < x < 1
top_p
number
default: 1

https://platform.openai.com/docs/api-reference/completions/create#completions/create-top_p An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass

tools
object[]
tool_choice
default: none
Available options:
none,
auto
n
integer
default: 1

How many completions to generate for each prompt. @default 1

Required range: x > 1
stream
boolean
default: false

** COMING SOON ** Specify if you want the response to be streamed or to be returned as a standard HTTP request

logprobs
integer

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logprobs Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens.

Required range: 0 < x < 5
stop
default:

Stop responding when this sequence of characters is generated. Leave empty to allow the model to decide.

presence_penalty
number
default: 0

https://platform.openai.com/docs/api-reference/completions/create#completions/create-presence_penalty Increase the model's likelihood to talk about new topics

Required range: -2 < x < 2
frequency_penalty
number
default: 0

https://platform.openai.com/docs/api-reference/completions/create#completions/create-frequency_penalty Increase the model's likelihood to not repeat tokens/words

Required range: -2 < x < 2
best_of
integer
default: 1

The number of responses to generate. Out of those, it will return the best n.

logit_bias
object

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logit_bias Modify the likelihood of specified tokens appearing in the completion.

See here for a detailed explanation on how to use: https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probability

Response

200 - application/json

The response returned to the user by the Chat Completions endpoint

choices
object[]
required
model
string
required

The fully qualified model name used by PulzeEngine

object
enum<string>
required

The type of response object

Available options:
text_completion,
chat.completion
created
integer
default: 0

Creation timestamp

metadata
object

Metadata of the response

id
string

This ID gets generated by the database when we save the request

usage
object

Tokens used