Chat Completions
Perform a Chat Completion request.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The conversation sent (with or without history) (for a /chat/completions request)
The number of responses to generate. Out of those, it will return the best n
.
x > 1
https://platform.openai.com/docs/api-reference/completions/create#completions/create-frequency_penalty Increase the model's likelihood to not repeat tokens/words
-2 < x < 2
COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logit_bias Modify the likelihood of specified tokens appearing in the completion.
See here for a detailed explanation on how to use: https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probability
COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logprobs Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens.
0 < x < 5
The maximum number of tokens that the response can contain.
https://docs.pulze.ai/overview/models Specify the model you'd like Pulze to use. (optional). Can be the full model name, or a subset for multi-matching.
Defaults to our dynamic routing, i.e. best model for this request.
How many completions to generate for each prompt. @default 1
x > 1
The list of plugins to enable for the request
https://platform.openai.com/docs/api-reference/completions/create#completions/create-presence_penalty Increase the model's likelihood to talk about new topics
-2 < x < 2
https://platform.openai.com/docs/api-reference/chat/create#chat-create-response_format An object specifying the format that the model must output. Must be one of "text" or "json_object". Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. To help ensure you don't forget, the API will throw an error if the string "JSON" does not appear somewhere in the context.
Stop responding when this sequence of characters is generated. Leave empty to allow the model to decide.
Specify if you want the response to be streamed or to be returned as a standard HTTP request. Currently we only support streaming for OpenAI-compatible models.
Optionally specify the temperature for this request only. Leave empty to allow Pulze to guess it for you.
0 < x < 1
none
, auto
https://octo.ai/docs/text-gen-solution/rest-api#input-parameters A value between 0.0 and 1.0 that controls the probability of the model generating a particular token.
Response
The response returned to the user by the Chat Completions endpoint
The fully qualified model name used by PulzeEngine
The type of response object
text_completion
, chat.completion
Creation timestamp -- in milliseconds (!)
This ID gets generated by the database when we save the request
Metadata of the response
Tokens used
Was this page helpful?