Completions - Pulze.ai

POST

completions

curl --request POST \
  --url https://api.pulze.ai/v1/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "max_tokens": 123,
  "temperature": 0.5,
  "top_p": 123,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {
          "type": "object",
          "properties": {},
          "required": [
            "<string>"
          ]
        }
      }
    }
  ],
  "tool_choice": "none",
  "n": 2,
  "stream": true,
  "logprobs": 2,
  "stop": "<string>",
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "best_of": 2,
  "logit_bias": {},
  "response_format": {
    "type": "text"
  },
  "prompt": "<string>"
}'

{
  "choices": [
    {
      "index": 123,
      "finish_reason": "<string>",
      "text": "<string>",
      "logprobs": 123
    }
  ],
  "created": 0,
  "metadata": {
    "app_id": "<string>",
    "model": {
      "model": "<string>",
      "provider": "<string>",
      "owner": "<string>",
      "namespace": "<string>",
      "at": "<string>"
    },
    "costs": {
      "total_tokens": 123,
      "prompt_tokens": 123,
      "completion_tokens": 123
    },
    "cost_savings": {
      "total_tokens": 123,
      "prompt_tokens": 123,
      "completion_tokens": 123
    },
    "generated_artifacts": {
      "items": [
        {}
      ]
    },
    "search_results": {
      "items": [
        {}
      ]
    },
    "latency": 123,
    "labels": {},
    "error": "<string>",
    "scores": {
      "best_models": [],
      "candidates": [
        {}
      ]
    },
    "score": 123,
    "temperature": 123,
    "max_tokens": 0,
    "status_code": 123,
    "retries": 0,
    "extra": {},
    "warning": "<string>"
  },
  "id": "<string>",
  "usage": {
    "total_tokens": 123,
    "prompt_tokens": 123,
    "completion_tokens": 123
  },
  "model": "<string>",
  "object": "text_completion"
}

This endpoint supports:

Weights

Labels

Policies

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

prompt

required

The prompt text sent (for a /completions request)

Minimum length: 1

model

string | null

default:pulze

https://docs.pulze.ai/overview/models Specify the model you'd like Pulze to use. (optional). Can be the full model name, or a subset for multi-matching.

Defaults to our dynamic routing, i.e. best model for this request.

max_tokens

integer | null

The maximum number of tokens that the response can contain.

temperature

number | null

default:1

Optionally specify the temperature for this request only. Leave empty to allow Pulze to guess it for you.

Required range: 0 <= x <= 1

top_p

number | null

default:1

https://octo.ai/docs/text-gen-solution/rest-api#input-parameters A value between 0.0 and 1.0 that controls the probability of the model generating a particular token.

tools

object[] | null

tool_choice

default:none

Available options:

none,

auto

integer | null

How many completions to generate for each prompt. @default 1

Required range: x >= 1

stream

boolean | null

default:false

Specify if you want the response to be streamed or to be returned as a standard HTTP request. Currently we only support streaming for OpenAI-compatible models.

logprobs

integer | null

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logprobs Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens.

Required range: 0 <= x <= 5

stop

default:

Stop responding when this sequence of characters is generated. Leave empty to allow the model to decide.

presence_penalty

number | null

https://platform.openai.com/docs/api-reference/completions/create#completions/create-presence_penalty Increase the model's likelihood to talk about new topics

Required range: -2 <= x <= 2

frequency_penalty

number | null

https://platform.openai.com/docs/api-reference/completions/create#completions/create-frequency_penalty Increase the model's likelihood to not repeat tokens/words

Required range: -2 <= x <= 2

best_of

integer | null

The number of responses to generate. Out of those, it will return the best n.

Required range: x >= 1

logit_bias

object | null

COMING SOON https://platform.openai.com/docs/api-reference/completions/create#completions/create-logit_bias Modify the likelihood of specified tokens appearing in the completion.

See here for a detailed explanation on how to use: https://help.openai.com/en/articles/5247780-using-logit-bias-to-define-token-probability

response_format

object | null

https://platform.openai.com/docs/api-reference/chat/create#chat-create-response_format An object specifying the format that the model must output. Must be one of "text" or "json_object". Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. To help ensure you don't forget, the API will throw an error if the string "JSON" does not appear somewhere in the context.

Response

200

application/json

Successful Response

The response returned to the user by the (text) Completions endpoint

choices

object[]

required

model

string

required

The fully qualified model name used by PulzeEngine

object

enum<string>

required

The type of response object

Available options:

text_completion,

chat.completion

created

integer

default:0

Creation timestamp -- in milliseconds (!)

metadata

object

Metadata of the response

metadata.app_id

string

The ID of the app this request belongs to

metadata.model

object | null

The model used in the request

metadata.costs

object

Cost (in $) of the request

metadata.cost_savings

object

Price difference -- compared with GPT-4

metadata.generated_artifacts

object

Generated artifacts

metadata.search_results

object

Search results

metadata.latency

number | null

The time it took for the Provider to return a response

metadata.labels

object | null

Custom labels (metadata) sent along in the request

metadata.error

string | null

If an error occurs, it will be stored here

metadata.scores

object | null

A ranking of the best models for a given request

metadata.score

number | null

default:0

The score for the currently used LLM

metadata.temperature

Temperature used for the request

metadata.max_tokens

integer | null

Maximum number of tokens that can be used in the request+response.Leave empty to make it automatic, and set to -1 to use the maximum context size (model-dependent)

Required range: x >= -1

metadata.status_code

integer | null

Status code of the response

metadata.retries

integer

default:0

The number of retries needed to get the answer. null or 0 means no retries were required

metadata.extra

object | null

Extra data

metadata.warning

string | null

Show a warning -- deprecation messages, etc.

string

This ID gets generated by the database when we save the request

usage

object

Tokens used

Was this page helpful?

Raise issue

curl --request POST \
  --url https://api.pulze.ai/v1/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "max_tokens": 123,
  "temperature": 0.5,
  "top_p": 123,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {
          "type": "object",
          "properties": {},
          "required": [
            "<string>"
          ]
        }
      }
    }
  ],
  "tool_choice": "none",
  "n": 2,
  "stream": true,
  "logprobs": 2,
  "stop": "<string>",
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "best_of": 2,
  "logit_bias": {},
  "response_format": {
    "type": "text"
  },
  "prompt": "<string>"
}'

{
  "choices": [
    {
      "index": 123,
      "finish_reason": "<string>",
      "text": "<string>",
      "logprobs": 123
    }
  ],
  "created": 0,
  "metadata": {
    "app_id": "<string>",
    "model": {
      "model": "<string>",
      "provider": "<string>",
      "owner": "<string>",
      "namespace": "<string>",
      "at": "<string>"
    },
    "costs": {
      "total_tokens": 123,
      "prompt_tokens": 123,
      "completion_tokens": 123
    },
    "cost_savings": {
      "total_tokens": 123,
      "prompt_tokens": 123,
      "completion_tokens": 123
    },
    "generated_artifacts": {
      "items": [
        {}
      ]
    },
    "search_results": {
      "items": [
        {}
      ]
    },
    "latency": 123,
    "labels": {},
    "error": "<string>",
    "scores": {
      "best_models": [],
      "candidates": [
        {}
      ]
    },
    "score": 123,
    "temperature": 123,
    "max_tokens": 0,
    "status_code": 123,
    "retries": 0,
    "extra": {},
    "warning": "<string>"
  },
  "id": "<string>",
  "usage": {
    "total_tokens": 123,
    "prompt_tokens": 123,
    "completion_tokens": 123
  },
  "model": "<string>",
  "object": "text_completion"
}

TUTORIALS

Weights

Labels

Policies

Authorizations

Body

Response