baseten_chat_completions

Tool

Create a chat completion using OpenAI-compatible API. **Supported Models:** - `deepseek-ai/DeepSeek-V3-0324` - DeepSeek V3 0324 (164k context) 🧠 - `deepseek-ai/DeepSeek-V3.1` - DeepSeek V3.1 (164k context) 🧠 - `zai-org/GLM-4.6` - GLM 4.6 (200k context) 🧠 - `zai-org/GLM-4.7` - GLM 4.7 (200k context) 🧠 - `moonshotai/Kimi-K2-Instruct-0905` - Kimi K2 0905 (128k context) - `moonshotai/Kimi-K2-Thinking` - Kimi K2 Thinking (262k context) 🧠 always-on - `moonshotai/Kimi-K2.5` - Kimi K2.5 (262k context) - `openai/gpt-oss-120b` - OpenAI GPT OSS 120B (128k context) 🧠 = Reasoning model. Use `reasoning_effort` param (low/medium/high) to control thinking depth. Response includes `reasoning_content` field with chain-of-thought. Supports streaming, tool calling, structured outputs.

Pricing

Per call

$0.01

Model

flat

Pay only for what you use. No subscriptions.

Inputs

top_logprobs

number

reasoning_effort

string

logit_bias

object

seed

number

bad

string

skip_special_tokens

boolean

documents

string

presence_penalty

number

echo

boolean

top_p_min

number

early_stopping

boolean

tools

string

logprobs

boolean

top_p

number

frequency_penalty

number

response_format

object

truncate_prompt_tokens

number

best_of

number

stream

boolean

top_k

number

disaggregated_params

object

temperature

number

tool_choice

string

model *

string

ignore_eos

boolean

chat_template

string

max_tokens

number

add_generation_prompt

boolean

number

min_tokens

number

min_p

number

spaces_between_special_tokens

boolean

chat_template_args

object

stop

string

parallel_tool_calls

boolean

include_stop_str_in_output

boolean

messages *

string

bad_token_ids

string

stream_options

object

user

string

repetition_penalty

number

length_penalty

number

stop_token_ids

string

add_special_tokens

boolean

Baseten Model APIs

Tool