xpay hub logo
hub
  • Tools
  • Explore
  • Docs
  • xpay.sh
Baseten Model APIs
Baseten Model APIs
Tool

baseten_chat_completions

Create a chat completion using OpenAI-compatible API. **Supported Models:** - `deepseek-ai/DeepSeek-V3-0324` - DeepSeek V3 0324 (164k context) 🧠 - `deepseek-ai/DeepSeek-V3.1` - DeepSeek V3.1 (164k context) 🧠 - `zai-org/GLM-4.6` - GLM 4.6 (200k context) 🧠 - `zai-org/GLM-4.7` - GLM 4.7 (200k context) 🧠 - `moonshotai/Kimi-K2-Instruct-0905` - Kimi K2 0905 (128k context) - `moonshotai/Kimi-K2-Thinking` - Kimi K2 Thinking (262k context) 🧠 always-on - `moonshotai/Kimi-K2.5` - Kimi K2.5 (262k context) - `openai/gpt-oss-120b` - OpenAI GPT OSS 120B (128k context) 🧠 = Reasoning model. Use `reasoning_effort` param (low/medium/high) to control thinking depth. Response includes `reasoning_content` field with chain-of-thought. Supports streaming, tool calling, structured outputs.


Pricing

Per call

$0.01

Model

flat


Pay only for what you use. No subscriptions.
Inputs

top_logprobs

number

reasoning_effort

string

logit_bias

object

seed

number

bad

string

skip_special_tokens

boolean

documents

string

presence_penalty

number

echo

boolean

top_p_min

number

early_stopping

boolean

tools

string

logprobs

boolean

top_p

number

frequency_penalty

number

response_format

object

truncate_prompt_tokens

number

best_of

number

stream

boolean

top_k

number

disaggregated_params

object

temperature

number

tool_choice

string

model *

string

ignore_eos

boolean

chat_template

string

max_tokens

number

add_generation_prompt

boolean

n

number

min_tokens

number

min_p

number

spaces_between_special_tokens

boolean

chat_template_args

object

stop

string

parallel_tool_calls

boolean

include_stop_str_in_output

boolean

messages *

string

bad_token_ids

string

stream_options

object

user

string

repetition_penalty

number

length_penalty

number

stop_token_ids

string

add_special_tokens

boolean
Try It
API
MCP Config
Input Parameters
top_logprobs
Top logprobs to return (0-20)
reasoning_effort
Reasoning depth for supported models (low/medium/high). Default: medium. Supported on: DeepSeek V3.1, DeepSeek V3 0324, GLM 4.7, GLM 4.6, Kimi K2 Thinking
logit_bias
Token ID to bias map (-100 to 100)
seed
Random seed
bad
Words to avoid
documents
Documents for RAG
presence_penalty
Penalize by presence
top_p_min
Min dynamic top_p
tools
Functions model can call
top_p
Nucleus sampling 0-1
frequency_penalty
Penalize tokens by frequency (default: 0)
response_format
Response format type
truncate_prompt_tokens
Truncate prompt to N tokens
best_of
Candidates to generate (only 1)
top_k
Top-K sampling
disaggregated_params
Advanced distributed inference params
temperature
Sampling temperature 0-4
tool_choice
Tool calling mode
model *
Model slug (e.g., deepseek-ai/DeepSeek-V3.1)
chat_template
Custom Jinja template
max_tokens
Max tokens (default: 4096)
n
Number of completions (only 1)
min_tokens
Minimum tokens before stopping
min_p
Min probability threshold
chat_template_args
Chat template arguments
stop
Stop sequences
messages *
Conversation messages with role and content
bad_token_ids
Token IDs to avoid
stream_options
Stream options
user
End-user identifier
repetition_penalty
Repetition penalty
length_penalty
Length penalty for beam search
stop_token_ids
Token IDs that stop generation
Cost per run
Execution cost
$0.01
Deducted from your xPay allowance
xpay hub logo
hub

Marketplace for AI Capabilities. Run agents, tools & prompts with pay-per-use micropayments.

Product
ExploreCollectionsBundles
Resources
Documentationxpay.shGitHub

© 2026 Agentically Inc. All rights reserved.Microtransactions happen via Stablecoins