xpay hub logo
hub
  • Tools
  • Explore
  • Docs
  • xpay.sh
Z.ai API
Z.ai API
Tool

zai_chat_completion

Create a chat completion model that generates AI replies for given conversation messages. It supports multimodal inputs (text, images, audio, video, file), offers configurable parameters (like temperature, max tokens, tool use), and supports both streaming and non-streaming output modes.


Pricing

Per call

$0.02

Model

flat


Pay only for what you use. No subscriptions.
Inputs

max_tokens

integer

do_sample

boolean

thinking

object

tools

array

tool_stream

boolean

top_p

number

response_format

object

stop

array

stream

boolean

user_id

string

temperature

number

messages *

array

tool_choice

string

model *

string

request_id

string
Try It
API
MCP Config
Input Parameters
max_tokens
The maximum number of tokens for model output, the GLM-4.6 series supports 128K maximum output, the GLM-4.5 series supports 96K maximum output, the GLM-4.5v series supports 16K maximum output, GLM-4-32B-0414-128K supports 16K maximum output.
thinking
Only supported by GLM-4.5 series and higher models. This parameter is used to control whether the model enable the chain of thought.
tools
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
top_p
Another method of temperature sampling, value range is: `[0.01, 1.0]`. The GLM-4.6, GLM-4.5 series default value is `0.95`, GLM-4-32B-0414-128K default value is `0.9`.
response_format
Specifies the response format of the model. Defaults to text. Supports two formats:{ "type": "text" } plain text mode, returns natural language text, { "type": "json_object" } JSON mode, returns valid JSON data. When using JSON mode, it’s recommended to clearly request JSON output in the prompt.
stop
Stop word list. Generation stops when the model encounters any specified string. Currently, only one stop word is supported, in the format ["stop_word1"].
user_id
Unique ID for the end user, 6–128 characters. Avoid using sensitive information.
temperature
Sampling temperature, controls the randomness of the output, must be a positive number within the range: `[0.0, 1.0]`. The GLM-4.6 series default value is `1.0`, GLM-4.5 series default value is `0.6`, GLM-4-32B-0414-128K default value is `0.75`.
messages *
The current conversation message list as the model’s prompt input, provided in JSON array format, e.g.,`{“role”: “user”, “content”: “Hello”}`. Possible message types include system messages, user messages, assistant messages, and tool messages. Note: The input must not consist of system messages or assistant messages only.
tool_choice
Controls how the model selects a tool. Used to control how the model selects which function to call. This is only applicable when the tool type is function. The default value is auto, and only auto is supported.
model *
The model code to be called. GLM-4.6 are the latest flagship model series, foundational models specifically designed for agent applications.
request_id
Passed by the user side, needs to be unique; used to distinguish each request. If not provided by the user side, the platform will generate one by default.
Cost per run
Execution cost
$0.02
Deducted from your xPay allowance
xpay hub logo
hub

Marketplace for AI Capabilities. Run agents, tools & prompts with pay-per-use micropayments.

Product
ExploreCollectionsBundles
Resources
Documentationxpay.shGitHub

© 2026 Agentically Inc. All rights reserved.Microtransactions happen via Stablecoins