Smart Spectrum released a free vision model GLM-4V-Flash concurrency limit raised to 10, this time enough for production environments

AI News8mos agorelease AI Sharing Circle

1.7K 00

GLM-4V Series

The GLM-4V series contains 3 models for different application scenarios.

GLM-4V-Plus: With excellent multimodal comprehension capability, it can process up to 5 images simultaneously and supports video content comprehension, which is suitable for complex multimedia analysis scenarios.
GLM-4V: Focusing on image content understanding, it is capable of parsing up to 5 images simultaneously, and is suitable for occasions where deep image understanding is required.
GLM-4V-Flash (Free): Focuses on efficient single-image understanding for scenarios with fast image parsing, such as real-time image analysis or batch image processing.

synchronous call

interface request

typology	clarification
transmission method	https
request address	https://open.bigmodel.cn/api/paas/v4/chat/completions
invocation method	Synchronized call that waits for the model execution to complete and return the final result or SSE call
character encoding	UTF-8
Interface Request Format	JSON
response format	JSON or Standard Stream Event
Interface request type	POST
development language	Any development language that can initiate http requests

Request Parameters

Parameter name	typology	mandatory field	Parameter description
model	String	be	The model encoding of the call. Model encoding: glm-4v-plus , glm-4v, glm-4v-flash (free)
messages	List	be	When the language model is called, a list of current dialog messages is input to the model as prompts, passing parameters in the form of json arrays. For example, the video comprehension parameter:`{ "role": "user", "content": [ { "type": "video_url", "video_url": { "url" : "https://xxx/xx.mp4" } }, { "type": "text", "text": "请仔细描述这个视频" } ] }`Picture comprehension parameters:`{ "role": "user", "content": [ { "type": "image_url", "image_url": { "url" : "https://xxx/xx.jpg" } }, { "type": "text", "text": "解释一下图中的现象" } ] }`Possible message types include User message, Assistant message. see the message message field descriptions below.
request_id	String	clogged	Passed by the user, need to ensure the uniqueness; used to distinguish between each request for a unique identity, the user side does not pass the platform will be generated by default.
do_sample	Boolean	clogged	Sampling strategy is enabled when do_sample is true, and sampling strategies temperature and top_p will not take effect when do_sample is false.
stream	Boolean	clogged	This parameter should be set to Fasle or omitted when using synchronized calls. If set to True, the model will return the model generated content one by one through the standard Event Stream. If set to True, the model will return the model generated content in a standard Event Stream, one block at a time, and at the end of the Event Stream, it will return an`data: [DONE]`Message.
temperature	Float	clogged	Sample temperature, control the randomness of the output, must be positive The range of values is:`[0.0,1.0]`The default value is 0.8, the larger the value, the more random and creative the output will be; the smaller the value, the more stable or certain the output will be. It is recommended to adjust the value according to the application scenario. `top_p` maybe `temperature` parameter, but do not adjust both parameters at the same time
top_p	Float	clogged	Another method of sampling with temperature, called kernel sampling The range of values is:`[0.0, 1.0]`The default value is 0.6 The model considers models with a `top_p` probability mass tokens For example, 0.1 means that the model decoder only considers tokens from the candidate set of probabilities from the former 10%. It is recommended that you adjust the model decoder according to your application scenario. `top_p` maybe `temperature` parameter, but do not adjust both parameters at the same time
max_tokens	Integer	clogged	Maximum model output tokens, default is 1024, maximum support 1024tokens.
user_id	String	clogged	Unique ID of the end-user to assist the platform to intervene in end-user violations, generation of illegal and undesirable information or other abusive behaviors.ID length requirement: minimum 6 characters, maximum 128 characters.

Messages format

Acceptable message types for the model include User message, Assistant message, and the format varies from one message type to another. The details are as follows:

User message

Parameter name	typology	mandatory field	Parameter description
role	String	be	The role information for the message should be`user`
content	List	be	Message content.
type	String	be	Text type: text Image type: image_url Video type: video_url Video and image types cannot be entered at the same time
text	String	be	Supplemental when type is text
image_url	Object	be	When type is image_url, add
url	String	be	Image url or base64 encoding. Image size upload limit is less than 5M per image and not more than 6000*6000 pixels. jpg, png, jpeg formats are supported. Note: GLM-4V-Flash does not support base64 encoding.
video_url	Object	be	When type is video_url, the video_url parameter must be the first when only glm-4v-plus supports video input video comprehension.
url	String	be	Video url. video size is only supported up to 20M, video duration is not more than 30s. supported video type mp4 .

Assistant message

Parameter name	typology	mandatory field	Parameter description
role	String	be	The role information for the message should be`assistant`
content	String	be	Message

response parameter

Parameter name	typology	Parameter description
id	String	Task ID
created	Long	The request creation time, which is a Unix timestamp in seconds.
model	String	Model name
choices	List	Model output content of the current dialog
index	Integer	Result subscripts
finish_reason	String	Modeling Reasons for Reasoning About Termination.`stop`Represents the natural end of reasoning or triggers a stop word.`length` represents reaching the upper limit of tokens length.`sensitive` Represents model inference content intercepted by the security audit interface.`network_error` represents model inference anomalies.
message	Object	Text message returned by the model
role	String	The role of the current dialog, currently defaulting to assistant (model)
content	List	Content of the current dialogue
usage	Object	Returns a count of the number of tokens for this model call at the end of the session
prompt_tokens	Integer	Number of tokens entered by the user
completion_tokens	Integer	Number of tokens output by the model
total_tokens	Integer	Total tokens
content_filter	List	Back to information about content security
role	String	Security entry into force sessions, including`role = assistant` model reasoning.`role = user` user input.`role = history` historical context
level	Integer	Severity level 0-3, with level 0 being the most severe and 3 being the least severe.

Example of a request

Upload Video URL

#视频理解示例、上传视频URL
from zhipuai import ZhipuAI
client = ZhipuAI(api_key="YOUR API KEY") # 填写您自己的APIKey
response = client.chat.completions.create(
model="glm-4v-plus",  # 填写需要调用的模型名称
messages=[
{
"role": "user",
"content": [
{
"type": "video_url",
"video_url": {
"url" : "https://sfile.chatglm.cn/testpath/video/xxxxx.mp4"
}
},
{
"type": "text",
"text": "请仔细描述这个视频"
}
]
}
]
)
print(response.choices[0].message)

Upload video Base64

import base64
from zhipuai import ZhipuAI
video_path = "/Users/YourCompluter/xxxx.mp4"
with open(video_path, 'rb') as video_file:
video_base = base64.b64encode(video_file.read()).decode('utf-8')
client = ZhipuAI(api_key="YOUR API KEY") # 填写您自己的APIKey
response = client.chat.completions.create(
model="glm-4v-plus",  # 填写需要调用的模型名称
messages=[
{
"role": "user",
"content": [
{
"type": "video_url",
"video_url": {
"url" : video_base
}
},
{
"type": "text",
"text": "请仔细描述这个视频"
}
]
}
]
)
print(response.choices[0].message)

Upload Image URL

from zhipuai import ZhipuAI
client = ZhipuAI(api_key="") # 填写您自己的APIKey
response = client.chat.completions.create(
model="glm-4v",  # 填写需要调用的模型名称
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "图里有什么"
},
{
"type": "image_url",
"image_url": {
"url" : "https://img1.baidu.com/it/u=1369931113,3388870256&fm=253&app=138&size=w931&n=0&f=JPEG&fmt=auto?sec=1703696400&t=f3028c7a1dca43a080aeb8239f09cc2f"
}
}
]
}
]
)
print(response.choices[0].message)

Upload image Base64

import base64
from zhipuai import ZhipuAI
img_path = "/Users/YourCompluter/xxxx.jpeg"
with open(img_path, 'rb') as img_file:
img_base = base64.b64encode(img_file.read()).decode('utf-8')
client = ZhipuAI(api_key="YOUR API KEY") # 填写您自己的APIKey
response = client.chat.completions.create(
model="glm-4v-plus",  # 填写需要调用的模型名称
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": img_base
}
},
{
"type": "text",
"text": "请描述这个图片"
}
]
}
]
)
print(response.choices[0].message)

Multiple rounds of picture dialog

from zhipuai import ZhipuAI
client = ZhipuAI(api_key="YOUR API KEY") # 填写您自己的APIKey
response = client.chat.completions.create(
model="glm-4v-plus",  # 填写需要调用的模型名称
messages=[
{
"content": [
{
"image_url": {
"url": "https://cdn.bigmodel.cn/xxxxxx.png"
},
"type": "image_url"
},
{
"text": "图中有什么",
"type": "text"
}
],
"role": "user"
},
{
"content": [
{
"text": "这是一幅描绘自然风景的画，展现了一片宁静的湖泊，湖水呈现出深浅不一的蓝绿色调。湖边长着一棵盛开的樱花树，粉白色的花朵在阳光下显得格外鲜艳。树下有一位身穿蓝色衣服的人坐在船上，似乎正在享受这美好的时光。\n\n背景是连绵起伏的山脉，山体被绿色的植被覆盖，山顶则被云雾缭绕，增添了几分神秘感。远处还可以看到一座小亭子，它坐落在湖边的岩石上，与周围的自然环境和谐相融。\n\n整个画面给人一种宁静、和谐的感觉，仿佛置身于一个世外桃源之中。",
"type": "text"
}
],
"role": "assistant"
},
{
"content": [
{
"image_url": {
"url": "https://cdn.bigmodel.cn/xxxxxxx.jpeg"
},
"type": "image_url"
},
{
"text": "这个图与上面图有什么不一样",
"type": "text"
}
],
"role": "user"
},
{
"content": [
{
"text": "第一张图和第二张图在画面内容、构图、色彩等方面都有很大的不同，形成鲜明的对比。\n\n1.第一张图是风景画，主要描绘了自然风景，如樱花树、湖泊、山脉等，色彩丰富，充满春天的气息，而第二张图是描绘长城在冬季的风景照片，长城在白雪覆盖的山脉中蜿蜒盘旋，整体色调偏冷，给人一种寒冷、宁静的感觉。\n\n2.构图上，第一张图采用对称式构图，以樱花树为中心，两侧的景物相互呼应，形成平衡和谐的画面效果，第二张图则采用对角线构图，长城作为视觉引导线，引导观众的视线穿越整个画面，增加画面的深度和动态感。\n\n整体来看，这两张图在画面内容、色彩、构图和氛围上都呈现出明显的对比，一张是充满生机和色彩的春日风景画，另一张是宁静而神秘的冬日长城风景照。",
"type": "text"
}
],
"role": "assistant"
},
{
"content": [
{
"image_url": {
"url": "https://cdn.bigmodel.cn/xxxxxxx.jpeg"
},
"type": "image_url"
},
{
"text": "这个图与上一张图有什么区别",
"type": "text"
}
],
"role": "user"
}
]
)
print(response.choices[0].message)

Response Example

{
"created": 1703487403,
"id": "8239375684858666781",
"model": "glm-4v-plus",
"request_id": "8239375684858666781",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "图中有一片蓝色的海和蓝天，天空中有白色的云朵。图片的右下角有一个小岛或者岩石，上面长着深绿色的树木。",
"role": "assistant"
}
}
],
"usage": {
"completion_tokens": 37,
"prompt_tokens": 1037,
"total_tokens": 1074
}
}

streaming output

response parameter

Parameter name	typology	Parameter description
id	String	Task ID
created	Long	The request creation time, which is a Unix timestamp in seconds.
choices	List	Model output content of the current dialog
index	Integer	Result subscripts
finish_reason	String	Modeling Reasons for Reasoning About Termination.`stop`Represents the natural end of reasoning or triggers a stop word.`length` represents reaching the upper limit of tokens length.`sensitive` Represents model inference content intercepted by the security audit interface.`network_error` represents model inference anomalies.
delta	Object	Text message returned by the model increment
role	String	The role of the current dialog, currently defaulting to assistant (model)
content	String	Content of the current dialogue
usage	Object	Statistics on the number of tokens called by this model
prompt_tokens	Integer	Number of tokens entered by the user
completion_tokens	Integer	Number of tokens output by the model
total_tokens	Integer	Total tokens
content_filter	List	Back to information about content security
role	String	Security entry into force sessions, including`role = assistant` model reasoning.`role = user` user input.`role = history` historical context
level	Integer	Severity level 0-3, with level 0 being the most severe and 3 being the least severe.

Example of a request

from zhipuai import ZhipuAI
client = ZhipuAI(api_key="") # 请填写您自己的APIKey
response = client.chat.completions.create(
model="glm-4v-plus",  # 填写需要调用的模型名称
messages=[
{
"role": "user", 
"content": [
{
"type": "image_url",
"image_url": {
"url" : "sfile.chatglm.cn/testpath/xxxx.jpg"
}
},
{
"type": "text",
"text": "图里有什么"
}
]
},
],
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta)

Response Example

data: {"id":"8305986882425703351","created":1705476637,"model":"glm-4v-plus","choices":[{"index":0,"delta":{"role":"assistant","content":"下"}}]}
data: {"id":"8305986882425703351","created":1705476637,"model":"glm-4v-plus","choices":[{"index":0,"delta":{"role":"assistant","content":"角"}}]}
data: {"id":"8305986882425703351","created":1705476637,"model":"glm-4v-plus","choices":[{"index":0,"delta":{"role":"assistant","content":"有一个"}}]}
... ...
data: {"id":"8305986882425703351","created":1705476637,"model":"glm-4v-plus","choices":[{"index":0,"delta":{"role":"assistant","content":"树木"}}]}
data: {"id":"8305986882425703351","created":1705476637,"model":"glm-4v-plus","choices":[{"index":0,"delta":{"role":"assistant","content":"。"}}]}
data: {"id":"8305986882425703351","created":1705476637,"model":"glm-4v-plus","choices":[{"index":0,"finish_reason":"stop","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":1037,"completion_tokens":37,"total_tokens":1074}}

AI News

Article copyright AI Sharing Circle All, please do not reproduce without permission.

Wisdom Spectrum supports 128K context/4K output length large model GLM-4-Flash full volume free open!

AI News # Free Large Model API

7mos ago

02.3K

20 Completely Free AI Tools

AI News

8mos ago

01.9K

World's Best Business AI Products Review Report

AI News

5mos ago

01.5K

Is Domestic Tier 1 AI really better than ChatGPT? Three features of DeepSeek tested in real life

AI News

7mos ago

03.5K

No comments

You must be logged in to leave a comment!

No comments...

Smart Spectrum released a free vision model GLM-4V-Flash concurrency limit raised to 10, this time enough for production environments

GLM-4V Series

synchronous call

interface request

Request Parameters

Messages format

User message

Assistant message

response parameter

Example of a request

Upload Video URL

Upload video Base64

Upload Image URL

Upload image Base64

Multiple rounds of picture dialog

Response Example

streaming output

response parameter

Example of a request

Response Example

Google Newly Releases AI Video Veo2, AI Mapping Imagen3

Command R7B: Enhanced Retrieval and Reasoning, Multilingual Support, Fast and Efficient Generative AI

Related posts

Wisdom Spectrum supports 128K context/4K output length large model GLM-4-Flash full volume free open!

20 Completely Free AI Tools

World's Best Business AI Products Review Report

Is Domestic Tier 1 AI really better than ChatGPT? Three features of DeepSeek tested in real life

No comments

Latest Collections

Latest Articles

Smart Spectrum released a free vision model GLM-4V-Flash concurrency limit raised to 10, this time enough for production environments

GLM-4V Series

synchronous call

interface request

Request Parameters

Messages format

User message

Assistant message

response parameter

Example of a request

Upload Video URL

Upload video Base64

Upload Image URL

Upload image Base64

Multiple rounds of picture dialog

Response Example

streaming output

response parameter

Example of a request

Response Example

Google Newly Releases AI Video Veo2, AI Mapping Imagen3

Command R7B: Enhanced Retrieval and Reasoning, Multilingual Support, Fast and Efficient Generative AI

Related posts

Wisdom Spectrum supports 128K context/4K output length large model GLM-4-Flash full volume free open!

20 Completely Free AI Tools

World's Best Business AI Products Review Report

Is Domestic Tier 1 AI really better than ChatGPT? Three features of DeepSeek tested in real life

No comments

Selected AI Tools

Latest Collections

Latest Articles