This paper uses Dify v0.12.1 version, mainly introduces through the HTTP node in the Dify workflow, to call the tencent/HunyuanVideo interface of siliconflow, through the text to generate the specific implementation of the video. Among them, Dify and HTTP services are deployed on top of Sealos Cloud platform.
HunyuanVideo 是腾讯推出的开源视频生成基础模型,拥有超过 130 亿参数,是目前最大的开源视频生成模型。该模型采用统一的图像和视频生成架构,集成了数据整理、图像-视频联合模型训练和高效基础设施等关键技术。模型使用多模态大语言模型作为文本编码器,通过 3D VAE 进行空间-时间压缩,并提供提示词重写功能。根据专业人工评估结果,HunyuanVideo 在文本对齐、运动质量和视觉质量等方面的表现优于现有最先进的模型。
I. HunyuanVideo Interface
1. Create a Vincentian video
Generate video by inputting prompt, the interface returns the requestId generated by the user's current request, the user needs to get the specific video link by polling the status interface, the generated result is valid within 10 minutes, please take the video link in time. As shown below:
import requests
url = "https://api.siliconflow.cn/v1/video/submit"
payload = {
"model": "tencent/HunyuanVideo",
"prompt": "", "seed": 123
"seed": 123
}
headers = {
"Authorization": "Bearer ", "Content-Type": "application/json", "text": 123 } headers = {
"Content-Type": "application/json"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
2. Get the video generation link
Get the user-generated video as shown below:
import requests
url = "https://api.siliconflow.cn/v1/video/status"
payload = {"requestId": ""}
headers = {
"Content-Type": "application/json"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
HunyuanVideo interface encapsulation
Because the text to generate video time is relatively long, so usually suppliers in the design of the time will have two interfaces, one interface is to get this request requestId, another interface according to requestId to determine the video generation model Status, and so on the video generation is complete, before the return of the video URL. therefore, you can not directly use Dify Therefore, you can't directly use the HTTP node in the Dify workflow to call the official interface of siliconflow, but you need to encapsulate the HunyuanVideo interface again. Specific HunyuanVideo interface package is also very simple, is to start a Flask service, according to the business logic to package.
Generate the requirements.txt command as shown below:
pip freeze > requirements.txt
Packaging as a mirror command, as shown below:
docker build -t 1000sprites/hunyuanvideo:v1 .
Special note: If the tag does not have a dockerhub repository username (1000sprites needs to be filled out corresponding to my own), then Docker uploads the image to report an error :denied: requested access to the resource is denied.
Packaging generates the image 1000sprites/hunyuanvideo:v1 as shown below:
Click Push to Hub to upload to the dockerhub repository as shown below:
Because images uploaded to the dockerhub repository are private by default, they need to be set to public, as shown below:
Sealos Deployment of HunyuanVideo Service
Click "Application Management" as shown below:
Set it up as needed, especially the mirror name can't be misspelled as it will be pulling mirrors from this address, as shown below:
Click "Application Management" to view, when the STATUS of the application changes from Pending to Running, it indicates that the application has been successfully started. When STATUS is Running, you can directly access the external network address. If you encounter problems, check the Pod logs as shown below:
III.Dify Video Generation Workflow
bibliography
[1] Video generation online experience: https://cloud.siliconflow.cn/playground/text-to-video
[2] Quick installation of Python programs: https://sealos.run/docs/examples/programming-languages/Quick installation of Python Apps
[3] https://hub.docker.com/