Video generation workflow in Dify using Tencent HunyuanVideo modeling interface

AI hands-on tutorials8mos agoupdate AI Sharing Circle

2.1K 00

This paper uses Dify v0.12.1 version, mainly introduces through the HTTP node in the Dify workflow, to call the tencent/HunyuanVideo interface of siliconflow, through the text to generate the specific implementation of the video. Among them, Dify and HTTP services are deployed on top of Sealos Cloud platform.

HunyuanVideo It is an open source video generation base model launched by Tencent, with more than 13 billion parameters, which is currently the largest open source video generation model. The model adopts a unified image and video generation architecture, integrating key technologies such as data organization, joint image-video model training and efficient infrastructure. The model uses a multimodal macrolanguage model as a text encoder, performs spatial-temporal compression via 3D VAE, and provides cue-word rewriting. According to professional manual evaluation results, HunyuanVideo outperforms existing state-of-the-art models in terms of text alignment, motion quality, and visual quality.

I. HunyuanVideo Interface

1. Create a Vincentian video

Generate video by inputting prompt, the interface returns the requestId generated by the user's current request, the user needs to get the specific video link by polling the status interface, the generated result is valid within 10 minutes, please take the video link in time. As shown below:

import requests
url = "https://api.siliconflow.cn/v1/video/submit"
payload = {
"model": "tencent/HunyuanVideo",
"prompt": "<string>",
"seed": 123
}
headers = {
"Authorization": "Bearer <token>",
"Content-Type": "application/json"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)

2. Get the video generation link

Get the user-generated video as shown below:

import requests
url = "https://api.siliconflow.cn/v1/video/status"
payload = {"requestId": "<string>"}
headers = {
"Authorization": "Bearer <token>",
"Content-Type": "application/json"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)

HunyuanVideo interface encapsulation

Because the text to generate video time is relatively long, so usually suppliers in the design of the time will have two interfaces, one interface is to get this request requestId, another interface according to requestId to determine the video generation model Status, and so on the video generation is complete, before the return of the video URL. therefore, you can not directly use Dify Therefore, you can't directly use the HTTP node in the Dify workflow to call the official interface of siliconflow, but you need to encapsulate the HunyuanVideo interface again. Specific HunyuanVideo interface package is also very simple, is to start a Flask service, according to the business logic to package.

Generate the requirements.txt command as shown below:

pip freeze > requirements.txt

Packaging as a mirror command, as shown below:

docker build -t 1000sprites/hunyuanvideo:v1 .

Special note: If the tag does not have a dockerhub repository username (1000sprites needs to be filled out corresponding to my own), then Docker uploads the image to report an error :denied: requested access to the resource is denied.

Packaging generates the image 1000sprites/hunyuanvideo:v1 as shown below:

Click Push to Hub to upload to the dockerhub repository as shown below:

Because images uploaded to the dockerhub repository are private by default, they need to be set to public, as shown below:

Sealos Deployment of HunyuanVideo Service

Click "Application Management" as shown below:

Set it up as needed, especially the mirror name can't be misspelled as it will be pulling mirrors from this address, as shown below:

Click "Application Management" to view, when the STATUS of the application changes from Pending to Running, it indicates that the application has been successfully started. When STATUS is Running, you can directly access the external network address. If you encounter problems, check the Pod logs as shown below: