FastGPT is a knowledge base Q&A system based on the LLM model, developed by the Circle Cloud team, providing out-of-the-box data processing, model invocation and other capabilities. FastGPT can also be used for workflow orchestration through Flow visualization to realize complex Q&A scenarios.FastGPT has gained 19.4k stars on Github.
SiliconCloud of Silicon Flow is a big model cloud service platform and has its own acceleration engine.SiliconCloud can help users to test and use open source models in a low-cost and fast way. The actual experience is that the speed and stability of their models are very good, and they are rich in variety, covering dozens of models such as languages, vectors, reordering, TTS, STT, mapping, video generation, etc., which can satisfy all the modeling needs in FastGPT.
This article is a tutorial written by the FastGPT team that will present a solution for deploying FastGPT in local development using exclusively SiliconCloud models.
1 Obtaining the SiliconCloud Platform API Key
- Open the SiliconCloud website and register/sign in for an account.
- After completing the registration, open API Key , create a new API Key and click on the key to copy it for future use.
2 Modifying FastGPT Environment Variables
OPENAI_BASE_URL=https://api.siliconflow.cn/v1 # Fill in the Api Key provided by SiliconCloud console CHAT_API_KEY=sk-xxxxxxxx
FastGPT development and deployment documentation: https://doc.fastgpt.cn
3 Modifying the FastGPT Configuration File
The models in SiliconCloud were selected as FastGPT configurations. Here Qwen2.5 72b is configured with pure language and vision models; bge-m3 is selected as the vector model; bge-reranker-v2-m3 is selected as the rearrangement model. Choose fish-speech-1.5 as the speech model; choose SenseVoiceSmall as the speech input model.
Note: The ReRank model still needs to be configured with the API Key once.
{ "llmModels": [ { "provider": "Other", // Model provider, mainly used for categorized display, currently has built-in providers including https://github.com/labring/FastGPT/blob/main/packages/global/core/ai/provider.ts, can pr Provide a new provider, or just fill in Other. "model": "Qwen/Qwen2.5-72B-Instruct", // model name (corresponds to the model name of the channel in OneAPI) "name": "Qwen2.5-72B-Instruct", // model alias "maxContext": 32000, // Maximum context "maxResponse": 4000, // maxResponse "quoteMaxToken": 30000, // Maximum quote content "maxTemperature": 1, // maxTemperature "charsPointsPrice": 0, // n points/1k token (commercial version) "censor": false, // Whether to enable sensitive censoring (commercial version) "vision": false, // if image input is supported "datasetProcess": true, // Whether to set to text comprehension model (QA), make sure at least one of them is true, otherwise the knowledge base will report an error. "usedInClassify": true, // whether to use it for question classification (make sure at least one of them is true) "usedInExtractFields": true, // whether to use it for content extraction (make sure at least one is true) "usedInToolCall": true, // if or not used for tool calls (make sure at least one is true) "usedInQueryExtension": true, // if used for question optimization (make sure at least one is true) "toolChoice": true, // whether tool choice is supported (used for categorization, content extraction, tool calls.) "functionCall": false, // Whether or not to support function calls (used for categorization, content extraction, and tool calls). Will prioritize toolChoice, if false, then use functionCall, if still false, then use prompter mode) "customCQPrompt": "", // custom text categorization prompter (does not support tool and function call model) "customExtractPrompt": "", // custom content extraction prompter "defaultSystemChatPrompt": "", // Default system chat prompt for dialogs "defaultConfig": {}, // Some default configurations (e.g. GLM4's top_p) to be carried when requesting an API "fieldMap": {} // field mapping (o1 models need to map max_tokens to max_completion_tokens) }, {} { "provider": "Other", "model": "QUALIFICATIONS": "QUALIFICATIONS", { "model": "Qwen/Qwen2-VL-72B-Instruct", { "name": "Qwen2-VL-72B-Instruct", { "model": "Qwen2-VL-72B-Instruct", } "name": "Qwen2-VL-72B-Instruct", "name": "Qwen2-VL-72B-Instruct", "maxContext": 32 "quoteMaxToken": 30000, "maxTemperature": 1,000,000, "maxTemperature": 1,000,000 "maxTemperature": 1, "charsPointsPrice". "charsPointsPrice": 0, "censor": false, false "censor": false, "vision": true, "datasetProcess". "usedInExtractFields": false, "usedInToolCall": false, "usedInToolCall": false "usedInQueryExtension": false, "toolChoice": false, "usedInExtractFields": false, "usedInToolCall": false, "usedInQueryExtension": false, "functionCall": false, "customCQPrompt". "customCQPrompt": "", "customExtractPrompt": false "defaultSystemChatPrompt": "", "functionCall": false, "customCQPrompt": "", "customExtractPrompt": "", "defaultConfig": {} } ], "vectorModels": [", "defaultConfig": {} "vectorModels": [ { "provider": "Other", "model": "Pro/BAAI/bge-m3", "vectorModels": [ { "model": "Pro/BAAI/bge-m3", "name": "Pro/BAAI/bge-m3", "name": "Pro/BAAI/bge-m3", "charsPointsPrice": 0, "defaultToken": 512 "maxToken": 5000, "weight": 100 "weight": 100 } ], "reRankModels". "reRankModels": [ { "model": "BAAI/bge-reranker-v2-m3", // The model here needs to correspond to the siliconflow The model name of the "name": "BAAI/bge-reranker-v2-m3", "requestUrl": "https://api.siliconflow.cn/v1/rerank", "requestAuth": "key" on siliconflow "requestAuth": "key requested on siliconflow" } ], "audioSpeechModels", "audioSpeechModels "audioSpeechModels": [ { "model": "fishaudio/fish-speech-1.5", "name": "fish-speech-1.5", [ { "name": "fish-speech-1.5", "voices": [ { "label": "fish-alex", "value": "fish-audio/fish-speech", [ { "value": "fishaudio/fish-speech-1.5:alex", "bufferId": "bufferId". "bufferId": "fish-alex" }, { "value": "fish-alex". { "label": "fish-anna", "value": "fishaudio/fish-speech-1.5:alex", { "value": "fishaudio/fish-speech-1.5:anna", { "bufferId": "fish-anna" }, { { "label": "fish-bella", "value": "fishaudio/fish-speech-1.5:anna", { "value": "fishaudio/fish-speech-1.5:bella", { "bufferId": "bufferId". "bufferId": "fish-bella" }, } { "label": "fish-benjamin", "value": "fish-audio/fish-speech-1.5:bella", { "value": "fishaudio/fish-speech-1.5:benjamin", { "bufferId": "fish-benjamin" }, { { "label": "fish-charles", "value": "fish-audio", "value": "fish-charles", { "value": "fishaudio/fish-speech-1.5:charles", { "bufferId": "bufferId". "bufferId": "fish-charles" }, { { "label": "fish-claire", "value": "fish-charles", { "value": "fishaudio/fish-speech-1.5:claire", { "bufferId": "bufferId". "bufferId": "fish-claire" }, { { "label": "fish-david", "value": "fish-audio/fish-speech-1.5:claire", { "value": "fishaudio/fish-speech-1.5:david", { "bufferId": "fish-david" }, { "label": "fish-diana", "value": "fish-audio/fish-speech-1.5:david", { "value": "fishaudio/fish-speech-1.5:diana", { "bufferId": "bufferId". "bufferId": "fish-diana" } ] } ] "whisperModel": { "model": "FunAudioLLM/SenseVoiceSmall", "name": "SenseVoiceSmall", "charsPointsPrice": 0 } }
4 Reboot FastGPT
5 Experience Test
Testing dialog and picture recognition
Just create a new simple application, select the corresponding model, and test it with image upload turned on.
As you can see, the 72B model, the performance is very fast, which if there are not a few local 4090, not to mention the configuration of the environment, I am afraid that the output will take 30s.
Testing Knowledge Base Import and Knowledge Base Q&A
Create a new knowledge base (since only one vector model is configured, the vector model selection will not be displayed on the page).
To import a local file, just select the file and go all the way to next. 79 indexes and it took about 20s to complete. Now let's go test the Knowledge Base quiz.
First go back to the application we just created, select Knowledge Base, adjust the parameters and start the conversation.
Once the dialog is complete, click on the citation at the bottom to view the citation details, as well as to see the specific retrieval and reordering scores.
Test voice playback
Continuing in the app just now, find Voice Play on the left side of the configuration and click on it to select a voice model from the pop-up window and try it out.
Test Language Input
Go ahead and find the voice input in the left side configuration in the app just now and click on it to enable language input from the pop-up window.
When you turn it on, a microphone icon will be added to the dialog input box, and you can click it for voice input.
summarize
If you want to quickly experience the open source model or quickly use FastGPT, and don't want to apply for all kinds of API keys at different service providers, then you can choose SiliconCloud's model for a quick experience.