AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

Top 5 AI Inference Platforms That Use Full-Blooded DeepSeek-R1 for Free

Due to excessive traffic and a cyber attack, the DeepSeek website and app have been up and down for a few days, and the API is not working.

blank


We have previously shared the method for deploying DeepSeek-R1 locally (seeDeepSeek-R1 Local Deployment), but the average user is limited to a hardware configuration that makes it difficult to run even a 70b model, let alone a full 671b model.

Luckily, all major platforms have access to DeepSeek-R1, so you can try it as a flat replacement.

 

I. NVIDIA NIM Microservices

NVIDIA Build: Integration of multiple AI models and free experience
Website: https://build.nvidia.com/deepseek-ai/deepseek-r1

NVIDIA deployed the full volume parameter 671B of the DeepSeek-R1 Models, the web version is straightforward to use, and you can see the chat window when you click on it:

blank

Also listed on the right is the code page:

blank

Simply test it:

blank

Below the chat box, you can also turn on some parameter items (which can be defaulted in most cases):

blank

The approximate meaning and function of these options are listed below:

Temperature:
The higher the value, the more randomized the output is, potentially generating more creative responses

Top P (nuclear sampling):
Higher values retain more probabilistic quality tokens and generate more diversity

Frequency Penalty:
Higher values penalize high-frequency words more and reduce verbosity or repetition

Presence Penalty:
The higher the value, the more inclined the model is to try new words

Max Tokens (maximum generation length):
The higher the value, the longer the potential length of the response

Stop:
Stop output when generating certain characters or sequences, to prevent generating too long or running out of topics.

Currently, due to the increasing number of white johns (look at the number of people in the queue in the chart below), NIM is lagging some of the time:

blank

Is NVIDIA also short of graphics cards?

NIM microservices also support API calls to DeepSeek-R1, but you need to sign up for an account with an email address:

blank

The registration process is relatively simple, using only email verification:

blank

After registering, you can click on "Build with this NIM" at the top right of the chat interface to generate an API KEY. Currently, you will get 1000 points (1000 interactions) for registering, so you can use it all up and register again with a new email address.

blank

The NIM microservices platform also provides access to many other models:

blank

 

II. Microsoft Azure

Web site:
https://ai.azure.com

Microsoft Azure allows you to create a chatbot and interact with the model through a chat playground.

blank

Azure is a lot of trouble to sign up for, first you have to create a Microsoft account (just log in if you already have one):

blank

Creating an account also requires email verification:

blank

Finish by proving you're human by answering 10 consecutive netherworld questions:

blank

blank

Getting here isn't enough to create a subscription:

blank

Verify information such as cell phone number as well as bank account number:

blank

Next, select "No technical support":

blank

Here you can start the cloud deployment, in the "Model Catalog" you can see the prominent DeepSeek-R1 model:

blank

After clicking on it, click on "Deploy" on the next page:

blank

Next, you need to select "Create New Project":

blank

Then default them all and click "Next":

blank

Next, click "Create":

blank

Creating it under this page starts, and it takes a while to wait:

blank

When you're done, you'll come to this page where you can click "Deploy" to go to the next step:

blank

You can also check the "Pricing and Terms" above to see that it is free to use:

blank

Continue to this page by clicking on "Deployment" and you can click on "Open in Playground":

blank

Then the conversation can begin:

blank

Azure also has NIM-like parameter tuning available:

blank

As a platform, there are many models that can be deployed:

blank

Already deployed models can be quickly accessed in the future via "Playground" or "Model + Endpoint" in the left menu:

blank

III. Amazon AWS

Web site:
https://aws.amazon.com/cn/blogs/aws/deepseek-r1-models-now-available-on-aws

DeepSeek-R1 is also prominently displayed and lined up.

blank

Amazon AWS registration process and Microsoft Azure is almost as troublesome, both have to fill in the payment method, but also phone verification + voice verification, here will not describe in detail:

blank

blank

The exact deployment process is much the same as Microsoft Azure:

blank

IV. Cerebras

Cerebras: the world's fastest AI inference, high-performance computing platform available today
Website: https://cerebras.ai

Unlike several large platforms, Cerebras uses a 70b model, claiming to be "57 times faster than GPU solutions."

blank

Once the email registration is entered, the drop-down menu at the top allows you to select DeepSeek-R1:

blank

The real-world speeds are indeed faster, though not as exaggerated as claimed:

blank

V. Groq

Groq: AI big model inference acceleration solution provider, high-speed free big model interface
Website: https://groq.com/groqcloud-makes-deepseek-r1-distill-llama-70b-available

blank

The model is also optional after the email registration is entered:

blank

It's also fast, but again, 70b feels a little more retarded than the Cerebras?

blank

Note that the chat interface can be accessed directly while logged in:
https://console.groq.com/playground?model=deepseek-r1-distill-llama-70b

 

Complete list of DeepSeek V3 and R1:

AMD

AMD Instinct™ GPUs Power DeepSeek-V3: Revolutionizing AI Development with SGLang (AMD Instinct™ GPUs Power DeepSeek-V3: Revolutionizing AI Development with SGLang)

NVIDIA

DeepSeek-R1 NVIDIA model card (DeepSeek-R1 NVIDIA model card)

Microsoft Azure

Running DeepSeek-R1 on a single NDv5 MI300X VM (Running DeepSeek-R1 on a single NDv5 MI300X VM)

Baseten

https://www.baseten.co/library/deepseek-v3/

Novita AI

Novita AI uses SGLang running DeepSeek-V3 for OpenRouter (Novita AI using SGLang to run DeepSeek-V3 for OpenRouter)

ByteDance Volcengine

The full-size DeepSeek model lands on the Volcano Engine!

DataCrunch

Deploy DeepSeek-R1 671B on 8x NVIDIA H200 with SGLang (Deployment of DeepSeek-R1 671B on 8x NVIDIA H200 using SGLang)

Hyperbolic

https://x.com/zjasper666/status/1872657228676895185

Vultr

How to Deploy Deepseek V3 Large Language Model (LLM) Using SGLang (How to deploy with SGLang) Deepseek V3 Large Language Model (LLM))

RunPod

What's New for Serverless LLM Usage in RunPod in 2025? (What are the new features used by Serverless LLM in RunPod 2025?)

CDN1
May not be reproduced without permission:Chief AI Sharing Circle " Top 5 AI Inference Platforms That Use Full-Blooded DeepSeek-R1 for Free

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish