Heavyweight! The world's first decentralized 10B model is trained and open-sourced within a week!

AI News11mos agorelease AI Sharing Circle

16.5K 00

The world's first decentrally trained 10B parametric model is born!
重磅！全球首个去中心化10B模型训练完成，一周内开源！ The Prime Intellect team announced that they have completed a landmark effort: a decentralized training network across the US, Europe, and Asia that successfully trained a large model with 10B parameters. This marks a revolutionary step in the field of AI training.
As you can see from the training panel, the project, called INTELLECT-1, has completed the training of 1 trillion (1T) tokens.
Both the loss and perplexity curves show a desirable downward trend, and the number of tokens generated per second remains stable, indicating that the training process was very successful.
重磅！全球首个去中心化10B模型训练完成，一周内开源！ The success of this project would not have been possible without the support of our many partners.
A number of organizations including Hugging Face, SemiAnalysis, Arcee.ai, Hyperbolic Labs, Olas, Akash, Schelling AI, and others contributed valuable arithmetic resources to this training. This unprecedented model of cooperation demonstrates a new type of collaboration in AI.
重磅！全球首个去中心化10B模型训练完成，一周内开源！ As you can see from the project's leaderboard, contributors from all over the globe have provided a staggering amount of computing time. The highest contributor reached 8,230 hours, with participants spread across San Mateo, Dallas, Helsinki and Stockholm. This globalized model of arithmetic collaboration allows AI training to no longer be limited to the data centers of a handful of tech giants.
重磅！全球首个去中心化10B模型训练完成，一周内开源！

On a technical level, the innovation of this program is equally impressive.
The team adopted the DiLoCo distributed training technique to solve the challenges of cross-region training. In order to cope with the various challenges in a distributed environment, the research team also implemented a fault-tolerant training mechanism and asynchronous distributed checkpointing techniques.
In terms of memory optimization, the team chose to upgrade to the FSDP2 framework, which successfully solved the memory allocation issues present in FSDP1.
Meanwhile, the training efficiency is significantly improved by the application of tensor parallel computing technology.
Behind these technological innovations is a strong research team working quietly. The project leaders especially thank Tristan Rice and Junjie Wang for their contributions to fault-tolerant training, and Chien-Chin Huang and Iris Zhang for their work on asynchronous distributed checkpointing. Also, Yifu Wang is credited for his advice on tensor parallel computing.
What's even more exciting is that the team has announced that it will be releasing the full open source version, including the base model, checkpoint file, post-training model, and training dataset, within a week. This means that researchers and developers around the world will soon be able to innovate and develop based on this model.
There are already developers who can't wait to start experimenting. One developer demonstrated an attempt at model inference on two 4090 graphics cards on the West Coast of the United States and in Europe. Although the network connection between the two locations was not ideal, this experiment proved the flexibility and adaptability of the model.
The success of this project is not just a technological breakthrough, but an important milestone in the democratization of AI for all.
It proves that through global collaboration, we are well positioned to break through the limitations of traditional AI training and engage more organizations and individuals in the wave of AI development.

AI News

Article copyright AI Sharing Circle All, please do not reproduce without permission.

OpenAI releases its first AI Agent! ChatGPT can now do your work for you automatically!

AI News

9mos ago

021.6K

Microsoft launches Correction tool: can it end the crisis of trust caused by AI hallucinations?

AI News

1yrs ago

016.2K

Dify Releases Agent Node: Injecting Autonomous Decision-Making Capabilities into Workflow

AI News

7mos ago

037.9K

Since GPT2, OpenAI plans to release new open source weighting models

AI News

6mos ago

020.4K

No comments

You must be logged in to leave a comment!

No comments...

Heavyweight! The world's first decentralized 10B model is trained and open-sourced within a week!

Copilot agents intelligences in SharePoint officially released

voyage-3 and voyage-3-lite: a new generation of small but powerful general-purpose embedding models

Related posts

OpenAI releases its first AI Agent! ChatGPT can now do your work for you automatically!

Microsoft launches Correction tool: can it end the crisis of trust caused by AI hallucinations?

Dify Releases Agent Node: Injecting Autonomous Decision-Making Capabilities into Workflow

Since GPT2, OpenAI plans to release new open source weighting models

No comments

Latest Collections

Latest Articles

Heavyweight! The world's first decentralized 10B model is trained and open-sourced within a week!

Copilot agents intelligences in SharePoint officially released

voyage-3 and voyage-3-lite: a new generation of small but powerful general-purpose embedding models

Related posts

OpenAI releases its first AI Agent! ChatGPT can now do your work for you automatically!

Microsoft launches Correction tool: can it end the crisis of trust caused by AI hallucinations?

Dify Releases Agent Node: Injecting Autonomous Decision-Making Capabilities into Workflow

Since GPT2, OpenAI plans to release new open source weighting models

No comments

Selected AI Tools

Latest Collections

Latest Articles