Replicating DeepSeek-R1: 8K Math Examples Help Small Models Achieve Reasoning Breakthroughs through Reinforcement Learning
Github: https://github.com/hkust-nlp/simpleRL-reason This blog will show a replication of DeepSeek-R1-Zero and DeepSeek-R1 training...






























































































![[转]从零拆解一款火爆的浏览器自动化智能体,4步学会设计自主决策Agent](https://aisharenet.com/wp-content/uploads/2025/01/e0a98a1365d61a3.png)



