TinyZero: A Low-Cost Replication of DeepSeeK-R1 Zero's Epiphany Effect
General Introduction TinyZero is a veRL-based reinforcement learning model designed to replicate the performance of DeepSeeK-R1 Zero in countdown and multiplication tasks. Surprisingly, the project costs only $30 to run (using 2xH2...


































































































