Skywork-SWE-32B - KunlunWanwei Open Source Autonomous Code Intelligent Body Base Model

Latest AI Resources9mos agorelease AI Sharing Circle

36.5K 00

What is Skywork-SWE-32B?

Skywork-SWE-32B is an open source 32B scale software engineering (SWE) autonomous code intelligences base model introduced by Kunlun World Wide. The model focuses on software engineering tasks with powerful repository-level code repair capabilities, and can excel in complex scenarios with multi-round interactions and long text processing. By building more than 10,000 verifiable GitHub repository task instances, the largest verifiable GitHub repository-level code repair dataset has been created, and it has achieved a pass@1 accuracy of 38.0% in the SWE-bench Verified benchmark test, which refreshes the best performance of the model with the same parameter scale. With the introduction of the test-time scaling technique, the accuracy is further improved to 47.0%, which significantly outperforms existing open-source models up to 32B, and approaches or even surpasses the performance of some closed-source models.

Main Features of Skywork-SWE-32B

Warehouse-level code fixes: Can locate code problems (such as bugs) in GitHub repositories, generate repair code, verify the effect of the repair, and complete the whole process of closing the loop from problem understanding to resolution.
Multi-wheel interaction capability: Supports more than 50 rounds of interactions, simulating multiple debugging and repair processes in real development scenarios, and solving problems step by step.
Long Text Processing: Can handle long texts of more than 32k tokens, meeting the processing needs of complex code files and multiple file dependencies.
automated verification: Ensure that the generated repair code is valid in the actual runtime environment by building a dedicated runtime environment and unit test validation mechanism.
Data-Driven Performance Improvement: Training based on large-scale (more than 10,000 instances) and high-quality verifiable datasets, the model performance continues to improve as the amount of data increases, validating the applicability of the law of data scaling to software engineering tasks.

Project address for Skywork-SWE-32B

HuggingFace Model Library:: https://huggingface.co/Skywork/Skywork-SWE-32B
Technical Papers:: https://huggingface.co/Skywork/Skywork-SWE-32B/resolve/main/assets/Report.pdf

Technical Advantages of Skywork-SWE-32B

Large-scale, high-quality data sets
- Data size and diversity: Skywork-SWE-32B is trained based on more than 10,000 verifiable GitHub repository task instances covering 2,531 different GitHub repositories, which is the largest verifiable SWE dataset available. The large-scale dataset provides rich training samples for the model to learn more diverse code repair patterns.
- Automated data collection and validation: Ensure high quality and verifiability of data through a three-phase automated process (data collection and pre-screening, execution-based validation, and smart body trajectory generation). Each task instance is equipped with a dedicated Docker runtime environment image that supports automated unit test validation, ensuring that the generated repair code is valid in the actual runtime environment.
Powerful model performance
- high accuracy: In the SWE-bench Verified benchmark test, Skywork-SWE-32B achieves a pass@1 accuracy of 38.0%, which refreshes the best performance of models with the same parameter scale. With the introduction of the Test Time Scaling (TTS) technique, the accuracy is further improved to 47.0%, which significantly outperforms existing open-source models below 32B, and approaches or even surpasses the performance of some closed-source models.
- the law of data scaling: Through systematic validation, it is found that the model performance continues to improve as the size of the training data increases, verifying the applicability of the data scaling law in software engineering tasks. The performance of the model can be further improved with the increase of data volume, which provides theoretical support for future expansion.

People who use Skywork-SWE-32B

software developer: Developers can use Skywork-SWE-32B to quickly locate and fix problems in their code, reducing the time and effort of manual debugging.
Software Test Engineer: Test engineers can use Skywork-SWE-32B to automate the execution of unit tests, verify the validity of the generated repair code, and improve testing efficiency.
project management: Reduce technical debt in your projects by automating code fixes and optimizations to improve the speed and quality of project delivery.
Academic researchers: Researchers can use Skywork-SWE-32B as an experimental platform to explore the application of large language models to software engineering tasks and to verify theories such as the law of data scaling.
Technical Manager and Architect: Technical managers and architects can make smarter technical decisions with the performance data and technical benefits of the Skywork-SWE-32B.