AI Personal Learning
and practical guidance
讯飞绘镜
Total 21 articles

Tags: big model fine-tuning

Second Me:本地训练拥有个人记忆和习惯的AI分身-首席AI分享圈

Second Me: Locally trained AI doppelgangers with personal memories and habits

Second Me is an open source project developed by the Mindverse team that lets you create an AI on your computer that acts like a "digital doppelganger", learning your speech and habits through your words and memories, and turning it into a smart assistant that understands you. Its best feature is that all the numbers...

MM-EUREKA:探索视觉推理的多模态强化学习工具-首席AI分享圈

MM-EUREKA: A Multimodal Reinforcement Learning Tool for Exploring Visual Reasoning

Comprehensive Introduction MM-EUREKA is an open source project developed by Shanghai Artificial Intelligence Laboratory, Shanghai Jiao Tong University and other parties. It extends textual reasoning capabilities to multimodal scenarios through rule-based reinforcement learning techniques to help models process image and textual information. The core goal of this tool is to improve...

OpenManus-RL:微调大模型强化智能体推理与决策能力-首席AI分享圈

OpenManus-RL: Fine-tuning Large Models to Enhance Intelligent Body Reasoning and Decision Making

General Introduction OpenManus-RL is an open source project jointly developed by UIUC-Ulab and the OpenManus team of the MetaGPT community, hosted on GitHub.The project enhances the reasoning and decision-making capabilities of large language model (LLM) intelligences through reinforcement learning (RL) techniques, based on Deepseek-R1, QwQ-32B ...

TPO-LLM-WebUI:输入问题即可实时训练模型输出结果的AI框架-首席AI分享圈

TPO-LLM-WebUI: An AI framework where you can input questions to train a model to output results in real time

Comprehensive Introduction TPO-LLM-WebUI is an innovative project open-sourced by Airmomo on GitHub that enables real-time optimization of Large Language Models (LLMs) through an intuitive web interface. It uses the TPO (Test-Time Prompt Optimization) framework to completely say goodbye to the traditional fine-tuning of the tedious process of ...

Open-Reasoner-Zero:开源大规模推理强化学习训练平台-首席AI分享圈

Open-Reasoner-Zero: Open Source Large-Scale Reasoning Reinforcement Learning Training Platform

General Introduction Open-Reasoner-Zero is an open source project focused on reinforcement learning (RL) research, developed by the Open-Reasoner-Zero team on GitHub. It aims to accelerate the research process in the field of artificial intelligence by providing an efficient, scalable and easy-to-use training framework, especially to the pass...

中文基于满血 DeepSeek-R1 蒸馏数据集,支持中文R1蒸馏SFT数据集-首席AI分享圈

Chinese based full-blooded DeepSeek-R1 distillation dataset, supports Chinese R1 distillation SFT dataset

Comprehensive Introduction The Chinese DeepSeek-R1 distillation dataset is an open source Chinese dataset containing 110K pieces of data designed to support machine learning and natural language processing research. The dataset is released by Cong Liu's NLP team. The dataset contains not only mathematical data, but also a large amount of general types of data, such as logical reasoning...

ColossalAI:提供高效大规模AI模型训练解决方案-首席AI分享圈

ColossalAI: Providing Efficient Large-Scale AI Model Training Solutions

Comprehensive Introduction ColossalAI is an open source platform developed by HPC-AI Technologies to provide an efficient and cost-effective solution for large-scale AI model training and inference. By supporting multiple parallelization strategies, heterogeneous memory management, and mixed-precision training, ColossalAI is able to significantly reduce model training and inference...

Unsloth:高效微调和训练大语言模型的开源工具-首席AI分享圈

Unsloth: an open source tool for efficiently fine-tuning and training large language models

General Introduction Unsloth is an open source project designed to provide efficient tools for fine-tuning and training large language models (LLMs). The project supports a wide range of well-known models, including Llama, Mistral, Phi, and Gemma, etc. Unsloth's main features are the ability to significantly reduce memory usage and speed up training...

NVIDIA Garak:检测LLM漏洞的开源工具,确保生成式AI的安全性-首席AI分享圈

NVIDIA Garak: Open-source tool to detect LLM vulnerabilities and secure generative AI

Comprehensive Introduction NVIDIA Garak is an open source tool that specializes in detecting vulnerabilities in Large Language Models (LLMs). It checks the model for multiple weaknesses such as illusions, data leakage, hint injection, error message generation, harmful content generation, etc. through static, dynamic and adaptive probing.Garak resembles ...

LLaMA Factory:高效微调百余种开源大模型,轻松实现模型定制-首席AI分享圈

LLaMA Factory: Efficient fine-tuning of more than a hundred open-source large models, easy model customization

General Introduction LLaMA-Factory is a unified and efficient fine-tuning framework that supports flexible customization and efficient training of more than 100 large language models (LLMs). Through the built-in LLaMA Board web interface, users can fine-tune their models without writing code. The framework integrates a variety of advanced training...

en_USEnglish