AI Personal Learning
and practical guidance
Beanbag Marscode1

AI knowledge

EQ-Bench 如何评估大型语言模型的情商与创造力-首席AI分享圈

How EQ-Bench Assesses Emotional Intelligence and Creativity in Large Language Models

As the capabilities of large-scale language models (LLMs) evolve at a rapid pace, traditional benchmark tests, such as MMLU, are gradually showing limitations in distinguishing top models. Relying on knowledge quizzes or standardized tests alone, it has become difficult to comprehensively measure the nuanced capabilities of models that are critical in real-world interactions, such as emotional intelligence, creative...

大语言模型推理:在“思考不足”与“过度思考”之间寻求平衡-首席AI分享圈

Reasoning with Large Language Models: Balancing "Underthinking" and "Overthinking"

The development of large language models (LLMs) is rapidly changing, and their reasoning ability has become a key indicator of their intelligence level. In particular, models with long reasoning capabilities, such as OpenAI's o1, DeepSeek-R1, QwQ-32B, and Kimi K1.5, which simulate the human deep thinking process by solving compound...

突破工具调用瓶颈:CoTools 框架助力大型语言模型高效利用海量工具-首席AI分享圈

Breaking the Tool Calling Bottleneck: The CoTools Framework Enables Large Language Models to Efficiently Utilize a Massive Number of Tools

INTRODUCTION In recent years, Large Language Models (LLMs) have made impressive progress in the field of Artificial Intelligence, and their powerful language comprehension and generation capabilities have led to a wide range of applications in several domains. However, LLMs still face many challenges when dealing with complex tasks that require the invocation of external tools. For example, ...

uv common commands

The Python ecosystem has always had a shortage of package management and environment management tools, from the classic pip and virtualenv to pip-tools and conda to the modern Poetry and PDM. Each tool has its area of specialization, but they often make a developer's toolchain fragmented and complex. Now, from A...

为何多智能体协作系统更容易出错?-首席AI分享圈

Why are multi-intelligence collaborative systems more prone to error?

INTRODUCTION In recent years, multi-intelligent systems (MAS) have attracted much attention in the field of artificial intelligence. These systems attempt to solve complex, multi-step tasks through the collaboration of multiple Large Language Model (LLM) intelligences. However, despite the high expectations of MAS, their performance in real-world applications has not been ...

让 AI 停下来思考:Anthropic

Making AI Stop and Think: How Anthropic's "Think" Tool Enhances Claude Reasoning

Recently, Anthropic has introduced a new tool called "think", which aims to enhance the capability of Claude model in complex problem solving. In this paper, we will discuss the design concept, performance and best practices of the "think" tool, and analyze its implications for the future development of AI systems...

DeepRetrieval:强化学习驱动的高效信息检索查询生成-首席AI分享圈

DeepRetrieval: efficient information retrieval query generation driven by reinforcement learning

Abstract Information retrieval systems are critical for efficient access to large document collections. Recent approaches utilize Large Language Models (LLMs) to improve retrieval performance through query augmentation, but typically rely on expensive supervised learning or distillation techniques that require significant computational resources and manually labeled data. In ...

OpenAI发布:大型语言模型如何监控自身的不当行为-首席AI分享圈

OpenAI Releases: How Large Language Models Monitor Their Own Misbehavior

Large reasoning models exploit vulnerabilities when given the opportunity. Research has shown that these exploits can be detected by using large language models (LLMs) to monitor their chains-of-thought (CoT). Punishing models for "bad thoughts" does not prevent most misbehavior, but rather allows them to hide their intentions. ...

DeepSearch/DeepResearch中最优文本段选择和URL重排-首席AI分享圈

Optimal Text Segment Selection and URL Rearrangement in DeepSearch/DeepResearch

If you have read Jina's last classic article "Design and Implementation of DeepSearch/DeepResearch", then you may want to dig deeper into some details that can significantly improve the quality of answers. This time, we will focus on two details: extracting optimal text segments from long web pages: how to utilize late-chun...

Gemma 3 技术报告中文版-首席AI分享圈

Gemma 3 Technical Report Chinese version

Gemma 3 Key Information Summary I. Key Metrics Parameters Details Model size 100 million to 27 billion parameters in four versions: 1B, 4B, 12B, 27B Architecture Transformer-based decoder-specific architecture inherited from Gemma 2 with several improvements Multimodal capabilities Support for text and image...

IDProtector:保护人像照片免受AI生成技术滥用的方法-首席AI分享圈

IDProtector: a way to protect portraits from the abuse of AI-generated technology

1. Background and Issues With the rapid development of Artificial Intelligence (AI) technologies, especially the advancement of diffusion modeling, AI has been able to generate very realistic portrait images. For example, technologies like InstantID require only one photo to generate multiple new images with the same identity features. This kind of technology though...

长文本向量模型在4K Tokens 之外形同盲区?-首席AI分享圈

Long Text Vector Modeling a Blind Spot Beyond 4K Tokens?

NoLiMA, released in February 2025, is a Large Language Model (LLM) method for assessing long text comprehension. Unlike traditional Needle-in-a-Haystack (NIAH) tests, which rely on keyword matching, NoLiMA is characterized by carefully designed questions and key messages that force...

LangChain vs. LangGraph:官方告诉你该如何选择-首席AI分享圈

LangChain vs. LangGraph: The Officials Tell You What to Choose

The field of generative AI is currently evolving rapidly, with new frameworks and technologies emerging. Therefore, readers need to be aware that the content presented in this paper may be time-sensitive. In this paper, we will take an in-depth look at the two dominant frameworks for building LLM applications, LangChain and LangGraph, and analyze their strengths and weaknesses,...

MCP Server、Function Call 与 Agent 的协同与差异-首席AI分享圈

Synergies and Differences between MCP Server, Function Call and Agent

Understanding the three key concepts of MCP Server, Function Call, and Agent is essential in the burgeoning field of Artificial Intelligence (AI), especially Large Language Modeling (LLM). They are the cornerstones of an AI system, and each has a unique and interrelated role to play. A deeper understanding of it...

en_USEnglish