Analyzing the Product Format of AI Performing Desktop Operational Tasks Using AutoGLM as a Citation

AI News4mos agorelease AI Sharing Circle

1.9K 00

Today Wisdom Spectrum released "AutoGLM Contemplation", which many people say is Manus I think it's true that Manus can be categorized as a Deep Research product, or a Deep Research product. But such a simple categorization will create a lot of cognitive errors for both developers and users, and I think many people have the same problem, at least I do.

Consider that Smart Spectrum has released desktop automation class applications so far ( AutoGLM-Web Plugin ), until the "AutoGLM Meditations", you can basically see a near-complete lineup of the entire Smart Spectrum product line.

So today's conversation is centered around "AutoGLM Meditations", which deconstructs the branching capabilities of AI products that perform desktop operational tasks.

The official Wisdom Spectrum presentation is pragmatic
AutoGLM Contemplation is an Autonomous Intelligent Body (AI Agent) that can explore open-ended questions and perform actions based on the results. It is capable of simulating human thought processes, from data retrieval and analysis to report generation.

For the user, what "AutoGLM contemplation" really is is the developer's word, and the developer can help the user focus on a feature point and guide the user through it, but ultimately there is no way to self-define it on the user's behalf.

For developers, the discussion of "AutoGLM Contemplation" is Manus, Deep Research,Wisdom Spectrum Cow, AI Search,Browser-Use, neither is correct, one has to break down his functions and discuss the boundaries of his capabilities to have a discussion. If one simply summarizes AutoGLM contemplation as Manus There are obvious bugs, such as Manus being able to do computational tasks and "AutoGLM Meditation" not.

Start by understanding the basic features of AutoGLM Contemplation.

used up Qingyin Browser Plug-in For those of you who have found them to be very similar, they are now united under the "AutoGLM" product line, and it is recommended that you start with the plug-in before using the "AutoGLM Contemplation" client. There is no feature parity between the two, and the plugin is (currently) more powerful than the client.

However, the client can currently access sites that are "out of the whitelist", whereas the plugin currently limits the scope of information:

Therefore, the potential of the AutoGLM Contemplation feature can be better utilized by using the client to understand it.

1. Download the client, must also install the plug-in

Download: https://autoglm-research.zhipuai.cn/#get_started

2. Initiate the first task (operate together and observe the process)

Find all free "AI Translator" tools from https://aisharenet.com/, and only collect AI Translator tools with clients.

以AutoGLM为引，分析AI执行桌面操作任务的产品形态
Tip: This is not a good task description, because the website does not provide an in-site search function and a clear entry point to AI translation tools, a better task description would be: start flipping pages from https://aisharenet.com/tag/aifanyi/ and find all free and client-side AI translation tools from the list information.
3. Observe the process of task execution (this is a screenshot of part of the page automatically visited during the execution of the tool)
reflections 以AutoGLM为引，分析AI执行桌面操作任务的产品形态

First, find the search box, type in "AI Translation" and execute the search. 以AutoGLM为引，分析AI执行桌面操作任务的产品形态

Go to the Bing search interface (the site's search box is a jump to Bing search) and start visiting the link... 以AutoGLM为引，分析AI执行桌面操作任务的产品形态
When visiting the second link, a categorized directory of AI translation tools was found
Link-by-link browsing in a categorized list of AI translation tools with automatic page turning
Visit the second page and start the summarization task 以AutoGLM为引，分析AI执行桌面操作任务的产品形态
Output full research report

4. not covered by the important test link "login" interested parties to launch their own task to observe the interaction process, the task is able to evoke the login interaction. (Log out of Xiaohongshu first)

Gathering the knowledge of Little Red Book about DEEPSEEK generating videos

localization

Knowledge Depth Research Tool, from the results obtained it was possible to reverse analyze that the tool prompts were designed around writing a research paper and were not suitable for other types of tasks.

Core competencies

Generate a plan of tasks to be performed
Wake up the browser
In-browser viewing (text only), clicking, typing
Task judgment nodes (partial): web browsing completed, observe the page and determine the next task, determine whether login is required, end of information acquisition

Automation around browser visual interactions, but only for collecting information and writing research reports, it does not look like it is releasing all of its capabilities at this point, especially with client-side additions, and it should be able to integrate more capabilities in the future.

One sentence summary of AutoGLM contemplation vs. Wisdom Spectrum Bull Difference

The former operates the browser visually, automating the process of gathering information and generating "input" only for searching and visiting pages.

The latter operates the desktop visually and is not limited to the automation of the information gathering process, but is free to manipulate the desktop to accomplish tasks.

One Sentence Summary AutoGLM Contemplation vs. Clearspeak Browser Plugin Differences

The former operates the browser visually, and as a PC client can later interact with more interfaces.

The latter still has the same visual manipulation of browser capabilities, and as a browser plug-in can natively interact with the information on the visited page.

Back to AI performing desktop manipulation tasks

Let's start with a question:

AutoGLM Contemplative Core Competencies Browser-Use Both, writing in-depth research reports STORM More powerful, why use AutoGLM Contemplation?

The answer is summarized below:

AutoGLM Contemplation is a consumer-facing productization tool designed with a complete process of information gathering and writing research reports.

There is no need to configure complex local installation environments and utilize cloud computing power to collaborate on local interactions.

STORM is a fixed source of information collection and does not have access to non-open information, whereas AutoGLM contemplates the use of browser automation to achieveNon-open information collectionThe

By this time you will vaguely recognize some differences between the tools? In fact, the problem is very simple, the following from summarizing the desktop character automation tools to start combing.

Two types of solutions for desktop task automation

1. Traditional set fixed anchor points and execute by process. Example: Microsoft PA, Shadowblade.

2. Purely visual interactions that utilize Browser-Use to assist the larger model in determining and generating interactions. Example: AutoGLM contemplation.

3. Hybrid: Shadowblade can also be based on a fixed workflow, with some nodes (especially content extraction sessions) using purely visual interactions. More typical is Microsoft's automated customer service orchestration tool, after the introduction of AI, so that customer service in the fixed SOP premise, work more humanized.

Moving on to focusing on purely visual interaction solutions, let's come up with a name... Desktop Task Automation Intelligence

What can desktop task automation intelligences be capable of?

General competence:

Desktop Visual Recognition, Desktop Functional Operation

Scalability:

Single Intelligence, Multi-Intelligence Performing Tasks.Multiple intelligences are generally used to perform task planning, branching tasks, task coordination and information summarization, respectively.

Execute desktop operations by referring to a fixed "tool" or fixed "workflow" for a specific task.For example: calculations, programming, searching for quality sources of information. the reason why Manus is so powerful is that it integrates programming tools to accomplish some of the branching tasks.

Extend (access) local, remote data sources.

Limitations:

Desktop task automation intelligences do not necessarily need to operate the desktop purely visually. If my branch task includes searching for "Knowledge", it may be better to directly interface with the search results of Knowledge, and desktop operation will be inefficient instead. Therefore, a reasonable extension capability can help to realize the value of desktop intelligences.

What Desktop Task Automation Intelligence is good for

AutoGLM contemplation is limited to searching for non-open knowledge, which is great for knowledge search scenarios, but the point where it can be of greater value is in automating operations where the interface contains dynamic information and is repetitive. This Convergence Doing a good job of automating the task execution by the AI and then saving the task execution process so that it can be looped subsequently.

Summarize: check information, perform duplicate work.

Desktop Operating Tasks Product Capability Portfolio

The above teardown has enough information to summarize the current form of similar products.

In the end it is nothing but a combination of the following capabilities, local or cloud, designing the range of processed and unprocessed task execution, and ultimately presenting the user with the type of executable task.

All similar tools that can be thought of can be summarized in the following chart.

AI News

Article copyright AI Sharing Circle All, please do not reproduce without permission.

Cloudflare 拥抱远程 MCP：将 AI Agent 能力推向更广阔的互联网

Cloudflare Embraces Remote MCP: Pushing AI Agent Capabilities to the Wider Internet

AI News

5mos ago

02.9K

Claude 3.7 Sonnet：首创混合推理模式并推出智能编码工具 Claude Code

Claude 3.7 Sonnet: First Hybrid Reasoning Model and Launch of Intelligent Coding Tool Claude Code

AI News

6mos ago

02.7K

Gemini 2.0 released: a new AI model built for the age of intelligentsia

AI News

8mos ago

02.3K

AI Research Assistant Competition: In-depth Review and Selection Guide of Five Mainstream Tools

AI News

4mos ago

02.7K

No comments

You must be logged in to leave a comment!

No comments...

Analyzing the Product Format of AI Performing Desktop Operational Tasks Using AutoGLM as a Citation

Start by understanding the basic features of AutoGLM Contemplation.

localization

Core competencies

One sentence summary of AutoGLM contemplation vs. Wisdom Spectrum Bull Difference

One Sentence Summary AutoGLM Contemplation vs. Clearspeak Browser Plugin Differences

Back to AI performing desktop manipulation tasks

Two types of solutions for desktop task automation

What can desktop task automation intelligences be capable of?

What Desktop Task Automation Intelligence is good for

Desktop Operating Tasks Product Capability Portfolio

Uncovering the Big Model Illusion: HHEM Rankings Provide Insight into the State of Factual Consistency in the LLM

Since GPT2, OpenAI plans to release new open source weighting models

Related posts

Cloudflare Embraces Remote MCP: Pushing AI Agent Capabilities to the Wider Internet

Claude 3.7 Sonnet: First Hybrid Reasoning Model and Launch of Intelligent Coding Tool Claude Code

Gemini 2.0 released: a new AI model built for the age of intelligentsia

AI Research Assistant Competition: In-depth Review and Selection Guide of Five Mainstream Tools

No comments

Latest Collections

Latest Articles

Analyzing the Product Format of AI Performing Desktop Operational Tasks Using AutoGLM as a Citation

Start by understanding the basic features of AutoGLM Contemplation.

localization

Core competencies

One sentence summary of AutoGLM contemplation vs. Wisdom Spectrum Bull Difference

One Sentence Summary AutoGLM Contemplation vs. Clearspeak Browser Plugin Differences

Back to AI performing desktop manipulation tasks

Two types of solutions for desktop task automation

What can desktop task automation intelligences be capable of?

What Desktop Task Automation Intelligence is good for

Desktop Operating Tasks Product Capability Portfolio

Uncovering the Big Model Illusion: HHEM Rankings Provide Insight into the State of Factual Consistency in the LLM

Since GPT2, OpenAI plans to release new open source weighting models

Related posts

Cloudflare Embraces Remote MCP: Pushing AI Agent Capabilities to the Wider Internet

Claude 3.7 Sonnet: First Hybrid Reasoning Model and Launch of Intelligent Coding Tool Claude Code

Gemini 2.0 released: a new AI model built for the age of intelligentsia

AI Research Assistant Competition: In-depth Review and Selection Guide of Five Mainstream Tools

No comments

Selected AI Tools

Latest Collections

Latest Articles