AI Personal Learning
and practical guidance

Zhu Xiaohu: big model entrepreneurship "pseudo demand", commercialization is the true faith

Early last year, Zhu Xiaohu was interviewed by Zhang Xiaojun.

At the time, Dark Side of the Moon had just closed a $1 billion round of funding, and its user growth numbers were off the charts.OpenAI Releases Sora The entire AI community is once again in a frenzy as technical reports and demo videos have been released.


The most popular topics at the time were the months difference between China and the US, whether open source could catch up with closed source, and who Ali had invested in big models ......

And Zhu Xiaohu has already begun to interrogate big model business models, project the eventual end of big model startups, draw attention to China's unique data advantage, as well as repeated warnings to self-create blood don't burn money ......It was also then that Zhu Xiaohu's polemic with the Dark Side of the Moon was still underwater.

The original article has been reposted and discussed over and over again. Looking back ten months later, most of Zhu Xiaohu's predictions have come true, though of course there have been surprises, and that's what makes time so moving.

 

I believe in AGI, but I believe in apps, and I believe in immediate commercialization.

Tencent News' "The Dive": When did you determine that you wouldn't vote for any of the big Chinese modeling companies on the field?

ZHU Xiaohu: We know at a glance that this is definitely not going to work.

Tencent News, "The Dive": a look at the time?

ZHU Xiaohu: When they came out of the melt down. We said from the beginning, I'm just not a big fan of big models.

Tencent News "Subterfuge": who did you watch in the first one?

Zhu Xiaohu: I don't even want to talk about it, you know? It makes no sense - these companies, to scene no scene, to data no data, you say it has what value? And one up valuation so expensive.

Not many of the "Four Little Dragons of AI" (Quantum, Kuongsun, Cloud, Itu) have made money from their investors, right? Going back to the big models, the results may not be as good as the "Four Little Dragons". The "Four Little Dragons" still had some golden age in their early years, and their revenues went up pretty fast at first. What kind of revenue did you get from the big models?

The four computer vision companies, Kuangyi (2011), Yitu (2012), Shangtang (2014), and Yunfeng (2015), have become the most dazzling group of unicorns in China's AI field from 2016 - 2018 in one fell swoop, thanks to the dual east winds of the deep learning technology explosion and the construction of domestic security.

However, when we talk about the "Four Little Dragons of AI" now, it's more of a sigh of relief: the same track competition with insufficient differentiation and the landing scenarios with insufficient commercialization, it's so difficult for AI technology startups to develop.

What's more, after breaking through the development, financing, listing and other heavy passes, waiting for them there is a long rooted traditional giant - Hikvision.

According to Zhu Xiaohu's point of view, the above history is accelerating to repeat itself in the era of big models, only with a new batch of startups, and their rivals are Byte and Ali, who have more resources and determination.

 

The point is that I will ask you a very solid question right now: GPT-4 Do you want to invest in doing research? You do GPT-4 scientific research at least 40 to 50 million dollars.

Tencent News "Subterranean": (do research) to GPT-5?

Zhu Xiaohu: No, to GPT-4!!! GPT-5 costs several hundred million dollars!

In July 2024, SemiAnalysis published a lengthy article describing the GPT-4 architecture, number of parameters, and other information, and estimating that GPT-4 costs up to $63 million for a single training session.

In December 2024, the WSJ reported that GPT-5 had conducted at least two rounds of training, each of which took several months, with the calculated cost of just one round approaching $500 million.

In addition, in April 2024, the Stanford HAI released Artificial Intelligence Index Report 2024, which estimated the cost of OpenAI GPT-4 training to be $78 million, and the Google Gemini Ultra training costs were $191 million.

 

The point is, what if you spend 40-50 million dollars to make it, and someone else open-sources it? You are not all for nothing? This is a very solid problem. Now most of the companies in China, how many dare to really spend money to develop GPT-4?

Today to GPT-3.5, everyone is more or less the same, but GPT-4 is to do some research, not so simple, and now it is not exactly an engineering problem. So if you're going to spend money and wait for someone else to open source it, aren't you going to spend all that money for nothing? The big manufacturers must do it themselves. Do you dare to startup companies?

The overall level of domestic large models is now between GPT-3.5 and GPT-4, and the updating pace of most of the general-purpose large models has slowed down significantly.

 

Tencent News' "Subterranean": all said to do.

Zhu Xiaohu: really dare to smash money, the heart must be very weak.

 

Tencent News' "Dive": Were you under a lot of pressure when you decided not to invest in big model companies in the first half of 2023? Another fund that didn't invest in big model companies said it was quite stressful. After all, most of the first-tier dollar funds have entered the market.

Zhu Xiaohu: Not much. Why is there a lot of pressure? Chinese VCs have never made money by consensus.

Investors be like: 2023 can't get share, stressful; 2024 don't know how to exit, stressful.

 

Tencent News' Subterranean: Some practitioners have also said that if you admit in the first half of 2023 that you're not bullish on big models, you'll be seen as faithless.

Zhu Xiaohu: What do you mean you have no faith? Hahahahahahaha.

Tencent News, "Subterfuge": no belief in AGI (General Artificial Intelligence).

Zhu Xiaohu: No, I believe in AGI, but I believe in applications, and I believe in immediate commercialization.

In early 2025, Sam Altman published a lengthy post on his personal blog announcing that OpenAI had mastered the way AGI was built and had begun to move toward superintelligence.

But as recently as March 2024, Sam Altman was interviewed by Lex Friedman and said that no one, including Ilya Sutskever, had built a true AGI yet.

He predicts that by 2030 (or earlier), humans may be able to build powerful systems with specific capabilities that approach or reach the level of AGI in some respects. Realizing AGI is complex and challenging.

So what happened to make Sam Altman's expectations for AGI shorten rapidly from 2030 to 2025? Or what exactly are those who believe in AGI believing in?

 

 

 

"Immediately cashable! Cash in a minute!"

Tencent News' "Insider": What AIGC companies have you invested in over the past year?

Zhu Xiaohu: Many are not necessarily invested in the past year, but its transition to AIGC instead very good, we then additional investment.

There's a guy who does AI video interviews that did really well in 2023. This surprised me! The job market was cold last year, but it AI video interviews more than doubled from 2022. Haha. I don't know how many people are recruited, but interviews are still done. School recruiting interviews are very expensive, now with AI you can reduce the cost.

There are many such scenarios. Originally to do WeChat private marketing, now you can use AI to replace the human - with LLaMA training for two or three months, at least to do the human Top 30 level, immediately remove the 50% artificial sales. China is far ahead of the United States in this aspect of the scene.

The company is Near Yu Intelligence, founded in 2017, is a HR tech company centered on AI + RPA + BI technology, and the founder and CEO is Xiaolei Fang.

In 2019, the company received an angel round of financing from GSR Ventures, followed by investments from Tech Data, InnoAngel, Dark Horse Fund, etc. In 2024, the company completed a Series A round of financing, led by Wisdom Hope Capital and followed by GSR Ventures.

 

FancyTech, you know what? Let me show you... AIGC video ads, that's pretty cool. Their product is very effective, and it's instantly recognizable. 2022 when we invested in it, it was just over $10 million in revenue, last year it was over $50 million, it's gone up five or six times, and it's all profitable.

Do you think this can be done in the US? Pika (a global hot AI video generation company) simply can't do it today!

Founded in 2020, FancyTech provides efficient and stable AIGC solutions based on Deep Video, a self-developed video industry model, for clients in the consumer industry, especially in the luxury, fashion, and FMCG sectors.

In 2022, the company completed Pre-A and A rounds of financing, and in 2023, it completed a nearly 100 million yuan Series B round of financing, led by DCM and followed by old shareholders GSR Ventures and Huashan Capital, etc. The company has also completed a series B round of financing, led by DCM and followed by old shareholders GSR Ventures and Huashan Capital.

In July 2024, an episode of the 42 Chapters podcast featured FancyTech founder William, and unexpectedly, former employees and interns of the company started a group discussion in the comments section of the podcast, pointing out various internal management issues at FancyTech, which sparked a public outcry and led to the podcast being taken off the air.

A month later, Zhu Xiaohu responded in an interview: why we are optimistic about these (companies), is to rely on 100% AI can not be done, it must rely on outsourcing, to make it the last part of the effect, startups can keep.

This response is self-consistent with his investment logic. But! The most publicized concern about FancyTech's internal management disorganization has not been touched ah 🤪

 

Tencent News "Subterranean": if the bottom layer is the big model capabilities, what are the barriers built by the application companies above?

Zhu Xiaohu: Data ah, there is no product short video data in the US. Looking at Amazon and Shopify in the US, it's still based on photos. All e-commerce, the United States based on photos. China in the past three years all switched to short video.

Tencent News "Subterranean": but other Chinese companies see the effect is good, you can immediately copy a family.

Zhu Xiaohu: It's hard to copy, it's a year ahead of everyone else.

In many vertical areas, it is necessary to accumulate data and optimization. They 60% - 70% Customer Authorization monitor the effect, so they know which videos are suitable for Taobao, which are suitable for Xiaohongshu, which are suitable for Shake, and there is closed loop data feedback. It is not easy to chase behind.

The second is sales management skills. Most big model founders don't know how to manage sales. If you don't know how to commercialize or manage sales, what do you do?

In 2023, mainstream voices have very pessimistic expectations for the development of domestically produced big models, and one of the points of view is that 'China's data quality is lagging behind'.

There are also different voices.

For example, considering the data accumulation of 2C applications and the application scenarios of vertical industries, China is more advantageous. This is because there are a large number of successful 2C companies in China such as Byte, Pinduoduo, and Meituan, while the investment in the United States after 2013 has focused on 2B SaaS companies, and the number of successful C-supply companies and the scale of data are very limited.

The first time I heard this idea, back in December 2023, was in a podcast conversation between 42 Chapters and Jack Mok. The facts confirm the foresight of these top investors' judgment.

 

Tencent News' "Periscope": You didn't invest in to C projects?

Zhu Xiaohu: to C there is, but also a little early. to B immediately commercialization, basically do not have to burn money. Yesterday we invested in a company said: AIGC PMF, you can not find ten people, cast a hundred people also can not find. It has nothing to do with the number of people or cost.

You don't go throwing money at it. It's impossible to spend money on AIGC. The key is to find a PMF! The key is to find a PMF. if you find a PMF, you don't have to spend tens of millions of dollars on a big model, the cost is not high, and it's enough to take LLaMA training for two or three months. The companies we invest in don't need many cards, the worst is just one card.Fancy may have more than ten cards, now the income is high, so to more than a hundred cards.

Zhu Xiaohu has always been the logic of investment is: to find spend hundreds of thousands of dollars, one million, two million will be able to verify that in the end there is no user demand for the product.

He then took that investment logic and brought it full circle into the AI era.

 

Tencent News "Subterranean": Sora came out of nowhere, will it form a descending blow to these programs?

Zhu Xiaohu: It will still help. We generated video, is a big model impossible 100% to do, there is partly manual, partly AI. today, big model, especially with Transformer such a structure out of a hallucination and bias, inevitably to be integrated into the workflow, the need for manual modification, go fine-tuning.

You must make things that AI 100% can not do, this is the opportunity for China, 100% the big models can do soon be subverted, what you are doing now is a waste.

Many developers and product teams, it should be somberly recognized.

After all, in the past two years, the large model context length has grown from the initial 4K in GPT-3 to 32K in GPT-4, the code generation capability has skyrocketed from the initial simple complementation to semi-automatic programming, and the generated images have gone from being subject unstable to being accurately controlled ......

Often, when I wake up, the newly released big models have begun to mention "wocao" in various groups with joy.

If the team misjudges the direction of technological advances and tinkers within the range of the future evolution of the big model, it is doomed that all efforts will end up in vain.

 

Tencent News' "Periscope": So the first wave of AIGC opportunities in China is exploding in enterprise services, can be interpreted in this way?

Zhu Xiaohu: In the short term, we will definitely do to B. When the iPhone, big brother and computer came out, they were all used by to B first. Immediately can improve productivity, see the effect. Enterprises are willing to spend money.

To C will have to see the iPhone 3 moment. Just like when the mobile internet to C apps exploded with Multi-Touch, it was unexpected, right? Angry Birds and Watermelon Cutter became global hits only after Multi-Touch came out.

At what point does the big model explode for to C applications, I don't know. Today, the term "personal assistant" is a figment of the technician's imagination. Let me ask you, how many people need a personal assistant? It's a typical pseudo-need!

The only thing we agree on about the future of Super App is that we don't know what it is.

One way to look at it is that ChatGPT itself is a Super App: in just 5 days after its release, the number of users exceeded 1 million; two months after its release, the number of users exceeded 100 million; and there are currently more than 250 million weekly active users, making it one of the fastest-growing and most influential apps of all time. Its domestic counterparts are Doubao, Kimi, and so on.

However, the prediction from overseas unicorns on this is that Chatbot will remain the most inclusive front-end interaction with the widest audience of users, but the Chatbot race is over in 2024, and the models are going to have to race on entirely new product forms in 2025.

Another way to look at it is that everyone needs a near-omnipotent AI assistant. Luo Yonghao's recently launched J1 Assistant has already taken shape in terms of functionality and interaction. But as mentioned above, Zhu Xiaohu thinks it's a typical pseudo-need.

 

Tencent news "periscope": these to B companies in AI training to achieve what effect, will realize exponential improvement?

Zhu Xiaohu: It is very simple to meet the customer the first face to sign a single. Meet and sign a single, is an assessment indicator, is the PMF.

Why was it difficult for corporate apparel in the past? The sales cycle was long, six months, so sales growth was hard. Now you create an order of magnitude value increase to the customer. Customization that's no good, it's standardized services, the first side POC (Proof of Concept), the second side of the formal contract.

PMF, Product-Market Fit.

The concept was first introduced by Marc Andreessen.

In 2007, he wrote in a blog post that "Product-market fit means being in a good market with a product that can satisfy that market. In layman's terms, it means finding a real point of need.

Since then, many theories have been born around PMF.

For example, Sean Ellis proposes the '40% rule', which states that a product reaches PMF if it is no longer possible to continue using it, and more than 40% users express great disappointment.

In the last two years, a number of derivative concepts have emerged in conjunction with LLM characteristics.

For example, Baichuan Intelligence founder Wang Xiaochuan proposed TPF (Technology-Product Fit), which is the combination of technology and product.

 

Tencent News "periscope": how to see the Chinese state-owned enterprise service investors say that there is no big battle on this track, much less earn big money? ("Not a single chance of a billion return.")

Zhu Xiaohu: American enterprise service companies tripled in the first year, tripled in the second year, doubled in the third year, doubled in the fourth year, and soon achieved 100 million dollars ARR (Annual Recurring Revenue). China's previous enterprise service software in the tens of millions of yuan when the growth to 50%, it is difficult to grow. after AIGC came out is different, last year, many companies grew three times to more than five times.

Tencent news "subterranean": how do you think the past two years, some investment institutions, enterprise service track investors forced to disappear as a group of this phenomenon?

Zhu Xiaohu: Hahaha. That's right, this is honestly a pity, did not survive until the spring. China's enterprise services still have a chance, but there may be three to five years of winter. Macro too much uncertainty, enterprise service is the king of the leftovers. Today you can not rely on AIGC quickly reach the explosion of growth without burning money is the only way.

 

Tencent news "periscope": to C big explosion need iPhone3 moment, now what is the moment?

Zhu Xiaohu: Just iPhone1, iPhone2 bar. The evolutionary speed of big models is ten times faster than that of the mobile Internet. When there is a big model on every cell phone, the to C application may explode.

End-side large models are large models (also called small models because of the relatively small number of parameters, the golden size is around 3B) that run on end devices (cell phones, tablets, etc.) with good privacy and security, low latency, and support for offline use.

Currently, the more well-known end-side models include the MiniCPM series of Facade Intelligence, the ChatGLM series of Smart Spectrum, the Qwen series of Alibaba, the InternLM series of Shanghai AI lab, the Phi series of Microsoft, and the Octopus series of Nexa AI.

 

Tencent News' Dive: If you look back at the big 2023 model, what moments would you put as key nodes, both globally and in terms of China?

Zhu Xiaohu: (Thinking for 2 seconds here ......) LLaMA going online is very important.

Open source is a completely different situation, at least it allows China to have a basis for innovation at the application level. before LLaMA, a lot of shells used OpenAI, which was a bit of a problem. Before LLaMA, many shells used OpenAI, which was a bit problematic, but after LLaMA, at least there are no more problems.

Those CTOs of ours are very young. Take LLaMA and train for two or three months, the worst is a card, train for two or three months and you can commercialize it right away. Think about it - the commercialization threshold is really, really low. It's instant cash! It's instant cash!

Very curious what Zhu Xiaohu's judgment is on the '2024 critical node for big model development' 👀

My personal most physical node is May 2024, when DeepSeek-V2 reduces the price of the big model APIs to $100,000 per million. token It costs only 1 RMB.

Subsequently, vendors such as Byte, Ali, Baidu, Tencent, and Smart Spectrum quickly followed suit, announcing price cuts or free strategies. Of course, there are 'real' and 'fake', and there are even vendors who directly stated that they would not participate in the price war (e.g. Zero One Everything).

As a result, DeepSeek, GLM-4-Flash, and SiliconFlow have become the most guaranteed API options for AI product developers in China, giving the possibility for more products to be born.

Indeed, as analyzed at the time, the DeepSeek API price reductions are a very positive sign before the explosion of 2C products.

 

Tencent新闻《潜望》:Google has just launched the open source model Gemma, how is the performance compared to the previous LLaMa and Mistral, and what is the possibility of open source after OpenAI?

Zhu Xiaohu: The overall feedback is a little better than LLaMa 2, some people in the industry feel that it is going to be released before LLaMa 3, LLaMa 3 is also fast. openAI is not necessary at the moment. OpenAI is not necessary at the moment. Right now, LLaMa, Mistral, and Google, these three are competing. OpenAI is still a long way off.

Currently, in the LMSYS big model arena ranking, the top 3 open source big models are DeepSeek-V3, Yi-Lightning, and Qwen2.5-plus-1127, all of which are models from Chinese companies. Moreover, DeepSeek V3 has good performance and is very close to the top closed-source models.

The global landscape of open source modeling has been turned upside down.

The latest versions of the three open source models (families) mentioned above are all in the range of GPT-3.5 and GPT-4, and their progress is summarized below:

✦ LLaMa 3 was released on April 18, 2024, and since then versions 3.1 (July), 3.2 (October), and 3.3 (December) have been released.

✦ Gemma 2 was released on June 27, 2024, and included different parameter sizes such as 2.6B, 9B, and 27B. There have been no significant updates since then.

✦✦ Mistral AI Between July and October, the Mistral Large 2, Pixtral, and Ministral series of large models were released. Sources tell us that Mistral AI has stopped pre-training.

 

If they catch up with GPT-4, OpenAI has a chance to open source a small model. For many vertical applications, we found that Mistral 2 is better than LLaMa 2. Anyway, if we open source a model, let's try it out and see which one works better.

In December 2024, OpenAI announced on its official website that it would be breaking away from its 'nonprofit-led for-profit company' structure and creating a new, profitable, Delaware-registered Public Benefit Corporation (PBC), with the main operations to be in the hands of the newly formed PBC.

In fact, the last OpenAI open source model was GPT-2 (2019).

Since then, for GPT-3, DALL-E, CLIP, Whisper, GPT-4, Sora, o1 models, and so on, OpenAI no longer open-sources the model code and weights, but only releases a technical report. The technical report has also evolved from a detailed technical breakdown to a more abbreviated overview, with less technical details disclosed.

Can we wait for OpenAI to open source new models?

 

 

"This is classic FOMO."

Tencent News' "Periscope": I asked what China's big models rolled out in 2023 after a year. One investor told me to roll out a few big model startups.

Zhu Xiaohu: Let's see in another year, how many of these can still be around?

Early 2025 Answer to Zhu Xiaohu: All are still there, but have had, a harder time.

✦ Dark Side of the Moon should be the first team to stop pre-training in China, and overseas 2C apps Ohai (virtual companion) and Noisee (music video generation) have also stopped, and are now focused on doing a good job. Kimi Kimi is ranked #2 in terms of monthly activity in China, while #1 is the latecomer and breakaway leader Doubao.

✦✦ Minimax is currently focusing on Talkie (virtual companion), an overseas 2C conversation app, and has already overtaken Charactor.AI to become No.1, but the domestic version, Hoshino, has been snatched by Byte's Catbox, and Byte's 2B market has been seized by Byte as well.

✦ Zero One Thing has publicly stated that it will no longer pursue training super large models, most of the training and AI infra teams have joined the joint lab as Ali employees, and the company will focus on offshore 2C applications and domestic 2B business. This is the first Chinese big model unicorn to publicly adjust its development direction significantly.

✦ Baichuan Intelligence was confirmed to be transforming the healthcare vertical, and some time ago it released a financial grand model.

✦ Smart Spectrum focuses on 2B and 2G, with AutoGLM as a key direction for subsequent development.

✦ Step Star is a bit more unusual in that, in addition to training models, 2C apps mainly take the form of collaborations with external development teams, with a little pop-up every now and then, such as the familiar Book of Stomachs, Lyrics Riot Machine, and Woodland Healing Room.

Zhu Xiaohu, in a subsequent forum in June 2024, extended the above point: almost all of the first-tier big model companies have already allied with giants, while the second-tier big model companies can only sell out; it can be asserted that in five years, there may no longer be any independent big model companies in existence, and there will only be AI application companies or cloud service companies.

 

Tencent News "Subterfuge": have you seen every one of them?

Zhu Xiaohu: I saw the concept and knew that there is no chance, no possibility, I do not chat one. I'm very familiar with all of them. Wang Huiwen (co-founder of Meituan, founder of Light Years Beyond), I'm so familiar with, I'm not willing to go to him and talk to him about this.

Tencent News' "Subterranean": you belonged to the enemy team in his last war - he was in the Meituan camp, you were in Hungry (as an investor).

Zhu Xiaohu: No, no, and he has a very good personal relationship.

Tencent News' "Dive": How did you react when Wang Huiwen threw up his arms and said he wanted to get into big models?

Zhu Xiaohu: Everyone calm down, let the bullet fly for a while. Fly for half a year and you will know whether it works or not.

Tencent News' "The Periscope": What role did Wang Huiwen play in the big modeling wars with his $50 million entry and his sudden retirement?

Zhu Xiaohu: This is the romanticism of the technical man. Wang Huiwen didn't figure this out, and he's good at commercialization. If he did the application at the beginning, the result must be much better than now. At that time, everyone will FOMO mood is relatively high.

In June 2023, Meituan acquired Light Years Away.2024 In November 2024, media reports claimed that Wang Huiwen had returned to Meituan to lead the GN06 independent team to explore AI applications.

Beyond Lightyear has already launched several AI apps, such as Dodoboo (a children's drawing app for the overseas market), Pretty Fish Le Companion (a children's AI voice interaction app developed in cooperation with the Little Genius Watch), and Miaobrush (an app based on the ComfyUI (an image generation and editing tool), Wow (a virtual social community), and more.

According to 01Founder, Wang Huiwen has established two main points of strength for GN06: first, to explore independent and borderless innovation; and second, to seek breakthroughs in the global market.

 

Tencent News "Outlook": Many large model companies are now working along the "two-wheel drive" proposed by Wang Huiwen.

Zhu Xiaohu: How to two-wheel drive? Can you turn up on two wheels? Which one of the big models of the two-wheeled turn up? Baidu, I honestly don't dare to say that I have two wheels turning up. Baidu at least there are a lot of scenes, Wenxin Yiyin at least do early, at least 1 million DAU. it also dare not say that this year double wheel drive turned up.

Unsurprisingly, the original goal of "molding as one" has been defeated.

Progress on general-purpose models such as Minimax, Zero One, and Baidu has slowed; Smart Spectrum, DeepSeek, and Qwen lack out-of-the-loop 2C applications (Chatbot doesn't count).

But offhand, there was an accident - a byte.

2024 In the second half of the year, Byte can be said to be progressing at a rapid pace: on the one hand, it is digging hard for people, and modeling capabilities are rapidly iterating; on the other hand, the "product factory" has awakened, saturated with all the popular product directions, such as Beanbag, i.e. Dream, Cici, Gauth, Hypic, coze...

Left foot on right foot, spiral to the sky.

The foreign counterpart to this - Google.

In the second half of 2024, the tech giant finally redeemed itself with the release of the Gemini 2.0 series of models, all of which are quite capable, and apps such as NotebookLM, Illuminate, Learn About, ImageFX, Whisk, and many others have come out of the woodwork.

The bloodstain is thick, it's really resistant to beatings.

 

Tencent News' Dive: What do you think of the placement of some funds in this wave of large model companies?

Zhu Xiaohu: Some of them are typical FOMO, that is, Fear of Missing Out, fear of missing out.

Tencent News "Dive": you're not afraid to miss it?

Zhu Xiaohu: We don't care, what do we miss? When the mobile Internet just came out, we want to do China's localized OS, and then where are these companies ah? We invested in the Dim Sum operating system (the first company incubated by Lee Kai-Fu's Innovation Works), and then okay, was collected by Baidu. Today's big models have to wait for the big players to take over, how can the big players take over today? Mobile Internet when there is no antitrust, we still make a little money.

Tencent News "Subterranean": a normal investor should not be, I first find a way to enter the game before?

Zhu Xiaohu: It's very cheap to get into the game, and everyone still tries. So expensive valuation, go there for what ah? I simply do not want to chat.

The present situation is different from that of the "Four Little Dragons". The "Four Little Dragons" grew up during the capital bubble, and there were rounds and rounds and rounds and rounds and rounds and rounds and rounds and rounds. Up to now, who can still finance the next round? Today, the big domestic models are basically looking for the government to get money, and the government's money is not good anymore. Moreover, the valuation has been raised to here, how do you get the valuation behind?

2024 In the middle of the year, there was not much news about big financing. Until the end of the year, Baichuan, Smart Spectrum, Facing Wall, Aishi Technology, and Step Star have announced new rounds of financing, and the investors all include state-owned capital.

 

Tencent News' Dive: Will there be a wave of mergers or acquisitions of big Chinese model companies in 2024? --like the Meituan-Dianping merger you've witnessed before, or Ali's acquisition of Hungry Mou.

ZHU Xiaohu: People who are technical, they don't believe that I'm worse than others. How do you say this merger? (Laughs) Secondly, who is willing to merge and acquire now? And today, if we merge and acquire, if we all use LLaMA open source to change, what is there that I do not have? I have data and scenarios. What do you have? You just have a few people.

Now maybe M&A is about acquiring a team for the sake of acquiring a team. How much is the team worth? How much can you spend? It's completely different from the old days.

Microsoft and Inflection AI, Google and Character.AI in foreign countries, and Ali Cloud and Zero One Everything in domestic countries, in fact, is more or less the same "script": the big manufacturers to take away the core technology talent, so that investors take the principal plus interest to exit.

Review these three classic cases of 'repossession':

✦ In March 2024, Microsoft provided a new standard to the Inflection AI paid $650 million for access to its big modeling technology and brought most of Inflection AI's employees into the fold, including its co-founders and key researchers.

✦ In August 2024, Google acquires the stock of Character.AI investors at a valuation of $2.5 billion and pays them a non-exclusive license fee to the big model technology. two of Character.AI's co-founders and core researchers join the Google DeepMind team.

✦ In January 2025, AliCloud and Zero One Everything set up a joint lab, with most of Zero One Everything's training and AI infra teams joining the lab as Ali employees. No more specific details have been revealed at this time.

 

Tencent News' "Dive": what's the way out for these big model companies?

Zhu Xiaohu: I don't know. I don't even want to care about these things. Today I just say to all the companies, how much revenue do you have? Can you not burn money? All only care about this.

In early September 2024, two blurbs about the 'Big Model Six Little Dragons' started circulating in the AI community.

We started to do a series of observations, trying to find the correlation between the breaking news, such as "overseas products failed miserably and all of them were cut," "from prosperity to silence," and "pre-training is no longer done, and C-end is also no longer done," and the clues and clues in the movements of each company.

Now, according to public information, it also corresponds to a sevenfold.

 

Tencent News "Subterranean": you said in a previous interview that investors who invested in big models in the first half of the year regretted it in the second half of the year. Is this what they told you, or did you guess?

ZHU Xiaohu: It's hard to say. There are definitely people who regret. The key to your next money is really not good, you now have enough money on hand. It's really embarrassing. You GPT-4 in the end do not do? --If you don't do it, what difference do you have with others; if you do it, in case others GPT-4 open source, you will regret it. You want to do vertical scenes, vertical scenes which you have an advantage?

 

Tencent News' "Subterranean": But just this February, Dark Side of the Moon just raised $1 billion with investors like Ali, Tonus Capital, and Little Red Book. Other big model companies are also said to be doing such big deals. That means there are still people in the market willing to invest, and quite a few of them.

Zhu Xiaohu: Mainly big factories, big factories or FOMO, afraid of their own mistakes. Dark Side of the Moon most of the money is out of Ali.

In 2023 and 2024, Dark Side of the Moon, MiniMax, Smart Spectrum AI, Baichuan Intelligence, and Zero One Everything, all received investment from Ali, with a cumulative amount of more than RMB 10 billion. This is known as the "Ali round".

 

Tencent News' Subterranean: Will Ali and Dark Side of the Moon form a model like Microsoft and OpenAI?

Zhu Xiaohu: It depends on Ali's investment department and internal regulation. This is not yet certain, Ali also has several internal teams working on it, which in the end depends on who is good to use, the business department will use.

Stroke a timeline. Not knowing the true face of Mt.

In February 2024, Alibaba invested $800 million in Dark Side of the Moon. Hu Xiao, the main driver of the investment, also pushed Dark Side of the Moon's big model into pilot applications in multiple business scenarios at Ali and helped enter the enterprise services market.

In April 2024, the Tongyi Thousand Questions model was released, and Alibaba announced that all products will access the "Tongyi Thousand Questions" big model for comprehensive transformation in the future, including Tmall, Nail, Gaode Maps, Taobao, Youku, and Boxmart.

In September, 2024, Hu Xiao left Ali's war chest to join Morning One Foundation.

In November 2024, Cycle Intelligence and five of its investors (GSR Ventures and others) filed an arbitration in Hong Kong against Yang Zhilin and Zhang Yutao, alleging that they initiated financing and created Dark Side of the Moon without obtaining a consent waiver.

In December 2024, Zhu Xiaohu and Yang Zhilin publicly exchanged words, and the conflict between Zhu Xiaohu and Zhang Yutong gradually became public. For more details, check out the "Waves" feature story.

 

Tencent News' Dive: Will other giants invest so heavily or simply acquire a big modeling company next?

Zhu Xiaohu: The key is now confidence in the internal team. Now it seems that the big factories on Ali show willingness to merge and acquire. Unlike before, the big manufacturers are willing to merge and acquire. But the money Ali is willing to pay today, and before is certainly completely different ah.

Byte should be no merger and acquisition willingness, think they can do it. Baidu certainly feel they can do. Tencent is not good to say, Tencent internal several teams are doing, but at least at present do not see a strong willingness to mergers and acquisitions. And Tencent has always thought is not urgent, slowly follow at the back, it has the scene, there is data, you see Tencent from the game to the video to the music to literature, are followed at the back followed the first.

2024 has passed, and the companies are indeed following the above route. Bytes are firing on all cylinders, Baidu is living the dream, and Tencent is taking its time.

 

Tencent News' The Dive: What do you think of Mission's acquisition of Light Years Away? Although this one is pretty special.

Zhu Xiaohu: It's a complete relief for Lao Wang (Wang Huiwen), and basically the investors will get their capital back. This is also a warning - future mergers and acquisitions, if not so successful mergers and acquisitions, may be to let investors get some principal plus interest back. Big companies are not so rich, mergers and acquisitions and the previous can not be compared. If it is to let investors get some principal plus interest back, then what is the point of investing ah?

Of course, Dark Side of the Moon is valuable to acquire if it proves itself and the big model can catch up to closed source level and can do GPT-4.5 or GPT-5. But how much can you pay to acquire a team if it only makes it to the open source level?

The progress of China's big models, if you catch up with open source, at least you still have the value of existence; if you can't catch up with open source, there is no point; if you catch up with closed source, you can only have a unique additional value.

 

Tencent News' "Subterranean": what would you say to your peers who are already in the game?

ZHU Xiaohu: (Thinking for a long time) It's not too fucking easy to say, is it?

(Thinking about it again for a long time) I think this is something that people ...... this is not too good to talk about, this is not too good to talk about, this is not too good to talk about.

It doesn't matter to them anyway, they have a lot of money, honestly.

 

 

 

"That's why I don't advise domestic entrepreneurs to use big domestic models."

Tencent News' Dive: How big are the differences between China and the U.S. in this wave of big models?

Zhu Xiaohu: To be honest, in this wave of AIGC, the gap between China and the United States is still very large. The United States is in the bottom of the big model, the input is getting bigger and bigger, like OpenAI said 100,000 GPU cards connected together. It is impossible in China.

When you look at AI application innovation in the United States, there are honestly only two paths. One is either very, very thin because the underlying big model is so powerful that the top is called a shell application. The other layer is that it looks great, but it certainly doesn't work. Like Pika, this kind of goal is great, AIGC generates videos, movies. But this road may not see the possibility of going through in a few years.

This observation is very accurate.

The ambition of OpenAI engineers is to build extremely powerful general-purpose big models, and then external applications are just a thin layer of shells attached to them. This has to do with OpenAI's resources and strength, as well as the application scenarios in the United States.

In this interview, Zhu Xiaohu explicitly mentioned that there are two companies that he is not optimistic about: Pika, because the goal is so big that it is unrealistic; and Midjourney, because the scene demand is too low-frequency.

Both companies seem to sense the dangerous vibe of Death Note, and both are trying to switch up their development strategies in 2024: Pika links to more explicit everyday use scenarios with effects modes and the like, and Midjourney released Patchwork, which is the best infinite canvas-type creation product I've seen (so far).

As for the grandiose blueprint of "AIGC Film and TV Generation", it is obviously to be handed over to head creative platforms like Cutting Screen, or giants in the film and TV industry like Disney.

 

China, on the contrary, China in the "in the middle" a little more - the underlying big model is not strong enough, I can add more things on top. I can do value-added services on top of it, and I can immediately realize cash for my customers. This kind of opportunity exists in China.

There's almost nothing like this in the US because the underlying big models are so powerful that there's very little that startups can do on top. In China, no one will look at the shell. In China, there is no shell, because the function of the big model itself is just like that, there must be value add on top.

(What's the point of having over 200 big models? Not much. But there is a lot of innovation at the application level. China is far ahead of the United States in terms of data and application scenarios.

Add an information dimension.

Among the top ten national treasures of central enterprises in 2024 released at the beginning of 2025, two are related to large models: the "Nine Skies" large model developed by China Mobile and the "Harnessing Electricity" large model developed by China Southern Power Grid.

The other national treasures on the same list are the "Dream" ocean drilling ship, the new intelligent heavy-duty electric locomotive, the "Diffraction" quantum computing cloud platform, the world's largest offshore wind turbine, and the "Jianghai" super-large-diameter shield machine, etc. The "Dream" ocean drilling ship is the first of its kind in China.

Feel the power of "the only country in the world that has all the industrial sectors listed in the United Nations Industrial Classification".

 

Tencent News' "Subterranean": what are the bright cards on the big model poker table right now?

Zhu Xiaohu: Open source is now a generation behind non-open source, but in the long run, open source will definitely catch up.

Tencent News "Outlook": Li Guangmi, the founder of Pick Up Elephant, judged that open source models can't catch up with closed source models, and the gap will definitely get bigger and bigger, and the big model is very much like a chip or Space X. In terms of talent density, LLaMa is not enough yet, and he thinks that the core secret of the big model of the Silicon Valley is in the three companies, OpenAI, Anthropic, and Google.

Zhu Xiaohu: The OpenAI technology iteration curve is still relatively steep, and open source is definitely a year or even a year and a half behind non-open source, but when the non-open source technology iteration curve slows down, open source will go up. But when the non-open-source technology iteration curve slows down, open source will go up. openAI only a couple hundred engineers, open source is used by millions or tens of millions of engineers all over the world, how can it be lagging behind non-open-source all the time? Like Android, is it worse than iOS today? It's definitely not.

It's all about whether 100,000 cards will come out or not. Will the "miracle" continue? If 100,000 cards can still "vigorously out of the miracle", that is really awesome; if 100,000 cards can not significantly improve performance, slow down. As soon as the technology iteration curve slows down, open source immediately catches up - who can guarantee to keep secrets forever, there are no secrets to keep.

The above judgment is partially validated in 2024.

First, the delay in the release of GPT-5 and Clude 3.5 Opus is seen as a key signal of a slowdown in technology iteration. openAI's investment in model training is still huge, but its cost-benefit ratio is beginning to be questioned, and its market share has dropped from 50% to 34% in 2023, with the market being eroded by the open source models step by step.

Second, the Scaling Law 'hitting the wall' argument was discussed from mid-year to the end of the year, and finally, in December 2024, it was made clear by Ilya Sutskever that AI training data is facing a growth bottleneck, and that the predicted size of existing data will not be able to meet the demands of future development, meaning that the era of pre-training is coming to an end.

Also, on December 26, 2024, DeepSeek-V3 was officially open-sourced, and in several benchmarks, performance has been comparable to top closed-source models such as GPT-4 and Claude-3.5-Sonnet.

In the current situation, Zhu Xiaohu's judgment is right: the technology iteration curve will always slow down one day, open source can always catch up.

 

Tencent News "Subterranean": the founder of the Dark Side of the Moon Yang Shik Lin's point of view is that the development method is not the same as in the past, in the past, all people can contribute to the open source, and now the open source itself is still centralized, and the open source contribution may be a lot of not through the arithmetic validation - what do you think of his biased technical inference type of judgment?

Zhu Xiaohu: The application layer will be more in favor of open source, especially for Chinese developers, with open source at least you do not worry about being copied by others.

Big domestic models, honestly you build a house on them and still worry about people copying you. Model and application skills are completely different, the model needs scientists, these people know a lot about the technology, and do not need a lot of people, only need to be lean. Applications, you need to know the scenarios, market placements, sales very well, it's a completely different skill set than scientists.

Tencent News' Subterranean: The idealized idea of a big model company is that I make the best model with one hand and the best application with the other.

Zhu Xiaohu: That's why I don't recommend domestic entrepreneurs to use domestic big models. You use the domestic big model, you do well, surely others will copy you. They all do big model, really do not understand the application ah, but if you do well based on it, it is easy to copy you.

The United States has a clear division of labor, the domestic big model companies know that their big models are behind the United States, and then all want to do, then the entrepreneurs must be more afraid to use it. I have always said to domestic entrepreneurs, never build a house on someone else's foundation.

Chatbot of major model manufacturers is integrating more and more functions, and the product form is also richer, the Web terminal and mobile terminal is a must for every family, and the desktop terminal, applets, browser plug-ins and so on will also be ready one after another.

Many of the small AI apps that burst onto the scene before have completely run out of room to survive.

In addition, I have another interpretation of the phrase "never build a house on someone else's foundation": never host an app on someone else's content ecosystem, especially public numbers and Xiaohongshu.

The viability of your product will be drastically narrowed when platform policies are tightened, or when there is a conflict with internal products. This is a foregone conclusion.

 

Tencent News "Subterranean": OpenAI is not going to make applications?

Zhu Xiaohu: It was forced to make a GPT. It really didn't find many scenarios on it by the applicants, so it made a GPT to show it. The front-end scenarios in the U.S. have been done by others, why did Microsoft cooperate with OpenAI? Microsoft has a bunch of scenarios, and OpenAI has no advantage to do it by itself, so it must cooperate with others. Now it is very obvious in the United States that the big model will be a part of the cloud service in the future.

 

Tencent news "subterranean": Yang Shilin team what do you think?

Zhu Xiaohu: We invested in his last company. He is very powerful, and big models do suit him better. It's OK for him to do scientific research, but I don't know how he's going to commercialize it. Shit, Wang Xiaochuan is the same.

They (Dark Side of the Moon) are ahead on the big domestic models, but still have to prove their worth in the long run to at least catch up with US open source. If they can overtake open source, he's really valuable as a team.

On November 28, 2024, at the k0-math press conference, Yang Zhilin responded to several questions related to commercialization, such as how to look at the competition between Kimi and Doubao, and Kimi's current core mission and stream casting strategy.

When asked, "What is Kimi's most important core task at the moment?", Yang Zhilin answered: improving retention, and it never ends.

Yang Zhilin's several decisions were undoubtedly decisive. Only, this road is too dangerous to walk. Make people feel in vain.

 

Tencent News' "Subterranean": your views are more intense than many, have you ever been disliked by your peers or entrepreneurs offline?

Zhu Xiaohu: No, I can't answer any of the questions I asked. Who can answer it? I hope to be disliked, the key is who can answer it. Where is your commercialization scene? Where is your data? I don't know. Just go talk to him yourself and find out.

This is really worse than the "Four Little Dragons of AI". When the "Four Little Dragons" entered the market, there were not so many competitors, the market was only five, six, seven or eight, the competition was not so fierce. There are still two or three years of golden period, the revenue is done up, to the back is to kill the price.

Now 200 big models. Big model at the beginning of 2023, 10 million privatization deployment of a, to June five million deployment of a, to the end of a million do not. To the central enterprises to deploy a privatization of large models, do not one million yuan. A year's time on the price to kill the floor price to go. How to do? How can a startup company do it? So early into the price war, the big model company will be very difficult to survive alone.

This year will show if the big model itself is a good business model. How many OpenAI users will migrate to Google's Gemini because of the price difference -- $20 a month for OpenAI, $10 a month for Gemini. Half of our US team has already switched to Gemini, partly because of the price and partly because of Google's ecosystem.

Where the best models are, where the free channels are, users flock to them.

Businesses don't have moats. Users have no loyalty.

Add two observations that corroborate the point above.

✦✦ Claude A while ago, I was often 'resource constrained' and then switched the user's default model from Claude 3.5 Sonnet to Claude 3.5 Haiku. but, after the release of Gemini 2.0, the resources suddenly aren't constrained anymore 👀

✦✦ Poe A $10 subscription package was introduced in November. It looks like there's a cheaper paid option, but it's actually less cost-effective. It should be that the usual wool was pulled so hard that the platform had no choice 🤣

 

Samsung's AI phone is already bundled with Google Gemini. it's up to Apple to see which big model it will bundle with its new iPhone and how much it will charge the big model companies.

Apple's chosen partner in the U.S. is OpenAI.

In December 2024, ChatGPT has been officially integrated into Apple Intelligence for iOS 18, allowing users to use Siri via the ChatGPT Function.

The way the two work together is being kept under wraps, with people familiar with the matter revealing that they don't pay each other, and that Apple may get a cut of users' ChatGPT Plus subscriptions in the future.

Apple's choice of Chinese partner, on the other hand, can be described as a series of twists and turns.

In December 2024, foreign media reported that Apple had paid up to $10 billion to use Baidu's AI models and had borne the cost of retraining and fine-tuning the models, but the partnership was still not going well.

Then, word came out that Apple started talking to Tencent and Byte, as well as engaging with Wisdom Spectrum AI.

 

"You have to be realistic."

Tencent News "Subterranean": you feel, what are the commonalities and differences between this era and the previous one?

Zhu Xiaohu: I think Midjourney It can't be guarded. Why is Midjourney still hot? Because the technology iteration cycle is still steep, Midjourney 5, Midjourney 6, the version speed is fast. However, once the technology curve slows down, you can't hold on to it because the to C application is too low-frequency, and it's too easy to attach a thing to it. Why do big companies give you a chance? U.S. companies may still merge and acquire, China is not necessarily doing well.

So, like the mobile Internet, to C applications must be in high demand and high frequency to have a chance of holding up in the long run. I have a lot of concerns about Midjourney.

It's been two years since the ChatGPT was released, and the world is gradually coming to a new consensus and returning to common sense after a period of disruption.

For example, the above article, 2C applications must be high-frequency just need, is one of the common sense of the mobile Internet era.

 

Tencent News' Dive: Do you have a favorite entrepreneurial portrait?

Zhu Xiaohu: very clear thinking, ten minutes to understand a thing, and expression is more direct.

(The questions in my question) are basically the same, where is your market opportunity? How big is the market opportunity? Why you? Those are the questions. There's not a lot that can be said about it.

 

Tencent news "subterfuge": if you speak today about the big model judgment are wrong, how do you think?

Zhu Xiaohu: That's normal and possible. But in my personal opinion, the core issue is still whether or not AGI can be generated, whether or not artificial intelligence can be generated to understand the world model. At present, I feel that at least 5 to 10 years is invisible.

From a philosophical point of view, an increase in the level of intelligence first requires an increase in the level of energy. Before the realization of controlled fusion, I don't quite believe that the earth has enough arithmetic power to realize the real AGI. the work of helping human beings to reduce 90% may be realized in the next 3 to 5 years, but the final 10% may need a sky-high amount of arithmetic power and energy consumption, which is why Sam Altman would like to melt sky-high amount of money! Walking a hundred miles is half the battle.

Jürgen Schmidhuber (father of LSTM) says the singularity will come around 2040, Ray Kurzweil (inventor & futurist) gives 2045.

While Elon Musk says AGI will be here in 2026 and Dario Amodei (Anthropic CEO) agrees, Demis Hassabis (DeepMind founder / 2024 Nobel Prize in Chemistry winner) says it will take at least 10 more years and Geoffrey Hinton (deep learning pioneer / 2018 Turing Award winner) gives the latest version of the reference as 5 - 20 years.

And Sam Altman adjusted his AGI estimate from 2030 to 2025. If I understand correctly.

Is the first reaction of those who believe in AGI to Sam's tweets gleeful, skeptical, or a collapse of faith?

 

Sora proves that the US has the money and the courage to try and make mistakes, so it's good that China is just following behind. The technology iteration curve will definitely slow down.

It was the same when the PC first came out, (the CPU model of the computer) 286, 386, 486, we all thought that the United States was the best, but after 586, Lenovo bought IBM. The technology iteration curve can't be so steep all the time. NVIDIA rose 20% yesterday, which means it is getting closer to the top.

In February 2024, OpenAI officially released the technical report & demo case of Sora, a large model of Vincentian video, and said it would not be available to the public in the near future.

In April 2024, Raw Digital Technology Vidu released a teaser. This kicked off the prelude to the domestic born-video application, byte that is the dream, fast hand can be Ling, PixVerse, MiniMax Hailuo, Zhi Spectrum clear shadow, Tencent mixed yuan one after another to release products and APIs.

Among them, the product with the best effect and the most out of the circle is KLing: after its launch in June, KLing has exploded globally and become the most discussed domestic AI application/model overseas (except for the open-source big model).

 

Tencent News' Dive: When do you expect the technology curve to slow down?

Zhu Xiaohu: GPT-5 will basically slow down after that - GPT-4.5 should be released soon this year, GPT-5 is not necessarily, may be next year. GPT-5 is the video generation to the level of today's image generation, and it's not easy to go further.

Now look, language modeling progress is almost at the top, breakthrough in multimodal, Sora released three or four more versions can also see the top. The latter breakthrough according to their own words 7 trillion dollars, at least hundreds of thousands of cards out of the training, the cost is too high and too expensive.

In an interview with German business newspaper Handelsblatt in October 2023, Bill Gates (Bill Gates) said that GPT-4 had reached the ceiling of generative AI capabilities.

There are "many good people" working at OpenAI who are convinced that GPT-5 will be significantly better than GPT-4, including OpenAI CEO Sam Altman, Gates says. But he believes that current generative AI has reached a ceiling - though he admits he could be wrong.

Link: https://the-decoder.com/bill-gates-does-not-expect-gpt-5-to-be-much-better-than-gpt-4

At the time, we didn't think much of it, imagining that OpenAI had countless treasures in its pockets, each of which would be famous around the world. Now I look back at Bill Gates' public statement and realize that we didn't pour enough cold water on it.

 

Tencent News "Subterranean": back to the topic of AGI faith, if the big model companies are destined to see no commercialization, take a step back, can you use the money you have to support the human dream and scientific research?

Zhu Xiaohu: That one needs big manufacturers and governments to support it. Why do Americans dare to invest? Microsoft has a market capitalization of 3 trillion dollars, Apple has a market capitalization of 2 trillion dollars, and they dare to invest money.

China does not need to smash, the United States has the money in front of the trial and error, once it proves that the road is open, the back to spend an order of magnitude less money. We follow behind, spend an order of magnitude less money, the risk is much smaller, why not follow behind?

"This fourth-generation plane is not a fourth-generation plane. What we have is not a fourth-generation fighter in the American sense, nor a fifth-generation fighter in the Russian sense.

We are an improvement on the J-10. ......

It's not that easy. The US F-22 has been around for more than 20 years, but China has no plans to launch its fourth-generation aircraft. ......"

May not be reproduced without permission:Chief AI Sharing Circle " Zhu Xiaohu: big model entrepreneurship "pseudo demand", commercialization is the true faith

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish