How China’s New AI Model DeepSeek Is Threatening U.S. Dominance
发布时间 2025-01-24 17:00:42 来源
中英文字稿 
China's latest AI breakthrough has leapfrogged the world. I think we should take the development out of China very, very seriously. A game-changing move that does not come from OpenAI, Google or meta. There is a new model that has all of the valley buzzing. But from a Chinese lab called Deepsea.
中国最近的人工智能突破已经跃居世界领先。我认为我们应该非常认真地看待中国的发展。这是一项改变游戏规则的举措,而不是来自OpenAI、谷歌或Meta。一个新模型正在整个硅谷引起轰动,但这个模型来自一个名为Deepsea的中国实验室。
It's opened a lot of eyes of like what is actually happening in AI in China. What took Google in OpenAI years and hundreds of millions of dollars to build, Deepsea says took it just two months and less than $6 million. They have the best open source model and all the American developers are building on that. I'm Georgia Bosa with the Tech Check Take. China's AI breakthrough.
这让很多人真正了解到了中国在人工智能领域的实际进展。对于谷歌和OpenAI需要几年时间和数亿美元才能打造的技术,Deepsea公司表示他们仅用了两个月时间和不到600万美元的资金就完成了。他们拥有最好的开源模型,所有美国的开发者都在基于这个模型进行开发。我是Georgia Bosa,来自Tech Check Take。中国的人工智能突破。
It was a technological leap that shocked Silicon Valley. A newly unveiled free open source AI model that beat some of the most powerful ones on the market. But it wasn't a new launch from OpenAI or model announcement from Anthropic. This one was built in the east by a Chinese research lab called Deepsea. And the details behind its development stunned top AI researchers here in the US.
这是一场令硅谷震惊的技术飞跃。一款新推出的免费开源AI模型击败了市场上一些最强大的模型。但这并不是OpenAI的新产品发布,也不是Anthropic的新模型公告。这款模型是在东方由一家名为Deepsea的中国研究实验室研发的。它的发展细节令美国顶级AI研究人员感到震惊。
First the cost, the AI lab reportedly spent just $5.6 million to build Deepsea version 3. Compare that to OpenAI. We're just spending $5 billion a year. And Google, which expects capital expenditures in 2024 to soar to over $50 billion. And then there's Microsoft that shelled out more than $13 billion just to invest in OpenAI. But even more stunning how Deepsea scrap your model was able to outperform the lavishly funded American ones.
首先,说到成本,据报道,这家人工智能实验室仅花费了560万美元就构建了Deepsea 3版本。相比之下,OpenAI每年花费50亿美元。谷歌预计在2024年的资本支出将飙升至超过500亿美元。还有微软,仅投资OpenAI就花费了超过130亿美元。但更令人惊讶的是,Deepsea的小成本模型竟然能够超越那些资金充裕的美国公司。
To see the Deepsea new model, it's super impressive in terms of both how they have greatly effectively done an open source model that does what is this inference time compute and it's super computer efficient. It beat Meta's llama, OpenAI's GPT-40, an anthropic's cloud son at 3.5 on accuracy on wide ranging tests.
要看到Deepsea的新模型,它在多方面都给人留下深刻印象。首先,他们在开源模型的实现上非常有效,这个模型在推理计算时间上表现卓越且计算效率极高。它在准确性方面的广泛测试中击败了Meta的Llama、OpenAI的GPT-40和Anthropic的Cloud Son 3.5。
A subset of 500 math problems, an AI math evaluation, coding competitions, and a test of spotting and fixing bugs in code. Quickly following that up with a new reasoning model called R1, which just as easily outperformed OpenAI's cutting edge O1 in some of those third party tests. Today we released Humanities Last Exam, which is a new evaluation or benchmark of AI models that we produced by getting math, physics, biology, chemistry, professors to provide the hardest questions they could possibly imagine.
我们进行了500道数学题的子集测试、AI数学评估、编码比赛,以及代码错误检测和修复测试。紧接着,我们推出了一个名为R1的新推理模型,在某些第三方测试中轻松超越了OpenAI的尖端模型O1。今天,我们发布了“人文最后考试”,这是一个新的AI模型评估或基准测试,我们邀请数学、物理、生物、化学等教授提供他们能想到的最难的问题来完成这个评估。
Deepsea, which is the leading Chinese AI lab, their model is actually the top performing or roughly on par with the best American models. They accomplished all that despite the strict semiconductor restrictions that the US government has imposed on China, which has essentially shackled them out of computing power. Washington has drawn a hard line against China and the AI race, cutting the country off from receiving America's most powerful chips like NVIDIA's H100 GPUs.
Deepsea是中国领先的人工智能实验室,他们的模型实际上是表现最好的,或者大致与美国最优秀的模型相当。即便面对美国政府对中国施加的严格半导体限制,他们仍然取得了这些成就。这些限制基本上阻碍了他们获取计算能力。华盛顿在中美人工智能竞赛中划下了明确的界限,切断了中国获取美国最强大芯片(如NVIDIA的H100 GPU)的渠道。
Those were once thought to be essential to building a competitive AI model with startups and big tech firms alike scrambling to get their hands on any available. But Deepsea turned that on its head, sidestepping the rules by using NVIDIA's less performant H800s to build the latest model and showing that the chip export controls were not the chokehold DC intended. They were able to take whatever hardware they were trained on but use it way more efficiently.
曾经,人们认为那些(芯片)对于建立一个具有竞争力的人工智能模型是必不可少的,创业公司和大型科技公司都争相获取任何可用的芯片。然而,Deepsea 颠覆了这种看法,他们通过使用 NVIDIA 性能较低的 H800s 来开发最新的模型,规避了规则,并证明芯片出口管制并没有像预期那样成为瓶颈。他们能够更高效地使用他们所接受训练的任何硬件。
But just choose behind Deepsea anyway. Despite its breakthrough, very, very little is known about its lab and its founder, Liang Wenfeng. According to Chinese media reports, Deepsea was born out of a Chinese hedge fund called High Flyer Quant that manages about $8 billion in assets. The mission on its developer site, it reads simply unraveled the mystery of AGI with curiosity, answered the essential question with long termism.
不过还是选择跟随Deepsea。尽管它取得了突破,但关于它的实验室及其创始人梁文峰的信息仍然非常少。据中国媒体报道,Deepsea起源于一家名为High Flyer Quant的中国对冲基金,该基金管理着大约80亿美元的资产。在其开发者网站的使命中,它用简单的话语表达了利用好奇心揭开人工智能通用化(AGI)之谜,并用长期主义回答这个关键问题。
The leading American AI startups, meanwhile, open AI and anthropic, they have detailed charters and constitutions that lay out their principles and their founding missions, like these sections on AI safety and responsibility. Despite several attempts to reach someone at Deepsea, we never got a response. How did they actually assemble this talent? How did they assemble all the hardware? How did they assemble the data to do all this? We don't know. And it's never been publicized and hopefully we can learn that.
与此同时,领先的美国人工智能初创公司OpenAI和Anthropic都制定了详细的章程和宪法,明确了它们的原则和创立使命,其中包括关于AI安全和责任的部分。尽管我们多次尝试联系Deepsea公司的人,但始终没有得到任何回应。他们究竟是如何汇聚这些人才的?他们是如何收集所有硬件的?他们是如何收集数据来进行这些工作的?我们不知道,这些信息从未被公布过,但希望我们能够了解到。
But the mystery brings into sharp relief just how urgent and complex the AI face off against China has become. Because it's not just Deepsea. Other more well known Chinese AI models have carved out positions in the race with limited resources as well. Haifu Li, he's one of the leading AI researchers in China, formerly leading Google's operations there. Now his startup, 01.AI.
但这一神秘事件清楚地凸显了美国与中国在人工智能领域对抗的紧迫性和复杂性。这不仅仅是关于Deepsea。其他知名度更高的中国AI模型也在资源有限的情况下占据了竞争中的一席之地。李海福(Haifu Li)是中国顶尖的人工智能研究者之一,他曾经负责谷歌在中国的业务。现在他创立了一家名为01.AI的初创公司。
It's attracting attention, becoming a unicorn just eight months after founding and bringing in almost $14 million in revenue in 2024. The thing that shocks my friends in the Silicon Valley is not just our performance, but that we train the model with only $3 million. And GBT4 was trained by 80 to 100 million. Trained with just $3 million. Alibaba's QN, meanwhile, cut costs by as much as 85% on its large language models in a bid to attract more developers and signaling that the race is on. China's breakthrough undermines the lead that our AI labs were once thought to have. In early 2024, former Google CEO Eric Schmidt, he predicted China was two to three years behind the US in AI. But now Schmidt is singing a different tune. Here he is on ABCs this week. By use, I think we were a couple of years ahead of China. China has caught up in the last six months in a way that is remarkable. The fact of the matter is that a couple of the Chinese programs, one example is called DeepSeek, looks like they've caught up. It raises major questions about just how wide open AI's melt really is.
它在创立仅八个月后就吸引了大量关注,成为了独角兽,并在2024年创收近1400万美元。令硅谷朋友们震惊的不仅是我们的业绩,而是我们仅用300万美元就训练了这个模型。而GBT4的训练费用在8000万到1亿美元之间。与此相比,阿里巴巴的QN为了吸引更多开发人员,大幅削减其大规模语言模型的成本,降低幅度高达85%,这表明竞争正在展开。中国的突破削弱了曾被认为我们的AI实验室现有的领先地位。2024年初,谷歌前CEO埃里克·施密特曾预测中国在AI方面落后美国两到三年。但现在施密特的看法发生了变化。他在ABC的节目上表示,"在某些方面,我认为我们领先中国几年。但在过去的六个月里,中国的追赶速度令人瞩目。这事实表明,中国的一些项目,比如名为DeepSeek的,就像是已经赶上了。这引发了关于AI发展的开放性问题的重大讨论"。
Back when OpenAI released Chatchi BT to the world in November of 2022, it was unprecedented and uncontested. Now the company faces not only the international competition from Chinese models, but fierce domestic competition from Google's Gemini, Anthropix Cloud, and meta's open source llama model. And now the game has changed. The widespread availability of powerful open source models allows developers to skip the demanding, capital intensive steps of building and training models themselves. Now they can build on top of existing models, making it significantly easier to jump to the frontier, that is the front of the race, with a smaller budget and a smaller T. In the last two weeks, AI research teams have really opened their eyes and have become way more ambitious on what's possible with a lot less capital. So previously, you know, to get to the frontier, you'd have to think about hundreds of millions of dollars of investment and perhaps a billion dollars of investment.
回到2022年11月,当OpenAI将ChatGPT发布到全世界时,这是前所未有且无可争议的。现在,该公司不仅面临来自中国模型的国际竞争,还有来自谷歌的Gemini、Anthropic's Cloud以及Meta开源Llama模型的激烈国内竞争。现在,形势发生了变化。强大开源模型的广泛可用性使开发者能够跳过自行构建和训练模型这一艰巨且资金密集的步骤。他们可以基于现有模型构建,从而更容易以较小的预算和团队进入前沿,达到竞争的第一梯队。在过去两周,AI研究团队真正意识到这点,并变得更加雄心勃勃地探索在较少资本条件下的可能性。以前,要想达到前沿,可能需要考虑投资数亿美元,甚至可能上十亿美元。
What DeepSeek has now done here in Silicon Valley is it's opened our eyes to what you can actually accomplish with 10, 15, 20, 30 million dollars. It also means any company like OpenAI that claims the frontier today could lose it tomorrow. That's how DeepSeek was able to catch up so quickly. It started building on the existing frontier of AI. It's approach focusing on iterating on existing technology rather than reinventing the wheel. They can take a really good big model and use a process called distillation. And what distillation is is basically you use a very large model to help your small model get smart at the thing you want it to get smart at. And that's actually very cost efficient. It closed the gap by using available data sets, applying innovative tweaks and leveraging existing models.
DeepSeek 最近在硅谷的所作所为让我们看到了用1000万、1500万、2000万甚至3000万美元能够实现的目标。 这也意味着任何像 OpenAI 这样的公司,即便今天处于行业前沿,但明天也可能失去优势。 DeepSeek 能够如此迅速追赶上来的原因在于它是在现有的人工智能前沿技术基础上进行构建。 他们的策略是专注于改进现有技术,而不是重新发明轮子。他们可以利用一种叫做蒸馏的过程,把一个非常优秀的大模型用于帮助小模型在某个特定领域变得更加智能。 这种方法实际上非常具有成本效益。 他们通过使用现有的数据集,进行创新调整并利用现有模型来缩小差距。
So much so that DeepSeek's model has run into an identity crisis. It's convinced that it's chat GBT. When you ask it directly, what model are you? DeepSeek responds, I'm an AI language model created by OpenAI specifically based on the GBT4 architecture. Leading OpenAI CEO Sam Altman to post in a thinly veiled shot at DeepSeek just days after the model was released, it's relatively easy to copy something that you know works. It's extremely hard to do something new, risky and difficult when you don't know if it will work. That's not exactly what DeepSeek did. It emulated GBT by leveraging OpenAI's existing outputs and architecture principles while quietly introducing its own enhancements, really blurring the line between itself and chat GBT. It all puts pressure on a closed source leader like OpenAI to justify its costlier model as more and potentially nimbler competitors emerge. Everybody copies everybody in this field. You can say Google did the transform refers, it's not OpenAI and OpenAI just copied it.
DeepSeek的模型遇到了身份危机,甚至觉得自己是Chat GBT。当你直接问它,“你是什么模型?”时,DeepSeek会回答:“我是由OpenAI创建的AI语言模型,基于GBT4架构。”这导致OpenAI的CEO Sam Altman在该模型发布后几天内含蓄地抨击DeepSeek:“复制一个你知道有效的东西相对容易。而在不确定是否会有效的情况下,做一些新的、冒险的和困难的事情却极其难。” DeepSeek并没有完全这样做。它通过利用OpenAI现有的输出和架构原则来模拟GBT,同时悄悄地引入自己的增强功能,模糊了它与Chat GBT之间的界限。这一切都让像OpenAI这样的闭源领导者承受压力,需要证明其成本更高的模型的价值,尤其是在出现更多潜在和灵活的竞争者时。在这个领域,每个人都在互相借鉴。你可以说谷歌发明了转换器,OpenAI不是发明者,OpenAI只是复制了它。
Google built the first large language models, they didn't prioritize it but OpenAI did it into productized way. So you can say all this in many ways, it doesn't matter. So if everyone is copying one another, it raises the question, is massive spend on individual LLMs, even a good investment anymore? Now no one has as much at stake as OpenAI. The startup raised over $6 billion in its last funding round alone. But the company has yet to turn a profit and with its core business centered on building the models, it's much more exposed than companies like Google and Amazon who have cloud and ad businesses bankrolling their spend. For OpenAI, reasoning will be key. A model that thinks before it generates a response going beyond pattern recognition to analyze draw logical conclusions and solve really complex problems. For now, the startup's 01 reasoning model, it's still cutting edge, but for how long? Researchers at Berkeley showed that they could build a reasoning model for $450 just last week. So you can actually create these models that do thinking for much, much less. You don't need those huge amounts of pre-training the model, so I think the game is shifting. It means that staying on top may require as much creativity as capital. Deepseeks breakthrough also comes at a very tricky time for the AI darling. Just as OpenAI is moving to a for profit model and facing unprecedented brain drain. Can it raise more money at ever higher valuations if the game is changing?
Google建立了最早的大型语言模型,但并没有优先将它商品化,而OpenAI却做到了。所以你可以用很多种方式来表达这一点,但这并不重要。如果大家都在互相模仿,那就引发了一个问题:对个别大型语言模型的大量投入是否仍然是一个好的投资?现在,没有谁比OpenAI面临的风险更大。这个初创公司在最近一轮融资中筹集了超过60亿美元,但公司尚未实现盈利,其核心业务集中在构建模型上,比谷歌和亚马逊这样的公司更容易受到影响,因为后者有云服务和广告业务来支撑他们的支出。对于OpenAI来说,推理能力至关重要。一个在生成响应之前进行思考的模型,不仅仅依赖模式识别,还能够进行分析、得出逻辑结论并解决非常复杂的问题。目前,这家初创公司的01推理模型依然处于前沿状态,但能持续多久呢?伯克利的研究人员上周展示了他们可以用450美元构建一个推理模型。这表明实际上可以用更少的资金创建这些具备思考能力的模型,不再需要对模型进行大量的前期训练,因此我认为游戏规则正在改变。这意味着要保持领先可能需要和资本同等的创造力。Deepseek的突破也在一个非常微妙的时刻到来,就在这个AI明星转向盈利模式并面临前所未有的人才流失之际。如果游戏规则在改变,OpenAI能否在越来越高的估值下筹集到更多资金呢?
As Schmath-Pall-Happatia puts it, let me say the quiet part out loud. AI model building is a money trap. Those trip restrictions from the US government, they were intended to slow down the race. To keep American tech out of American ground to stay ahead in the race. What we want to do is we want to keep it in this country. China is a competitor and others are competitors. So instead, the restrictions might have been just what China needed. Necessity is the mother of invention. Because they had to go figure out workarounds, they actually ended up building something a lot more efficient. It's really remarkable the amount of progress they've made with as little capital as it's taken them to make that progress. It drove them to get creative with huge implications. DeepSeeq is an open source model, meaning that developers have full access and they can customize its weights or fine tune it to their liking.
正如 Schmath-Pall-Happatia 所说,让我把不便直言的话说出来。构建 AI 模型实际上是一种资金陷阱。美国政府的旅行限制原本是为了减缓竞争,防止美国技术被带出国,以便在竞争中保持领先。我们的目标是让技术留在国内。中国和其他国家都是竞争者。因此,这些限制反而可能正是中国所需要的。所谓“需求是发明之母”。因为他们需要寻找解决方案,结果他们实际上建立了一种更高效的东西。令人惊讶的是,他们用相对少的资金取得了如此大的进步。这促使他们变得更加有创造力,并带来了巨大的影响。DeepSeeq 是一个开源模型,这意味着开发者可以完全访问,并根据自己的需要定制或微调其参数。
It's known that once open source is caught up or improved over closed source software, all developers migrate to that. But key is that it's also inexpensive. The lower the cost, the more attractive it is for developers to adopt. The bottom line is our inference cost is 10 cents per million tokens. And that's one-thirtieth of what the typical comparable model chart is. And where's it going? It's what the 10 cents would lead to building apps for much lower costs. So if you wanted to build a U.com or a PLEXID or some other app, you can either pay open AI $4.40 per million tokens. Or if you have our model, it costs you just 10 cents. It could mean that the prevailing model in global AI may be open source. As organizations and nations come around to the idea that collaboration and decentralization, those things can drive innovation faster and more efficiently than proprietary closed ecosystems.
众所周知,一旦开源软件赶上或超越闭源软件,所有开发者都会转向使用开源软件。关键在于它也很便宜。成本越低,对开发者就越有吸引力。基本上,我们的推理成本是每百万个标记10美分,这仅仅是典型可比模型收费的三十分之一。这将会走向何方?这样的10美分意味着能够以更低的成本构建应用程序。因此,如果你想创建一个类似U.com或PLEXID这样的应用,你可以选择支付OpenAI每百万个标记4.40美元,或者使用我们的模型,只需10美分。这可能意味着全球AI的主流模式将成为开源。当组织和国家接受合作与去中心化能够比专有的封闭生态系统更快更高效地推动创新这一理念时,开源就可能占据主导地位。
A cheaper, more efficient, widely adopted open source model from China, that could lead to a major shift in dynamics. That's more dangerous. Because then they get to own the mind share, the ecosystem. In other words, the adoption of a Chinese open source model at scale, that could undermine US leadership while embedding China more deeply into the fabric of global tech infrastructure. There's always a second point where open source can stop being open source too, right? So the licenses are very favorable today, but it could close it. Exactly. Over time, they can always change the license. So it's important that we actually have people here in America building. And that's why it matters so important.
一个更便宜、更高效、广泛采用的中国开源模型,可能导致动态的重大转变。这更危险,因为这样他们就能占据思想的主导地位和生态系统。换句话说,大规模采用中国的开源模型可能削弱美国的领导地位,同时更深入地将中国嵌入全球科技基础设施中。其实,开源也可能不再开源,对吧?所以现在的授权协议很有利,但它们可能会关闭。的确,随着时间推移,他们总是可以改变授权。因此,让美国有人积极参与建设非常重要,这就是为什么这件事如此重要的原因。
Another consequence of China's AI breakthrough is giving its Communist Party control of the narrative. AI models built in China, they're forced to adhere to a certain set of rules set by the state. They must embody four socialist values. Studies have shown that models created by Tencent and Alibaba, they will censor historical events like Tiananmen Square, deny human rights abuse, and filter criticism of Chinese political leaders. That contest is about whether we're going to have democratic AI informed by democratic values built to serve democratic purposes or we're going to end up with other credit AI. It developers really begin to adopt these models on mass because they're more efficient. That could have a serious ripple effect. Trickle down to even consumer facing AI applications and influence how trustworthy those AI generated responses from chatbots really are.
中国在人工智能领域的突破带来的另一个后果是赋予了其共产党掌控话语权的能力。在中国构建的人工智能模型必须遵循国家设定的一套规则,并体现四项社会主义价值观。研究显示,像腾讯和阿里巴巴创建的模型会审查像天安门事件这样的历史事件,否认侵犯人权的行为,并过滤对中国政治领导人的批评。这场竞争关乎我们是否会拥有没有民主价值观的人工智能,亦或是那些体现民主价值观、服务于民主目的的人工智能。如果开发者大规模采用这些模型,因为它们更加高效,那么可能产生严重的连锁效应,甚至影响到面向消费者的人工智能应用,以及这些聊天机器人生成的回答的可信度。
There's really only two countries right now in the world that can build this at scale. And that is the US and China. And so the consequences of the stakes in and around this are just enormous. Enormous stakes, enormous consequences, and hanging in the balance America's lead. For our topic so complex and new, we turn to an expert who's actually building in the space and model agnostic. Perplexity co-founder and CEO Arvin Srinivas, who you heard from throughout our piece, he sat down with me for more than 30 minutes to discuss deep-seek and its implications as well as perplexity's roadmap. We think it's worth listening to that whole conversation. So here it is.
目前,世界上只有两个国家能够大规模建造这个技术,那就是美国和中国。因此,涉及其中的风险和影响都是巨大的。巨大的风险,巨大的影响,美国的领先地位也悬而未决。由于这个话题非常复杂且新颖,我们请来了一位真正参与该领域工作的专家——Perplexity公司的联合创始人兼CEO Arvin Srinivas。在整个讨论过程中,你会听到他的见解。他与我进行了一次超过30分钟的深入对话,探讨了深度搜索及其影响以及Perplexity的发展路线图。我们认为这次对话非常值得聆听。现在,我们将其分享给你。
So first I want to know what the stakes are. What describe the AI race between China and the US and what's at stake? Okay, so first of all, China has a lot of disadvantages in competing with the US. Number one is the fact that they don't get access to all the hardware that we have access to here. So they're kind of working with lower NGP use than us. It's almost like working with the previous generation GPUs, Crapili. The fact that the bigger models tend to be more smarter, naturally, at disadvantage. But the flip side of this is that necessity is the mother of invention. Because they had to go figure out workarounds, they actually ended up building something a lot more efficient. It's like saying, hey, look, you guys really got to get a top-notch model and I'm not going to give you resources and figure out something. Unless it's mathematically possible to prove that it's impossible to do so, you can always try to come up with something more efficient. But that is likely to make them come up with a more efficient solution than America. And of course they have open-source it so we can still adopt something like that here. But that kind of talent, they're building to do that will become an edge for them over time, right? The leading open-source model in America is Meta's llama family. It's really good. It's kind of like a model that you can run on your computer. But even though it got pretty close to GBT4 and a solid at the time of its release, the model that was closest in quality was a giant 405B, not the 70B that you could run on your computer. And so there was still not a small, cheap, fast, efficient open-source model that DRY will deem most powerful, close models from opening an entropic. Nothing from America. Nothing from Mistraa either. And then these guys come out with a crazy model that's like 10x cheaper and API pricing than GBT4 or 15x cheaper than solid, I believe.
首先,我想了解当前的形势是什么。可以描述一下中美在人工智能方面的竞赛,以及所涉及的利益吗?
好的,首先,中国在与美国竞争时面临很多劣势。首要的问题是,他们无法获得我们这里拥有的所有硬件。因此,他们使用的GPU性能相对较低,几乎相当于使用上一代的GPU。这让他们在开发更大型、更智能的模型时处于不利地位。不过,反过来看,必要性是发明之母。由于需要寻找替代方案,中国实际上开发了一些更高效的解决方案。这就像有人告诉你,必须要研发一个顶尖的模型,但不给你提供资源,迫使你去找到解决办法。除非可以用数学证明不可能实现,否则总能尝试找到更高效的方法。这可能会促使他们研发出比美国更高效的解决方案。而且,他们还将这些成果开源,我们也可以在这里采用这样的技术。不过,他们在这一过程中培养起来的人才,随着时间的推移,将成为他们的优势。
美国的领先开源模型是Meta的Llama家族表现很好,是一种你可以在电脑上运行的模型。尽管它几乎接近于GBT4 并在发布时表现优异,但最接近质量的模型仍是一个巨大的405B,而不是70B那种可以在电脑上运行的小规模模型。因此,美国和其他地方都没有一个小型、廉价、迅速且高效的开源模型能够与开放AI公司和Anthropic公司等闭源模型相比。而中国团队推出的一个疯狂的模型,API的定价竟然比GBT4便宜10倍,比Solid便宜15倍。我相信这就是这种情况。
Really fast, 16 tokens per second. And pretty much equal or bettors in some benchmarks and worse in some others, but roughly in that ballpark of 4.0s quality. And they did it all with approximately just 2048 H800 GPUs, which is actually equivalent to somewhere around 1500 or 1,000 to 1,500 H100 GPUs. That's like 20 to 30x lower than the amount of GPUs that GBT4 is usually trained on. And it roughly $5 million in total compute budget. They did it with so little money and such an amazing model, gave it away for free, wrote a technical paper, and definitely it makes us all question like, okay, if we have the equivalent of Doge for model training, this is an example of that. Right. Efficiency is what you're getting at. So fraction of the price, fraction of the time, dumb down GPUs, essentially.
非常快,每秒处理16个token。在一些基准测试中表现相当,甚至更好,而在另一些中表现较差,但整体接近于4.0s的质量。他们只用了大约2048个H800 GPU,这相当于大约1500或1000到1500个H100 GPU。这比GBT4通常使用的GPU数量少了大约20到30倍。总的计算预算大约为500万美元。他们用如此少的钱和资源开发出这么出色的模型,并且免费分享了,还撰写了一篇技术论文。这确实让我们思考,如果我们有类似于Doge的模型训练方式,这就是一个例子。对,效率是关键。在极低的价格和时间内,使用简化版的GPU实现了这一成就。
What was your surprise when you understood what they had done? So my surprise was that when I actually went through the technical paper, the amount of clever solutions they came up with, first of all, they trained to make sure experts model. It's not that easy to train. There's a lot of like, the main reason people find a difficult catch up with OpenAI, especially in the MOE architecture, is that there's a lot of irregular loss spikes. The numerics are not stable. So often, you've got to restart the training checkpoint again. And
当你理解他们所做的事情时,你的惊讶是什么?我非常惊讶的是,当我实际阅读他们的技术论文时,他们想出的聪明解决方案数量很多。首先,他们成功训练了一个专家模型,这并不容易。人们很难赶上OpenAI的主要原因之一,特别是在MOE架构中,是因为存在很多不规则的损失波动。数值不稳定,因此经常需要重启训练检查点。
a lot of infrastructure needs to be built for that. And they came up with very clever solutions to balance that without adding additional hacks. And they also figured out floating point eight, eight-bit training, at least for some of the numerics. And they cleverly figured out which has to be in higher position, which has to be in lower position. To my knowledge, I think floating point eight training is not that well understood. The most of the training in America is still running in F-16, maybe OpenAI and so many people are trying to explore that, but it's pretty difficult to get it right.
为了实现这一目标,需要建设大量的基础设施。他们提出了一些非常巧妙的解决方案,在不增加额外复杂性的情况下实现平衡。而且,他们也找到了一种方法,可以进行浮点8位训练,至少在某些数值运算上。他们聪明地决定了哪些数据需要保留更高的位置,哪些需要放在较低的位置。据我所知,浮点8位训练还没有被很好地理解。目前大多数的训练仍在使用F-16,也许OpenAI和其他许多人都在尝试探索这种方法,但要真正掌握它是相当困难的。
So because necessary, some of them have invention, because they don't have that much memory, that many GPUs, they figured out a lot of numerical stability stuff that makes their training work. And they claim in the paper that for majority of the training was a stable, which means what they can always rerun those training runs again. And on more data or better data. And then it only trained for 60 days. So that's pretty amazing. So to say you were surprised. So I was definitely surprised. Usually the wisdom, or like I wouldn't say it was the myth, is that Chinese are just good at copying. So we stopped writing research papers in America. If we stopped describing the details of our infrastructure, architecture, and stop open sourcing, they're not going to be able to catch up.
所以,有些情况下是因为必要,当他们没有足够的内存和足够多的GPU时,他们发明了一些东西。他们找到了很多数值稳定性的方法使得他们的训练能够正常进行。而且他们在论文中声称,大部分训练过程都是稳定的,也就是说,他们总是可以重新运行这些训练,并且可以在更多或更好的数据上再次进行。而这整个过程只用了60天完成,这真的很了不起。所以说,你感到惊讶,这我是完全能理解的。通常的看法,或者我不敢说是一个神话,就是中国人只是擅长模仿。所以说,如果我们在美国不再撰写研究论文,停止描述我们的基础设施和架构细节,并且不再开源,中国就赶不上我们。
But the reality is some of the details in DeepSeed 3 are so good that I would be surprised if Meta took a look at it and incorporated some of that in the number four. I wouldn't necessarily say copy, it's all like sharing science, engineering. But the point is like it's changing. It's not like China's copycat. They're also innovating. We don't know exactly the data that it was trained on, right? Even though it's open source. We know some of the ways and things that was trained up and not everything.
事实上,DeepSeed 3 中的一些细节非常出色,以至于如果 Meta 看到这些细节并在第四版中融入其中,我会感到惊讶。我不会说这是抄袭,更像是科学和工程技术的分享。重点是,这种情况正在发生变化,中国现在不仅仅是“模仿者”,他们也在创新。虽然 DeepSeed 3 是开源的,但我们并不完全了解它所使用的训练数据。我们知道一些训练方法和内容,但并不是全部。
And there's this idea that it was trained on public chat GBT outputs, which would mean it just was copied. But you're saying it goes beyond that. There's real innovation. They've trained about 14.8 trillion tokens. The internet has so much chat GBT. If you actually go to any LinkedIn post or X post now, most of the comments are written by AI. You can just see it. People are just trying to write.
有一种观点认为,它是基于公共聊天GBT的输出进行训练的,这意味着它只是被复制了。但是你说的意思超出了这一点。这其中有真正的创新。他们已经训练了大约14.8万亿个标记。互联网上充满了聊天GBT的内容。如果你现在去看任何LinkedIn或X上的帖子,大部分评论都是由AI写的。你一眼就能看出,人们只是在尝试写作。
In fact, even with an X, there's like a grok tweet enhancer or in LinkedIn, there's an AI enhancer. Or in Google Docs and Word, there are AI tools to rewrite your stuff. So if you do something there and copy paste somewhere on the internet, it's naturally going to have some elements of a chat GBT training, right? And there's a lot of people who don't even bother to strip away that I'm a language model part. So they just based it somewhere. And it's very difficult to control for this. I think XAI has spoken about this too. I wouldn't disregard their technical accomplishment just because for some prompts, like who are you or which model are you with response to that?
实际上,即使是在X平台上,也有一个类似grok的推文增强器;在LinkedIn上,有一个AI增强器;在Google Docs和Word中,也有AI工具可以重写你的内容。因此,如果你在这些平台上做了一些事情,然后将其复制粘贴到互联网上,它自然会带有一些聊天GPT训练的元素,对吧?而且,有很多人甚至不费心去除掉那些“我是一个语言模型”的部分,他们只是简单地粘贴上去。要控制这种情况非常困难。我认为XAI也提到过这个问题。我不会因为某些提示词,比如“你是谁”或者“你是哪个模型”而轻视他们的技术成就。
It doesn't even matter in my opinion. For a long time, we thought, I don't know if you agreed with us, China was behind in AI. What does this do to that race? Can we say that China is catching up or has it caught up? I mean, if we say the matter is catching up to open AI and in the topic, if you make that claim, then the same claim can be made for China catching up to America. There are a lot of papers from China that have tried to replicate 01. In fact, I saw more papers from China after 01 announcement that tried to replicate it than from America. And the amount of compute deep-seak has access to is roughly similar to what PhD students in the US have access to. So it's not meant to criticize others, even for ourselves. For perplexity, we decided not to train models because we thought it's a very expensive thing. We thought there's no way to catch up with the rest.
在我看来,这根本无关紧要。很长时间以来,我们认为——我不知道您是否同意——中国在人工智能领域一直落后。那么这对竞赛有什么影响?我们可以说中国正在赶上,或者已经赶上了吗?如果我们说这个议题中有人声称正赶上OpenAI,那么同样可以说中国正在赶上美国。中国有很多论文试图复现01的研究。实际上,在01发布后,我看到来自中国的复现尝试论文比美国的还多。Deep-seek可用的计算资源大致与美国的博士生相当。因此,这并不是对他人的批评,甚至对我们自己而言,出于困惑,我们决定不进行模型训练,因为我们认为这非常昂贵,也认为无法赶上其他人。
Will you incorporate deep-seak into perplexity? We already are beginning to use it. I think they have an API and they have open source of it so we can host it ourselves too. It's good to try to start using that because it actually allows us to do a lot of the things at a lower cost. But what I'm thinking is beyond that, which is, okay, if these guys actually could train such a great model, good team, and there's no excuse anymore for companies from the US, including ourselves to not try to do something like that. You hear a lot in public from a lot of thought leaders and generative AI, both on the research side, on the entrepreneurial side. Like Elon Musk and others say that China can't catch up. The stakes are too big. The geopolitical stakes, whoever dominates AI is going to dominate the economy, dominate the world. It's been talked about in those massive terms.
您会将deep-seak纳入perplexity中吗?我们已经开始使用它了。我想他们有一个API并且开源了,我们也可以自己托管。尝试使用它是个好主意,因为实际上可以让我们以更低的成本做很多事情。但我考虑的不仅仅是这些,我的意思是,如果这些团队能训练出如此优秀的模型,那么美国的公司,包括我们自己,就没有理由不去尝试做这样的事情了。很多公共领域的思想领袖和生成式AI专家,包括研究和创业领域的人们,比如埃隆·马斯克等人,都说中国无法赶上。这个领域的赌注太大了,从地缘政治来看,谁掌握了AI,谁就将主导经济、主导世界。这种重要性已经被以这种宏大的方式谈论过。
Are you worried about what China proved it was able to do? Firstly, I don't know if Elon ever said China couldn't catch up. I'm not a great person. Just the threat of China. He's only identified the threat of letting China. Sam Altman has said similar things. We can't let China win. He ever is. You know, it's all, I think you got a decouple of what someone like Sam says to like, what is in his self interest, right? Look, I think the, my point is like, whatever you did to a lot of let them catch up didn't even matter. They ended up catching up anyway. The necessity is the mother of invention, like he said. And you, it's actually, you know, what's more dangerous than trying to do all the things to like, not let them catch up with. And like, you know, all the stuff is what's more dangerous is they have the best open source model and all the American developers are building on that. Right. That's more dangerous because then they get to own the mind share, the ecosystem. The entire American AI ecosystem.
你是否担心中国所展示的能力?首先,我不知道埃隆是否曾说过中国无法赶上。我不是个很棒的人。只是对中国的威胁。他只是指出中国可能带来的威胁。Sam Altman 也说过类似的话。我们不能让中国赢。不管他是谁。你知道,我认为你需要将像 Sam 这样的人说的话与他们的个人利益区分开来。看,我的观点是,不管你做了什么以阻止他们赶上,这些都不重要。最终他们还是赶上了。就像他所说的,需求是发明之母。而且,你知道,实际上,比起竭力阻止他们赶上,更危险的是他们拥有最好的开源模型,而所有美国的开发者都在基于此进行开发。这更危险,因为这样他们能掌控思想的影响力和整个生态系统。整个美国的 AI 生态系统。
Look, in general, it's known that once open source is caught up or improved over closed source software, all developers migrate to that. Right. It's historically known, right? When llama was being built and becoming more widely used, there was this question, should we trust Zuckerberg? But now the question is, should we trust China? That's a very. You trust open source. That's, that's the, like it's not about who is it Zuckerberg or is it. Does it matter then if it's Chinese, if it's open source? It doesn't matter in the sense that you still have full control. You run it as your own, like, like set of weights on your own computer. You are in charge of the model. But it's not a great look for our own like talent to like, you know, rely on software to pay others. Even if it's open source, there's always like a point where open source can stop being open source too. Right. So the licenses are very favorable today. But if you close that. Exactly. Over time, they can always change the license. So it's important that we actually have people here in America building and that's why matter is so important. Like I, look, I still think matter will build a better model than deep seek between an open source and what they'll call it llama four or three points something. Doesn't matter. I think what is more key is that we don't like try to focus all our energy on banning them and stopping them and just try to out compete and win them. That's just the American we were doing things. Just be better. And it feels like there's, you know, we hear a lot more about these Chinese companies who are developing a similar way a lot more efficiently, a lot more cost effectively, right? Yeah.
看,一般来说,人们都知道,一旦开源软件追赶上或超过了闭源软件,所有开发者都会转向使用开源软件。这是有历史根据的,对吧?当Llama在构建和逐渐被广泛使用时,人们曾问,我们应该信任扎克伯格吗?但现在的问题是,我们应该信任中国吗?真正应该信任的是开源。关键不是扎克伯格或是否来自中国,因为只要是开源的,你依然可以完全掌控。你可以像在自己的计算机上运行自己的模型一样使用这些软件。问题在于,我们的人才是否依赖于他人开发的软件。即使是开源软件也有可能不再开放。当下的许可证很有利,但这可能改变。所以在美国有自己的开发能力非常重要,这也是为什么创新很重要。我认为,未来Meta会构建出比DeepMind更好的模型,无论它们是Llama 4还是3.X这类命名,但更重要的是我们不应该把精力放在禁止他们上,而是要尽力超越他们。这就是我们美国人做事的方式:做得更好。而且,我们确实听说中国公司在以更高效和更具成本效益的方式开发类似的东西。
Again, like, look, it's hard to fake scarcity, right? If you raise 10 billion and you are decided to spend 80% of it on a computer cluster, it's hard for you to come up with the exact same solution that someone with five million would do. And there's no point in no need to like sort of be rate those or putting more money. They're trying to do it as fast as they can. When we say open source, there's so many different versions. Some people criticize meta for not publishing everything and even deep seek itself. It's a totally transparent. Sure. Yeah. So open source and say, I should exactly be able to replicate your training run. But first of all, how many people even have the resources to do that and compare like, I think the amount of detail they've shared in the technical report actually meta did that too, by the way, meta's llama 3.3 technical report is incredibly detailed and very great for science. So the amount of details these people are sharing is already a lot more than what the other companies are doing right now. When you think about how much it costs deep seek to do this, less than $6 million, think about what open AI has spent to develop GPT models. What does that mean for the close source model, ecosystem trajectory, momentum? What does it mean for open AI? I mean, it's very clear that we'll have like an open source version for or even better than that and much cheaper than that open source like completely into the sphere. Made by open AI? Probably not. And I don't think they care if it's not made by them. I think they've already moved to a new paradigm called the 01 family of models. I can't like Ilya Sutsky, where Kim said, pre-training is a wall, right? So I mean, he didn't exactly use the word, but he clearly said there is a pre-training is a word. Many people have said that. Right? That doesn't mean scaling instead of all. I think we're scaling on different dimensions now.
再一次,看看,稀缺性是很难伪造的,对吧?如果你筹集了100亿,并决定把其中的80%花在一个计算机集群上,那么你很难达到一个只有500万的人能够实现的那种解决方案。没有必要去批评或者投入更多资金,那些人已经在尽可能快地尝试实现目标了。当我们谈到开源的时候,有很多不同的版本。有些人批评Meta没有公开所有的东西,甚至DeepMind本身也是完全透明的。是的,当然。有人认为开源就意味着他们应该能够完全复制你的训练过程。但首先,有多少人真的有资源去做到这一点呢?实际上,他们在技术报告中分享的细节,Meta也这样做了,Meta的LLaMA 3.3技术报告极其详细,对科学非常有益。他们分享的细节已经远远超过其他公司现在所做的。当你想到DeepMind花费不到600万美元来做这个,而对比OpenAI投入开发GPT模型的成本,这对封闭源模型的生态系统轨迹、发展势头意味着什么?对OpenAI来说意味着什么?很明显,我们将拥有一个开放源码的版本,甚至比那更好,而且成本更低。是不是由OpenAI制作的?可能不是。我认为他们并不在乎是不是由他们制作的,我觉得他们已经转向了一个名为01家族模型的新范式。我不能像Ilya Sutskever或其他人那样说,预训练是一个瓶颈,对吧?我的意思是,他没有确切地用这个词,但他清楚地表示预训练是一个瓶颈。很多人都这么说,不是吗?但这并不意味着规模化是一切。我认为我们现在正在其他维度上进行扩展。
The amount of time model spends thinking at test time, reinforcement learning, trying to make the model, okay, if it doesn't know what to do for a new prompt, it will go and reason and collect data and interact with the world, use a bunch of tools. I think that's where things are headed. And I feel like opening is more focused on that right now. Yeah. Instead of just the bigger, better model reasoning capacities. But didn't you say that deep-sea is likely to turn their attention to reasoning? 100%. I think they will. And that's why I'm pretty excited about what they'll produce next. I guess that's that my question is sort of what's opening eyes mode now?
模型在测试时使用的时间用于思考和强化学习,目标是让模型在遇到新的提示时,如果不知道该如何应对,就会进行推理、收集数据并与世界互动,使用各种工具。我认为这就是未来的发展方向。我感觉现在"Opening"更专注于这一点,而不是仅仅追求更大更好的模型推理能力。但是,你不是说"Deep-sea"可能会把注意力转向推理吗?是的,我完全认为他们会这样做,这就是为什么我对他们接下来的产品感到非常兴奋。我想我的问题是,现在"Opening"的模式是什么?
Well, I still think that no one else has produced a system similar to the 01 yet, exactly. I know that there's debates about whether 01 is actually worth it. Maybe a few prompts, it's really better, but most of the times it's not producing any differentiated output from SONAT. But at least the results they showed in 013 where they had competitive coding performance and almost like an AI software engineer level. Isn't it just a matter of time though before the internet is felt with reasoning data that deep-sea? Again, it's possible. Nobody knows yet. Yeah. SONAT, it's still uncertain. Right. So maybe that uncertainty is their mode that LinkedIn knows has the same reasoning capability yet. But will by end of this year, will there be multiple players, even in the reasoning arena? I absolutely think so. So are we seeing the quantitization of large language models? I think we'll see a similar trajectory, just like how in pre-training and post-training that sort of system for getting commoditized, where this year will be a lot more commoditization there. I think the reasoning kind of models will go through a similar trajectory, where in the beginning, one or two players will know how to do it, but over time like.
嗯,我仍然认为,到目前为止还没有其他人推出与01完全相似的系统。我知道关于01是否真的值得存在一些争论。也许在某些提示下,01表现得更好,但大多数时候它的输出与SONAT并没有什么差别。不过,他们在013中展示的结果至少显示出具有竞争力的编码性能,几乎达到了人工智能软件工程师的水平。但互联网上充满深度学习推理数据似乎只是时间问题,对吧?再次强调,这是有可能的,但目前还没有人知道。是的,SONAT目前仍然不确定。所以,也许这种不确定性是他们的优势,LinkedIn也没有同样的推理能力。但到今年年底,是否会有多个玩家进入这个推理领域呢?我绝对认为会有。所以,我们是否正在见证大型语言模型的量化进程?我认为会有类似的轨迹,就像以前在预训练和后训练中那样的系统逐渐商品化,今年会在这方面看到更多的商品化。我认为推理类模型也会经历类似的轨迹,最初只有一两个玩家知道怎么做,但随着时间的推移,这种情况会发生变化。
That's in who knows, right? Because OpenAI could make another advancement to focus on. Correct. But it's not easy to use it against their modes. But if advancements keep happening again and again and again, I think the meaning of the word advancement also loses some of its value. Totally. Even now, it's very difficult, right? Because there's pre-training advancements. Yeah. And then we've moved into a different phase. Yeah. So what is guaranteed to happen is what are models exist today? That level of reasoning, that level of multimodal capability, in like five to ten X cheaper models, open source, all that's going to happen. It's just a matter of time. What is unclear is if something like a model that reasons it at this time will be extremely cheap enough that we can just just all run it on our phones. I think that's not clear to me yet.
这谁知道呢,对吧?因为OpenAI可能会有新的进展需要关注。是的。但是,要将这些进展应用于他们的模式并不容易。不过,如果这些进展不断发生,我觉得“进展”这个词的意义也会有所贬值。确实如此。即便现在,情况已经很复杂了,对吧?因为有预训练的进展。是的,然后我们进入了一个不同的阶段。目前可以肯定的是,现有模型的那个推理能力和多模态能力,迟早会出现在成本低5到10倍的开源模型中。这只是时间问题。但不确定的是,像现在这样有推理能力的模型是否会变得特别便宜,以至于我们能在手机上运行它。我觉得这点我还不清楚。
It feels like so much of the landscape has changed with what DeepSik was able to prove. Could you call it China's chat GBT moment? It's possible. I mean, I think it's only probably gave them a lot of confidence that we're not really behind. No matter what you do to restrict our compute, we can always figure out some workarounds. And I'm sure the team feels pumped about the results. How does this change the investment landscape, the hyperscalers that are spending tens of billions of dollars a year on CapEx? They've just ramped it up huge and open AI and anthropic that are raising billions of dollars for GPUs, essentially.
看起来,DeepSik所取得的成就让整个领域都发生了巨大变化。是否可以称之为中国的"聊天GPT时刻"呢?这很有可能。我想这大大增强了他们的信心,让他们觉得自己并没有落后。无论别人如何限制我们的计算能力,我们总能想出一些解决方法。我相信团队对结果非常振奋。这又如何改变了投资格局呢?那些每年在资本支出上投入数百亿美元的超级计算公司还在大幅增加投入,而OpenAI和Anthropic等公司则正在为获取GPU筹集数十亿美元的资金。
What DeepSik told us is you don't need. You don't necessarily need that. Yeah. I mean, look, I think it's very clear that they're going to go even harder on reasoning because they understand that whatever they were building the previous two years is getting extremely cheap, but it doesn't make sense to go justify raising that amount of time. Funding proposition the same. Do they need the same amount of high-end GPUs or can you reason using the lower-end ones that DeepSik has already? Yeah, it's hard to say no until proven it's not. But I guess in the spirit of moving fast, you would want to use the high-end chips and you would want to move faster than your competitors.
DeepSik 告诉我们的是,你并不需要。你并不一定需要那个东西。是的,我的意思是,我认为很明显的是,他们会更加注重推理能力,因为他们知道无论过去两年他们在构建什么东西,现在都变得非常便宜,但花那么长时间去证明这件事是否能带来融资理由是没有意义的。他们是否需要相同数量的高端GPU,还是可以使用DeepSik已经拥有的低端GPU进行推理呢?在没有证明其不行之前,很难否定它。但是我想,本着快速推进的精神,你会希望使用高端芯片,并且希望走得比竞争对手更快。
I think the best talent still wants to work in the team that made it happen first. There's always some glory to who did this actually, who's the real pioneer versus who's fast follow, right? That was like Sam Altman's tweet, kind of veiled response to what DeepSik has been able to. He kind of implied that they just copied and anyone can copy, right? Yeah, but then you can always say that everybody copies everybody in this field. You can say Google did the transform first. It's not open AI and open AI just copied it. Google built the first large language models.
我认为最优秀的人才仍然想要加入首先实现目标的团队。毕竟,无论如何,谁是真正的先行者,谁只是快速跟随者,总是有些荣耀附加在最先做到的人身上,对吧?这有点像萨姆·奥特曼的推特中对DeepSik所取得成就的隐晦回应。他暗示说,他们只是抄袭,任何人都可以模仿,对吧?是的,但你也可以说,在这个领域里大家都在互相借鉴。你可以说谷歌首先开发了变换模型,而不是OpenAI,OpenAI只是借鉴了它。毕竟,是谷歌构建了第一个大型语言模型。
They didn't prioritize it, but open AI did it into productized way. You can say all this in many ways. It doesn't matter. I remember asking you being like, why don't you want to build the model? Yeah, I know. That's the glory. A year later, just one year later, you are very, very smart to not engage in that extremely expensive race that has become so competitive. You have this lead now in what everyone wants to see now, which is real-world applications, killer applications of generative AI. Talk a little bit about that decision and how that guided you where you see perplexity going from here.
他们并没有优先考虑这件事,但OpenAI却以一种产品化的方式做到了。你可以用很多种方式来表达这些,但这并不重要。我记得曾经问过你,为什么你不想构建模型。是的,我知道,那是非常荣耀的。一年后,仅仅一年后,你就非常聪明地避免了参与那场已经变得非常昂贵且竞争激烈的竞赛。现在,你在大家都期待看到的领域中占据了领先地位,也就是生成式AI的真实世界应用和杀手级应用。谈谈这个决定,以及它如何引导你并让你看到未来不解的方向。
Look, one year ago, I don't even think we had something like this is what like 2024 beginning, right? I feel like we didn't even have something like Son of 3.5. We had GPT-4, I believe, and it was kind of nobody else was able to catch up to it. But there was no multi-model, nothing. My sense was like, okay, if people with way more resources and way more talent cannot catch up, it's very difficult to play that game. So let's play a different game. Anyway, people want to use these models. There's one use case of asking questions and getting accurate answers with sources with real-time information, accurate information. There's still a lot of work there to do outside the model and making sure the product works reliably, keep scaling it up to usage, keep building custom UIs.
看,一年前,我们甚至没有类似这样的东西——好像是2024年年初,对吧?我感觉我们甚至没有类似于Son of 3.5的东西。我相信我们有GPT-4,而且当时似乎没有其他人能够赶上它。但是我们没有多模态模型,什么都没有。我的感觉是,如果拥有更多资源和人才的人都难以赶上,那玩这个游戏就非常困难。所以我们去玩一个不同的游戏。不过呢,人们想使用这些模型,有一种用例是提问并获得带有来源的准确答案和实时信息。模型之外还有很多工作要做,要确保产品可靠地运行,持续扩大使用规模,并且不断构建定制的用户界面。
There's just a lot of work to do and we will focus on that and we will benefit from all the tailwinds of models getting better and better. That's essentially what happened. In fact, I would say Son of 3.5 made our products so good in the sense that if you use Son of 3.5 as the model choice within perplexity, it's very difficult to find a hallucination.
有很多工作要做,我们将专注于此,并会从模型不断改善的利好中受益。基本上,这就是发生的情况。事实上,我想说Son of 3.5让我们的产品变得非常出色,因为如果在perplexity中选择Son of 3.5作为模型,几乎很难找到幻觉现象。
I'm not saying it's impossible, but it dramatically reduced the rate of hallucinations, which meant the problem of question answering, asking a question and getting an answer, doing fact checks, research, going and asking anything out there because almost all the information is on the web was such a big unlock and that helped us grow 10x over the course of the year in terms of usage. You've made huge strides in terms of users and we here on Steam BCL a lot, big investors who are huge fans. Jensen Wong himself, right? He mentioned it in his keynote the other night. It's a pretty regular user. He's not just saying it.
我并不是说这不可能,但这极大地降低了幻觉发生的概率,这意味着在问答、核实事实、研究等方面的问题得到了很大的突破,因为几乎所有信息都可以在网上找到。这帮助我们在这一年内使用量增长了10倍。你们在用户方面取得了巨大进展,我们在Steam BCL也有很多大投资者是你们的铁杆粉丝。黄仁勋本人也在,他在前几天晚上的主题演讲中提到了这个。他确实是一个常规用户,这不仅仅是随口说说。
He's actually a pretty regular user. So a year ago, we weren't even talking about monetization because you guys were just so new and you wanted to get yourselves out there and build some scale, but now you are looking at things like that increasingly in ad model, right? Yeah, we're experimenting with it. I know, like, does some controversy on why should we do ads? Whether you can have a truthful answer engine despite having ads?
他其实是一个很常用的用户。一年前,我们甚至没有讨论过关于盈利的问题,因为你们当时还很新,想让自己曝光并扩大规模。但是现在你们越来越在关注广告模式,对吧?是的,我们正在尝试。我知道有些争议,比如为什么我们应该做广告?在有广告的情况下,是否还能提供一个真实可信的回答引擎呢?
In my opinion, we've been pretty proactively talked about it where we said, okay, as long as the answer is always accurate, unbiased and not corrupted by someone's advertising budget, only you get to see some sponsored questions and even the answers to those sponsored questions are not influenced by them. And questions are also like, you know, I'm not picked in a way where it's manipulative.
在我看来,我们一直积极谈论这个问题,我们说过,只要答案始终是准确的、公正的,没有被某些人的广告预算所影响,你会看到一些赞助的问题,但即便是这些赞助问题的答案也没有受到影响。而且,问题的选择也不是以一种操控性的方式进行的。
Sure. There's some things that the advertisers also want, which is they want you to know about their brand and they want you to know the best parts of their brand, just like how you will go and what if you're introducing yourself to someone you want them to see the best parts of your ad. So that's all there. But you're still going to have to click on a sponsored question. You can ignore it. And we are only charging them CPM right now.
当然。广告商也希望你了解他们的品牌,并希望你了解品牌最好的部分。这就像你在自我介绍时,希望别人看到你最好的一面一样。所以,这一切广告信息是存在的。但是,你仍然可以选择是否点击赞助的问题,也可以忽略它。目前我们仅向他们收取CPM费用。
So we are still not even incentivized to make you click yet. So I think considering all this, we're actually trying to get it right long term instead of going the Google way of forcing you to click on links. I remember when people were talking about the commoditization of models a year ago and you thought, oh, it was controversial, but now it's not controversial. It's kind of like that's happening.
我们还没有动力让你点击。因此,我认为考虑到这一切,我们其实是在努力从长远来看做到正确,而不是像谷歌那样强迫你点击链接。我记得一年前,当人们谈论模型的商品化时,你可能觉得这是个有争议的话题,但现在已经不再有争议。这个过程其实正在发生。
Yeah. You've keeping your eye on that. It's smart. But we benefit a lot from model commoditization, except we also need to figure out something to offer to the paid users, like a more sophisticated research agent that can do like multi-step reasoning, go and do like 15 minutes worth of searching and give you like an analysis, an analyst type of answer. All that's going to come. All that's going to stay in the product.
是的,你一直在关注这件事。这是明智的。不过,我们确实从模型商品化中受益良多,但我们也需要想出一些提供给付费用户的东西,比如一个更高级的研究助手。这个助手可以进行多步骤推理,花大约15分钟进行搜索,然后给出一个分析师类型的答案。这些功能都会到来,并将保留在产品中。
Nothing changes there. But there's a ton of questions every free user asks day to day basis that needs to be quick fast answers. It shouldn't be slow. And all that will be free. Whether you like it or not, it has to be free. That's what people are used to. And that means figuring out a way to make that free traffic also monetizable.
在那里没有什么变化。但是,每个免费用户每天都会问很多问题,需要快速得到答案。这个过程不应该很慢。而且所有这些都将是免费的。不管你是否喜欢,它都必须是免费的,因为人们已经习惯了这样。这也意味着要找到一种方式,让这些免费的流量也能变现。
So you're not trying to change user habits, but it's interesting because you are trying to teach new habits to advertisers. They can't have everything that they have in a Google, 10 blue links search. What's the response been from them so far? Are they willing to accept some of the trade offs? Yeah. I mean, that's why they are trying stuff. Like Intuit is working with us. And then there's many other brands.
所以,你们不是试图改变用户的习惯,而是有趣的是,你们正试图教会广告商一些新习惯。他们不能拥有像谷歌那样的10个蓝色链接的搜索结果。到目前为止,他们的反应如何?他们愿意接受一些权衡吗?是的,我的意思是,这就是为什么他们在尝试。比如,Intuit正在和我们合作,还有许多其他品牌。
All these people are working with us to test. They're also excited about it. Look, everyone knows whether it's like it or not, 5 to 10 years from now. Most people are going to be asking AI's, most of the things, and not on the traditional search engine. Everybody understands that. So everybody wants to be early adopters of the new platforms, new UX, and learn from it.
所有这些人都在和我们一起进行测试。他们对此感到很兴奋。看看,不管喜不喜欢,大家都知道在未来的5到10年里,大多数人会更多地向人工智能询问问题,而不是使用传统的搜索引擎。每个人都明白这一点。所以每个人都想成为新平台和新用户体验的早期使用者,从中学习和探索。
And then we'll come together, not like they're not viewing it as like, okay, you guys go figure out everything else and then we'll come later. I'm smiling because it goes back perfectly to the point you made when you first sat down today, which is necessity is the mother of all invention, right? And that's what advertisers are essentially looking at. They're saying this field is changing. We have to learn to adapt with it.
然后我们会一起努力,而不是像他们那样看待问题:好了,你们先去解决所有问题,我们之后再加入。我微笑,因为这完美地回到了你今天刚坐下时提到的一点,那就是“需求是发明之母”,对吧?这正是广告商们所关注的事情。他们在说,这个领域正在变化,我们必须学会去适应它。
Okay. Arvind, I took up so much of your time. Thank you so much for taking the time.
好的,Arvind,我占用了你很多时间。非常感谢你抽出时间。