Dwarkesh Patel - Jeff Dean & Noam Shazeer – 25 years at Google: from PageRank to AGI
发布时间:2025-02-12 19:56:35
原节目
以下是 Dwarkesh Patel、Jeff Dean 和 Noam Shazir 之间对话的总结:
Dwarkesh、Jeff 和 Noam 讨论了 Google 的演变、人工智能的进步以及计算的未来。Google 首席科学家 Jeff Dean 和领先的 AI 架构师 Noam Shazir 分别分享了他们在公司 25 年的见解,回顾了公司的成长和他们的贡献。
Google 的早期:
他们回忆起 Google 的早期,那时 Jeff 似乎无所不知,而公司的快速增长导致项目数量激增。Jeff 回忆了最初他是如何联系他们的。Noam 最初不愿加入,因为他认为 Google 已经太大了。Noam 设想在 Google 挣钱,这样他最终可以全职从事 AI 工作,而这正是后来发生的事情。
摩尔定律与硬件:
摩尔定律是一件值得骄傲的事情。在过去的二十年里,随着摩尔定律的进步,它改变了人们对新系统的看法。最初,等待硬件改进就足够了,但像机器学习加速器和 TPU 这样的专用计算设备变得越来越重要。硬件的进步现在集中在像 TPU 这样的专用设备上,这与算法对低精度算术和高效数据移动的需求相符。对话强调了向针对深度学习优化的硬件的转变,以及协同设计算法和硬件的重要性。
算法进步:
他们探讨了这样一种反事实的场景:内存成本的下降速度超过了算术运算的速度。他们还讨论了未来 TPU 版本的权衡考虑。量化是一个关键焦点,可以实现精度降低的模型。
AI 历史:
简要提及了 Jeff 在反向传播和语言模型方面的早期工作。解释了构建一个拥有 2 万亿个 token 的 N-gram 模型的动机,以及它对翻译和其他应用的影响。讨论了一个通过对 YouTube 帧进行无监督学习来识别猫图像的系统的开发。
Google 公司:
Google 被描述为一家“组织世界信息”的公司,其范围比信息检索更广。人工智能的角色正在扩展,从检索扩展到创建和合成信息。Jeff 表达了他对人工智能的愿景,即组织和创建人们所需的新信息,包括编码、多模态理解和语言翻译。
AI 未来:
Noam 设想 AI 领域存在一个万亿美元的机会,重点在于价值创造。算法的进步是关键,能够不断提高 AI 能力。Jeff 认为有机会实现不分语言的普遍信息访问。
长上下文和上下文学习:
他们讨论了将 Google 搜索与上下文学习结合的挑战。Jeff 指出,模型擅长访问其上下文窗口中的信息,但难以处理存储在参数中的“模糊”信息。这引发了关于如何扩大模型规模并应用推理时间来提高质量的讨论。他随后提出,需要高效的算法来处理数万亿个 token,这可能会彻底改变许多行业。
AI 作为自主工程师:
Noam 和 Jeff 探讨了 AI 作为自主软件工程师的潜力。他强调了构建分布式系统以及创建相当于 map-reduce 或 Tensor flow 的新事物的重要性。AI 得出初始近似值的效率大大降低了劳动力成本。他随后谈到了将 AI 集成到一个大型单体仓库中的必要性。
安全和智能爆发:
Dwarkesh 认为,随着软件进步速度的加快,模型将变得更加强大和危险,而 Jeff 认为人们应该努力塑造和引导 AI 在世界上的部署方式。
谦逊和协作:
为了开发出能够快速学习不同技能的 AI,研究人员必须表现出谦逊的态度,愿意放弃行不通的东西。他解释说,因此,还需要自上而下和自下而上的激励,以促进协作和灵活的设计的创建。
微型厨房:
Jeff 和 Noam 描述了他们 Google 大楼的布局,其中设有一个微型厨房,人们可以在那里享用午餐,并享受额外的声音。
Here is a summary of the conversation between Dwarkesh Patel, Jeff Dean, and Noam Shazir:
Dwarkesh, Jeff, and Noam discuss the evolution of Google, advancements in AI, and the future of computing. Jeff Dean and Noam Shazir, Google's Chief Scientist and leading AI architect, respectively, share insights from their 25 years at the company, reflecting on its growth and their contributions.
Early days at Google:
They recall Google's early days, where Jeff seemed to know everything, and the company's rapid growth led to an explosion in projects. Jeff recounts how he had initially contacted them. Noam was initially reluctant to join because he assumed Google was already too big. Noam had envisioned making his money at Google so he can eventually work on AI full time, which is exactly what happened.
Moore's Law and Hardware:
Moore's Law is something to be proud about. As Moore’s Law has progressed over the last two decades, it has changed how people think about new systems. Initially, waiting for hardware to improve was sufficient, but the specialized computational devices like machine learning accelerators and TPUs became more important. Hardware advancements are now focused on specialized devices like TPUs, aligning with the algorithms' need for low-precision arithmetic and efficient data movement. The shift to hardware optimized for deep learning and the importance of co-designing algorithms and hardware are emphasized.
Algorithmic Advancements:
They explore the counterfactual scenario of memory costs declining more than arithmetic. The trade-offs considered for future TPU versions are also discussed. Quantization is a key focus, enabling reduced precision models.
AI history:
Jeff's early work on back propagation and language models are touched upon. The motivation behind building a two-trillion token N-gram model and its impact on translation and other applications are explained. The development of a system that identified cat images from unsupervised learning on YouTube frames is discussed.
Google as a company:
Google is described as an "organizing the world’s information" company, broader than information retrieval. AI's role is expanding beyond retrieval to creating and synthesizing information. Jeff expresses his vision for AI to organize and create new information for people's needs, including coding, multimodal understanding, and language translation.
AI Future:
Noam envisions a quadrillion-dollar opportunity in AI, focusing on value creation. Algorithmic advancements are key, leading to continuous improvements in AI capabilities. Jeff sees an opportunity to achieve universal access to information regardless of language.
Long Context and In Context Learning:
They discuss the challenge of merging Google Search with in-context learning. Jeff points out that models excel at accessing information within their context window but struggle with "squishy" information stored in parameters. This led to a discussion of how to scale up the model and apply inference time to improve quality. He then proposes the need for efficient algorithms to handle trillions of tokens, potentially revolutionizing many industries.
AI as Autonomous Engineers:
Noam and Jeff explore the potential of AI as autonomous software engineers. He highlights the importance of building distributed systems and creating the new equivalent of map-reduce or Tensor flow. The efficiency with which the AI can come up with initial approximations reduces labor costs substantially. He then talks about the need to integrate the AIs into a big mono repo.
Safety and Intelligence Takeoff:
Dwarkesh suggests that models will become much more potent and dangerous as the rate of the software’s progress increases, while Jeff believes people should try and shape and steer the way in which AI is deployed in the world.
Humility and Collabortaion:
In order to make an AI with the ability to quickly learn different skills, the researcher must demonstrate humility, a willingness to drop what doesn’t work. He explains that, as a result, there will also need to be both top-down and bottoms-up incentivization to promote the creation of collaborative and flexible designs.
Micro-kitchen:
Jeff and Noam describe how the layout of their Google building is set up, with a micro kitchen where people can enjoy their lunch with a little extra sound.