Zhou Hongyi: Sora Signals AGI Realization to be Accelerated from 10 Years to 1 Year

2024-02-18

"Sora is a groundbreaking video generation model that not only produces high-quality videos but also has a deep understanding and simulation of the real world, signaling new advancements and possibilities in artificial intelligence." "On February 16th, Zhou Hongyi, the founder of 360, shared his thoughts on Sora on Weibo, stating that its emergence could potentially shorten the realization of AGI (Artificial General Intelligence) from 10 years to one or two years." "Just the day before on February 15th, OpenAI announced the development of a 'text-to-video' model called Sora. This model can create videos up to 60 seconds long, including highly detailed scenes, complex camera movements, and emotionally engaging characters. What's even more astonishing is that it can even animate static images." "In regards to Sora's emergence, Zhou Hongyi believes that the ultimate competition in technology lies in talent density and profound accumulation. He points out that companies with core technologies, such as OpenAI, are evidently stronger compared to startups. While some may think that with AI, startups only need to be individual businesses, Zhou Hongyi finds this idea laughable." "While AI may disrupt certain industries, Zhou Hongyi believes it primarily inspires people's creativity. He mentions that although machines can produce a good video, the themes, scripts, storyboarding, and dialogue coordination still require human creativity and prompts. He believes that Sora may bring significant changes to the advertising industry, movie trailers, and short video industry, but it is unlikely to quickly surpass TikTok. Instead, it may become a creative tool for TikTok." "When discussing Sora's greatest advantage, Zhou Hongyi states that previous text-to-video software operated on a 2D plane, manipulating graphical elements. However, Sora can understand the real world like a human and simulate the physical laws of the real world. For example, it can comprehend the immense impact of a tank, preventing scenes like a car crashing into a tank. This is because OpenAI leverages the advantages of large language models, enabling Sora to possess a dual ability of understanding and simulating the real world." "Zhou Hongyi also mentions that with large-scale models as a foundation, combined with human knowledge guidance, super tools can be created in various fields. He believes that large-scale models will play an important role in biomedical research, protein and gene studies, as well as physics, chemistry, mathematics, and other disciplines." "He exclaims, 'Once artificial intelligence connects to a camera and watches all the movies and videos on YouTube and TikTok, its understanding of the world will far surpass textual learning. This development is not far from achieving Artificial General Intelligence, possibly within one or two years.'" "At the same time, Zhou Hongyi acknowledges that although the development level of large-scale models in China appears to be close to GPT-3.5, there is still a year and a half gap compared to GPT-4.0. He believes that OpenAI may have some undisclosed secret weapons, such as GPT-5 or machine self-learning to generate content. Therefore, the gap between China and the United States in the field of AI may continue to widen."