Today, Elon Musk's artificial intelligence company xAI officially launched its latest generation of large language models, Grok 3. Compared to its predecessor, Grok 2, Grok 3 has achieved significant improvements in capabilities.
In multiple domains such as mathematical reasoning, scientific logic reasoning, and code writing, Grok 3 excelled in various benchmark tests, surpassing other advanced models like DeepSeek-v3, GPT-4o, and Gemini-2 pro. Notably, Grok 3 performed exceptionally well on benchmarks like AIME (mathematical problem evaluation) and GPQA (doctoral-level physics, biology, and chemistry problem tests). Additionally, an early version of Grok 3 demonstrated strong competitiveness on the crowdsourced testing platform Chatbot Arena.
Grok 3 is not a single model but a family of models with multiple versions. Among them, the smaller version, Grok 3 mini, sacrifices some accuracy for faster response times. Currently, not all versions of Grok 3 have been made available.
The development cycle for Grok 3 was significantly shortened, largely due to the support of xAI’s powerful Colossus supercomputer. Completed in just eight months, the Colossus supercomputer provided substantial computing power for Grok 3's development. It is reported that Grok 3 utilized 100,000 NVIDIA H100 GPUs, with a total training duration of 200 million GPU hours, which is ten times the scale of Grok 2.
On the software side, the xAI team also optimized Grok 3. By improving the training process and incorporating techniques such as synthetic datasets, self-correction, and reinforcement learning, the performance of Grok 3 was further enhanced. The combined application of these technologies allows Grok 3 to exhibit higher accuracy when handling complex tasks.
Grok 3 also introduced two variant versions: Grok 3 Reasoning and Grok 3 mini Reasoning. These versions conduct thorough fact-checking before providing results, avoiding common errors similar to other reasoning models. In several benchmark tests, Grok 3 Reasoning outperformed other reasoning models, such as OpenAI's o3-mini high.
Besides, Grok 3 features a new function called "DeepSearch." This function scans information from the internet and the X platform and responds to user queries in summary form. The introduction of this function aims to offer users a more convenient and efficient search experience.
Notably, the planned Grok 3 voice mode did not launch as scheduled. Elon Musk confirmed on the X platform that issues remain with the voice mode, which is expected to be released in about a week.
xAI also launched a subscription service named SuperGrok, priced at $30 per month or $300 annually. Subscribers gain additional access to reasoning and DeepSearch query permissions, as well as unlimited image generation capabilities. Users subscribed to X platform's Premium + will be the first to experience Grok 3.
In the future, xAI plans to integrate Grok 3 models and DeepSearch functionality into enterprise APIs to meet the needs of more users. Furthermore, xAI stated that it will open-source the previous version, Grok 2, after Grok 3 reaches maturity and stability, to further promote the development of artificial intelligence technology.