OpenAI Secretly Developing Q*, Advancing Towards General Artificial Intelligence AI NEWS

Home
AInews
OpenAI Secretly Developing Q*, Advancing Towards General Artificial Intelligence

OpenAI Secretly Developing Q*, Advancing Towards General Artificial Intelligence

2023-11-24

According to reports, OpenAI is researching a project called Q* (pronounced Q-Star) that can solve unfamiliar mathematical problems. This new development is taking place against the backdrop of Andrej Karpathy's recent centralization and decentralization thinking. Some people at OpenAI believe that Q* could be a significant step towards achieving Artificial General Intelligence (AGI). However, this new model has raised concerns among some AI security researchers, especially after the model demonstration circulated internally at OpenAI in recent weeks, accelerating technological progress, according to Information. The model was created by OpenAI's Chief Scientist Ilya Sutskevar and other top researchers Jakub Pachocki and Szymon Sidor. Interestingly, this new development comes after Andrej Karpathy recently posted on X, saying that he has been thinking about centralization and decentralization. Karpathy mainly discusses building an AI system that involves the trade-offs between centralized and decentralized decision-making and information. To achieve the best results, you must balance these two aspects, and Q-learning seems to fit this equation perfectly, making all of this possible. What is Q-learning? Experts believe that Q* is built on the principles of Q-learning, which is a fundamental concept in the field of artificial intelligence, particularly in the field of reinforcement learning. The Q-learning algorithm is classified as model-free reinforcement learning and is designed to understand the value of an action in a specific state. The ultimate goal of Q-learning is to find an optimal strategy that defines the best action to take in each state, maximizing the accumulated reward over time. Q-learning is based on the concept of the Q-function, also known as the state-action value function. This function operates with two inputs: a state and an action. It returns an estimate, which is the total expected reward from starting at that state, taking that action, and then following the optimal policy. In simple examples, Q-learning maintains a table called the Q-table, where each row represents a state and each column represents an action. The entries in this table are Q-values, which are updated as the agent learns through exploration and exploitation.

Miko

AI interactive learning companion for children

Comet

Smart browser with AI features available for any website

Mirelo AI

AI-generated soundtracks for your video projects

Giskard AI

AI platform for identifying model vulnerabilities

SnapCalorie

AI photo calorie tracker for accurate nutrition

Supio

**AI legal assistant for personal injury cases**

TTS Maker

Free AI tool for converting text to speech

RECENT AI TOOLS

Spot AI

Miko

Comet

Mirelo AI

Giskard AI

RECENT AI NEWS

Microsoft Deploys the World's First GB300 Supercluster for OpenAI

Unitree R1 Bipedal Humanoid Robot Ranks on TIME's 2025 Best Inventions List

Dishwashing and laundry "housework buddy" is here! Figure 03 humanoid robot: 1.68 meters tall, 5-hour battery life

Sora Reaches 1 Million Downloads Faster Than ChatGPT

Google Launches Gemini Enterprise: Unified AI Platform for Businesses

Figma Leverages Google's Gemini to Accelerate Enterprise AI in Its Design Platform

Intel Launches Panther Lake, the First Core Ultra Based on 18A Process

Amazon Launches Quick Suite, Introducing AI Agents to the Enterprise Workplace

RECENT AI TOOLS