NVIDIA Launches ChatQA: A GPT-4 Level Conversational QA Model

2024-01-22

NVIDIA researchers recently introduced ChatQA, a series of conversational question-answering models aimed at achieving performance comparable to GPT-4. NVIDIA's ChatQA offers a range of models from 7B to 70B in scale. Extensive evaluations on 10 conversational question-answering databases show that the top-performing ChatQA-70B not only surpasses GPT-3.5-turbo but also reaches a level comparable to GPT-4. It is worth noting that these achievements were made without relying on any synthetic data from the ChatGPT model. The development team of ChatQA proposed a two-stage guided fine-tuning method that significantly improves the performance of large language models (LLMs) in zero-shot conversational question-answering. To address retrieval issues in conversational QA, researchers fine-tuned a dense retriever on a multi-turn QA database, achieving results comparable to state-of-the-art query rewriting models while reducing deployment costs. NVIDIA demonstrated the effectiveness of fine-tuning a single-turn query retriever with carefully curated conversational QA data. This approach performs on par with state-of-the-art LLM-based query rewriting models without requiring additional computation time and potential API costs associated with rewriting. ChatQA from NVIDIA shows significant progress in handling scenarios where answers are difficult to find. The introduction of a small number of "unanswerable" samples has been proven to significantly enhance the model's capability. Through evaluation of unanswerable cases, it is evident that the leading ChatQA-70B model exhibits only marginal performance differences compared to the powerful GPT-4. NVIDIA is not alone in this endeavor. Several foundational models have already achieved capabilities similar to GPT-4. Google may soon release Gemini Ultra, and Mistral's CEO, Arthur Mensch, announced on French National Radio that the company will unveil an open-source GPT-4-level model in 2024.