Oxford AI researchers warn that LLM may pose a threat to scientific truth.

2023-11-24

Last November, tech giant Meta launched a large-scale language model called Galactica to assist scientists. However, unlike Meta's hopes for a big bang, Galactica quietly disappeared after three days of intense criticism.


Concerns about the potential threat to scientific integrity posed by LLM were expressed by AI researchers from the Oxford Internet Institute in a recent paper published in the journal "Nature Human Behaviour". Brent Mittelstadt, Chris Russell, and Sandra Wachter argue that models like LLM, based on the GPT-3.5 architecture, are not absolute reliable sources of truth and can generate what they call "illusions" - unreal answers.


The authors suggest changing the way LLM is used and propose using them as "zero-shot translators". Users should provide relevant information and guide the model to convert it into the desired output, rather than relying on LLM as a knowledge base. This approach makes it easier to verify the factual accuracy of the output and its consistency with the provided input.


As outlined in the paper, the core issue lies in the nature of the data on which these models are trained. These language models, designed to provide useful and persuasive responses, often lack guarantees of accuracy or alignment with facts. Training on large datasets containing false statements, opinions, and fictional writing from online content exposes LLM to non-factual information.


Professor Mittelstadt emphasizes the risk of users trusting LLM as a reliable source of information. Due to their human-like design, users may be misled into believing that their answers are accurate, even if they lack factual basis or present biased answers.


To protect scientific truth and education from the spread of inaccurate and biased information, the authors advocate for clear regulations on responsible use of LLM. In tasks where accuracy is crucial, the paper suggests users should provide annotations containing factual information.


Professor Wachter highlights the role of responsible use of LLM in the scientific community and the need for confidence in factual information. The authors warn of the potential harm if LLM is used hastily in generating and disseminating scientific articles.


While emphasizing the need for careful consideration, Professor Russell urges people to take a step back from the opportunities offered by LLM and reflect on whether specific opportunities should be given simply because the technology can provide them.