"AI2 Introduces OLMo: The First 'Truly Open' Large-Scale Language Model"

2024-02-02

The Allen Institute for Artificial Intelligence (AI2) has announced a groundbreaking contribution to the open-source AI ecosystem - OLMo, a large-scale language model (LLM) with unprecedented transparency in its internal workings.

AI2 describes OLMo as the "first truly open large-scale language model," providing not only the model's code and weights but also the complete training data, training code, evaluation benchmarks, and toolkit used to develop OLMo. This level of openness allows AI researchers to delve deep into the model's construction, enhancing our understanding of large-scale language models.

OLMo's transparency addresses the issue of many popular AI models being considered "black boxes" today, as these models are trained using undisclosed methods and datasets. As project lead Hanna Hajishirzi puts it, "Without access to training data, researchers cannot scientifically understand how a model works." OLMo ultimately provides this visibility.

The model is built on AI2's Dolma dataset, which includes a 30 trillion-token open corpus for language model pretraining, including the code to generate the training data. It provides full model weights for four variants of the 7B-scale model, each trained on at least 2T tokens, along with inference code, training metrics, and logs. Additionally, OLMo's evaluation suite is released under the Catwalk and Paloma projects, including over 500 checkpoints for each model at every 1000 steps during training.

Moreover, OLMo is more than just an academic exercise - it showcases cutting-edge performance comparable to commercial products. When benchmarked against models like Meta's LLama and TII's Falcon, OLMo performs exceptionally well, even surpassing them in certain natural language tasks.

OLMo's impressive debut sets AI2 on the path to iteratively build "the world's best open language model." The institute plans to continue enhancing OLMo over time by increasing model size, modalities, and capabilities.

By developing the entire framework of OLMo through open-source, AI2 sets a new standard for transparency in the field of artificial intelligence research. Microsoft's Chief Scientist Eric Horvitz praises this unprecedented level of openness, stating that it "will drive numerous advances in the field of AI across the global community."

The release of OLMo is not just about providing tools; it also lays the foundation for a deeper understanding of AI models. As Yann LeCun, Chief AI Scientist at Meta, points out, open foundational models play a crucial role in fostering innovation in AI generation. The vibrant communities generated by open-source projects are essential for accelerating the development of future AI technologies.

Collaborations with institutions such as the Kempner Institute at Harvard University, AMD, CSC (Lumi Supercomputer), the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and Databricks, along with other partners, are crucial for realizing OLMo. These collaborations highlight the collaborative spirit embodied by OLMo, aiming to facilitate AI experimentation and innovation on a global scale.

The collaborative spirit behind OLMo can be traced back to the early stages of artificial intelligence as an open academic discipline. As the adoption of AI accelerates, projects like OLMo are crucial for driving technological advancements on an open rather than secretive basis. If AI is to benefit humanity, we must understand how it works - and OLMo plays a powerful role in this endeavor.