Red Hat Discusses Open Small Language Models for Responsible and Practical AI

2025-04-23

As geopolitical events shape the world, it's no surprise that they also influence technology – specifically, how they are driving changes in the current Artificial Intelligence (AI) market, impacting its methodologies, development approaches, and applications within enterprises.

At present, there is a growing balance between expectations and reality regarding AI. While significant doubts about this technology persist, many have welcomed its emergence during its early developmental stages. Instances like Llama, DeepSeek, and Baidu’s newly launched Ernie X1 are challenging the closed-loop nature of well-known large language models (LLMs).

In contrast, open-source development provides transparency and the ability to contribute feedback, aligning better with the concept of "responsible AI." This notion encompasses the environmental impact of large models, the ways AI is utilized, the composition of its training datasets, and issues of data sovereignty, language, and politics.

Red Hat, a company that has demonstrated the economic viability of an open-source development model for its business, aims to extend its open, collaborative, and community-driven AI approach. We recently interviewed Julio Guijarro, Red Hat’s EMEA CTO, to learn how the organization plans to harness the undeniable power of generative AI models responsibly, sustainably, and transparently, while delivering value to businesses.

Julio emphasized the need for substantial educational efforts to provide a deeper understanding of AI. He noted, “Given the significant unknowns surrounding AI’s inner workings—stemming from complex science and mathematics—it remains a 'black box' for many. The lack of transparency becomes more pronounced when AI is primarily developed in hard-to-access, closed environments.”

Moreover, challenges such as language support (European and Middle Eastern languages remain largely underserved), data sovereignty, and fundamental trust issues persist. “Data is an organization’s most valuable asset, and enterprises must be aware of the risks of exposing sensitive information to public platforms with differing privacy policies,” he added.

Red Hat's Response

Red Hat’s response to global AI demands is to pursue what it believes offers the greatest benefit to users while addressing concerns and warnings that arise quickly when deploying default AI services.

One solution, according to Julio, lies in Small Language Models (SLMs). These can run on-premises or in hybrid clouds, using non-specialized hardware and accessing local business data. SLMs serve as compact, efficient alternatives to LLMs, designed to deliver robust performance for specific tasks while requiring significantly fewer computational resources. Some smaller cloud providers may help offload certain computing tasks, but the key is flexibility and freedom to retain critical business information close to the model when needed. This is crucial because internal organizational data evolves rapidly. “A challenge with LLMs is that they can quickly become outdated since data generation doesn’t occur in large clouds. Data arises from your environment and business processes,” he explained.

Additionally, cost is a major concern. “Your customer service LLM queries might incur hidden costs. Before AI, you knew that when querying data, the scope was limited and predictable, allowing you to calculate transaction costs. With LLMs operating iteratively, the more you use them, the better their answers get, leading to more questions and interactions—and each one comes at a price. What could have been a single query might turn into 100 depending on who uses it and how. Running models locally allows greater control since the cost scope is tied to your infrastructure, not per-query pricing.”

Organizations don’t need to prepare massive budgets for GPU procurement. Red Hat is currently optimizing models (open-source, of course) to run on standard hardware. This is achievable because specialized models used by many businesses don’t require handling vast generic datasets for every query at a high cost.

“Much ongoing work involves people analyzing large models and removing unnecessary components for specific use cases. If we want AI everywhere, small language models must lead the way. We’re also focused on supporting and improving vLLM (inference engine projects) to ensure people interact efficiently and standardized across various models, whether locally, at the edge, or in the cloud,” said Julio.

Staying Lean

Using and referencing locally relevant data means results can be tailored to specific needs. Julio cited projects in Arabic and Portuguese-speaking regions that wouldn’t have been feasible with English-centric LLMs.

Furthermore, early adopters have identified several practical issues with daily LLM usage. First is latency, which can be problematic in time-sensitive or customer-facing scenarios. Deploying focused resources and targeted outcomes within just one or two network hops makes sense.

Second is trust—an essential component of responsible AI. Red Hat advocates for open platforms, tools, and models to enhance transparency, understanding, and contributions from as many people as possible. “This matters for