Enhancing AI Software with AMD by Moreh

2023-12-26

"Any configuration or technology that customers want to apply should be easily implemented without the involvement of any programmers. This is our goal and what sets us apart," said Junghwan Lim. Since AMD announced its MI300X and ROCm software updates, its rise in the AI market has been accelerating. The company has established partnerships with multiple AI companies to test and deliver its products. One of these companies leading the way is Moreh, based in South Korea. An early issue with AMD GPUs was their incomplete software stack, but now the company has ROCm, an alternative to CUDA. However, it is still not suitable for large GPU clusters. "Our software allows people to use AMD GPUs without any additional code, so they can run their own or larger language models without further engineering," said Junghwan Lim, Head of AI at Moreh, in an interview with AIM. Junghwan Lim previously worked as a data scientist at PUBG and at Samsung in South Korea. Currently, he is focused on developing better AI workload software at Moreh and making language models smaller and more efficient. Moreh's flagship AI software, MoAI, is positioned similar to NVIDIA's CUDA but is compatible with existing machine learning frameworks such as Meta's PyTorch, Google's TensorFlow, and even OpenAI's Triton. Currently, the company is helping AMD improve its ROCm performance. In August last year, Moreh announced that it had been using the AMD MI250 for a long time and that its performance surpassed NVIDIA. According to Moreh, when the MoAI platform drives AMD's MI250 Instinct accelerator, its GPU throughput is 116% higher than NVIDIA's A100. "We have used over 400 MI250 GPUs and some MI300X to train AI models," said Junghwan Lim. The company also plans to purchase more MI300X from AMD in the future. "If someone wants to use AMD GPUs, they can come to us without any code and run on them using our software," said Junghwan Lim. This is somewhat similar to the collaboration between Lamini and AMD, but Junghwan Lim stated that there are still differences in code requirements, as the company can also run models on their existing GPUs. The key to Moreh's success lies in its software built on top of the AMD GPU infrastructure, demonstrating performance that surpasses NVIDIA GPUs in AI model development. The MoAI platform is a comprehensive software product that is not dependent on a specific hardware vendor and supports various device backends, including AMD GPUs. "Customers want every configuration or technology to be easily implemented without the involvement of programmers. This is our goal and what sets us apart," said Junghwan Lim. "If you build AI models on a thousand GPUs, you may encounter GPU failures due to hardware or software issues," explained Junghwan Lim. "Training suddenly stops, sometimes taking hours or even days to restart." He explained that Moreh is also developing software and designing techniques to reduce this issue by parallelizing computation into different segments. Victory of Large Language Models Moreh has also completed training its own Korean LLM, which consists of 221 billion parameters and hopes to make future models open source. "We are developing something similar to GPT or Gemini, but it will be open source," said Junghwan Lim. The current model is too large to be open source, so Moreh also plans to release smaller models soon. "Our models will include code, weights, inference code, and everything else," emphasized Junghwan Lim, highlighting the trend of recent open source models that have some limitations. In October last year, AMD and Korean Telecom (KT) invested $22 million in Moreh's Series B funding, bringing the startup's valuation to $30 million. The company also expects its revenue to reach $30 million by the end of 2023. KT has also purchased one of the world's largest AMD GPU clusters and is using them to build AI models. "We support all of KT's clusters and cloud systems," said Junghwan Lim. KT has started focusing on the GPU cloud service provider business and also provides language model APIs, but these APIs only focus on Korean, not English. KT has been collaborating with Moreh since 2021 and claims that Moreh's technology has demonstrated superior performance in terms of speed and GPU memory capacity compared to NVIDIA's DGX. "People thought AMD GPUs were not suitable for machine learning, but the company has continuously proven them wrong," concluded Junghwan Lim.