OctoAI Introduces OctoStack Platform to Streamline Enterprise AI Model Hosting AI NEWS

Home
AInews
OctoAI Introduces OctoStack Platform to Streamline Enterprise AI Model Hosting

OctoAI Introduces OctoStack Platform to Streamline Enterprise AI Model Hosting

2024-04-03

OctoAI Launches OctoStack Software Platform for Hosting AI Models on Internal Infrastructure OctoAI has introduced the OctoStack software platform, which enables companies to host artificial intelligence models on their internal infrastructure. Many large language models are delivered through cloud-based application programming interfaces (APIs). These models are hosted on the respective developers' infrastructure, requiring customers to send their data to that infrastructure for processing. Hosting neural networks on internal hardware eliminates the need to share data with external vendors, simplifying network security and regulatory compliance for enterprises. OctoAI states that its newly launched OctoStack platform makes it easier to host AI models on internal infrastructure. The platform can run on internal hardware, major public clouds, and AI-optimized infrastructure-as-a-service platforms like CoreWeave. OctoStack is also compatible with multiple AI accelerators from Nvidia and Advanced Micro Devices, as well as the AWS Inferentia chip available in Amazon Web Services. The platform is partially based on open-source technology Apache TVM, developed by OctoAI's founder. Apache TVM is a compiler framework that simplifies the optimization of AI models to run on multiple chips. After creating the initial version of a neural network, developers can optimize it in various ways to improve performance. One technique is operator fusion, which compresses some of the AI computations into fewer and more efficient hardware computations. Another technique is quantization, which reduces the amount of data processing required for accurate results in a neural network. These optimizations are not always applicable to different types of hardware. Therefore, an AI model optimized for one graphics card may not necessarily run efficiently on a processor from another chip manufacturer. The open-source technology TVM adopted by OctoStack automates the process of optimizing neural networks for different chips. OctoAI claims that its platform helps customers run their AI infrastructure more efficiently. According to the company, the inference environment powered by OctoStack can increase GPU utilization by four times compared to building an AI cluster from scratch. The company also promises to reduce operating costs by 50%. Louis Cerezo, co-founder and CEO of OctoAI, said, "Building viable and future-oriented generative AI applications for customers requires not only cost-effective cloud inference but also hardware portability, model integration, fine-tuning, optimization, load balancing, and comprehensive solutions. These are all full-stack problems that require comprehensive solutions." OctoStack supports popular open-source LLMs such as Llama from Meta Platforms Inc. and Mixtral, a hybrid expert model developed by startup Mistral AI. The company can also run internally developed neural networks. According to OctoAI, OctoStack allows for updating AI models over time in the inference environment without significant changes to the supported applications.

PCR.AI

PCR.AI - Analyze PCR test results with AI

ScrapFly

ScrapFly - Simplified web scraping API for developers

Warp

Warp - AI coding using the terminal

Pixop

Pixop - AI video enhancement and upscaling platform

Swimm

Swimm - Reverse engineer your code

Retell AI

Retell AI - AI voice and chat agents that can make calls and send chat messages

Muset

Muset - The AI-native workspace for deep creators

RECENT AI TOOLS

Kavout

PCR.AI

ScrapFly

Warp

Pixop

RECENT AI NEWS

OpenAI's Non-Profit Parent Company Will Receive Over $100 Billion in Shares from Its Profit-Making Unit

F5 Acquires AI Security Company CalypsoAI for $180 Million

Microsoft Visual Studio 2026 Introduces “AI Integration into Workflows”

NVIDIA Supports QuEra in Expanded $230M Funding Round

FTC Investigates AI Chatbot Companions from Companies like Meta and OpenAI

OpenAI Partners with Oracle on $300 Billion Cloud Computing Agreement to Advance AI Development

Microsoft and OpenAI Continue to Surpass Partnership Boundaries

Arm Launches Lumex Chip Series Optimized for Mobile AI

RECENT AI TOOLS