MosaicML Announces Inference API and Foundation Series for Generative AI


Share post:

Today, MosaicML, the leading Generative AI infrastructure provider, announced MosaicML Inference and its foundation series of models for enterprises to build on. This new offering allows developers to quickly, easily, and affordably deploy Generative AI models for 15x less than other comparable services. With the addition of inference capabilities, MosaicML now offers a complete, end-to-end solution for Generative AI training and deployment at the most efficient cost available today.

Generative AI models have quickly become a catalyst for innovation across industries from healthcare to financial services to e-commerce. However, off-the-shelf models have well-documented issues around data security, model transparency, and availability. Access to the alternative—custom Generative AI models—has been limited, until now.

“We believe that MosaicML Inference is a game-changer for Generative AI. It radically reduces the cost of serving large models and enables enterprises to do so in their own secure environments. Together with the MosaicML Foundation Series, enterprises now have more capabilities than ever before to achieve their own state-of-the-art AI without concerns about cost, scale, and security.” – Naveen Rao, CEO

Organizations are Building Custom LLMs on MosaicML

Today, organizations including Replit, Stanford, and Twelve Labs are building their own custom VLMs and LLMs on MosaicML because of the maximum control, privacy, and cost efficiencies it affords. MosaicML customers have found that smaller models trained on their own domain-specific data perform better than large generic models like GPT 3.5, the original model behind ChatGPT.

“Using the MosaicML platform, we were able to train and deploy our Ghostwriter 2.7B LLM for code generation with our own data within a week and achieve leading results.” – Amjad Masad, CEO, Replit

MosaicML Inference Curates the Best Open Source Models

MosaicML Inference delivers maximum flexibility and choice for developers who want to add Generative AI to their applications. Developers can choose to deploy their own custom LLMs, or choose from a curated selection of the best open source LLMs available today, including the MosaicML Foundation Series of Models, Instructor-XL, Dolly, and GPTNeoX. The cost and time advantages of MosaicML Inference are attributable to efficient ML systems engineering and optimizations that enable you to serve smaller lightweight domain-specific models.

Also Read: Amazing Inventions in the Metaverse World

MosaicML Inference offers two tiers for Generative AI developers to get started easily with their model deployments:

  1. Starter Tier: Open source models curated and hosted by MosaicML are offered as API endpoints for easy starts when adding Generative AI to applications.
  2. Enterprise tier: Custom models developed by enterprises to address specific business use cases. Model and data are fully secured in the customer’s enterprise environment.

MosaicML Foundation Series

The MosaicML Foundation Series are pre-trained GPT-style models for customers to fine tune and deploy. The LLMs in this series are in many cases higher performing than comparable open source models, with unique capabilities that go beyond GPT-4. The first set of models in the series will be open-sourced to the community starting this week.

MosaicML Inference Delivers Privacy & Control

According to a recent KPMG study of 225 US executives, while two-thirds of executives believe that Generative AI will have a major impact on their business, nearly the same percentage say they are still one or two years away from deploying extensively into their operations. Two of the main reasons? Concerns about cyber security (81%) and data privacy (78%) issues.

In addition to unprecedented cost efficiencies, with MosaicML Inference organizations can also develop and deploy their own generative AI models with complete data privacy and control. Developers can deploy on a secure cluster hosted by MosaicML or on their infrastructure of choice such as AWS, Oracle Cloud Infrastructure, and GCP. Developers can turn a saved model checkpoint into a secure, inexpensive API hosted within their own virtual private cloud (VPC) environment in under a minute. Inference data never leaves the secured environment of the user’s infrastructure. MosaicML Inference also offers continuous monitoring of cluster and model metrics for enterprise-grade DevOps, ensuring complete transparency for model behavior.

TalkDev Bureau
TalkDev Bureau
The TalkDev Bureau has five well-trained writers and journalists, well versed in B2B enterprise technology industry, and constantly in touch with industry leaders for the latest trends, opinions, and other inputs- to bring you the best and latest in the domain.

Related articles

Octopus Deploy buys Codefresh to Combine CD, CI, and GitOps on a Single Platform

Octopus Deploy has announced the acquisition of Codefresh. This acquisition marks an important milestone for Octopus as it...

Aetheros Adds Remote SIM Provisioning Support to AetherOS for Utilities Customers

Aetheros has added GSMA Remote SIM Provisioning support to its flagship software product AetherOS (AOS). This will enable...

Intel Showcases AI Optimizations for Language Models on Arc Alchemist GPUs Using PyTorch

Intel has released its PyTorch extension, which allows large language models (LLMs) like Llama 2 to run on...

Huawei’s New Server CPU Matches Zen 3 in Single-Core Performance.

A recent Geekbench 6 score offers a glimpse at the single-core performance of the Taishan V120, which is...