Alibaba Cloud Strengthens Commitment to the Open-Source Community by Making Its 7-Billion-Parameter LLM Models Available for Download

Date:

Share post:

Alibaba Cloud, the digital technology and intelligence backbone of Alibaba Group, has announced its latest contribution to the open-source community by open-sourcing its 7-billion-parameter Large Language Models (LLM), Qwen-7B and Qwen-7B-Chat, through its AI model community ModelScope, and the collaborative AI platform Hugging Face.

Alibaba Cloud introduced its proprietary LLM, Tongyi Qianwen, earlier this year in April. This cutting-edge model, capable of generating human-like content in both Chinese and English, has different model sizes, including  seven billion and above parameters. This time, the open-source release includes the pre-trained 7-billion-parameter model, Qwen-7B, and its conversationally fine-tuned version, Qwen-7B-Chat.

In an effort to democratise AI technologies, the models’ code, model weights, and documentation will be freely accessible to academics, researchers and commercial institutions worldwide. For commercial uses, the models will be free to use for companies with fewer than 100 million monthly active users. Programs with more users can request a license from Alibaba Cloud.

“By open-sourcing our proprietary large language models, we aim to promote inclusive technologies and enable more developers and SMEs to reap the benefits of generative AI,” said Jingren Zhou, CTO of Alibaba Cloud Intelligence. “As a determined long-term champion of open-source initiatives, we hope that this open approach can also bring collective wisdom to further help open-source communities thrive.”

The Qwen-7B was pre-trained on over 2 trillion tokens, including Chinese, English and other multilingual materials, code, and mathematics, covering general and professional fields. Its context length reaches 8K. In training, the Qwen-7B-Chat model was aligned with human instructions. Both Qwen-7B and Qwen-7B-Chat models can be deployed on cloud and on-premises infrastructures. This enables users to fine-tune the models and build their own high-quality generative models effectively and cost-efficiently.

The pre-trained Qwen-7B model distinguished itself in the Massive Multi-task Language Understanding (MMLU) benchmark, scoring a notable 56.7, outperforming other major pre-trained open-source models with similar scales or even some larger-size models. This benchmark assesses a text model’s multitask accuracy across 57 varied tasks, encompassing fields such as elementary mathematics, computer science and law.

Also Read: How Low-Code and No-Code Platforms Accelerate Development Process

Moreover, Qwen-7B achieved the highest score among models with equivalent parameters in the leaderboard of C-Eval, a comprehensive Chinese evaluation suite for foundational models. It covers 52 subjects in four major specialities including humanities, social sciences, STEM and others. Additionally, Qwen-7B reached outstanding performance on benchmarks of mathematics and code generation, such as GSM8K and HumanEval.

Alibaba Cloud’s Qwen-7B model distinguished itself in several benchmarks

In July, Alibaba Cloud also introduced its AI image generator, Tongyi Wanxiang, which was designed to support developers and SMEs in their creative image expression. The cloud pioneer also unveiled ModelScopeGPT, a versatile framework designed to assist users in performing complex and specialised AI tasks across language, vision and speech domains by leveraging various AI models on ModelScope. Launched by Alibaba Cloud last year, ModelScope is an open-source AI model community currently featuring over 1,000 AI models contributed by 20 leading AI institutes.

TalkDev Bureau
TalkDev Bureau
The TalkDev Bureau has five well-trained writers and journalists, well versed in B2B enterprise technology industry, and constantly in touch with industry leaders for the latest trends, opinions, and other inputs- to bring you the best and latest in the domain.
spot_img

Related articles

OpsHub Rolls Out Integration Support for IBM Engineering Test Management (ETM) Tool

OpsHub, a leading provider of Intelligent Application Mesh solutions, is pleased to announce integration support for the IBM...

Cirrascale Cloud Services Integrates the NVIDIA HGX H200 into Its AI Innovation Cloud

Cirrascale Cloud Services®, a leading provider of innovative cloud solutions for AI and high-performance computing (HPC) workloads, today...

The Significance and Benefits of Mobile-First Design

Mobile devices have become the primary internet access tool. With more than 50% of global website traffic coming...

Appy Pie Launches ‘Flawless Text’ for Error-Free AI Image Generation

Appy Pie Design has revolutionized its AI-driven graphic design platform with a new addition, "Flawless Text", to its...