PanGu-Σ Description

The expansion of large language model has led to significant advancements in natural language processing, understanding and generation. This study introduces a new system that uses Ascend 910 AI processing units and the MindSpore framework in order to train a language with over one trillion parameters, 1.085T specifically, called PanGu-Sigma. This model, which builds on the foundation laid down by PanGu-alpha transforms the traditional dense Transformer model into a sparse model using a concept called Random Routed Experts. The model was trained efficiently on a dataset consisting of 329 billion tokens, using a technique known as Expert Computation and Storage Separation. This led to a 6.3 fold increase in training performance via heterogeneous computer. The experiments show that PanGu-Sigma is a new standard for zero-shot learning in various downstream Chinese NLP tasks.

Integrations

Reviews

Total
ease
features
design
support

No User Reviews. Be the first to provide a review:

Write a Review

Company Details

Company:
Huawei
Year Founded:
1987
Headquarters:
China
Website:
huawei.com

Media

Get Started
Recommended Products
The #1 Embedded Analytics Solution for SaaS Teams. Icon
The #1 Embedded Analytics Solution for SaaS Teams.

Qrvey saves engineering teams time and money with a turnkey multi-tenant solution connecting your data warehouse to your SaaS application.

Qrvey’s comprehensive embedded analytics software enables you to design more customizable analytics experiences for your end users.
Try Developer Playground

Product Details

Platforms
SaaS
On-Premises
Type of Training
Documentation

PanGu-Σ Features and Options

PanGu-Σ User Reviews

Write a Review
  • Previous
  • Next