PanGu-Σ Description

The expansion of large language model has led to significant advancements in natural language processing, understanding and generation. This study introduces a new system that uses Ascend 910 AI processing units and the MindSpore framework in order to train a language with over one trillion parameters, 1.085T specifically, called PanGu-Sigma. This model, which builds on the foundation laid down by PanGu-alpha transforms the traditional dense Transformer model into a sparse model using a concept called Random Routed Experts. The model was trained efficiently on a dataset consisting of 329 billion tokens, using a technique known as Expert Computation and Storage Separation. This led to a 6.3 fold increase in training performance via heterogeneous computer. The experiments show that PanGu-Sigma is a new standard for zero-shot learning in various downstream Chinese NLP tasks.

Integrations

Reviews

Total
ease
features
design
support

No User Reviews. Be the first to provide a review:

Write a Review

Company Details

Company:
Huawei
Year Founded:
1987
Headquarters:
China
Website:
huawei.com

Media

Get Started
Recommended Products
1Password makes it easy to store and share passwords anywhere, anytime Icon
1Password makes it easy to store and share passwords anywhere, anytime

More than a password manager.

Protect yourself, your family, or your global workforce with simple security, easy secret sharing, and actionable insight reports.
Start Today

Product Details

Platforms
SaaS
On-Premises
Type of Training
Documentation

PanGu-Σ Features and Options

PanGu-Σ User Reviews

Write a Review
  • Previous
  • Next