Best ChatGLM Alternatives in 2026
Find the top alternatives to ChatGLM currently available. Compare ratings, reviews, pricing, and features of ChatGLM alternatives in 2026. Slashdot lists the best ChatGLM alternatives on the market that offer competing products that are similar to ChatGLM. Sort through ChatGLM alternatives below to make the best choice for your needs
-
1
DeepSeek-V3.1-Terminus
DeepSeek
FreeDeepSeek has launched DeepSeek-V3.1-Terminus, an upgrade to the V3.1 architecture that integrates user suggestions to enhance output stability, consistency, and overall agent performance. This new version significantly decreases the occurrences of mixed Chinese and English characters as well as unintended distortions, leading to a cleaner and more uniform language generation experience. Additionally, the update revamps both the code agent and search agent subsystems to deliver improved and more dependable performance across various benchmarks. DeepSeek-V3.1-Terminus is available as an open-source model, with its weights accessible on Hugging Face, making it easier for the community to leverage its capabilities. The structure of the model remains consistent with DeepSeek-V3, ensuring it is compatible with existing deployment strategies, and updated inference demonstrations are provided for users to explore. Notably, the model operates at a substantial scale of 685B parameters and supports multiple tensor formats, including FP8, BF16, and F32, providing adaptability in different environments. This flexibility allows developers to choose the most suitable format based on their specific needs and resource constraints. -
2
Baichuan-13B
Baichuan Intelligent Technology
FreeBaichuan-13B is an advanced large-scale language model developed by Baichuan Intelligent, featuring 13 billion parameters and available for open-source and commercial use, building upon its predecessor Baichuan-7B. This model has set new records for performance among similarly sized models on esteemed Chinese and English evaluation metrics. The release includes two distinct pre-training variations: Baichuan-13B-Base and Baichuan-13B-Chat. By significantly increasing the parameter count to 13 billion, Baichuan-13B enhances its capabilities, training on 1.4 trillion tokens from a high-quality dataset, which surpasses LLaMA-13B's training data by 40%. It currently holds the distinction of being the model with the most extensive training data in the 13B category, providing robust support for both Chinese and English languages, utilizing ALiBi positional encoding, and accommodating a context window of 4096 tokens for improved comprehension and generation. This makes it a powerful tool for a variety of applications in natural language processing. -
3
Hunyuan Motion 1.0
Tencent Hunyuan
Hunyuan Motion, often referred to as HY-Motion 1.0, represents an advanced AI model designed for transforming text into 3D motion, utilizing a billion-parameter Diffusion Transformer combined with flow matching techniques to create high-quality, skeleton-based animations in mere seconds. This innovative system comprehends detailed descriptions in both English and Chinese, allowing it to generate fluid and realistic motion sequences that can easily integrate into typical 3D animation workflows by exporting into formats like SMPL, SMPLH, FBX, or BVH, which are compatible with software such as Blender, Unity, Unreal Engine, and Maya. Its sophisticated training approach includes a three-phase pipeline: extensive pre-training on thousands of hours of motion data, meticulous fine-tuning on selected sequences, and reinforcement learning informed by human feedback, all of which significantly boost its capacity to interpret intricate commands and produce motion that is not only realistic but also temporally coherent. This model stands out for its ability to adapt to various animation styles and requirements, making it a versatile tool for creators in the gaming and film industries. -
4
Reka Flash 3
Reka
Reka Flash 3 is a cutting-edge multimodal AI model with 21 billion parameters, crafted by Reka AI to perform exceptionally well in tasks such as general conversation, coding, following instructions, and executing functions. This model adeptly handles and analyzes a myriad of inputs, including text, images, video, and audio, providing a versatile and compact solution for a wide range of applications. Built from the ground up, Reka Flash 3 was trained on a rich array of datasets, encompassing both publicly available and synthetic information, and it underwent a meticulous instruction tuning process with high-quality selected data to fine-tune its capabilities. The final phase of its training involved employing reinforcement learning techniques, specifically using the REINFORCE Leave One-Out (RLOO) method, which combined both model-based and rule-based rewards to significantly improve its reasoning skills. With an impressive context length of 32,000 tokens, Reka Flash 3 competes effectively with proprietary models like OpenAI's o1-mini, making it an excellent choice for applications requiring low latency or on-device processing. The model operates at full precision with a memory requirement of 39GB (fp16), although it can be efficiently reduced to just 11GB through the use of 4-bit quantization, demonstrating its adaptability for various deployment scenarios. Overall, Reka Flash 3 represents a significant advancement in multimodal AI technology, capable of meeting diverse user needs across multiple platforms. -
5
Yi-Lightning
Yi-Lightning
Yi-Lightning, a product of 01.AI and spearheaded by Kai-Fu Lee, marks a significant leap forward in the realm of large language models, emphasizing both performance excellence and cost-effectiveness. With the ability to process a context length of up to 16K tokens, it offers an attractive pricing model of $0.14 per million tokens for both inputs and outputs, making it highly competitive in the market. The model employs an improved Mixture-of-Experts (MoE) framework, featuring detailed expert segmentation and sophisticated routing techniques that enhance its training and inference efficiency. Yi-Lightning has distinguished itself across multiple fields, achieving top distinctions in areas such as Chinese language processing, mathematics, coding tasks, and challenging prompts on chatbot platforms, where it ranked 6th overall and 9th in style control. Its creation involved an extensive combination of pre-training, targeted fine-tuning, and reinforcement learning derived from human feedback, which not only enhances its performance but also prioritizes user safety. Furthermore, the model's design includes significant advancements in optimizing both memory consumption and inference speed, positioning it as a formidable contender in its field. -
6
BitNet
Microsoft
FreeMicrosoft’s BitNet b1.58 2B4T is a breakthrough in AI with its native 1-bit LLM architecture. This model has been optimized for computational efficiency, offering significant reductions in memory, energy, and latency while still achieving high performance on various AI benchmarks. It supports a range of natural language processing tasks, making it an ideal solution for scalable and cost-effective AI implementations in industries requiring fast, energy-efficient inference and robust language capabilities. -
7
PanGu-α
Huawei
PanGu-α has been created using the MindSpore framework and utilizes a powerful setup of 2048 Ascend 910 AI processors for its training. The training process employs an advanced parallelism strategy that leverages MindSpore Auto-parallel, which integrates five different parallelism dimensions—data parallelism, operation-level model parallelism, pipeline model parallelism, optimizer model parallelism, and rematerialization—to effectively distribute tasks across the 2048 processors. To improve the model's generalization, we gathered 1.1TB of high-quality Chinese language data from diverse fields for pretraining. We conduct extensive tests on PanGu-α's generation capabilities across multiple situations, such as text summarization, question answering, and dialogue generation. Additionally, we examine how varying model scales influence few-shot performance across a wide array of Chinese NLP tasks. The results from our experiments highlight the exceptional performance of PanGu-α, demonstrating its strengths in handling numerous tasks even in few-shot or zero-shot contexts, thus showcasing its versatility and robustness. This comprehensive evaluation reinforces the potential applications of PanGu-α in real-world scenarios. -
8
HappyAccounts
AICO Arena International
$1,900 one-time paymentHappyAccounts is a unique bilingual accounting solution that supports multiple currencies and offers a variety of language combinations, including Japanese-English, Chinese-English, Spanish-English, and Korean-English. This system allows companies to maintain consistency across their accounting processes while catering to their multilingual requirements. Designed specifically for global businesses, mid-sized enterprises, and subsidiaries of multinational corporations, HappyAccounts provides an extensive suite of bilingual financial and business management tools tailored to diverse operational needs. For example, a company based in Japan might utilize the Japanese interface, while its parent organization abroad could operate in English, seamlessly accessing reports in both languages. Additionally, a Japanese firm with international branches can efficiently consolidate data from overseas and access comprehensive reports in Japanese, ensuring that communication and understanding remain clear across all levels of the organization. This versatility makes HappyAccounts an essential asset for any company navigating the complexities of global commerce. -
9
Llama 2
Meta
FreeIntroducing the next iteration of our open-source large language model, this version features model weights along with initial code for the pretrained and fine-tuned Llama language models, which span from 7 billion to 70 billion parameters. The Llama 2 pretrained models have been developed using an impressive 2 trillion tokens and offer double the context length compared to their predecessor, Llama 1. Furthermore, the fine-tuned models have been enhanced through the analysis of over 1 million human annotations. Llama 2 demonstrates superior performance against various other open-source language models across multiple external benchmarks, excelling in areas such as reasoning, coding capabilities, proficiency, and knowledge assessments. For its training, Llama 2 utilized publicly accessible online data sources, while the fine-tuned variant, Llama-2-chat, incorporates publicly available instruction datasets along with the aforementioned extensive human annotations. Our initiative enjoys strong support from a diverse array of global stakeholders who are enthusiastic about our open approach to AI, including companies that have provided valuable early feedback and are eager to collaborate using Llama 2. The excitement surrounding Llama 2 signifies a pivotal shift in how AI can be developed and utilized collectively. -
10
PanGu-Σ
Huawei
Recent breakthroughs in natural language processing, comprehension, and generation have been greatly influenced by the development of large language models. This research presents a system that employs Ascend 910 AI processors and the MindSpore framework to train a language model exceeding one trillion parameters, specifically 1.085 trillion, referred to as PanGu-{\Sigma}. This model enhances the groundwork established by PanGu-{\alpha} by converting the conventional dense Transformer model into a sparse format through a method known as Random Routed Experts (RRE). Utilizing a substantial dataset of 329 billion tokens, the model was effectively trained using a strategy called Expert Computation and Storage Separation (ECSS), which resulted in a remarkable 6.3-fold improvement in training throughput through the use of heterogeneous computing. Through various experiments, it was found that PanGu-{\Sigma} achieves a new benchmark in zero-shot learning across multiple downstream tasks in Chinese NLP, showcasing its potential in advancing the field. This advancement signifies a major leap forward in the capabilities of language models, illustrating the impact of innovative training techniques and architectural modifications. -
11
ChatGPT Plus
OpenAI
$20 per month 1 RatingWe have developed a model known as ChatGPT that engages users in dialogue. This conversational structure allows ChatGPT to effectively respond to follow-up inquiries, acknowledge errors, question faulty assumptions, and decline unsuitable requests. InstructGPT, a related model, focuses on adhering to specific instructions given in prompts and delivering comprehensive answers. ChatGPT Plus is a premium subscription service designed for ChatGPT, the conversational AI. The subscription costs $20 per month, offering subscribers several advantages: - Uninterrupted access to ChatGPT, even during high-demand periods - Accelerated response times - Access to GPT-4 - Integration of ChatGPT plugins - Capability for web-browsing with ChatGPT - Priority for new features and enhancements Currently, ChatGPT Plus is accessible to users in the United States, with plans to gradually invite individuals from our waitlist in the upcoming weeks. We also aim to broaden access and support to more countries and regions in the near future, ensuring that a wider audience can experience its benefits. -
12
GigaChat 3 Ultra
Sberbank
FreeGigaChat 3 Ultra redefines open-source scale by delivering a 702B-parameter frontier model purpose-built for Russian and multilingual understanding. Designed with a modern MoE architecture, it achieves the reasoning strength of giant dense models while using only a fraction of active parameters per generation step. Its massive 14T-token training corpus includes natural human text, curated multilingual sources, extensive STEM materials, and billions of high-quality synthetic examples crafted to boost logic, math, and programming skills. This model is not a derivative or retrained foreign LLM—it is a ground-up build engineered to capture cultural nuance, linguistic accuracy, and reliable long-context performance. GigaChat 3 Ultra integrates seamlessly with open-source tooling like vLLM, sglang, DeepSeek-class architectures, and HuggingFace-based training stacks. It supports advanced capabilities including a code interpreter, improved chat template, memory system, contextual search reformulation, and 128K context windows. Benchmarking shows clear improvements over previous GigaChat generations and competitive results against global leaders in coding, reasoning, and cross-domain tasks. Overall, GigaChat 3 Ultra empowers teams to explore frontier-scale AI without sacrificing transparency, customizability, or ecosystem compatibility. -
13
Kimi K2 Thinking
Moonshot AI
FreeKimi K2 Thinking is a sophisticated open-source reasoning model created by Moonshot AI, specifically tailored for intricate, multi-step workflows where it effectively combines chain-of-thought reasoning with tool utilization across numerous sequential tasks. Employing a cutting-edge mixture-of-experts architecture, the model encompasses a staggering total of 1 trillion parameters, although only around 32 billion parameters are utilized during each inference, which enhances efficiency while retaining significant capability. It boasts a context window that can accommodate up to 256,000 tokens, allowing it to process exceptionally long inputs and reasoning sequences without sacrificing coherence. Additionally, it features native INT4 quantization, which significantly cuts down inference latency and memory consumption without compromising performance. Designed with agentic workflows in mind, Kimi K2 Thinking is capable of autonomously invoking external tools, orchestrating sequential logic steps—often involving around 200-300 tool calls in a single chain—and ensuring consistent reasoning throughout the process. Its robust architecture makes it an ideal solution for complex reasoning tasks that require both depth and efficiency. -
14
ERNIE 3.0 Titan
Baidu
Pre-trained language models have made significant strides, achieving top-tier performance across multiple Natural Language Processing (NLP) applications. The impressive capabilities of GPT-3 highlight how increasing the scale of these models can unlock their vast potential. Recently, a comprehensive framework known as ERNIE 3.0 was introduced to pre-train large-scale models enriched with knowledge, culminating in a model boasting 10 billion parameters. This iteration of ERNIE 3.0 has surpassed the performance of existing leading models in a variety of NLP tasks. To further assess the effects of scaling, we have developed an even larger model called ERNIE 3.0 Titan, which consists of up to 260 billion parameters and is built on the PaddlePaddle platform. Additionally, we have implemented a self-supervised adversarial loss alongside a controllable language modeling loss, enabling ERNIE 3.0 Titan to produce texts that are both reliable and modifiable, thus pushing the boundaries of what these models can achieve. This approach not only enhances the model's capabilities but also opens new avenues for research in text generation and control. -
15
DeepSeek R2
DeepSeek
FreeDeepSeek R2 is the highly awaited successor to DeepSeek R1, an innovative AI reasoning model that made waves when it was introduced in January 2025 by the Chinese startup DeepSeek. This new version builds on the remarkable achievements of R1, which significantly altered the AI landscape by providing cost-effective performance comparable to leading models like OpenAI’s o1. R2 is set to offer a substantial upgrade in capabilities, promising impressive speed and reasoning abilities akin to that of a human, particularly in challenging areas such as complex coding and advanced mathematics. By utilizing DeepSeek’s cutting-edge Mixture-of-Experts architecture along with optimized training techniques, R2 is designed to surpass the performance of its predecessor while keeping computational demands low. Additionally, there are expectations that this model may broaden its reasoning skills to accommodate languages beyond just English, potentially increasing its global usability. The anticipation surrounding R2 highlights the ongoing evolution of AI technology and its implications for various industries. -
16
Megatron-Turing
NVIDIA
The Megatron-Turing Natural Language Generation model (MT-NLG) stands out as the largest and most advanced monolithic transformer model for the English language, boasting an impressive 530 billion parameters. This 105-layer transformer architecture significantly enhances the capabilities of previous leading models, particularly in zero-shot, one-shot, and few-shot scenarios. It exhibits exceptional precision across a wide range of natural language processing tasks, including completion prediction, reading comprehension, commonsense reasoning, natural language inference, and word sense disambiguation. To foster further research on this groundbreaking English language model and to allow users to explore and utilize its potential in various language applications, NVIDIA has introduced an Early Access program for its managed API service dedicated to the MT-NLG model. This initiative aims to facilitate experimentation and innovation in the field of natural language processing. -
17
Qwen2.5-Max
Alibaba
FreeQwen2.5-Max is an advanced Mixture-of-Experts (MoE) model created by the Qwen team, which has been pretrained on an extensive dataset of over 20 trillion tokens and subsequently enhanced through methods like Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). Its performance in evaluations surpasses that of models such as DeepSeek V3 across various benchmarks, including Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond, while also achieving strong results in other tests like MMLU-Pro. This model is available through an API on Alibaba Cloud, allowing users to easily integrate it into their applications, and it can also be interacted with on Qwen Chat for a hands-on experience. With its superior capabilities, Qwen2.5-Max represents a significant advancement in AI model technology. -
18
GLM-4.7-FlashX
Z.ai
$0.07 per 1M tokensGLM-4.7 FlashX is an efficient and rapid iteration of the GLM-4.7 large language model developed by Z.ai, designed to effectively handle real-time AI applications in both English and Chinese while maintaining the essential features of the larger GLM-4.7 family in a more resource-efficient format. This model stands alongside its counterparts, GLM-4.7 and GLM-4.7 Flash, providing enhanced coding capabilities and superior language comprehension with quicker response times and reduced resource requirements, making it ideal for situations that demand swift inference without extensive infrastructure. As a member of the GLM-4.7 series, it benefits from the model’s inherent advantages in programming, multi-step reasoning, and strong conversational skills, and it also accommodates long contexts for intricate tasks, all while being lightweight enough for deployment in environments with limited computational resources. This combination of speed and efficiency allows developers to leverage its capabilities in a wide range of applications, ensuring optimal performance in diverse scenarios. -
19
Olmo 3
Ai2
FreeOlmo 3 represents a comprehensive family of open models featuring variations with 7 billion and 32 billion parameters, offering exceptional capabilities in base performance, reasoning, instruction, and reinforcement learning, while also providing transparency throughout the model development process, which includes access to raw training datasets, intermediate checkpoints, training scripts, extended context support (with a window of 65,536 tokens), and provenance tools. The foundation of these models is built upon the Dolma 3 dataset, which comprises approximately 9 trillion tokens and utilizes a careful blend of web content, scientific papers, programming code, and lengthy documents; this thorough pre-training, mid-training, and long-context approach culminates in base models that undergo post-training enhancements through supervised fine-tuning, preference optimization, and reinforcement learning with accountable rewards, resulting in the creation of the Think and Instruct variants. Notably, the 32 billion Think model has been recognized as the most powerful fully open reasoning model to date, demonstrating performance that closely rivals that of proprietary counterparts in areas such as mathematics, programming, and intricate reasoning tasks, thereby marking a significant advancement in open model development. This innovation underscores the potential for open-source models to compete with traditional, closed systems in various complex applications. -
20
Samsung Gauss
Samsung
Samsung Gauss is an innovative AI model crafted by Samsung Electronics, designed to serve as a large language model that has been trained on an extensive array of text and code. This advanced model is capable of producing coherent text, translating various languages, creating diverse forms of artistic content, and providing informative answers to a wide range of inquiries. Although Samsung Gauss is still being refined, it has already demonstrated proficiency in a variety of tasks, such as: Following directives and fulfilling requests with careful consideration. Offering thorough and insightful responses to questions, regardless of their complexity or peculiarity. Crafting different types of creative outputs, which include poems, programming code, scripts, musical compositions, emails, and letters. To illustrate its capabilities, Samsung Gauss can translate text among numerous languages, including English, French, German, Spanish, Chinese, Japanese, and Korean, while also generating functional code tailored to specific programming needs. Ultimately, as development continues, the potential applications of Samsung Gauss are bound to expand even further. -
21
Mistral 7B
Mistral AI
FreeMistral 7B is a language model with 7.3 billion parameters that demonstrates superior performance compared to larger models such as Llama 2 13B on a variety of benchmarks. It utilizes innovative techniques like Grouped-Query Attention (GQA) for improved inference speed and Sliding Window Attention (SWA) to manage lengthy sequences efficiently. Released under the Apache 2.0 license, Mistral 7B is readily available for deployment on different platforms, including both local setups and prominent cloud services. Furthermore, a specialized variant known as Mistral 7B Instruct has shown remarkable capabilities in following instructions, outperforming competitors like Llama 2 13B Chat in specific tasks. This versatility makes Mistral 7B an attractive option for developers and researchers alike. -
22
Sparrow
DeepMind
Sparrow serves as a research prototype and a demonstration project aimed at enhancing the training of dialogue agents to be more effective, accurate, and safe. By instilling these attributes within a generalized dialogue framework, Sparrow improves our insights into creating agents that are not only safer but also more beneficial, with the long-term ambition of contributing to the development of safer and more effective artificial general intelligence (AGI). Currently, Sparrow is not available for public access. The task of training conversational AI presents unique challenges, particularly due to the complexities involved in defining what constitutes a successful dialogue. To tackle this issue, we utilize a method of reinforcement learning (RL) that incorporates feedback from individuals, which helps us understand their preferences regarding the usefulness of different responses. By presenting participants with various model-generated answers to identical questions, we gather their opinions on which responses they find most appealing, thus refining our training process. This feedback loop is crucial for enhancing the performance and reliability of dialogue agents. -
23
Orpheus TTS
Canopy Labs
Canopy Labs has unveiled Orpheus, an innovative suite of advanced speech large language models (LLMs) aimed at achieving human-like speech generation capabilities. Utilizing the Llama-3 architecture, these models have been trained on an extensive dataset comprising over 100,000 hours of English speech, allowing them to generate speech that exhibits natural intonation, emotional depth, and rhythmic flow that outperforms existing high-end closed-source alternatives. Orpheus also features zero-shot voice cloning, enabling users to mimic voices without any need for prior fine-tuning, and provides easy-to-use tags for controlling emotion and intonation. The models are engineered for low latency, achieving approximately 200ms streaming latency for real-time usage, which can be further decreased to around 100ms when utilizing input streaming. Canopy Labs has made available both pre-trained and fine-tuned models with 3 billion parameters under the flexible Apache 2.0 license, with future intentions to offer smaller models with 1 billion, 400 million, and 150 million parameters to cater to devices with limited resources. This strategic move is expected to broaden accessibility and application potential across various platforms and use cases. -
24
Phi-2
Microsoft
We are excited to announce the launch of Phi-2, a language model featuring 2.7 billion parameters that excels in reasoning and language comprehension, achieving top-tier results compared to other base models with fewer than 13 billion parameters. In challenging benchmarks, Phi-2 competes with and often surpasses models that are up to 25 times its size, a feat made possible by advancements in model scaling and meticulous curation of training data. Due to its efficient design, Phi-2 serves as an excellent resource for researchers interested in areas such as mechanistic interpretability, enhancing safety measures, or conducting fine-tuning experiments across a broad spectrum of tasks. To promote further exploration and innovation in language modeling, Phi-2 has been integrated into the Azure AI Studio model catalog, encouraging collaboration and development within the research community. Researchers can leverage this model to unlock new insights and push the boundaries of language technology. -
25
Wan2.1 represents an innovative open-source collection of sophisticated video foundation models aimed at advancing the frontiers of video creation. This state-of-the-art model showcases its capabilities in a variety of tasks, such as Text-to-Video, Image-to-Video, Video Editing, and Text-to-Image, achieving top-tier performance on numerous benchmarks. Designed for accessibility, Wan2.1 is compatible with consumer-grade GPUs, allowing a wider range of users to utilize its features, and it accommodates multiple languages, including both Chinese and English for text generation. The model's robust video VAE (Variational Autoencoder) guarantees impressive efficiency along with superior preservation of temporal information, making it particularly well-suited for producing high-quality video content. Its versatility enables applications in diverse fields like entertainment, marketing, education, and beyond, showcasing the potential of advanced video technologies.
-
26
QwQ-32B
Alibaba
FreeThe QwQ-32B model, created by Alibaba Cloud's Qwen team, represents a significant advancement in AI reasoning, aimed at improving problem-solving skills. Boasting 32 billion parameters, it rivals leading models such as DeepSeek's R1, which contains 671 billion parameters. This remarkable efficiency stems from its optimized use of parameters, enabling QwQ-32B to tackle complex tasks like mathematical reasoning, programming, and other problem-solving scenarios while consuming fewer resources. It can handle a context length of up to 32,000 tokens, making it adept at managing large volumes of input data. Notably, QwQ-32B is available through Alibaba's Qwen Chat service and is released under the Apache 2.0 license, which fosters collaboration and innovation among AI developers. With its cutting-edge features, QwQ-32B is poised to make a substantial impact in the field of artificial intelligence. -
27
Dolly
Databricks
FreeDolly is an economical large language model that surprisingly demonstrates a notable level of instruction-following abilities similar to those seen in ChatGPT. While the Alpaca team's research revealed that cutting-edge models could be encouraged to excel in high-quality instruction adherence, our findings indicate that even older open-source models with earlier architectures can display remarkable behaviors when fine-tuned on a modest set of instructional training data. By utilizing an existing open-source model with 6 billion parameters from EleutherAI, Dolly has been slightly adjusted to enhance its ability to follow instructions, showcasing skills like brainstorming and generating text that were absent in its original form. This approach not only highlights the potential of older models but also opens new avenues for leveraging existing technologies in innovative ways. -
28
Kimi K2
Moonshot AI
FreeKimi K2 represents a cutting-edge series of open-source large language models utilizing a mixture-of-experts (MoE) architecture, with a staggering 1 trillion parameters in total and 32 billion activated parameters tailored for optimized task execution. Utilizing the Muon optimizer, it has been trained on a substantial dataset of over 15.5 trillion tokens, with its performance enhanced by MuonClip’s attention-logit clamping mechanism, resulting in remarkable capabilities in areas such as advanced knowledge comprehension, logical reasoning, mathematics, programming, and various agentic operations. Moonshot AI offers two distinct versions: Kimi-K2-Base, designed for research-level fine-tuning, and Kimi-K2-Instruct, which is pre-trained for immediate applications in chat and tool interactions, facilitating both customized development and seamless integration of agentic features. Comparative benchmarks indicate that Kimi K2 surpasses other leading open-source models and competes effectively with top proprietary systems, particularly excelling in coding and intricate task analysis. Furthermore, it boasts a generous context length of 128 K tokens, compatibility with tool-calling APIs, and support for industry-standard inference engines, making it a versatile option for various applications. The innovative design and features of Kimi K2 position it as a significant advancement in the field of artificial intelligence language processing. -
29
GLM-4.5-Air
Z.ai
FreeZ.ai serves as a versatile, complimentary AI assistant that integrates presentations, writing, and coding into a seamless conversational platform. By harnessing the power of advanced language models, it enables users to create sophisticated slide decks with AI-generated slides, produce high-quality text for various purposes such as emails, reports, and blogs, and even write or troubleshoot intricate code. In addition to content generation, Z.ai excels in conducting thorough research and information retrieval, allowing users to collect data, condense lengthy documents, and break through writer's block, while its coding assistant can clarify code snippets, optimize functions, or generate scripts from the ground up. The user-friendly chat interface eliminates the need for extensive training; you simply communicate your requirements—be it a strategic presentation, marketing content, or a script for data analysis—and receive immediate, contextually pertinent outcomes. With capabilities that extend to multiple languages, including Chinese, as well as native function invocation and support for an extensive 128K token context, Z.ai is equipped to facilitate everything from idea generation to the automation of tedious writing or coding tasks, making it an invaluable tool for professionals across various fields. Its comprehensive approach ensures that users can navigate complex projects with ease and efficiency. -
30
Qwen2-VL
Alibaba
FreeQwen2-VL represents the most advanced iteration of vision-language models within the Qwen family, building upon the foundation established by Qwen-VL. This enhanced model showcases remarkable capabilities, including: Achieving cutting-edge performance in interpreting images of diverse resolutions and aspect ratios, with Qwen2-VL excelling in visual comprehension tasks such as MathVista, DocVQA, RealWorldQA, and MTVQA, among others. Processing videos exceeding 20 minutes in length, enabling high-quality video question answering, engaging dialogues, and content creation. Functioning as an intelligent agent capable of managing devices like smartphones and robots, Qwen2-VL utilizes its sophisticated reasoning and decision-making skills to perform automated tasks based on visual cues and textual commands. Providing multilingual support to accommodate a global audience, Qwen2-VL can now interpret text in multiple languages found within images, extending its usability and accessibility to users from various linguistic backgrounds. This wide-ranging capability positions Qwen2-VL as a versatile tool for numerous applications across different fields. -
31
Llama 4 Maverick
Meta
FreeLlama 4 Maverick is a cutting-edge multimodal AI model with 17 billion active parameters and 128 experts, setting a new standard for efficiency and performance. It excels in diverse domains, outperforming other models such as GPT-4o and Gemini 2.0 Flash in coding, reasoning, and image-related tasks. Llama 4 Maverick integrates both text and image processing seamlessly, offering enhanced capabilities for complex tasks such as visual question answering, content generation, and problem-solving. The model’s performance-to-cost ratio makes it an ideal choice for businesses looking to integrate powerful AI into their operations without the hefty resource demands. -
32
CodeGemma
Google
CodeGemma represents an impressive suite of efficient and versatile models capable of tackling numerous coding challenges, including middle code completion, code generation, natural language processing, mathematical reasoning, and following instructions. It features three distinct model types: a 7B pre-trained version designed for code completion and generation based on existing code snippets, a 7B variant fine-tuned for translating natural language queries into code and adhering to instructions, and an advanced 2B pre-trained model that offers code completion speeds up to twice as fast. Whether you're completing lines, developing functions, or crafting entire segments of code, CodeGemma supports your efforts, whether you're working in a local environment or leveraging Google Cloud capabilities. With training on an extensive dataset comprising 500 billion tokens predominantly in English, sourced from web content, mathematics, and programming languages, CodeGemma not only enhances the syntactical accuracy of generated code but also ensures its semantic relevance, thereby minimizing mistakes and streamlining the debugging process. This powerful tool continues to evolve, making coding more accessible and efficient for developers everywhere. -
33
Qwen3.5
Alibaba
FreeQwen3.5 represents a major advancement in open-weight multimodal AI models, engineered to function as a native vision-language agent system. Its flagship model, Qwen3.5-397B-A17B, leverages a hybrid architecture that fuses Gated DeltaNet linear attention with a high-sparsity mixture-of-experts framework, allowing only 17 billion parameters to activate during inference for improved speed and cost efficiency. Despite its sparse activation, the full 397-billion-parameter model achieves competitive performance across reasoning, coding, multilingual benchmarks, and complex agent evaluations. The hosted Qwen3.5-Plus version supports a one-million-token context window and includes built-in tool use for search, code interpretation, and adaptive reasoning. The model significantly expands multilingual coverage to 201 languages and dialects while improving encoding efficiency with a larger vocabulary. Native multimodal training enables strong performance in image understanding, video processing, document analysis, and spatial reasoning tasks. Its infrastructure includes FP8 precision pipelines and heterogeneous parallelism to boost throughput and reduce memory consumption. Reinforcement learning at scale enhances multi-step planning and general agent behavior across text and multimodal environments. Overall, Qwen3.5 positions itself as a high-efficiency foundation for autonomous digital agents capable of reasoning, searching, coding, and interacting with complex environments. -
34
EXAONE Deep
LG
FreeEXAONE Deep represents a collection of advanced language models that are enhanced for reasoning, created by LG AI Research, and come in sizes of 2.4 billion, 7.8 billion, and 32 billion parameters. These models excel in a variety of reasoning challenges, particularly in areas such as mathematics and coding assessments. Significantly, the EXAONE Deep 2.4B model outshines other models of its size, while the 7.8B variant outperforms both open-weight models of similar dimensions and the proprietary reasoning model known as OpenAI o1-mini. Furthermore, the EXAONE Deep 32B model competes effectively with top-tier open-weight models in the field. The accompanying repository offers extensive documentation that includes performance assessments, quick-start guides for leveraging EXAONE Deep models with the Transformers library, detailed explanations of quantized EXAONE Deep weights formatted in AWQ and GGUF, as well as guidance on how to run these models locally through platforms like llama.cpp and Ollama. Additionally, this resource serves to enhance user understanding and accessibility to the capabilities of EXAONE Deep models. -
35
mT5
Google
FreeThe multilingual T5 (mT5) is a highly versatile pretrained text-to-text transformer model, developed using a methodology akin to that of T5. This repository serves as a resource for replicating the findings outlined in the mT5 research paper. mT5 has been trained on the extensive mC4 corpus, which encompasses 101 different languages, including but not limited to Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, and many others. This impressive range of languages makes mT5 a valuable tool for multilingual applications across various fields. -
36
Qwen2
Alibaba
FreeQwen2 represents a collection of extensive language models crafted by the Qwen team at Alibaba Cloud. This series encompasses a variety of models, including base and instruction-tuned versions, with parameters varying from 0.5 billion to an impressive 72 billion, showcasing both dense configurations and a Mixture-of-Experts approach. The Qwen2 series aims to outperform many earlier open-weight models, including its predecessor Qwen1.5, while also striving to hold its own against proprietary models across numerous benchmarks in areas such as language comprehension, generation, multilingual functionality, programming, mathematics, and logical reasoning. Furthermore, this innovative series is poised to make a significant impact in the field of artificial intelligence, offering enhanced capabilities for a diverse range of applications. -
37
GLM-4.5
Z.ai
Z.ai has unveiled its latest flagship model, GLM-4.5, which boasts an impressive 355 billion total parameters (with 32 billion active) and is complemented by the GLM-4.5-Air variant, featuring 106 billion total parameters (12 billion active), designed to integrate sophisticated reasoning, coding, and agent-like functions into a single framework. This model can switch between a "thinking" mode for intricate, multi-step reasoning and tool usage and a "non-thinking" mode that facilitates rapid responses, accommodating a context length of up to 128K tokens and enabling native function invocation. Accessible through the Z.ai chat platform and API, and with open weights available on platforms like HuggingFace and ModelScope, GLM-4.5 is adept at processing a wide range of inputs for tasks such as general problem solving, common-sense reasoning, coding from the ground up or within existing frameworks, as well as managing comprehensive workflows like web browsing and slide generation. The architecture is underpinned by a Mixture-of-Experts design, featuring loss-free balance routing, grouped-query attention mechanisms, and an MTP layer that facilitates speculative decoding, ensuring it meets enterprise-level performance standards while remaining adaptable to various applications. As a result, GLM-4.5 sets a new benchmark for AI capabilities across numerous domains. -
38
StarCoder
BigCode
FreeStarCoder and StarCoderBase represent advanced Large Language Models specifically designed for code, developed using openly licensed data from GitHub, which encompasses over 80 programming languages, Git commits, GitHub issues, and Jupyter notebooks. In a manner akin to LLaMA, we constructed a model with approximately 15 billion parameters trained on a staggering 1 trillion tokens. Furthermore, we tailored the StarCoderBase model with 35 billion Python tokens, leading to the creation of what we now refer to as StarCoder. Our evaluations indicated that StarCoderBase surpasses other existing open Code LLMs when tested against popular programming benchmarks and performs on par with or even exceeds proprietary models like code-cushman-001 from OpenAI, the original Codex model that fueled early iterations of GitHub Copilot. With an impressive context length exceeding 8,000 tokens, the StarCoder models possess the capability to handle more information than any other open LLM, thus paving the way for a variety of innovative applications. This versatility is highlighted by our ability to prompt the StarCoder models through a sequence of dialogues, effectively transforming them into dynamic technical assistants that can provide support in diverse programming tasks. -
39
Alpaca
Stanford Center for Research on Foundation Models (CRFM)
Instruction-following models like GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have seen significant advancements in their capabilities, leading to a rise in their usage among individuals in both personal and professional contexts. Despite their growing popularity and integration into daily tasks, these models are not without their shortcomings, as they can sometimes disseminate inaccurate information, reinforce harmful stereotypes, and use inappropriate language. To effectively tackle these critical issues, it is essential for researchers and scholars to become actively involved in exploring these models further. However, conducting research on instruction-following models within academic settings has posed challenges due to the unavailability of models with comparable functionality to proprietary options like OpenAI’s text-DaVinci-003. In response to this gap, we are presenting our insights on an instruction-following language model named Alpaca, which has been fine-tuned from Meta’s LLaMA 7B model, aiming to contribute to the discourse and development in this field. This initiative represents a step towards enhancing the understanding and capabilities of instruction-following models in a more accessible manner for researchers. -
40
GLM-4.1V
Zhipu AI
FreeGLM-4.1V is an advanced vision-language model that offers a robust and streamlined multimodal capability for reasoning and understanding across various forms of media, including images, text, and documents. The 9-billion-parameter version, known as GLM-4.1V-9B-Thinking, is developed on the foundation of GLM-4-9B and has been improved through a unique training approach that employs Reinforcement Learning with Curriculum Sampling (RLCS). This model accommodates a context window of 64k tokens and can process high-resolution inputs, supporting images up to 4K resolution with any aspect ratio, which allows it to tackle intricate tasks such as optical character recognition, image captioning, chart and document parsing, video analysis, scene comprehension, and GUI-agent workflows, including the interpretation of screenshots and recognition of UI elements. In benchmark tests conducted at the 10 B-parameter scale, GLM-4.1V-9B-Thinking demonstrated exceptional capabilities, achieving the highest performance on 23 out of 28 evaluated tasks. Its advancements signify a substantial leap forward in the integration of visual and textual data, setting a new standard for multimodal models in various applications. -
41
Phi-4-mini-reasoning
Microsoft
Phi-4-mini-reasoning is a transformer-based language model with 3.8 billion parameters, specifically designed to excel in mathematical reasoning and methodical problem-solving within environments that have limited computational capacity or latency constraints. Its optimization stems from fine-tuning with synthetic data produced by the DeepSeek-R1 model, striking a balance between efficiency and sophisticated reasoning capabilities. With training that encompasses over one million varied math problems, ranging in complexity from middle school to Ph.D. level, Phi-4-mini-reasoning demonstrates superior performance to its base model in generating lengthy sentences across multiple assessments and outshines larger counterparts such as OpenThinker-7B, Llama-3.2-3B-instruct, and DeepSeek-R1. Equipped with a 128K-token context window, it also facilitates function calling, which allows for seamless integration with various external tools and APIs. Moreover, Phi-4-mini-reasoning can be quantized through the Microsoft Olive or Apple MLX Framework, enabling its deployment on a variety of edge devices, including IoT gadgets, laptops, and smartphones. Its design not only enhances user accessibility but also expands the potential for innovative applications in mathematical fields. -
42
Hunyuan T1
Tencent
Tencent has unveiled the Hunyuan T1, its advanced AI model, which is now accessible to all users via the Tencent Yuanbao platform. This model is particularly adept at grasping various dimensions and potential logical connections, making it ideal for tackling intricate challenges. Users have the opportunity to explore a range of AI models available on the platform, including DeepSeek-R1 and Tencent Hunyuan Turbo. Anticipation is building for the forthcoming official version of the Tencent Hunyuan T1 model, which will introduce external API access and additional services. Designed on the foundation of Tencent's Hunyuan large language model, Yuanbao stands out for its proficiency in Chinese language comprehension, logical reasoning, and effective task performance. It enhances user experience by providing AI-driven search, summaries, and writing tools, allowing for in-depth document analysis as well as engaging prompt-based dialogues. The platform's versatility is expected to attract a wide array of users seeking innovative solutions. -
43
DeepSeek V3.1
DeepSeek
FreeDeepSeek V3.1 stands as a revolutionary open-weight large language model, boasting an impressive 685-billion parameters and an expansive 128,000-token context window, which allows it to analyze extensive documents akin to 400-page books in a single invocation. This model offers integrated functionalities for chatting, reasoning, and code creation, all within a cohesive hybrid architecture that harmonizes these diverse capabilities. Furthermore, V3.1 accommodates multiple tensor formats, granting developers the versatility to enhance performance across various hardware setups. Preliminary benchmark evaluations reveal strong results, including a remarkable 71.6% on the Aider coding benchmark, positioning it competitively with or even superior to systems such as Claude Opus 4, while achieving this at a significantly reduced cost. Released under an open-source license on Hugging Face with little publicity, DeepSeek V3.1 is set to revolutionize access to advanced AI technologies, potentially disrupting the landscape dominated by conventional proprietary models. Its innovative features and cost-effectiveness may attract a wide range of developers eager to leverage cutting-edge AI in their projects. -
44
DeepScaleR
Agentica Project
FreeDeepScaleR is a sophisticated language model comprising 1.5 billion parameters, refined from DeepSeek-R1-Distilled-Qwen-1.5B through the use of distributed reinforcement learning combined with an innovative strategy that incrementally expands its context window from 8,000 to 24,000 tokens during the training process. This model was developed using approximately 40,000 meticulously selected mathematical problems sourced from high-level competition datasets, including AIME (1984–2023), AMC (pre-2023), Omni-MATH, and STILL. Achieving an impressive 43.1% accuracy on the AIME 2024 exam, DeepScaleR demonstrates a significant enhancement of around 14.3 percentage points compared to its base model, and it even outperforms the proprietary O1-Preview model, which is considerably larger. Additionally, it excels on a variety of mathematical benchmarks such as MATH-500, AMC 2023, Minerva Math, and OlympiadBench, indicating that smaller, optimized models fine-tuned with reinforcement learning can rival or surpass the capabilities of larger models in complex reasoning tasks. This advancement underscores the potential of efficient modeling approaches in the realm of mathematical problem-solving. -
45
Stable LM
Stability AI
FreeStable LM represents a significant advancement in the field of language models by leveraging our previous experience with open-source initiatives, particularly in collaboration with EleutherAI, a nonprofit research organization. This journey includes the development of notable models such as GPT-J, GPT-NeoX, and the Pythia suite, all of which were trained on The Pile open-source dataset, while many contemporary open-source models like Cerebras-GPT and Dolly-2 have drawn inspiration from this foundational work. Unlike its predecessors, Stable LM is trained on an innovative dataset that is three times the size of The Pile, encompassing a staggering 1.5 trillion tokens. We plan to share more information about this dataset in the near future. The extensive nature of this dataset enables Stable LM to excel remarkably in both conversational and coding scenarios, despite its relatively modest size of 3 to 7 billion parameters when compared to larger models like GPT-3, which boasts 175 billion parameters. Designed for versatility, Stable LM 3B is a streamlined model that can efficiently function on portable devices such as laptops and handheld gadgets, making us enthusiastic about its practical applications and mobility. Overall, the development of Stable LM marks a pivotal step towards creating more efficient and accessible language models for a wider audience.