StableCode is a unique tool that helps developers become more efficient in their coding by using three models. The base model was trained first on a diverse collection of programming languages using the stack-dataset from BigCode, and then further trained with popular languages such as Python, Go, Java, Javascript, C, markdown and C++. We trained our models using 560B tokens on our HPC Cluster.
After the base model was established, the instruction models were tuned for specific usecases to help solve complex programming problems. To achieve this result, 120,000 code instructions/response pair in Alpaca format was trained on the base.
StableCode is a great tool for anyone who wants to learn more about programming. The long-context model is a great assistant for ensuring that autocomplete suggestions for single- and multiple-line input are available to the user. This model was designed to handle a large amount of code at once.