Groq's mission is to set the standard in GenAI inference speeds, enabling real-time AI applications to be developed today. LPU, or Language Processing Unit, inference engines are a new end-to-end system that can provide the fastest inference possible for computationally intensive applications, including AI language applications. The LPU was designed to overcome two bottlenecks in LLMs: compute density and memory bandwidth. In terms of LLMs, an LPU has a greater computing capacity than both a GPU and a CPU. This reduces the time it takes to calculate each word, allowing text sequences to be generated faster. LPU's inference engine can also deliver orders of magnitude higher performance on LLMs than GPUs by eliminating external memory bottlenecks. Groq supports machine learning frameworks like PyTorch TensorFlow and ONNX.