DBRX is an intriguing language model that has caught the attention of researchers. DBRX, developed by the Mosaic ML team at Databricks, is an open-sourced large language model (LLM) that sets a new state-of-the-art for open LLMs. With a staggering 132 billion parameters, including 36 billion active parameters, DBRX surpasses GPT-3.5 and even challenges Gemini 1.0 Pro.
What makes DBRX particularly fascinating is its fine-grained mixture-of-experts (MoE) architecture, which enables efficient training and inference. In fact, DBRX is about 40% of the size of Grok-1 while maintaining impressive performance. As a code model, it outperforms specialized models like CodeLLaMA-70B in programming tasks. Beyond its technical prowess, DBRX is available for Databricks customers via APIs, allowing enterprises to pretrain their own DBRX-class models or continue training from existing checkpoints.
DBRX is already making waves in applications like SQL, surpassing GPT-3.5 Turbo. Its journey exemplifies the collaborative spirit of AI research, and we look forward to sharing our lessons learned with the community.