Muse is a text-to-image Transformer model that claims to achieve state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. According to the website, Muse is trained using a masked modeling task in discrete token space and uses a pre-trained large language model (LLM) to enable fine-grained language understanding. Muse's features include image editing applications without the need to fine-tune or invert the model, such as inpainting, outpainting, and mask-free editing. The software also claims to generate high-quality images from text prompts quickly, with 1.3s for 512x512 resolution or 0.5s for 256x256 resolution on TPUv4. Use cases include graphic designers, content creators, and researchers. Muse differentiates itself from competitors by claiming to offer real-time interactive editing of images with each update being processed in 760ms via a single pass through the base model and 8 passes through the superres model.