Do you remember the all-star team Zuck recruited to form Meta Superintelligence Labs?
This is their very first release — which has been highly anticipated.
Now it isn’t state-of-the-art right out of the gates, but it is extremely competitive with some of today’s top models like Opus 4.6 and GPT 5.4
For example:
On SWE-Bench Verified, Muse Spark scored a 77.4. Opus 4.6 is at 80.8, Gemini 3.1 Pro is at 80.6 and Grok 4.2 is at 76.7.
On Humanity’s Last Exam, Muse Spark scored a 42.8, compared to Opus 4.6’s 40.0 and GPT 5.4’s 52.1.
That’s impressive for a 9-month old lab.
One of the things I love about the Al race, is how hands- on founder have been. Sergey Brin is back to continue the mission at Google. Tobi Lutke is coding again at Shopify.

Creators and entrepreneurs should be paying attention to major opportunity the Muse family of models will unlock for commerce.
The bet is essentially that whoever owns the Al layer between a person and their purchasing decisions owns commerce at scale.
And the personal context advantage Meta has is massive.
This model is just the beginning and may be the final piece that was missing.
