The Information12 de jun.

Meta Bought Rivos to Accelerate Its AI Chip Push. It Isn’t Working.

AppleMicrosoftGoogleAmazonMetaNvidiaIATech

Resumo

Meta enfrenta dificuldades na integração da startup Rivos após aquisição para desenvolvimento de chips próprios, com problemas de estratégia e cultura organizacional que afetam sua redução de dependência da Nvidia.

Credit: Getty Images/Adobe.

Meta Platforms bought semiconductor startup Rivos last year to accelerate development of in-house chips and reduce its reliance on Nvidia as it pours cash into data centers for its AI ambitions.

Now six months since the acquisition closed, Meta is struggling to make it work, beset by problems that illustrate the social media giant’s bigger challenges in actually building a chip business, according to 11 current and former employees who have worked on the chip efforts. They described months of uncertainty over strategy, shifting leadership priorities for the chips division that have hindered Meta’s ability to use Rivos’s technology, and tensions between Rivos staffers and Meta’s existing chips team over strategy and other issues.

As Meta laid off about 10% of its nearly 80,000-person workforce late last month, it cut more than a quarter of the 450 Rivos employees who had just joined, according to several of the people who have worked on the chip efforts. Mark Hayter, one of four Rivos cofounders, also told employees he was leaving.

The challenges absorbing Rivos have exposed a fundamental problem inside Meta’s chipmaking efforts, the current and former employees said. The company is trying to build hardware using management systems and incentives shaped by software development. That software culture entails rapid development cycles and constant demands for measurable individual impact ill-suited to the realities of chip programs needing years of stable roadmaps.

A Meta spokesperson said its chip program is executing strongly while also noting that the faster paced evolution of AI models has forced the company to “regularly evolve our silicon roadmap as we learn what our AI workloads actually need at scale.”

Meta started designing its own semiconductors in 2020, and has since deployed hundreds of thousands of its Meta Training and Inference Accelerator, or MTIA, chips in its operation, the company has said. Chief Executive Mark Zuckerberg said in April that it is rolling out more than 1 gigawatt of those chips—enough for a large data center.

But Meta started years behind rivals Google, which began developing its Tensor Processing Units more than a decade ago, and Amazon, which bought chip design startup Annapurna Labs in 2015.

Indeed, Meta is paying both of those companies to use their chips. It also depends heavily on Nvidia, and in February struck a deal to buy as much as 6 gigawatts of AI chips from Advanced Micro Devices.

Those chips made by other companies account for an enormous share of Meta’s capital spending, which the company said in April could double this year to as much as $145 billion. Custom chips are crucial for Meta’s hopes to reduce those costs and improve its AI infrastructure’s performance.

A Chip Roadmap that Keeps Changing

When Meta bought Rivos, a startup founded by former Apple engineers, the startup had yet to mass-produce chips. But it was developing both AI chips and central processing units, or CPUs, that Meta believed could accelerate its custom chip development plans. Terms of the transaction weren’t disclosed, but it came as Rivos was attempting to raise new venture funding at more than $2 billion valuation.

Meta had licensed some Rivos technology before the deal, and Meta was particularly interested in the startup’s engineering talent, several of the employees said—especially engineers who lay out chips’ physical design, a task that Meta has previously outsourced to partners such as Broadcom. Meta also coveted Rivos’s chip architecture, the blueprint for how compute engines, memory and other core blocks fit together that determines the chip’s performance.

Integrating Rivos got off to a difficult start, the people said. Some incoming employees discovered Meta didn’t yet have clear projects for them to work on, or clear guidance on how their prior work connected to Meta’s plans. Some staffers weren’t assigned work they considered meaningful until roughly three months after joining. (Others still don’t have access to documents they need for chip design work.) Several employees said managers frequently acknowledged they were trying to determine how Rivos would exactly fit into Meta’s plan for custom chips.

The employees said Meta’s apparent lack of preparation to integrate Rivos had practical consequences, impeding the tightly coordinated, highly specialized work that chip programs depend on.

One of the first major custom chips that Meta initially planned to build largely around Rivos’s intellectual property was a chip code-named Olympus, which Meta planned to use to train its biggest AI models, several people said. But Meta changed direction after the acquisition closed, halting Olympus and shifting focus to another project known internally as Phoebe that could only handle smaller training workloads, according to people familiar with the project.

Meta removed or reduced in scope several of Phoebe’s features during its development. One person familiar with the project described relatively little new engineering work occurring on portions of the program because teams spent large amounts of time reevaluating priorities. The target date for finalizing Phoebe’s design and sending it for manufacturing was pushed from the first quarter of 2027 to the second quarter, two people said.

Compounding the challenges with Phoebe, Meta laid off some of the Rivos engineers assigned to it, the people said.

The Meta spokesperson said the changing features are consistent with “standard industry practice to ideate and be flexible” and that the company has “built the capability to ship a new chip roughly every six months.”

More broadly, Meta is moving away from attempting to train its largest AI models entirely on internally developed chips, as the gap between its technology and Nvidia’s has only grown. Meta’s current plan, which entails developing four chips in addition to Phoebe, is instead focused on inference chips that handle user queries and AI agent workloads—consistent with a broader industry shift as inference needs soar.

The Culture Clash Inside MTIA

Rivos’s integration has been further hampered by escalating tensions between existing Meta employees and incoming Rivos staff over compensation and strategy. The two sets of staffers frequently fought over whether future chips should rely primarily on Meta’s existing IP or on Rivos’s technology. These disputes became political battles because selecting core elements of the chip architecture could significantly shape Meta’s chip plans, according to several of the employees, who said the battles slowed development on some projects.

The May layoffs of about 125 Rivos employees who joined Meta through the acquisition exacerbated tensions inside the chip group.

The current and former employees said Meta’s deep roots in software have added to challenges across the chip division. Several of them identified Meta’s performance-review system as a key source of unhappiness. The reviews focus heavily on proving measurable individual impact, rather than team performance, and they happen every six months, meaning people are gauged on very short-term goals. That’s fine when you’re developing software, which can be created quickly. But semiconductor work takes much longer, making six-month reviews less practical.

Umesh Padval, managing partner at Seligman Ventures and a veteran semiconductor investor, said many software-led companies underestimate the time required to build competitive semiconductor teams and capabilities. Google and Amazon took several years to develop internal chip programs that are now largely self-sufficient, and Microsoft also faced early challenges with its in-house chips, he said.

“It’s not something you fix with a single chip,” he said. “You need to build a cadence— multiple chips over time, often several per year—before you start to get good at it.”