#ai model performance

1 NEWS

VentureBeat

Cerebras runs trillion-parameter AI model 7x faster than GPUs with new chip

A new benchmark shows Cerebras Systems delivering trillion-parameter AI inference at nearly 1,000 tokens per second, outperforming GPU-based clouds by up to 29 times on real-world tasks. The milestone proves wafer-scale chips can handle massive open models where GPUs struggle.

May 21, 2026