
RecursiveMAS cuts multi-agent AI token use by 75% and boosts speed 2.4x
A new framework from UIUC and Stanford replaces text-heavy agent communication with latent embeddings, cutting inference costs while improving accuracy on complex tasks like code generation and medical reasoning.