July_AI Chip Topics|AI Inference Chip Trend Analysis by the Edge of Hundreds of Controversies(Next)
At the end of 2023, Microsoft will launch its first in-house AI gas pedal, Maia 100, to create a "home-grown computing power" moat for Azure Cloud. Maia 100 is based on the TSMC N5 + CoWoS-S process, integrates 64 GB of HBM2e, has a total bandwidth of 1.8 TB/s and is equipped with ~500 MB of multi-layer on-chip SRAM. It is also equipped with approximately 500 MB of multi-layer on-chip SRAM, which is sufficient to temporarily store most of the KV-cache and reduce the latency of transportation with HBM. The tensor core supports FP32, BF16, FP8, and Microsoft's custom MX 4-bit format; the chip has a TDP of 700 W, and cloud inference is often run at about 500 W, allowing dynamic switching between training and inference. The Maia 100 has a Superscalar vector processor and an asynchronous DMA (Direct Memory Access) controller, which prefetches memory or network data in the background to enable computation and transfer of data.