Performance Plummets by 92%: Intel’s China-Specific AI Chip Revealed

According to media reports, Intel has revealed in its Gaudi 3 AI chip whitepaper that it is preparing to launch a “special edition” Gaudi 3 for the Chinese market.

The China-specific Gaudi 3 includes two versions: the HL-328 OAM (Mezzanine Card) and the HL-388 PCIe Accelerator Card, with the HL-328 set to launch on June 24 and the HL-388 on September 24.

Compared to the original version, the China-specific Gaudi 3 features the same 96MB SRAM on-chip memory, 128GB HBM2e high-bandwidth memory with a bandwidth of 3.7TB/s, PCIe 5.0 x16 interface, and decoding standards.

However, due to U.S. export controls on AI chips, its composite performance (TPP) needs to be below 4800 to export to China. This means the 16-bit performance of the China-specific Gaudi 3 cannot exceed 150 TFLOPS.

The original Gaudi 3 achieves performance of up to 1835 TFLOPS on FP16/BF16, so the China-specific Gaudi 3 may need to reduce its AI performance by approximately 92% to meet U.S. export control requirements.

Nevertheless, the reduced performance also leads to a significant decrease in power consumption. According to the exposed information, the TDP (thermal design power) of the PCIe card and OAM card for the China-specific Gaudi 3 are both 450 watts, whereas the original version’s TDP is 600 watts and 900 watts, respectively.