One of many huge bulletins at as we speak’s GTC Fall 2021 is on the GPU facet, however not the place many might have anticipated is with the NVIDIA A2. Whereas on the high-end we have now seen replacements for the info heart GPUs, together with the workstation, and even shopper markets, the one space that we have now not seen get an replace is the AI inference area. Extra particularly, the low-profile AI inference card area. The NVIDIA A2 lastly updates the section held by the NVIDIA T4.
NVIDIA A2 and Market Perspective
The NVIDIA A2 is a low energy and low profile PCIe card. Particularly the TDP is barely 40-60W. The interface can be PCIe Gen4 x8. Whereas one might instantly query why anybody would need a GPU with out an x16 connector and such low TDP within the information heart, the reason being merely that these are straightforward to place into severs and each to energy and funky.
One of many largest breakthroughs of the NVIDIA T4 that we reviewed was that it was a low-profile entry into the NVIDIA line-up. It might subsequently be bodily slotted into locations usually reserved for NICs and different usually low-profile gadgets. The NVIDIA T4 additionally had a double-digit TDP. Just like the NVIDIA A2, it’s designed to be powered off of the PCIe bus as a substitute of requiring exterior energy connections. This helps to enhance airflow in a chassis and reduces the system necessities for with the ability to combine the A2. With decrease energy, it additionally implies that the A2 might be positioned into servers on the edge the place there could also be tighter energy envelopes to stick to.
We’re going to put a full set of key specs under, however the card additionally has 16GB of GDDR6 reminiscence. One characteristic we didn’t see listed is we didn’t see MIG (many-instance GPU) help listed. Beforehand, NVIDIA’s place was that one can be extra possible to make use of a bigger GPU with MIG to get AI inference in edge servers. Whereas which will work for density, it doesn’t work for these instances the place there’s solely 40-120W out there and one wants 1-3 inference playing cards. That appears to be the market that NVIDIA is focusing on right here.
General, it is a good step for NVIDIA. It additionally feels very late within the cycle. PCIe Gen4 goes to get replaced by PCIe Gen5 within the information heart beginning in Q2 2022. NVIDIA says the playing cards can be found as we speak, however there’s solely a brief window till we see Gen5 gadgets at this level. Certainly, we have already got been taking a look at PCIe Gen5 on the desktop. Within the meantime, the market has had to make use of the NVIDIA T4 or use MIG with bigger GPUs. That is Ampere coming to a well-liked and established market section round six quarters after the NVIDIA A100 launched.
Nonetheless, we’re excited to see extra NVIDIA A2 servers, because the T4 has been extraordinarily common.
NVIDIA A2 Key Specs
Listed below are the NVIDIA A2 key specs from NVIDIA’s web site as of launch:
|Peak FP32||4.5 TF|
|TF32 Tensor Core||9 TF | 18 TF¹|
|BFLOAT16 Tensor Core||18 TF | 36 TF¹|
|Peak FP16 Tensor Core||18 TF | 36 TF¹|
|Peak INT8 Tensor Core||36 TOPS | 72 TOPS¹|
|Peak INT4 Tensor Core||72 TOPS | 144 TOPS¹|
|Media engines||1 video encoder
2 video decoders (contains AV1 decode)
|GPU reminiscence||16GB GDDR6|
|GPU reminiscence bandwidth||200GB/s|
|Interconnect||PCIe Gen4 x8|
|Type issue||1-slot, low-profile PCIe|
|Max thermal design energy (TDP)||40–60W (configurable)|
|Digital GPU (vGPU) software program help²||NVIDIA Digital PC (vPC), NVIDIA Digital Functions (vApps), NVIDIA RTX Digital Workstation (vWS), NVIDIA AI Enterprise, NVIDIA Digital Compute Server (vCS)|