NVIDIA AI Infrastructure and Operations NCA-AIIO Exam Questions

Page: 1 / 14
Total 50 questions
Question 1

Which GPUs should be used when training a neural network for self-driving cars?



Answer : A

Training neural networks for self-driving cars requires immense computational power and high-bandwidth memory to process vast datasets (e.g., sensor data, video). NVIDIA H100 GPUs, with their cutting-edge architecture and massive throughput, are ideal for these demanding workloads. L4 GPUs are optimized for inference and efficiency, while DRIVE Orin targets in-vehicle inference, not training, making H100 the best choice.

(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on GPU Selection for Training)


Question 2

A customer is evaluating an AI cluster for training and is questioning why they should use a large number of nodes. Why would multi-node training be advantageous?



Answer : A

Multi-node training is advantageous when a model's size---its parameters, activations, and gradients---exceeds the memory capacity of a single GPU. By sharding the model across multiple nodes (using techniques like data parallelism or model parallelism), training becomes feasible and efficient. User count and inference scale are unrelated to training architecture needs, which focus on compute and memory distribution.

(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on Multi-Node Training Benefits)


Question 3

What is the primary command for checking the GPU utilization on a single DGX H100 system?



Answer : A

The nvidia-smi (System Management Interface) command is the primary tool for checking GPU utilization on NVIDIA systems, including the DGX H100. It provides real-time metrics like utilization percentage, memory usage, and power draw. NVML (NVIDIA Management Library) is an API, not a command, and ctop is unrelated, solidifying nvidia-smi as the standard.

(Reference: NVIDIA DGX H100 System Documentation, Monitoring Section)


Question 4

Which architecture is the core concept behind large language models?



Answer : C

The Transformer model is the foundational architecture for modern large language models (LLMs). Introduced in the paper 'Attention is All You Need,' it uses stacked layers of self-attention mechanisms and feed-forward networks, often in encoder-decoder or decoder-only configurations, to efficiently capture long-range dependencies in text. While BERT (a specific Transformer-based model) and attention mechanisms (a component of Transformers) are related, the Transformer itself is the core concept. State space models are an alternative approach, not the primary basis for LLMs.

(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on Large Language Models)


Question 5

What is the name of NVIDIA's SDK that accelerates machine learning?



Answer : C

The CUDA Deep Neural Network library (cuDNN) is NVIDIA's SDK specifically designed to accelerate machine learning, particularly deep learning tasks. It provides highly optimized implementations of neural network primitives---such as convolutions, pooling, normalization, and activation functions---leveraging GPU parallelism. Clara focuses on healthcare applications, and RAPIDS accelerates data science workflows, but cuDNN is the core SDK for machine learning acceleration.

(Reference: NVIDIA cuDNN Documentation, Introduction)


Question 6

How many 1 Gb Ethernet in-band network connections are in a DGX H100 system?



Answer : C

The DGX H100 system uses high-speed NVIDIA ConnectX-7 QSFP56 ports (supporting 10 GbE and above) for in-band management and storage traffic, with no 1 Gb Ethernet interfaces allocated to in-band networks. A single 1 GbE RJ45 port exists, but it's reserved for out-of-band Baseboard Management Controller (BMC) tasks, not in-band connectivity.

(Reference: NVIDIA DGX H100 System Documentation, Networking Section)


Question 7

What is the maximum number of MIG instances that an H100 GPU provides?



Answer : A

The NVIDIA H100 GPU supports up to 7 Multi-Instance GPU (MIG) partitions, allowing it to be divided into seven isolated instances for multi-tenant or mixed workloads. This capability leverages the H100's architecture to maximize resource flexibility and efficiency, with 7 being the documented maximum.

(Reference: NVIDIA H100 GPU Documentation, MIG Section)


Page:    1 / 14   
Total 50 questions