Intended Audience
These pages are intended for network administrators who are responsible for configuring and managing NVOS platforms.
Related Documentation
The following table lists the documents referenced in this user manual.
|
Document Name |
Description |
|---|---|
|
Q32xx and Q34xx XDR 800Gb/s InfiniBand Switch Systems User Manual |
This manual describes the installation and basic use of the NVIDIA XDR InfiniBand switch systems based on the NVIDIA Quantum™-3 switch ASIC |
|
NVIDIA NVOS Release Notes |
Provides information about the supported platforms, changes and new features, software known issues, and bug fixes. See the Enterprise Support Portal for more information. |
Terminology
|
Term |
Description |
|---|---|
|
AAA |
Authentication, Authorization, and Accounting. A security framework for controlling and tracking user access within a computer network. Authentication verifies user credentials; Authorization grants or denies privileges; Accounting tracks user activities and resource consumption. |
|
ACL |
Access Control List. A set of filtering rules applied to interfaces or ports that control network traffic based on source/destination IP, port number, or protocol type. |
|
ARP |
Address Resolution Protocol. Maps IP addresses to physical MAC addresses for communication within a local area network (LAN). |
|
ASIC |
Application-Specific Integrated Circuit. A custom-designed chip optimized for specific functions such as packet forwarding in network switches. |
|
BIOS |
Basic Input/Output System. Firmware that initializes and tests hardware components during system boot and loads the operating system. |
|
BMC |
Baseboard Management Controller. A service processor that enables out-of-band hardware monitoring and management (temperature, fan speed, power). |
|
CLI |
Command-Line Interface. A text-based interface used to configure, manage, and monitor the switch through command input. |
|
Core Dump |
A file containing memory and process state at the time of a crash, used for debugging. |
|
Crashdump |
Diagnostic data collected automatically after a system or process failure. |
|
CPLD |
Complex Programmable Logic Device. A programmable logic device used for control logic, often handling board-level management and power sequencing. |
|
DHCP |
Dynamic Host Configuration Protocol. Automatically assigns IP addresses and network configuration parameters to devices on a network. |
|
DNS |
Domain Name System. Translates human-readable domain names into corresponding IP addresses for network communication. |
|
ERoT |
External Root of Trust. A secure cryptographic hardware component that verifies system integrity and authenticates firmware during boot. |
|
EEPROM |
Electrically Erasable Programmable Read-Only Memory. Non-volatile memory used to store firmware or configuration data that must persist across power cycles. |
|
Ethernet |
A family of networking technologies used for LAN communication, defining physical and data link layer protocols. |
|
FPGA |
Field Programmable Gate Array. A reconfigurable integrated circuit that can be programmed post-manufacturing to implement custom logic or accelerate specific functions. |
|
Fabric |
The interconnected topology of switches and links forming a unified network for data transfer, commonly used in high-performance environments. |
|
FRU |
Field Replaceable Unit. A hardware component that can be replaced without special tools, typically including fans, power supplies, and management modules. |
|
FTP / TFTP / SFTP |
File Transfer Protocols. Protocols used to transfer files between systems. FTP uses TCP; TFTP is simplified and connectionless; SFTP provides secure encrypted transfer over SSH. |
|
gNMI |
gRPC Network Management Interface. A streaming-based protocol for network configuration and telemetry using gRPC, supporting both configuration and real-time monitoring. |
|
GUI |
Graphical User Interface. A visual interface allowing users to interact with and configure the system via icons, menus, and windows rather than command lines. |
|
Host |
A device connected to a network that can send, receive, and process data with other network devices. |
|
ICMP |
Internet Control Message Protocol. Used to send control messages and error notifications between network devices, commonly used for ping tests. |
|
I2C |
Inter-Integrated Circuit. A low-speed serial communication bus used internally on circuit boards for connecting components such as sensors, EEPROMs, or CPLDs. |
|
LACP |
Link Aggregation Control Protocol. A protocol for combining multiple physical network links into a single logical interface for redundancy and performance. |
|
LAG |
Link Aggregation Group. A logical grouping of multiple physical links to increase bandwidth and provide redundancy. |
|
Loopback Interface |
A virtual interface used for testing, routing, and management reachability. |
|
MAC |
Media Access Control Address. A unique hardware identifier assigned to each network interface for communication on the physical layer. |
|
MTU |
Maximum Transmission Unit. The largest payload size (in bytes) that can be sent in a single packet without fragmentation. |
|
NTP |
Network Time Protocol. Synchronizes system clocks of devices across a network to maintain accurate timekeeping. |
|
NVRAM |
Non-Volatile Random Access Memory. Memory that retains data even when power is turned off, often used for configuration storage. |
|
NVLink |
A high-speed interconnect developed by NVIDIA that allows direct communication between GPUs or between GPU and CPU. |
|
NVOS |
NVIDIA Operating System. Provides system and network management functionality for NVLink switches. |
|
Network Adapter |
A hardware component that enables communication between a computer or device and a network. |
|
Overlay Network |
A virtual network built on top of a physical underlay to abstract or isolate traffic, e.g., VXLAN-based networks. |
|
PCIe |
Peripheral Component Interconnect Express. A high-speed interface standard for connecting hardware components like NICs and GPUs. |
|
QoS |
Quality of Service. A set of mechanisms that prioritize or limit network traffic based on policies to ensure performance for critical applications. |
|
REST API |
Representational State Transfer API. A web-based interface that allows external systems to configure and monitor devices using HTTP/HTTPS requests. |
|
Routing Table |
A data table stored in a switch or router that lists paths to particular network destinations. |
|
SA |
Subnet Agent. A process running on each node that communicates with the Subnet Manager (SM) to maintain fabric topology and routing data. |
|
SCP |
Secure Copy Protocol. A secure file transfer method that uses SSH to copy files between local and remote hosts. |
|
SM |
Subnet Manager. The central process that initializes and maintains the InfiniBand or NVLink subnet, assigns local IDs (LIDs), and manages routing. |
|
SNMP |
Simple Network Management Protocol. Used for monitoring and managing network devices and collecting operational data. |
|
SNTP |
Simple Network Time Protocol. A simplified version of NTP for basic clock synchronization. |
|
SPI |
Serial Peripheral Interface. A synchronous serial communication interface used for short-distance communication between microcontrollers and peripherals. |
|
SSH |
Secure Shell. A cryptographic network protocol enabling secure remote login and command execution between devices. |
|
syslog |
A standard protocol for sending system log or event messages to a centralized log server for monitoring and analysis. |
|
TACACS+ |
Terminal Access Controller Access-Control System Plus. A protocol that provides centralized authentication, authorization, and accounting for network device access. |
|
TPM |
Trusted Platform Module. A hardware chip that provides hardware-based security, cryptographic key storage, and secure boot validation. |
|
Underlay Network |
The physical network that provides IP connectivity for overlay networks or tunnels. |
|
VRF |
Virtual Routing and Forwarding. Enables multiple routing tables to coexist on the same switch, allowing network segmentation and traffic isolation. |
|
VLAN |
Virtual Local Area Network. A logical grouping of devices that communicate as if on the same LAN, even if physically separated. |
|
Watchdog Timer |
A hardware or software timer that resets the system if it becomes unresponsive. |
|
ZTP |
Zero-Touch Provisioning. Automates initial device configuration by allowing the switch to fetch and apply configuration files and firmware on first boot. |
System Features
|
Feature |
Detail |
|---|---|
|
Software management |
|
|
Logging |
|
|
Management interface |
|
|
Chassis management |
|
|
Network management interfaces |
|
|
Security |
|
|
Cables & transceivers |
|
InfiniBand Features
|
Feature |
Detail |
|---|---|
|
Subnet manager |
|
|
IB port management |
|
|
IB Fabric |
|
Last updated: