SSH Configuration and Usage in Agent Deployment/Uninstall
The Cable Validation Tool (CVT) uses SSH for deploying and managing agents on Linux-based devices (hosts and switches). This document provides verification and QA teams with comprehensive guidance on SSH configuration, testing procedures, troubleshooting, and validation criteria for agent deployment and uninstall operations.
SSH Architecture
System Components
The CVT system uses multiple SSH components to handle different types of device connections:
-
SSH Connection Management
-
Base SSH client for establishing secure connections
-
Linux-specific SSH client for command execution on hosts and Linux switches
-
SFTP client for secure file transfers during deployment
-
Specialized client for MLNX-OS switch communication
-
-
Agent Deployment System
-
Linux agent deployment handler for hosts and Linux switches
-
MLNX-OS agent deployment handler for Mellanox switches
-
Device Support Matrix
|
Device Type |
OS Type |
SSH Usage |
Authentication |
|---|---|---|---|
|
Host |
Linux |
SSH + SFTP |
Password or SSH Keys |
|
Switch |
Linux (Cumulus, NVOS) |
SSH + SFTP |
Password (required) |
|
Switch |
MLNX-OS |
JSON API (not SSH) |
Password only |
Important Notes:
-
SSH is NOT used for MLNX-OS switches - they use JSON API over HTTP/HTTPS
-
For switches (including Linux switches), password authentication is required as the agent uses these credentials to communicate with the switch for port information retrieval
-
Supported Linux switch operating systems: Cumulus Linux, NVOS (for NVLink and XDR switches)
SSH Configuration
Environment Variables
Configure SSH behavior using these environment variables in /etc/cablevalidation/cvt_env.conf:
[ssh]
# SSH private key file path for HOST devices only
# NOTE: SSH keys are NOT used for switch devices (switches require password authentication)
# Switch passwords are mandatory as agents use them to communicate with switches for port information
# Path must be accessible inside the collector container
CV_SSH_KEY_FILE=
# SSH connection timeout in seconds (default: 20)
# Applied to both SSH commands and SFTP transfers
SSH_CONN_TIMEOUT=20
# Enable automatic SSH key discovery (default: true)
# Only applies to HOST devices when no password is provided
# Searches: SSH agent, ~/.ssh/id_rsa, ~/.ssh/id_dsa, ~/.ssh/id_ecdsa, ~/.ssh/id_ed25519
SSH_LOOK_FOR_KEYS=true
Key Configuration Details
-
CV_SSH_KEY_FILE
-
Only used for HOST devices (switches always require passwords)
-
Must be a container-accessible path
-
Leave empty for automatic key discovery
-
Switch devices cannot use SSH keys due to agent communication requirements
-
-
SSH_CONN_TIMEOUT
-
Connection timeout in seconds
-
Applies to both SSH commands and SFTP transfers
-
Increase for slow networks, decrease for faster failure detection
-
-
SSH_LOOK_FOR_KEYS
-
Only affects HOST devices (not applicable to switches)
-
When enabled, searches standard SSH key locations
-
Only used when no password is provided for hosts
-
Authentication Methods
1. Password Authentication
-
Supported: All Linux devices (hosts and switches)
-
Configuration: Set credentials using CVT credential management
-
Usage: MANDATORY for all switches (required for agent communication with switch for port information)
-
Usage: Optional for hosts (can use SSH keys instead)
2. SSH Key Authentication
-
Supported: HOST devices only (NOT supported for switches)
-
Configuration: Set
CV_SSH_KEY_FILEor enableSSH_LOOK_FOR_KEYS -
Usage: Available for hosts only
-
Switch Limitation: Switches cannot use SSH keys because deployed agents need password credentials to communicate with the switch OS for retrieving port information
Authentication Priority (for hosts only)
-
SSH key authentication (if key available and no password set)
-
Password authentication (if password provided)
-
Automatic key discovery (if
SSH_LOOK_FOR_KEYS=trueand no password)
Switch Authentication Requirements
-
All switch types require password authentication
-
SSH keys are not supported for switches
-
Passwords are used by agents for ongoing switch communication
-
Supported switch OS types: Cumulus Linux, NVOS (NVLink/XDR switches), SONiC
Agent Deployment Process
Linux Devices Deployment Flow
-
Preparation Phase
-
System generates deployment script from template
-
Script customized with environment-specific values (image URLs, checksums, configuration)
-
Temporary deployment file created for transfer
-
-
File Transfer Phase (SFTP)
-
Secure connection established to target device
-
Deployment script uploaded to
/tmpdirectory on target device -
Connection closed after successful transfer
-
-
Execution Phase (SSH)
-
SSH connection established for command execution
-
Deployment script executed with elevated privileges
-
Cleanup commands remove temporary files
-
Connection closed after completion
-
-
Validation Phase
-
Deployment results logged and validated
-
Temporary files removed from both systems
-
Success/failure status reported
-
Deployment Script Features
The install_agent.sh script handles:
-
Architecture detection (x86_64, aarch64)
-
Docker prerequisite checks
-
Image download with checksum verification
-
Container deployment with appropriate parameters
-
GPU support detection (for specific hardware)
-
LLDP socket mounting (for ethernet monitoring)
-
Comprehensive logging to
/var/log/cvt_deployment.log
Agent Uninstall Process
Linux Devices Uninstall Flow
-
Preparation
-
Generate uninstall script from template (
uninstall_agent.sh) -
Create temporary uninstall file
-
-
File Transfer and Execution
-
Same SFTP upload process as deployment
-
Execute uninstall script with sudo privileges
-
-
Uninstall Operations
-
Container shutdown and removal
-
Docker image cleanup
-
System resource cleanup
-
Credential Management
Setting Credentials
The CVT system supports multiple levels of credential configuration:
Default Credentials
-
Configure default username/password for all switches (password required)
-
Configure default username/password for all hosts (password can be empty if using SSH keys)
-
Applied when no specific credentials are found
Node-Specific Credentials
-
Set unique credentials for individual devices
-
Override default credentials for specific IP addresses
-
For hosts: password can be empty when using SSH key authentication
-
For switches: password is always required
-
Highest priority in credential resolution
Credential Profiles
-
Group devices with common credentials
-
Assign profile names to device groups
-
Manage credentials for multiple devices centrally
-
Same password rules apply: switches require passwords, hosts can use empty passwords with SSH keys
Credential Priority
-
Node-specific credentials
-
Credential profile credentials (if assigned)
-
Default credentials for device type
Troubleshooting
Common SSH Issues
-
Authentication Failures
SSH Authentication failure: please check device credentials
-
Verify credentials in CVT credential management
-
Check if SSH keys are properly configured for hosts
-
Ensure SSH service is running on target device
-
-
Connection Timeouts
Failed to execute commands on node: <IP>: Connection timeout
-
Increase
SSH_CONN_TIMEOUTvalue -
Check network connectivity to target device
-
Verify firewall rules allow SSH (port 22)
-
-
Permission Denied
Failed to execute commands on node: <IP>: Permission denied
-
Verify sudo access for the user account
-
Check if password is required for sudo
-
Ensure user has Docker access permissions
-
-
File Transfer Failures
Failed to upload deployment file
-
Check SFTP connectivity
-
Verify write permissions to
/tmpdirectory -
Ensure sufficient disk space on target device
-
SSH Key Issues
-
Key File Not Found
-
Verify
CV_SSH_KEY_FILEpath is accessible in container -
Check file permissions (should be 600 or 400)
-
Ensure key file is mounted into container if using Docker volumes
-
-
Key Format Issues
-
Ensure key is in OpenSSH format (not PuTTY or other formats)
-
Verify key format compatibility with paramiko SSH library
-
Check for proper key file structure and encoding
-
-
Key Permission Problems
-
Verify SSH key file permissions are restrictive (600 or 400)
-
Ensure correct ownership of key files
-
Check that key files are readable by the CVT process
-
Deployment Script Issues
-
Docker Not Available
docker is not installed, it is required for running the agent
-
Install Docker on target device
-
Ensure Docker service is running
-
Add user to docker group if needed
-
-
Image Download Failures
Failed to fetch the image from server
-
Check network connectivity to image server
-
Verify image URL is accessible
-
Check firewall rules for HTTP/HTTPS traffic
-
-
Checksum Verification Failures
Checksum verification failed!
-
Image may be corrupted during download
-
Network issues during transfer
-
Script will automatically retry download
-
Debugging Steps
-
Enable Debug Logging
-
Check deployment logs:
/var/log/cvt_deployment.logon target device -
Review CVT collector logs for SSH connection details
-
-
Manual SSH Testing
-
Test SSH connectivity with specified timeout values
-
Verify SFTP connectivity for file transfer operations
-
Test with specific SSH keys when configured
-
Validate authentication methods work as expected
-
-
Network Connectivity Testing
-
Verify basic network connectivity to target devices
-
Test SSH port accessibility (default port 22)
-
Check for firewall or network restrictions
-
Validate network latency and timeout settings
-
Best Practices
Security
-
SSH Key Management
-
Use dedicated SSH keys for CVT operations
-
Rotate keys regularly
-
Restrict key access with proper file permissions
-
Consider using SSH agent forwarding in containers
-
-
Credential Security
-
Use strong passwords
-
Implement credential rotation policies
-
Use credential profiles for device groups
-
Store credentials securely (CVT encrypts stored credentials)
-
-
Network Security
-
Use SSH key authentication when possible
-
Implement network segmentation
-
Configure firewall rules appropriately
-
Consider using SSH jump hosts for isolated networks
-
Performance
-
Connection Management
-
Adjust
SSH_CONN_TIMEOUTbased on network conditions -
Use parallel deployment for multiple devices
-
Monitor deployment worker limits
-
-
Resource Management
-
Ensure sufficient bandwidth for image transfers
-
Monitor disk space on target devices
-
Clean up temporary files after deployment
-
Operational
-
Monitoring
-
Monitor deployment success rates
-
Track authentication failures
-
Review deployment logs regularly
-
-
Documentation
-
Maintain inventory of SSH keys and their usage
-
Document credential profiles and their assignments
-
Keep network topology documentation updated
-
Container Considerations
When running CVT in containers:
-
SSH Key Access
-
Mount SSH keys into container using volume mapping
-
Configure
CV_SSH_KEY_FILEto point to container-accessible path -
Verify key file permissions and ownership within container
-
-
Network Access
-
Ensure container can reach target devices
-
Configure network settings for direct device access
-
Verify container networking doesn't block SSH connections
-
-
SSH Agent
-
Forward SSH agent for key-based authentication
-
Configure agent socket mounting for container access
-
Verify agent accessibility within container environment
-
Troubleshooting Agent Deployment Issues on mlnx_os switches
While deploying the agents, if you encounter below error messages from the nodes you can try these steps:
-
Error:
"% Failed to load image:[/var/opt/tms/images/cables_agent_latest.tar.gz].operation cancelled. check the log for details"
This can indicate that there is not enough space in the node to copy the agent image. Make some space and try again. -
Error:
"% Rule numbers must be consecutive: cannot have 40 without 39 (chain INPUT)"
This error suggests that the switch is not accepting the IP filter rule needed to permit communication between the collector and the agent on port 8251.
The switch needs to support ip filter rules and have atleast 40 default rules.
CVT checks if IP filter is enabled. Only if enabled andPacket filtering for IPv4 Table filteris set to enabled, the new rule will be added.
If IP filter is not enabled, then CVT does not add any new rule because all ports are allowed by default.
Check if IP filter is enabled:show ip filter. If you wish to enable it:ip filter enable
If IP filter is enabled and does not have the default rules, add the default rules:ip filter reset-to-default-rules
The new rule being added isip filter chain input rule insert 40 target accept dup-delete dest-port 8251 protocol tcp
Note: EDR switches are not supported at the moment as they support only older software versions and don't have ip filter rules enabled and defined by default.
Last updated: