[Remote] Network DevOps Engineer, RDMA Fabric Automation at Vultr

Remote

Vultr Logo
Not SpecifiedCompensation
N/AExperience Level
N/AJob Type
Not SpecifiedVisa
N/AIndustries

Requirements

  • Solid understanding of modern data center networking: EVPN-VXLAN, BGP, MLAG, QoS, and traffic engineering
  • Deep familiarity with RoCEv2, RDMA transport tuning, ECN/PFC, and lossless Ethernet design
  • Strong experience with automation frameworks like Ansible, and languages like Python, Golang, Rust, or PHP
  • Comfort working with telemetry and monitoring stacks — Prometheus, Grafana, Loki, ELK, or similar
  • Previous experience integrating with NetBox, Nautobot, OpsMill or similar for topology and configuration source-of-truth
  • Familiarity with CI/CD systems (GitHub Actions, Jenkins, ArgoCD) for continuous delivery of network automation
  • Strong Linux networking background, including namespaces, netlink, and system-level debugging

Responsibilities

  • Automate deployment and operations of large-scale RDMA (RoCEv2) Ethernet fabrics across Vultr data centers
  • Build Ansible and Python-based frameworks to provision, validate, and remediate underlay and overlay networks
  • Integrate network automation with Vultr’s source-of-truth systems (NetBox, OpsMill) for intent-driven configuration and validation
  • Develop telemetry ingestion and correlation pipelines (gNMI, Prometheus, Kafka, custom collectors) for real-time network health and performance metrics
  • Collaborate with platform, orchestration, and product engineering teams to optimize RDMA performance, PFC/ECN behavior, and path symmetry across fabrics
  • Implement CI/CD workflows for network configuration changes — validation, pre-checks, and rollbacks
  • Investigate complex network behaviors across layers — flow hashing, congestion domains, ECMP, and overlay interactions
  • Contribute to the design of next-generation GPU and AI interconnect fabrics, ensuring seamless integration into Vultr’s global network architecture

Vultr

Cloud infrastructure provider with global deployment

About Vultr

Vultr provides cloud infrastructure services, specializing in high-performance SSD VPS (Solid State Drive Virtual Private Servers) that can be deployed globally in just 60 seconds. Their services include cloud compute instances, storage solutions, and networking capabilities, allowing clients to manage and deploy resources easily. Unlike many competitors, Vultr operates on a subscription-based model where clients pay only for the resources they use, making it a cost-effective option for businesses of all sizes. With a strong focus on customer support, handling over 35,000 requests monthly, Vultr aims to simplify cloud computing for developers, startups, and enterprises across more than 150 countries.

West Palm Beach, FloridaHeadquarters
2014Year Founded
$323.9MTotal Funding
LATE_VCCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
51-200Employees

Benefits

Remote Work Options
401(k) Retirement Plan
401(k) Company Match
Professional Development Budget
Paid Vacation
Paid Sick Leave
Home Office Stipend
Phone/Internet Stipend
Gym Membership

Risks

Increased competition in AI infrastructure could erode Vultr's market share.
Rapid expansion may lead to operational challenges if not managed carefully.
Dependence on partnerships poses risks if strategic misalignments occur.

Differentiation

Vultr offers over 20 data center locations across 5 continents for low latency.
The company provides high-performance cloud infrastructure at a fraction of Big Tech costs.
Vultr's SSD VPS can be deployed globally in just 60 seconds.

Upsides

Vultr secured $3.5B for global expansion in AI infrastructure and cloud computing.
Partnerships with AMD, Broadcom, and Juniper enhance Vultr's AI and cloud capabilities.
Vultr's competitive pricing model attracts businesses seeking cost-effective cloud solutions.

Land your dream remote job 3x faster with AI