Senior Datacenter Systems Architect at Sustainable Talent

Hillsboro, Oregon, United States

Sustainable Talent Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, Datacenter, HPCIndustries

Requirements

  • 8–15+ years in UNIX/Linux systems engineering, system administration, or HPC/compute infrastructure roles
  • Expert-level knowledge of Linux internals (kernel, storage subsystems, networking stack, groups, system, NUMA, etc.)
  • Proven experience architecting and running large-scale compute clusters or farms (HPC, HCI, GPU clusters, or bare-metal automation environments)
  • Deep understanding of compute, network, and storage architectures end-to-end
  • Demonstrated skill in root-cause analysis at multiple layers, including NFSv3/v4 deep troubleshooting, packet-level analysis, kernel performance tuning, distributed storage (NetApp, Ceph, Lustre, BeeGFS, etc.)
  • Strong networking fundamentals: TCP/IP, VLANs, BGP, LACP, RoCE/RDMA, NIC offloading
  • Strong automation skills: Python, Bash, Ansible, Terraform, or IaC tools
  • Experience with PXE provisioning, Kickstart, bare-metal deployments, and OS image pipelines
  • Certifications strongly preferred: UNIX/Linux certs (RHCE, RHCSA, Linux Foundation), Networking certs (CCNP, CCIE, JNCIP, etc.), Storage certs (NetApp NCIE/NCDA or similar)

Responsibilities

  • Architect, scale, and optimize complex UNIX/Linux-based compute clusters, GPU farms, and high-density datacenter systems
  • Own the design and strategy for on-prem HPC/GPU compute environments including OS architecture, distributed storage, network tuning, and interconnects
  • Perform deep-dive troubleshooting across all layers — kernel, network stack, RPC/NFS, storage protocols, firmware, drivers, bootloaders, and orchestration systems
  • Lead automation efforts using Python, Bash, Ansible, and IaC to eliminate manual processes and improve system reliability
  • Drive configuration standards for compute, network, and storage layers across bare-metal systems
  • Collaborate with architects, system software teams, networking teams, and hardware engineering to ensure platform scalability
  • Own operational excellence: uptime, performance tuning, incident response processes, and long-term platform strategy
  • Mentor and technically lead junior engineers and datacenter technicians

Skills

UNIX
Linux
HPC
GPU
kernel
NUMA
distributed storage
network tuning
Ansible
Python
Bash
IaC
bare-metal
NFS
RPC

Sustainable Talent

AI-powered talent recruitment solutions

About Sustainable Talent

Sustainable Talent is dedicated to modernizing the recruitment field, leveraging AI technology for efficient real-time recruiting, staff augmentation, and providing both full-time and contract roles. This company stands out as a bridge for talent in startups and established industry leaders, ensuring top companies are matched with the best individuals. Their commitment to cutting-edge technology and a focus on sustainable talent retention makes it an ideal workplace for those interested in the forefront of HR tech and innovative staffing solutions.

Wayne, NJ 07470, USAHeadquarters
2009Year Founded
VENTURE_UNKNOWNCompany Stage
Consulting, Enterprise Software, EducationIndustries
11-50Employees

Benefits

Health Insurance
Unlimited Paid Time Off
Hybrid Work Options

Risks

Increased competition from AI-driven recruitment platforms.
Remote work trend reduces demand for traditional staffing services.
Economic uncertainty may lead to hiring freezes.

Differentiation

Specializes in remote workforce management consulting services.
Expertise in diverse talent acquisition and inclusive workplace strategies.
Emphasizes commitment to sustainable business practices.

Upsides

Rising demand for remote work optimization services.
Growing emphasis on diversity and inclusion initiatives.
Expansion of the gig economy offers new staffing opportunities.

Land your dream remote job 3x faster with AI