Member of Technical Staff, GPU Optimization at Captions

New York, New York, United States

Captions Logo
$215,000 – $300,000Compensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
AI, VideoIndustries

Requirements

  • Bachelor's degree in Computer Science, Electrical/Computer Engineering, or equivalent practical experience
  • 3+ years of hands-on experience writing and optimizing CUDA kernels for production ML workloads
  • Deep understanding of GPU architecture: memory hierarchies, warp scheduling, tensor cores, register pressure, and occupancy tuning
  • Strong Python skills and familiarity with PyTorch internals, TorchScript, and distributed data-parallel training
  • Proven track record profiling and accelerating large-scale training and inference jobs (e.g., mixed precision, kernel fusion, custom collectives)
  • Comfort working in Linux environments with modern CI/CD, containerization, and cluster managers such as Kubernetes
  • Preferred Qualifications
  • Advanced degree (MS/PhD) in Computer Science, Electrical/Computer Engineering, or related field
  • Experience with multi-modal AI systems, particularly video generation or computer vision models
  • Familiarity with

Responsibilities

  • Optimize model training and inference pipelines, including data loading, preprocessing, checkpointing, and deployment, for throughput, latency, and memory efficiency on NVIDIA GPUs
  • Design, implement, and benchmark custom CUDA and Triton kernels for performance-critical operations
  • Integrate low-level optimizations into PyTorch-based codebases, including custom ops, low-precision formats, and TorchInductor passes
  • Profile and debug the entire stack—from kernel launches to multi-GPU I/O paths—using Nsight, nvprof, PyTorch Profiler, and custom tools
  • Work closely with colleagues to co-design model architectures and data pipelines that are hardware-friendly and maintain state-of-the-art quality
  • Stay on the cutting edge of GPU and compiler tech (e.g., Hopper features, CUDA Graphs, Triton, FlashAttention, and more) and evaluate their impact
  • Collaborate with infrastructure and backend experts to improve cluster orchestration, scaling strategies, and observability for large experiments
  • Provide clear, data-driven insights and trade-offs between performance, quality, and cost
  • Contribute to a culture of fast iteration, thoughtful profiling, and performance-centric design

Skills

Key technologies and capabilities for this role

CUDAPyTorchTritonGPU OptimizationDistributed InferenceModel TrainingInference PipelinesData LoadingPreprocessingCheckpointing

Questions & Answers

Common questions about this position

What is the salary range for this position?

The salary range is $215K - $300K.

Is this role remote or does it require in-office work?

All roles require you to be in-person at the NYC HQ located in Union Square.

What key skills are required for this GPU Optimization role?

Expertise in CUDA, PyTorch, and generative models is essential, along with experience designing custom CUDA or Triton kernels, optimizing model training and inference pipelines on NVIDIA GPUs, and profiling with tools like Nsight and PyTorch Profiler.

What is the company culture like at Captions?

The company fosters a culture of fast iteration, thought leadership, and outsized impact for early team members, with a rapidly growing team of ambitious engineers, researchers, and others based in NYC.

What makes a strong candidate for this role?

Strong candidates are experts at GPU performance optimization who get excited about squeezing performance from modern GPUs, have deep knowledge of CUDA, PyTorch, Triton, and related tools, and can collaborate to co-design hardware-friendly architectures while staying on the cutting edge of GPU tech.

Captions

Video captioning and translation services

About Captions

Captions.ai enhances video content by providing captioning and translation services tailored for content creators, social media influencers, marketing agencies, and businesses. Their main offerings include automatic subtitle generation, translation into 28 languages, and video compression to improve performance. These tools simplify the video production process, allowing users to produce professional-quality videos with ease. Unlike many competitors, Captions.ai uses a freemium model, offering basic services for free while charging for advanced features, which helps attract a large user base and convert free users into paying customers. The company's goal is to make high-quality video content accessible to a wider audience, and recent funding will support their growth and product development.

New York City, New YorkHeadquarters
2021Year Founded
$82.7MTotal Funding
SERIES_CCompany Stage
Consumer Software, EntertainmentIndustries
51-200Employees

Benefits

Health Insurance
Dental Insurance
Vision Insurance
401(k) Retirement Plan
401(k) Company Match
Commuter Benefits
Wellness Program
Unlimited Paid Time Off
Flexible Work Hours

Risks

Increased competition from startups like Beeble AI could challenge Captions' market position.
Integration challenges from AlpacaML acquisition may delay product enhancements.
Rapid expansion may stretch resources, potentially affecting service quality.

Differentiation

Captions offers AI-powered video editing with automatic subtitle generation and language dubbing.
The platform supports video compression for optimized performance and accessibility.
Captions uses a freemium model to attract a wide user base and convert to paid plans.

Upsides

Captions secured $60 million in Series C funding, indicating strong investor confidence.
The acquisition of AlpacaML enhances Captions' creative tools with AI rendering capabilities.
Expansion to web and desktop platforms increases accessibility and user engagement.

Land your dream remote job 3x faster with AI