Sr Staff R&D Engineer at The Walt Disney Company

Nicasio, California, United States

The Walt Disney Company Logo
Not SpecifiedCompensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Entertainment, MediaIndustries

Requirements

  • MSc or PhD in Computer Science, Electrical Engineering, Applied Math, or a related field with a focus on AI/ML and multi-modal signal processing
  • 5 years of professional experience in applied ML, with a deep focus on audio-centric AI/ML research and deployment
  • Expertise in building and scaling models using PyTorch, with fluency in training, fine-tuning, and inference for deep neural networks
  • Demonstrated experience developing generative models such as VAE, GAN, diffusion models, or neural vocoders (e.g., HiFi-GAN, WaveNet)
  • Deep understanding of audio-specific ML domains, including source separation, speech enhancement, music processing, and cross-modal tasks
  • Experience with MLOps tooling (e.g., Weights & Biases, MLflow, Datachain), Docker-based containerization, and scalable infrastructure for distributed training
  • Fluency in audio signal processing fundamentals and the integration of DSP into ML pipelines
  • Proven ability to contribute to architectural planning, research strategy, and production deployment in complex, multi-stakeholder environments
  • Preferred Qualifications
  • Familiarity with audio/text/video multi-modal frameworks and cross-domain representations
  • Experience implementing real-time or near-real-time inference pipelines in cloud or edge environments (e.g., AWS, GCP, on-prem GPUs)
  • Working knowledge of latent diffusion audio models (e.g., stable-audio, AudioLDM, AudioGen)
  • Strong knowledge of industry-standard audio datasets and benchmarks (LibriSpeech, VCTK, MUSDB, etc.)

Responsibilities

  • Lead the research, design, and implementation of state-of-the-art machine learning algorithms for speech processing, voice transfer, source separation, and upmixing in media post-production environments
  • Drive the architecture and deployment of scalable model training pipelines using PyTorch and distributed computing frameworks
  • Develop novel generative audio models, including latent diffusion, flow-based models, variational autoencoders, and neural vocoders, optimized for professional soundtrack production
  • Own end-to-end model lifecycle management: pretraining, fine-tuning, validation, inference optimization, and CI/CD integration
  • Guide the development of personalized model adaptation workflows to support per-user tuning, cross-project continuity, and flexible deployment
  • Collaborate with product, platform, and engineering leads to define integration strategies within a secure, cloud-optimized SaaS environment
  • Stay at the forefront of generative audio, multi-modal modeling, and self-supervised learning—translating emerging research into applied innovation
  • Contribute to internal tooling and infrastructure that improves iteration speed, reproducibility, and explainability of deployed models
  • Mentor junior researchers and engineers, and contribute to a culture of rigorous experimentation, collaboration, and continuous improvement

Skills

PyTorch
Machine Learning
Speech Processing
Source Separation
Upmixing
Neural Vocoders
Latent Diffusion Models
Variational Autoencoders
Flow-based Models
Distributed Computing
CI/CD
Generative Audio
Model Training Pipelines

The Walt Disney Company

Leading producers & providers of entertainment and information

About The Walt Disney Company

N/AHeadquarters
1923Year Founded
N/ACompany Stage
10,001+Employees

Land your dream remote job 3x faster with AI