Staff Data Engineer at The Walt Disney Company

Nicasio, California, United States

The Walt Disney Company Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Entertainment, MediaIndustries

Requirements

  • Master’s Degree with preference for PhD in Data Engineering/Science, Computer Science, Signal Processing, or a related field
  • 8+ years of experience in data engineering or data science with a focus on building pipelines for AI/ML applications
  • Proficiency in Python, with expertise in data manipulation libraries such as Pandas, NumPy, and PyTorch’s data utilities
  • Hands-on experience with audio processing libraries and tools (e.g., Librosa, FFmpeg, SoX) for handling complex audio formats
  • Familiarity with scalable pipeline tools like GitLab, Apache Spark, Airflow, or Luigi, and experience with containerized workflows (Docker, Kubernetes)
  • Strong understanding of data pipeline requirements for model training, retraining, and evaluation in iterative research workflows
  • Experience with immersive and multichannel audio formats
  • Knowledge of cloud-based platforms and tools for storage and processing, such as AWS S3, Redshift, or Google BigQuery
  • Strong problem-solving skills, with a proactive mindset for addressing evolving data challenges
  • Preferred Qualifications
  • Experience integrating data pipelines with AI/ML workflows, including active learning and model retraining
  • Familiarity with audio-specific datasets and metadata management strategies
  • Knowledge of machine learning principles and how data quality impacts model performance
  • Experience with distributed training pipelines and large-scale dataset processing
  • Contributions to open-source projects or published research in the fields of data science or audio processing
  • Experience with visualization tools (e.g., Tableau, Matplotlib) for quality assurance and exploratory data analysis
  • Expertise in designing systems to support AI/ML model monitoring and retraining over time

Responsibilities

  • Design, implement, and maintain scalable, automated data pipelines for the ingestion, preprocessing, and transformation of large-scale audio datasets
  • Ensure pipelines support efficient model training and retraining workflows, enabling continuous improvement of AI/ML models
  • Collaborate with AI/ML researchers to define data requirements and integrate feedback to improve data pipeline functionality
  • Develop advanced preprocessing techniques for immersive and multichannel audio formats (e.g., Dolby Atmos, high-order ambisonics)
  • Automate data cleaning, normalization, and augmentation processes to prepare datasets for various model architectures, including foundational models and transformers
  • Integrate external datasets and APIs while ensuring compliance with legal and ethical data usage standards
  • Monitor and optimize pipeline performance to handle complex and dynamic data structures effectively
  • Create tools and workflows for annotating, labeling, and curating datasets, including the use of active learning methods
  • Perform exploratory data analysis to uncover trends, validate dataset quality, and identify data gaps

Skills

Data Pipelines
ETL
Data Ingestion
Data Preprocessing
Data Transformation
Audio Processing
Dolby Atmos
Ambisonics
Data Cleaning
Data Normalization
Data Augmentation
Machine Learning
Transformers
API Integration
Active Learning

The Walt Disney Company

Leading producers & providers of entertainment and information

About The Walt Disney Company

N/AHeadquarters
1923Year Founded
N/ACompany Stage
10,001+Employees

Land your dream remote job 3x faster with AI