Data Engineer – Clips/ML Data at Modal

New York, New York, United States

Modal Logo
$170,000 – $275,000Compensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Gaming, Technology, Machine LearningIndustries

Requirements

  • 5+ years of experience in data engineering, backend systems, or related roles. Experience with video data or ML infrastructure is a plus
  • Deep knowledge of ETL/ELT pipelines, distributed systems, and streaming data architectures (e.g., Kafka, Spark, Flink, etc.)
  • Strong proficiency with Python, Scala, Go, or similar languages used in data-intensive environments
  • Experience with cloud infrastructure (e.g., AWS, GCP) and modern data stack tools (e.g., dbt, Airflow, Parquet, Arrow)
  • Track record of designing systems with extreme scale and performance requirements
  • Deep understanding of data QA methodologies, anomaly detection, and automated testing in production systems
  • Passion for mentorship and team development; able to upskill engineers and advocate for engineering excellence
  • A bias toward ownership, urgency, and a desire to build systems that just work, even at scale

Responsibilities

  • Architect and operate petabyte-scale ingestion pipelines
  • Design automated QA guard-rails (schema validation, anomaly detection, deduplication)
  • Build high-performance ETL and feature-extraction jobs to process and index hundreds of millions of clips into columnar/video-native formats
  • Own the end-to-end data ingestion stack (desktop & mobile recorders, upload services, CDN)
  • Establish real-time monitoring, lineage, and “five-nines” SLAs, driving continuous improvement across storage, compute, and network layers
  • Partner with research and product to curate high-signal data slices, data-health metrics, and accelerate model experimentation
  • Champion security, privacy, and governance: implement robust RBAC, audit trails, and compliant retention policies for sensitive gameplay footage and user inputs
  • Mentor and uplevel engineers (including internal Medal platform talent), fostering a culture of craftsmanship, documentation, and ruthless focus on data excellence

Skills

ETL
Data Pipelines
Schema Validation
Anomaly Detection
Deduplication
Feature Extraction
Real-time Monitoring
Data Lineage
RBAC
Video Processing
ML Infrastructure

Modal

Employee training and skill development platform

About Modal

Modal Learning focuses on improving employee performance through skill development for businesses. Their main product, the Modal Mastery Platform, uses active learning techniques, including live cohort sessions, labs, and one-on-one coaching, to help employees engage with the material effectively. Unlike competitors, Modal Learning offers a subscription model that provides structured eight-week training programs, aligning skill development with organizational goals. The company's goal is to empower employees and help organizations retain talent by providing clear career development paths.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$30.8MTotal Funding
EARLY_VCCompany Stage
Consulting, EducationIndustries
51-200Employees

Risks

Competition from established players and emerging startups could dilute Modal's market share.
Focus on data and AI may limit appeal to companies seeking broader skill development.
Economic downturns could reduce corporate spending on employee training, impacting revenue.

Differentiation

Modal offers personalized technical skills training with on-demand coaching.
The platform uses cohort-based learning to enhance engagement and retention.
Modal's strategic skills planning aligns training with business goals.

Upsides

Increased demand for personalized learning experiences boosts Modal's market potential.
Growing emphasis on data and AI skills aligns with Modal's course offerings.
Subscription model provides steady revenue and predictable growth for Modal.

Land your dream remote job 3x faster with AI