Job Description: MLOps Engineer
Salary: $140K - $180K
Location Type: Remote
Employment Type: FullTime
About Reality Defender
Reality Defender provides accurate, multimodal AI-generated media detection solutions to enable enterprises and governments to identify and prevent deepfake-driven fraud in real time. The winner of RSA's 2024 Innovation Sandbox, a Y Combinator graduate, and backed by DCVC, Accenture, IBM, and Booz Allen Hamilton, Reality Defender is the first company to pioneer multimodal and multi-model detection of AI-generated media. Our web app and platform-agnostic API built by our research-forward team ensures that our customers can swiftly and securely mitigate fraud and cybersecurity risks in real time with a frictionless, robust solution.
Reality Defender's solutions stand out because they are:
- Proven: Accurate in the real world and continuously engineered to be resilient.
- Multimodal: Detects impersonations in any multimedia format.
- Real Time: Automated alerting of ongoing deepfake attempts.
- Integrated: Flexible deployment options across existing tech stacks and applications.
Responsibilities
- Architect and manage our core MLOps infrastructure for model training, validation, and high-availability inference serving.
- Develop and own our CI/CD/CT (Continuous Integration, Delivery, and Training) pipelines to automate the testing and deployment of ML models.
- Implement comprehensive monitoring and alerting for model performance, data drift, and system health to guarantee production stability and uptime.
- Implement and maintain security best practices throughout the ML lifecycle, including data privacy, access management, and infrastructure hardening, in close collaboration with security and engineering teams.
- Partner closely with the AI and Engineering teams to streamline workflows, remove bottlenecks, and empower them to deliver value faster.
Minimum Qualifications
- BS in Computer Science, a related technical field, or equivalent practical experience.
- 3+ years of professional experience in an MLOps, DevOps, or Software Engineering role with a focus on infrastructure.
- Hands-on experience with at least one major cloud provider (e.g., AWS, GCP, Azure).
- Strong proficiency with containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Demonstrated experience designing and implementing automated CI/CD pipelines from scratch (e.g., using Jenkins, GitHub Actions).
Preferred Qualifications
- MS in Computer Science or a related technical field.
- Proficient in Python, with experience writing scientific software and collaborating in code-centric research environments.
- Deep familiarity with AWS and Terraform - codified VPCs, EKS clusters, IAM least-privilege policies, and multi-account landing zones are second nature to you.
- Comfortable with ML workflow orchestration and metadata tools such as MLflow or Airflow, and experienced in Linux system administration.
- Skilled in configuring monitoring and observability platforms like Weights & Biases or Datadog, with the ability to integrate GPU-level metrics and build real-time dashboards tracking utilization, memory, error rates, drift, and latency across training and inference.
- Strong grasp of the end-to-end machine learning lifecycle, from data ingestion and processing through model training, evaluation, deployment, and monitoring.
- Experience working with human-centered, complex, and often messy datasets, with domain knowledge in social sciences or adjacent fields such as behavioral research, human-computer interaction, or digital media.