Senior Machine Learning Engineer - (Platform)
CoinbaseFull Time
Senior (5 to 8 years)
Candidates should have 8+ years of software engineering experience with distributed systems and 4+ years of hands-on experience building ML platforms. Deep expertise in Python and modern ML infrastructure tools is required, along with proven experience with Kubernetes, containerization, and cloud platforms. A strong background in performance optimization and scalability, and experience with Ray, JupyterHub, MLflow, or similar ML platforms are also necessary. Technical expertise should include distributed systems like Ray, Kubernetes, and Docker, ML platforms such as MLflow and Kubeflow, infrastructure tools like Terraform, languages like Python and Go, observability tools like Prometheus and Grafana, and CI/CD tools like GitLab and Jenkins. Contributions to open-source ML infrastructure projects, experience with real-time, high-throughput inference systems, and a background in cybersecurity or threat detection are considered differentiators.
The Senior ML Platform Engineer will design and implement enterprise-scale ML infrastructure using Ray, Kubernetes, and cloud-native technologies. They will architect high-performance model serving solutions, build robust, scalable systems for model training, deployment, and monitoring, and lead technical decisions for critical ML platform components. Responsibilities also include developing automated ML pipelines using Airflow and MLflow, implementing monitoring and observability solutions, optimizing resource utilization, and designing fault-tolerant, highly available ML systems. Additionally, they will optimize large-scale distributed systems for maximum throughput, implement advanced memory management strategies, and design and optimize real-time inference systems.
Cloud-native endpoint security solutions provider
CrowdStrike specializes in cybersecurity, focusing on protecting businesses from cyber threats through cloud-native endpoint security solutions. Their main product, the Falcon platform, includes services like Falcon Pro, which replaces traditional antivirus with next-generation antivirus that integrates threat intelligence, Falcon Insight for endpoint detection and response, and Falcon Device Control to manage connected devices. Unlike many competitors, CrowdStrike's services are subscription-based, allowing clients to choose different levels of protection based on their needs. The company serves a diverse clientele, including many Fortune 100 companies, and is recognized as a leader in the cybersecurity field, known for its effectiveness in threat detection and response.