Senior Site Reliability Engineer (AWS, AI/ML, & APM)
GranicusFull Time
Senior (5 to 8 years)
Palo Alto, California, United States
Key technologies and capabilities for this role
Common questions about this position
This information is not specified in the job description.
The position is hybrid.
Requirements include 5+ years as a Reliability Engineer or similar, strong proficiency in GPU cloud infrastructure, programming/scripting, containerization like Kubernetes, IaC tools like Terraform, observability tools, and problem-solving skills.
The role involves working in a fast-paced, rapidly scaling company, collaborating with researchers and engineers, participating in on-call rotation, and focusing on high-reliability GPU infrastructure for AI/ML.
Strong candidates have 5+ years in reliability engineering in fast-paced environments, expertise in GPU cloud infrastructure, Kubernetes, IaC, observability tools, and preferably SRE experience in AI/ML.
Develops multimodal AI technologies for creativity
Luma AI develops multimodal artificial intelligence technologies that enhance human creativity and capabilities. Their main product, the Dream Machine, allows users to interact with various types of data, enabling creative professionals, businesses, and developers to explore innovative applications of AI. Unlike many competitors, Luma AI focuses on integrating multiple modes of interaction, which broadens the possibilities for users. The company operates on a subscription model, providing access to its AI tools and services, and aims to lead the way in AI-driven creativity and productivity.