Former Founder
RoboflowFull Time
Expert & Leadership (9+ years)
San Francisco, California, United States
Candidates should possess strong experience with deep learning frameworks such as PyTorch, JAX, or TensorFlow, and expertise in compiler development or optimizations related to distributed training and inference workflows. A proven track record of contributing to open-source projects, particularly in machine learning or high-performance computing, is highly valued, along with experience collaborating with external partners.
The AI Performance Optimization Engineer will develop the Thunder compiler, an open-source project in collaboration with a strategic partner, focusing on performance-oriented model optimizations for distributed training and inference. They will also develop optimized kernels in CUDA or Triton, integrate Thunder throughout the PyTorch Lightning ecosystem, engage with the community to champion its growth, and support the adoption of Thunder across the industry, while working closely with the Lightning team as a strategic partner.
AI development platform for coding and deployment
Lightning AI provides a platform for developing artificial intelligence applications, supporting users throughout the entire AI lifecycle from initial ideas to final deployment. The platform is accessible via web browsers, allowing developers and data scientists to easily code, prototype, and train AI models using GPUs without needing extensive setup. It operates on a subscription model, offering a cloud-based AI Studio that functions like a virtual laptop with persistent storage and environments. This setup enables users to code on CPUs, debug on GPUs, and scale their projects across multiple nodes. Key features of the platform include tools like PyTorch Lightning, Fabric, Lit-GPT, and torchmetrics, which help in optimizing and scaling AI models. Lightning AI aims to provide a user-friendly and comprehensive solution for both enterprises and individual developers looking to enhance their AI development capabilities.