Senior Software Engineer, Vision Language Models
Motional- Full Time
- Junior (1 to 2 years)
Candidates should have significant experience solving hard problems in PyTorch, multimodal data, and distributed systems. Expertise in Python and PyTorch is required, along with practical experience working with the full development pipeline from data processing to training and inference. Experience processing large-scale text data is necessary, and familiarity with interleaved data spanning audio, video, image, and/or text is a bonus. Hands-on experience in developing or benchmarking LLMs, Vision Language Models, Audio Language Models, or generative video models is also essential. Experience in the design and development of annotation tools and synthetic data is good to have.
The Senior Research Engineer will design and develop large-scale annotation efforts for model post-training. They will build tools to evaluate and benchmark multimodal language models and develop large-scale AI training and inference methods. Ensuring efficient implementation of models and systems for data processing and training is crucial. The engineer will also build tools to visualize, evaluate, and filter datasets, collaborate with research and engineering teams across Luma to transfer research to products and services, and implement cutting-edge product prototypes based on multimodal generative AI.
Develops multimodal AI technologies for creativity
Luma AI develops multimodal artificial intelligence technologies that enhance human creativity and capabilities. Their main product, the Dream Machine, allows users to interact with various types of data, enabling creative professionals, businesses, and developers to explore innovative applications of AI. Unlike many competitors, Luma AI focuses on integrating multiple modes of interaction, which broadens the possibilities for users. The company operates on a subscription model, providing access to its AI tools and services, and aims to lead the way in AI-driven creativity and productivity.