AI Researcher (Perception)
TavusFull Time
Senior (5 to 8 years)
San Francisco, California, United States
Candidates should possess a Ph.D. or Master’s degree in Computer Science, AI, or a related field, complemented by significant research experience in areas such as video understanding, large language models, domain adaptation, representation learning, or action recognition, ideally with supporting projects, contributions, and publications. Strong proficiency in Python and PyTorch is essential, and experience developing and deploying large-scale ML models in production or optimizing large model training is a plus. Fluency in Korean and English is required.
As a Lead Research Scientist (Finetuning), you will conduct pioneering work in video understanding, multimodal learning, and AI agents, identifying critical research problems, designing innovative solutions, and running effective experiments. You will lead finetuning efforts for video embedding and video language models, closely collaborating with the MLE and Solutions Engineering team to productionize finetuning efforts. You will develop data strategies and define evaluation methodologies, collaborate closely with team leads and researchers, clearly communicating your findings and contributing to TwelveLabs’ broader research roadmap and culture, and fostering a collaborative culture centered around open communication and dynamic idea exchange.
AI system for video content understanding
Twelve Labs focuses on artificial intelligence and video understanding by developing a system that analyzes videos to extract key features like actions, objects, and speech. This information is transformed into vector representations, enabling fast semantic search within large video datasets. The company differentiates itself by providing a platform that is faster and more effective than many existing models, allowing developers and product managers to easily integrate its technology through an API. Twelve Labs aims to make all videos searchable, enhancing the way businesses utilize video content.