Research Internship (Fall 2025)
Cohere- Full Time
- Internship
Candidates should have proficiency in Python and PyTorch. Experience and interest in research related to Large Language Models, Vision Language Models, Video Language Models, Video Representation Learning, Video Understanding, or Action Recognition is preferred. A proactive and responsible attitude towards tasks is essential, along with effective communication and collaboration skills. Experience with real-world deep learning projects and a record of research publications in top-tier AI conferences are also desirable.
The research intern will participate in AI research projects related to the Video Foundation Model, including hypothesis validation and research development. They will design and discuss strategies for data collection and annotation necessary for model training, and regularly communicate with team members, including project leads, to provide feedback on ongoing projects.
AI system for video content understanding
Twelve Labs focuses on artificial intelligence and video understanding by developing a system that analyzes videos to extract key features like actions, objects, and speech. This information is transformed into vector representations, enabling fast semantic search within large video datasets. The company differentiates itself by providing a platform that is faster and more effective than many existing models, allowing developers and product managers to easily integrate its technology through an API. Twelve Labs aims to make all videos searchable, enhancing the way businesses utilize video content.