Data Acquisition Specialist at Twelve Labs

Seoul, South Korea

Twelve Labs Logo
Not SpecifiedCompensation
Junior (1 to 2 years), Mid-level (3 to 4 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Media, Broadcasting, SecurityIndustries

Requirements

  • Understanding of the full lifecycle of machine learning data collection, selection, and review
  • Execution-oriented mindset with the ability to proactively secure data ("If there's no data, I'll find it")
  • Detail-oriented with a focus on not overlooking even small errors and a sense of responsibility ("The data I select determines model performance")
  • Strong communication skills to actively collaborate and communicate clearly with various roles
  • Experience with simple data preprocessing and automation using Python, SQL, Shell, etc
  • Preferred Qualifications
  • Experience in developing and operating ML training data QA or labeling guidelines
  • Experience in managing annotation quality for video/multimodal datasets
  • Experience collecting and processing open-source/web-based public datasets
  • Experience in QA from the perspective of AI ethics, bias removal, and data responsibility

Responsibilities

  • Secure and select multimodal data (video, audio, text, etc.) for large-scale multimodal foundation model training
  • Design and operate evaluation datasets and query sets reflecting real-world use cases from industries like OTT, IPTV, security control, advertising, etc
  • Perform qualitative and quantitative quality evaluations based on accuracy, diversity, and representativeness of collected data
  • Design and execute data quality management (QA) processes such as noise removal, duplicate filtering, and labeling guide refinement
  • Lead precise communication and quality improvement loops with data engineers, researchers, annotators, and other collaborators

Skills

Python
SQL
Shell
Data Acquisition
Data QA
Multimodal Data
Annotation
Data Preprocessing
Noise Removal
Deduplication
Labeling Guidelines
Bias Mitigation

Twelve Labs

AI system for video content understanding

About Twelve Labs

Twelve Labs focuses on artificial intelligence and video understanding by developing a system that analyzes videos to extract key features like actions, objects, and speech. This information is transformed into vector representations, enabling fast semantic search within large video datasets. The company differentiates itself by providing a platform that is faster and more effective than many existing models, allowing developers and product managers to easily integrate its technology through an API. Twelve Labs aims to make all videos searchable, enhancing the way businesses utilize video content.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$104.2MTotal Funding
EARLY_VCCompany Stage
Enterprise Software, AI & Machine Learning, EducationIndustries
51-200Employees

Risks

Increased competition from emerging AI startups in video understanding.
Rapid AI advancements require continuous innovation, straining resources.
Potential over-reliance on key investors like SK Telecom and Databricks.

Differentiation

Twelve Labs offers a comprehensive AI system for multimodal video understanding.
Their technology transforms video content into vector representations for fast semantic search.
The platform's API allows easy integration into clients' systems with minimal effort.

Upsides

Growing demand for AI-driven video analytics in education, healthcare, and security sectors.
Recent $30 million funding enhances technology development and strategic partnerships.
Collaboration with cloud providers boosts scalability and efficiency of video processing.

Land your dream remote job 3x faster with AI