Data Acquisition Specialist at Twelve Labs

Seoul, South Korea

Twelve Labs Logo
Not SpecifiedCompensation
Junior (1 to 2 years), Mid-level (3 to 4 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, Media, Broadcasting, SecurityIndustries

Requirements

  • Understanding of the full lifecycle of machine learning data collection, selection, and review
  • Execution-oriented mindset with the ability to proactively secure data ("If there's no data, I'll find it")
  • Detail-oriented with a focus on not overlooking even small errors and a sense of responsibility ("The data I select determines model performance")
  • Strong communication skills to actively collaborate and communicate clearly with various roles
  • Experience with simple data preprocessing and automation using Python, SQL, Shell, etc
  • Preferred Qualifications
  • Experience in developing and operating ML training data QA or labeling guidelines
  • Experience in managing annotation quality for video/multimodal datasets
  • Experience collecting and processing open-source/web-based public datasets
  • Experience in QA from the perspective of AI ethics, bias removal, and data responsibility

Responsibilities

  • Secure and select multimodal data (video, audio, text, etc.) for large-scale multimodal foundation model training
  • Design and operate evaluation datasets and query sets reflecting real-world use cases from industries like OTT, IPTV, security control, advertising, etc
  • Perform qualitative and quantitative quality evaluations based on accuracy, diversity, and representativeness of collected data
  • Design and execute data quality management (QA) processes such as noise removal, duplicate filtering, and labeling guide refinement
  • Lead precise communication and quality improvement loops with data engineers, researchers, annotators, and other collaborators

Skills

Key technologies and capabilities for this role

PythonSQLShellData AcquisitionData QAMultimodal DataAnnotationData PreprocessingNoise RemovalDeduplicationLabeling GuidelinesBias Mitigation

Questions & Answers

Common questions about this position

What is the work location or arrangement for this role?

The position is hybrid, with offices in San Francisco and Seoul, and features a hybrid work setup combining autonomy and collaboration.

What benefits are offered to employees?

Benefits include MacBook and 700,000 KRW worth of remote work equipment support with replacement every 3 years, monthly 600,000 KRW corporate card for meals and transportation, office snack bar, 2-week year-end winter break, annual health checkup, and English education program support.

What skills or experience are required for this role?

Required experience includes understanding the full lifecycle of machine learning data collection, selection, and review; an execution-oriented mindset to secure data; detail-oriented responsibility; strong communication for collaboration; and experience with Python, SQL, Shell for data preprocessing and automation.

What is the company culture like at Twelve Labs?

The company emphasizes core values like honesty and reflection towards self and team, perseverance and humility without fearing failure or feedback, continuous learning to elevate team capabilities, and enjoying the process of solving challenging problems together.

What makes a strong candidate for this position?

Strong candidates have a deep understanding of data processes from collection to QA, an proactive mindset to source data independently, meticulous attention to detail with ownership of data quality, excellent communication skills, and hands-on experience with tools like Python, SQL, and Shell; preferred experience includes ML data QA, multimodal annotation, and open-source dataset handling.

Twelve Labs

AI system for video content understanding

About Twelve Labs

Twelve Labs focuses on artificial intelligence and video understanding by developing a system that analyzes videos to extract key features like actions, objects, and speech. This information is transformed into vector representations, enabling fast semantic search within large video datasets. The company differentiates itself by providing a platform that is faster and more effective than many existing models, allowing developers and product managers to easily integrate its technology through an API. Twelve Labs aims to make all videos searchable, enhancing the way businesses utilize video content.

San Francisco, CaliforniaHeadquarters
2021Year Founded
$104.2MTotal Funding
EARLY_VCCompany Stage
Enterprise Software, AI & Machine Learning, EducationIndustries
51-200Employees

Risks

Increased competition from emerging AI startups in video understanding.
Rapid AI advancements require continuous innovation, straining resources.
Potential over-reliance on key investors like SK Telecom and Databricks.

Differentiation

Twelve Labs offers a comprehensive AI system for multimodal video understanding.
Their technology transforms video content into vector representations for fast semantic search.
The platform's API allows easy integration into clients' systems with minimal effort.

Upsides

Growing demand for AI-driven video analytics in education, healthcare, and security sectors.
Recent $30 million funding enhances technology development and strategic partnerships.
Collaboration with cloud providers boosts scalability and efficiency of video processing.

Land your dream remote job 3x faster with AI