Data Acquisition Specialist at Twelve Labs

Seoul, South Korea

Apply Now

Not SpecifiedCompensation

Junior (1 to 2 years), Mid-level (3 to 4 years)Experience Level

Full TimeJob Type

UnknownVisa

Artificial Intelligence, Media, Broadcasting, SecurityIndustries

Requirements

Understanding of the full lifecycle of machine learning data collection, selection, and review
Execution-oriented mindset with the ability to proactively secure data ("If there's no data, I'll find it")
Detail-oriented with a focus on not overlooking even small errors and a sense of responsibility ("The data I select determines model performance")
Strong communication skills to actively collaborate and communicate clearly with various roles
Experience with simple data preprocessing and automation using Python, SQL, Shell, etc
Preferred Qualifications
Experience in developing and operating ML training data QA or labeling guidelines
Experience in managing annotation quality for video/multimodal datasets
Experience collecting and processing open-source/web-based public datasets
Experience in QA from the perspective of AI ethics, bias removal, and data responsibility

Responsibilities

Secure and select multimodal data (video, audio, text, etc.) for large-scale multimodal foundation model training
Design and operate evaluation datasets and query sets reflecting real-world use cases from industries like OTT, IPTV, security control, advertising, etc
Perform qualitative and quantitative quality evaluations based on accuracy, diversity, and representativeness of collected data
Design and execute data quality management (QA) processes such as noise removal, duplicate filtering, and labeling guide refinement
Lead precise communication and quality improvement loops with data engineers, researchers, annotators, and other collaborators

Skills

Key technologies and capabilities for this role

PythonSQLShellData AcquisitionData QAMultimodal DataAnnotationData PreprocessingNoise RemovalDeduplicationLabeling GuidelinesBias Mitigation

Questions & Answers

Common questions about this position

What is the work location or arrangement for this role?

The position is hybrid, with offices in San Francisco and Seoul, and features a hybrid work setup combining autonomy and collaboration.

What benefits are offered to employees?

Benefits include MacBook and 700,000 KRW worth of remote work equipment support with replacement every 3 years, monthly 600,000 KRW corporate card for meals and transportation, office snack bar, 2-week year-end winter break, annual health checkup, and English education program support.

What skills or experience are required for this role?

Required experience includes understanding the full lifecycle of machine learning data collection, selection, and review; an execution-oriented mindset to secure data; detail-oriented responsibility; strong communication for collaboration; and experience with Python, SQL, Shell for data preprocessing and automation.

What is the company culture like at Twelve Labs?

The company emphasizes core values like honesty and reflection towards self and team, perseverance and humility without fearing failure or feedback, continuous learning to elevate team capabilities, and enjoying the process of solving challenging problems together.

What makes a strong candidate for this position?

Strong candidates have a deep understanding of data processes from collection to QA, an proactive mindset to source data independently, meticulous attention to detail with ownership of data quality, excellent communication skills, and hands-on experience with tools like Python, SQL, and Shell; preferred experience includes ML data QA, multimodal annotation, and open-source dataset handling.

Twelve Labs

AI system for video content understanding

About Twelve Labs

Twelve Labs focuses on artificial intelligence and video understanding by developing a system that analyzes videos to extract key features like actions, objects, and speech. This information is transformed into vector representations, enabling fast semantic search within large video datasets. The company differentiates itself by providing a platform that is faster and more effective than many existing models, allowing developers and product managers to easily integrate its technology through an API. Twelve Labs aims to make all videos searchable, enhancing the way businesses utilize video content.

San Francisco, CaliforniaHeadquarters

2021Year Founded

$104.2MTotal Funding

EARLY_VCCompany Stage

Enterprise Software, AI & Machine Learning, EducationIndustries

51-200Employees

Risks

Increased competition from emerging AI startups in video understanding.

Rapid AI advancements require continuous innovation, straining resources.

Potential over-reliance on key investors like SK Telecom and Databricks.

Differentiation

Twelve Labs offers a comprehensive AI system for multimodal video understanding.

Their technology transforms video content into vector representations for fast semantic search.

The platform's API allows easy integration into clients' systems with minimal effort.

Upsides

Growing demand for AI-driven video analytics in education, healthcare, and security sectors.

Recent $30 million funding enhances technology development and strategic partnerships.

Collaboration with cloud providers boosts scalability and efficiency of video processing.

Land your dream remote job 3x faster with AI

Try Jobo Free