Atlas AI Training Data Curation Internship: Fueling Knowledge Graph Intelligence at Cognite

Oslo, Oslo, Norway

Cognite Logo
Not SpecifiedCompensation
InternshipExperience Level
InternshipJob Type
UnknownVisa
Oil & Gas, Chemicals, Pharmaceuticals, Manufacturing, EnergyIndustries

Requirements

  • Passion for Machine Learning and AI
  • Ability to design and implement strategies for collecting and annotating high-quality training data specific to industrial knowledge graph query generation
  • Experience working with domain experts to ensure data accuracy and relevance
  • Proficiency in Python for developing scripts and tools to automate data cleaning, transformation, and formatting
  • Knowledge of ensuring data privacy and compliance standards
  • Familiarity with fine-tuning techniques for pre-trained language models (LLMs)
  • Experience utilizing cloud AI platforms (Google Cloud AI Platform, AWS SageMaker, Azure Machine Learning) for model training and deployment
  • Skills in monitoring training progress, analyzing model performance metrics, and iterating on strategies
  • Machine Learning skills (inferred from context as "Machine Lea" likely completes to "Machine Learning")

Responsibilities

  • Curate, prepare, and fine-tune datasets for training large language models (LLMs) to generate precise queries for industrial knowledge graphs
  • Run curated datasets on evaluation frameworks
  • Design and implement strategies for collecting and annotating high-quality training data
  • Work with domain experts to ensure accuracy and relevance of curated data
  • Develop scripts and tools in Python to automate data cleaning, transformation, and formatting for model training
  • Ensure data privacy and compliance standards during curation
  • Experiment with various fine-tuning techniques for pre-trained LLMs on curated datasets
  • Utilize cloud AI platforms (Google Cloud AI Platform, AWS SageMaker, Azure Machine Learning) for model training and deployment
  • Monitor training progress, analyze model performance metrics, and iterate on fine-tuning strategies
  • Provide actionable insights and recommendations to the product based on data analysis and model performance
  • Author clear documentation on data curation processes, fine-tuning experiments, and evaluation methodologies

Skills

Machine Learning
AI
Large Language Models
Knowledge Graphs
Data Curation
Data Annotation
Industrial Data

Cognite

Industrial data management for asset-heavy industries

About Cognite

Cognite specializes in managing industrial data and facilitating digital transformation for asset-heavy industries like oil and gas, power and utilities, and manufacturing. Its main product, Cognite Data Fusion, integrates and organizes data from various sources, making it easier for businesses to analyze and utilize this information effectively. This process, known as data contextualization, enhances the relevance of data for better decision-making. Cognite operates on a software-as-a-service (SaaS) model, allowing clients to subscribe to its software, which provides a consistent revenue stream and access to ongoing updates. Additionally, Cognite offers consulting services to assist clients in optimizing their use of the software. The company's goal is to help industries improve their operations through better data management and digital solutions.

Bærum, NorwayHeadquarters
2016Year Founded
$219.2MTotal Funding
SERIES_BCompany Stage
Data & Analytics, Industrial & Manufacturing, Enterprise SoftwareIndustries
501-1,000Employees

Benefits

Competitive Compensation + 401(k) with employer matching
Health, Dental, Vision & Disability Coverages with premiums fully covered for employees and all dependents
Unlimited PTO + flexibility to enjoy it
Paid Parental Leave Program
Learning & Development Stipends
Global Mobility & Exchange Program
Company Paid Friday Lunch via DoorDash + Fully Stocked Fridges in the offices

Risks

Emerging industrial AI startups pose a threat to Cognite's market share.
Geopolitical tensions could disrupt operations in key regions like the Middle East.
Reliance on cloud providers introduces risks related to data security and outages.

Differentiation

Cognite Data Fusion integrates and contextualizes data for asset-heavy industries.
Cognite offers a subscription-based SaaS model, ensuring continuous software updates.
Cognite's solutions enhance safety, sustainability, and efficiency in industrial operations.

Upsides

Cognite's partnership with Google Cloud enhances scalability and security for data management.
The launch of Cognite Embedded opens new innovation avenues for industrial software builders.
Cognite's joint venture with Saudi Aramco expands its influence in the MENA region.

Land your dream remote job 3x faster with AI