Research Engineer - Multimodal Companion Agent at DeepMind

Tokyo, Japan

DeepMind Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
Artificial Intelligence, TechnologyIndustries

Requirements

  • Expertise in LLM post-training and evaluation
  • Experience with multimodal domains (vision, audio, text)
  • Proficiency in developing and optimizing multimodal AI models
  • Skills in building and maintaining robust data pipelines for training and evaluation
  • Ability to design, implement, and run experiments using metrics, prompt engineering, and few-shot learning
  • Knowledge of implementing algorithms for analyzing user interactions via vision and audio
  • Experience working closely with research scientists and engineers in a collaborative environment
  • Passion for AI, human-computer interaction, and agentic technologies

Responsibilities

  • Translate research concepts into practical implementations by developing and optimizing multimodal AI models, and building and maintaining robust data pipelines for training and evaluation
  • Design, implement, and run experiments to evaluate the performance and robustness of multimodal companion AI agents, using metrics and techniques like prompt engineering and few-shot learning
  • Implement algorithms to enable the agent to analyze user interactions via vision and audio, providing contextually relevant assistance in voice
  • Work closely with research scientists and engineers, contributing to team discussions, sharing knowledge, and actively participating in code reviews to foster a collaborative environment
  • Proactively identify and address technical challenges, stay updated on the latest AI advancements, and focus on developing solutions that can be effectively integrated into Google products and services, contributing to product impact

Skills

Key technologies and capabilities for this role

LLMsMultimodal AIVision ProcessingAudio ProcessingText ProcessingLLM Post-TrainingLLM EvaluationAI Agents

Questions & Answers

Common questions about this position

Where is this position located?

The role is based in Tokyo, where you will join the team focused on multimodal companion agents.

What skills are required for this Research Engineer role?

Key skills include expertise in large language models (LLMs) particularly in the multimodal domain (vision, audio, text), LLM post-training and evaluation, and the ability to implement and optimize multimodal research concepts.

What is the team environment like at Google DeepMind?

You will collaborate with a world-class, cross-functional team of researchers and software engineers in a dynamic and collaborative environment, working on cutting-edge research with a focus on safety, ethics, and public benefit.

What does the role involve?

The role focuses on developing state-of-the-art multimodal companion agents powered by Gemini, including implementation and optimization of research concepts across domains like education, health, and gaming.

What is the compensation for this position?

This information is not specified in the job description.

DeepMind

Develops artificial general intelligence systems

About DeepMind

This company leads in the field of artificial general intelligence (AGI), with notable applications across healthcare, energy management, and biotechnology. Their work in early diagnostic tools for eye diseases, optimizing energy usage in major data centers, and groundbreaking contributions to protein structure prediction underlines their commitment to harnessing AI for diverse practical applications. The company's dedication to pushing the boundaries of AI technology not only propels the industry forward but also creates a dynamic and impactful working environment for its employees.

London, United KingdomHeadquarters
2010Year Founded
$4.9MTotal Funding
ACQUISITIONCompany Stage
AI & Machine Learning, BiotechnologyIndustries
1,001-5,000Employees

Benefits

Performance Bonus

Risks

Emerging AI models may challenge DeepMind's current strategies.
Backlash against AI models like Gemini poses reputational risks.
Labeling AI-generated content could increase operational complexity for DeepMind.

Differentiation

DeepMind combines AI, ML, and neuroscience for general-purpose learning algorithms.
DeepMind's AlphaFold model advances protein folding research significantly.
GraphCast by DeepMind offers rapid, accurate ten-day weather forecasts.

Upsides

AI-driven drug discovery is set to grow significantly in 2024.
AlphaCode 2 showcases AI's potential in competitive programming.
DeepMind's AI tools are transforming music creation and meteorology.

Land your dream remote job 3x faster with AI