Vision Researcher – Multimodal Understanding & Generation in Foundation Models at Tencent

Bellevue, Washington, United States

Tencent Logo
$149,000 – $279,800Compensation
Senior (5 to 8 years), Expert & Leadership (9+ years)Experience Level
Full TimeJob Type
UnknownVisa
Technology, Artificial IntelligenceIndustries

Requirements

  • Master’s or Ph.D. degree in Computer Science, Artificial Intelligence, Computer Vision, Machine Learning, or a related field
  • Proven multi-modal research experience in relevant areas, with familiarity with state-of-the-art technologies and a strong publication record in top-tier conferences or journals such as CVPR, ICCV, ECCV, NeurIPS, ICLR, or ICML
  • Proficiency with mainstream open-source tools and frameworks relevant to the field, and strong engineering skills to support research implementation; candidates with influential GitHub projects or contributions to high-impact open-source communities are preferred
  • Strong team spirit and ability to collaborate across disciplines, excellent communication skills, intellectual curiosity, and a goal-oriented, problem-solving mindset

Responsibilities

  • Serve as a domain expert in computer vision and collaborate with researchers from other modalities to drive cutting-edge research in native multimodal foundation models, including novel architecture design and modeling for “2D + time” and “3D + time” scenarios
  • Explore the training and design of large models for understanding and generating representations of the physical world, multimodal reasoning, and self-evolving continual learning
  • Stay up to date with the latest advancements in academia and industry; actively participate in international conferences and workshops, and engage with leading global research teams
  • Contribute impactful research outcomes to the open-source community or transfer technologies to internal product teams

Skills

Key technologies and capabilities for this role

Computer VisionMultimodal LearningFoundation ModelsMachine LearningDeep Learning3D VisionVideo UnderstandingLarge ModelsPyTorchTensorFlow

Questions & Answers

Common questions about this position

What is the salary range for this Vision Researcher position?

The expected base pay range is $149,000.00 to $279,800.00 per year, with actual pay varying based on job-related knowledge, skills, and experience. Employees may also be eligible for a sign-on payment, relocation package, and restricted stock units on a case-by-case basis.

Where is this position located?

The position is located in US-Washington-Bellevue.

What qualifications and skills are required for this role?

Candidates need a Master’s or Ph.D. in Computer Science, AI, Computer Vision, Machine Learning or related fields, proven multi-modal research experience with strong publications in top conferences like CVPR, ICCV, NeurIPS, proficiency with open-source tools, strong engineering skills, and excellent collaboration and communication abilities.

What benefits are offered for this position?

Benefits include medical, dental, vision, life and disability coverage, 401(k) plan participation, 15-25 days of vacation depending on tenure, up to 13 holidays, and up to 10 days of paid sick leave per year, subject to plan terms and possible adjustments.

What makes a strong candidate for this Vision Researcher role?

Strong candidates have a Master’s or Ph.D., proven multi-modal research with top-tier publications, proficiency in open-source tools and engineering, influential GitHub contributions, and strong collaboration skills.

Tencent

Internet platform for social, gaming, fintech

About Tencent

Tencent is a technology company that focuses on enhancing the daily lives of internet users and assisting businesses in their digital transformation. It operates in various sectors, including social networking, entertainment, fintech, and cloud computing. Tencent's main products include WeChat, a messaging and mobile payment app with over a billion users, and Tencent Games, which produces popular video games like Honor of Kings and PUBG Mobile. The company generates revenue through online advertising, subscription services, in-app purchases, mobile payments, and cloud services. Unlike many competitors, Tencent has a diverse business model that allows it to serve both individual users and enterprises effectively. The goal of Tencent is to enrich user experiences and support businesses in their digital journeys.

Shenzhen, ChinaHeadquarters
1998Year Founded
$31.5MTotal Funding
IPOCompany Stage
Consumer Software, Enterprise Software, Fintech, AI & Machine Learning, GamingIndustries
10,001+Employees

Benefits

Professional Development Budget

Risks

Tencent's addition to the US blacklist may affect its operations and partnerships.
Developing Call of Duty mobile version may lead to competitive tensions with Microsoft.
Investment in blockchain exposes Tencent to volatile regulatory environments.

Differentiation

Tencent's WeChat app integrates messaging, social media, and mobile payments seamlessly.
Tencent Games is a global leader with popular titles like Honor of Kings and PUBG Mobile.
Tencent Cloud offers scalable solutions for businesses, enhancing digital transformation efforts.

Upsides

Tencent's investment in blockchain technology could enhance its fintech and cloud services.
The Hunyuan-Large language model advances Tencent's AI capabilities in social networking and gaming.
Collaboration with DYXnet on AI solutions opens new avenues in digital transformation services.

Land your dream remote job 3x faster with AI