Machine Learning Engineer (NLP) at Tonic

San Francisco, California, United States

Tonic Logo
Not SpecifiedCompensation
Junior (1 to 2 years)Experience Level
Full TimeJob Type
UnknownVisa
Information Technology, Artificial IntelligenceIndustries

Requirements

  • 3+ years of professional experience in applied ML or data science with a focus on NLP
  • Proficiency in Python and deep learning frameworks such as PyTorch and Hugging Face Transformers
  • Hands-on experience with experiment tracking (e.g., Weights & Biases), distributed training (e.g., Accelerate), and model serving (e.g., vLLM)
  • Comfort working independently and iterating quickly
  • Strong communication and collaboration skills
  • Bonus Points: Experience with supervised and reinforcement learning fine-tuning (e.g., TRL); familiarity with data privacy, PII redaction, or healthcare data; a public portfolio, blog, or open-source contributions that demonstrate technical depth and curiosity

Responsibilities

  • Build and ship models. Fine-tune and evaluate transformer-based models (e.g., RoBERTa, Gemma, LLaMA) to support PII redaction, entity extraction, and synthetic data generation
  • Own the ML lifecycle. From dataset curation and experiment tracking to model deployment and monitoring — you’ll own the full path from prototype to production
  • Collaborate cross-functionally. Partner with Product and Design to shape how ML models drive user-facing features, and work with the broader engineering team to integrate them into scalable systems
  • Experiment responsibly. Document your experiments, evaluate results rigorously, and help push the frontier of safe and explainable AI for data privacy

Skills

Python
Deep Learning
PyTorch
Hugging Face Transformers
NLP
Experiment Tracking
Weights & Biases
Distributed Training
Accelerate
Model Serving
vLLM
Data Privacy
PII Redaction
LLMs
RoBERTa
Gemma
LLaMA
Entity Extraction
Synthetic Data Generation

Tonic

Data management solutions for developers and teams

About Tonic

Tonic.ai provides data management solutions aimed at software developers, data scientists, and quality assurance teams. Their platform enables users to de-identify, subset, and synthesize data, which helps protect sensitive information while still making it useful for testing and development. Tonic.ai operates on a subscription-based model, offering various service tiers to accommodate different organizational needs. This approach allows clients, ranging from small startups to large enterprises, to automate data pipelines and generate realistic demo data, ultimately saving time and reducing bugs in software development. Tonic.ai stands out from competitors by seamlessly integrating with both SQL and NoSQL databases, making it a versatile choice for data-driven organizations. The company's goal is to enhance data privacy and streamline data management processes to accelerate software development cycles.

San Francisco, CaliforniaHeadquarters
2018Year Founded
$45.6MTotal Funding
SERIES_BCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
51-200Employees

Benefits

Competitive salary and equity
Unlimited paid time off
401k plan with employer contribution
Medical, dental, and vision insurance
One Medical membership
Generous parental leave policy
Remote-friendly work environment

Risks

Competition from CustomGPT.ai threatens Tonic's position in AI-driven data solutions.
Shift towards RAG may require Tonic to adapt its offerings to stay competitive.
Pay-as-you-go model could pressure Tonic's subscription-based business model.

Differentiation

Tonic specializes in synthetic data for privacy-preserving software development and testing.
The company offers tools for database subsetting, de-identification, and data synthesis.
Tonic's platform integrates with SQL and NoSQL databases, enhancing its versatility.

Upsides

Growing interest in synthetic data boosts Tonic's AI development opportunities.
Rising adoption of RAG systems aligns with Tonic's data synthesis capabilities.
Cloud-based solutions drive demand for Tonic's scalable, flexible platforms.

Land your dream remote job 3x faster with AI