Senior AI Engineer, GenAI & ML Evaluation Frameworks - Grafana Ops, AI/ML | USA | Remote at Grafana Labs

United States

Grafana Labs Logo
Not SpecifiedCompensation
Senior (5 to 8 years)Experience Level
Full TimeJob Type
UnknownVisa
SoftwareIndustries

Requirements

  • Experience designing and implementing evaluation frameworks for AI/ML systems
  • Familiarity with prompt engineering, structured output evaluation, and context-window management in LLM systems
  • High autonomy to collaborate and translate team goals into clear, testable criteria supported by effective tooling
  • Experience working in environments with rapid iteration and experimental development (bonus point)
  • Pragmatic mindset that values reproducibility, developer experience, and thoughtful trade-offs when scaling GenAI systems (bonus point)
  • Passion for minimizing human toil and building AI systems that actively support engineers (bonus point)

Responsibilities

  • Design and implement robust evaluation frameworks for GenAI and LLM-based systems, including golden test sets, regression tracking, LLM-as-judge methods, and structured output verification
  • Develop tooling to enable automated, low-friction evaluation of model outputs, prompts, and agent behaviors
  • Define and refine metrics for both structure and semantics, ensuring alignment with realistic use cases and operational constraints
  • Lead the development of dataset management processes and guide teams across Grafana in best practices for GenAI evaluation

Skills

Generative AI
Large Language Models (LLMs)
Evaluation Frameworks
Observability
AI/ML

Grafana Labs

Observability and monitoring solutions provider

About Grafana Labs

Grafana Labs specializes in observability and monitoring solutions for cloud infrastructure and applications. Its main product, Grafana, is an open-source metrics dashboard that allows users to visualize and analyze data from various sources. This helps businesses monitor the performance and health of their systems in real-time. Grafana Labs serves a wide range of clients, including large enterprises and individual developers, particularly in sectors like technology, finance, healthcare, and retail. Unlike many competitors, Grafana Labs offers both open-source and commercial products, generating revenue through premium features, enterprise support, and managed cloud services. The company's goal is to provide essential tools for monitoring and visualizing data, ensuring that digital services are reliable and efficient.

New York City, New YorkHeadquarters
2014Year Founded
$783.2MTotal Funding
SERIES_DCompany Stage
Data & Analytics, Enterprise SoftwareIndustries
1,001-5,000Employees

Benefits

30 days of paid vacation each year on top of national holidays, parental leave, & sick leave
Health coverage
4% contribution match on our 401(k)
$1,500 learning and development stipend
Udemy subscription
Complimentary subscription to Headspace
Discounts on a wide variety of services, including entertainment, food, and fitness.
Remote Work Option
Global Employee Assistance Program

Risks

Increased competition from Datadog, Dynatrace, and New Relic pressures Grafana to innovate.
Recent critical security vulnerabilities could affect customer trust and adoption.
Rapid regional expansion may strain resources and impact service quality.

Differentiation

Grafana Labs offers a unique open-source observability platform with extensive customization.
The company provides both self-managed and fully managed observability solutions for diverse needs.
Grafana Labs integrates AI to enhance data analysis and monitoring capabilities.

Upsides

Grafana Labs raised $270M, boosting its valuation to over $6 billion in 2024.
The company surpassed $250M in annual recurring revenue with over 5,000 customers.
Expansion into Southeast Asia shows Grafana's commitment to localized cloud services.

Land your dream remote job 3x faster with AI