Responsibilities:\u003C/h3>\u003Cul style=\"min-height:1.5em\">\u003Cli>\u003Cp style=\"min-height:1.5em\">Own evaluation pipelines — design, build, and automate offline and live evals that keep our speech and multimodal models honest in production.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Harness the data — create tooling for safe, versioned, privacy-aware dataset curation and discovery.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Ship models, not slide decks — partner with research and infra to prototype, train, and deploy state-of-the-art voice models that power \u003Ca target=\"_blank\" rel=\"noopener noreferrer nofollow\" href=\"https://www.sesame.com/research/crossing_the_uncanny_valley_of_voice?utm_source=chatgpt.com\">Sesame’s real-time companion\u003C/a> experience.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Squeeze silicon — scale training and inference for LLM-class workloads; chase latency, throughput, and cost until the graphs flatten.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Wire up monitoring and live evals — surface quality regressions before users or PMs notice.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Move at startup speed — take ideas from whiteboard to production in days, not quarters; leave a clean trail of tests and dashboards behind.\u003Cbr />\u003C/p>\u003C/li>\u003C/ul>\u003Ch3>\u003Cstrong>Required Qualifications:\u003C/strong>\u003C/h3>\u003Cul style=\"min-height:1.5em\">\u003Cli>\u003Cp style=\"min-height:1.5em\">Expert-level PyTorch.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Proven software engineer who loves ML; comfortable writing production code across the stack.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Hands-on experience training or fine-tuning large language or other large-scale models with a variety of techniques.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Evaluation expert — you’ve designed metrics and harnesses that actually predict user happiness.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Deep knowledge of the ML lifecycle: dataset ops, training pipelines, eval frameworks, deployment, and monitoring.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">History of shipping complex projects to production—especially user-facing, online ML systems—despite shifting requirements and surprise roadblocks.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">High agency and the judgment to know when to sprint solo vs. pull in the squad.\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Track record of setting technical direction, driving consensus, and partnering smoothly with product, infra, and research.\u003C/p>\u003C/li>\u003C/ul>\u003Cp style=\"min-height:1.5em\">\u003C/p>\u003Cp style=\"min-height:1.5em\">Sesame is committed to a workplace where everyone feels valued, respected, and empowered. We welcome all qualified applicants, embracing diversity in race, gender, identity, orientation, ability, and more. We provide reasonable accommodations for applicants with disabilities—contact careers@sesame.com for assistance.\u003C/p>\u003Cp style=\"min-height:1.5em\">\u003C/p>\u003Cp style=\"min-height:1.5em\">\u003Cstrong>Full-time Employee Benefits: \u003C/strong>\u003C/p>\u003Cul style=\"min-height:1.5em\">\u003Cli>\u003Cp style=\"min-height:1.5em\">401k matching\u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">100% employer-paid health, vision, and dental benefits \u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Unlimited PTO and sick time \u003C/p>\u003C/li>\u003Cli>\u003Cp style=\"min-height:1.5em\">Flexible spending account matching (medical FSA) \u003C/p>\u003C/li>\u003C/ul>\u003Cp style=\"min-height:1.5em\">Benefits do not apply to contingent/contract workers \u003C/p>","https://jobs.ashbyhq.com/sesame/5e846cd7-5719-440c-b8ef-2fba501847ae",{"id":136,"name":137,"urlSafeSlug":137,"logo":138},[597,598,599],{"city":28,"region":28,"country":29},{"city":169,"region":170,"country":29},{"city":172,"region":173,"country":29},"2025-09-11T07:17:59.442Z","Candidates must possess expert-level PyTorch skills and be proven software engineers comfortable writing production code across the stack. A strong background in ML is required, including hands-on experience training or fine-tuning large language or other large-scale models using various techniques. Expertise in ML evaluation, including designing metrics and harnesses that predict user happiness, is essential. Deep knowledge of the entire ML lifecycle, encompassing dataset operations, training pipelines, evaluation frameworks, deployment, and monitoring, is also necessary. A history of shipping complex, user-facing, online ML systems to production, even with shifting requirements and unexpected obstacles, is crucial. Candidates should demonstrate high agency, good judgment for independent work versus collaboration, and a track record of setting technical direction, building consensus, and partnering effectively with product, infra, and research teams.","The ML Engineer will be responsible for owning evaluation pipelines, including designing, building, and automating offline and live evaluations to ensure the quality of speech and multimodal models in production. They will also create tooling for safe, versioned, and privacy-aware dataset curation and discovery. A key responsibility is to partner with research and infrastructure teams to prototype, train, and deploy state-of-the-art voice models that power the real-time companion experience. The role involves scaling training and inference for LLM-class workloads, optimizing for latency, throughput, and cost. Additionally, the engineer will implement monitoring and live evaluations to detect quality regressions before they impact users or product managers. The position requires moving at startup speed, taking ideas from concept to production rapidly, and maintaining clean code with comprehensive tests and dashboards.",{"employment":604,"compensation":606,"experience":607,"visaSponsorship":611,"location":612,"skills":613,"industries":621},{"type":605},{"id":181,"name":182,"description":375},{"minAnnualSalary":184,"maxAnnualSalary":185,"currency":186,"details":23},{"experienceLevels":608},[609],{"id":436,"name":437,"description":610},"Build upon established skills and take on more responsibility.",{"type":11},{"type":177},[502,561,614,615,616,617,618,619,620],"Large Language Models","Dataset Curation","Model Deployment","Model Monitoring","Evaluation Metrics","Production Code","Software Engineering",[622],{"id":283,"name":399},["Reactive",624],{"$ssite-config":625},{"env":626,"name":627,"url":628},"production","nuxt-app","https://jobo.world",["Set"],["ShallowReactive",631],{"company-Sesame":-1,"company-jobs-1397a7e1-1afb-42b5-be4a-86a32f80c811-carousel":-1},"/company/Sesame",{}]