ML Engineer (Remote, NYC, Austin)
Trunk Tools- Full Time
- Senior (5 to 8 years)
Candidates should possess at least 6 years of experience in software engineering or machine learning, with a preference for 3+ years holding a PhD. They must have experience designing and training large language models or large-scale generative models, demonstrating deep expertise in NLP, sequence modeling, and transformer architectures. Proficiency in Python and ML libraries like PyTorch or TensorFlow, coupled with strong engineering skills in building scalable ML pipelines, is required. Experience with RL-based fine-tuning and evaluating generative systems is also necessary, along with the ability to lead technical projects and collaborate effectively across teams. A Bachelor’s degree in Computer Science, Engineering, or a related field is a minimum qualification.
The Machine Learning Engineer will lead the development, training, and deployment of large language and multimodal foundation models specifically tailored to clinical and biomedical domains. They will apply and refine state-of-the-art techniques such as supervised fine-tuning, reinforcement learning-based methods, parameter-efficient fine-tuning, prompt tuning, and retrieval-augmented generation. This role involves collaborating cross-functionally with researchers, clinicians, and engineers to design ML-driven solutions for healthcare, building scalable infrastructure for distributed training of large models, and designing and evaluating models for robustness, bias mitigation, factual consistency, and explainability within healthcare contexts. Furthermore, the engineer will stay current with the latest research in generative AI and contribute back to the community through publications and open-source initiatives.
Healthcare data platform for research analytics
Truveta provides a platform that allows researchers to access and analyze patient data to enhance patient care and study the safety and effectiveness of treatments. The platform, known as Truveta Studio, offers immediate and compliant access to patient-level data, which is sourced from over 30 health systems and includes information from more than 100 million patients across the United States. This data is updated daily and comes from over 800 hospitals and 20,000 clinics. Truveta Studio is designed to simplify the data access process, making it cost-effective for researchers by charging them only for the data and analytics they use. Unlike many competitors, Truveta focuses on providing transparent pricing and efficient access to comprehensive healthcare data. The company's goal is to empower researchers in the healthcare and life sciences sectors to gain valuable insights that can lead to improved patient outcomes.