AI Researcher (Voice)
TavusFull Time
Senior (5 to 8 years)
Key technologies and capabilities for this role
Common questions about this position
The salary range is $150K - $220K.
Yes, this is a remote position.
The role requires expertise in pioneering Latent Space Models (LSMs), building neural audio codecs, developing steerable generative models for speech synthesis, creating embedding systems for latent space factorization, leveraging latent recombination for synthetic data generation, and training multimodal speech-to-speech systems.
This information is not specified in the job description.
Strong candidates will have deep expertise in audio AI, experience with neural codecs, generative models, latent space techniques, and multimodal systems, with a passion for solving fundamental challenges in voice AI at scale.
Speech recognition APIs for audio transcription
Deepgram specializes in artificial intelligence for speech recognition, offering a set of APIs that developers can use to transcribe and understand audio content. Their technology allows clients, ranging from startups to large organizations like NASA, to process millions of audio minutes daily. Deepgram's APIs are designed to be fast, accurate, scalable, and cost-effective, making them suitable for businesses needing to handle large volumes of audio data. The company operates on a pay-per-use model, where clients are charged based on the amount of audio they transcribe, allowing Deepgram to grow its revenue alongside client usage. With a focus on the high-growth market of speech recognition, Deepgram is positioned for future success.