1. Digital Health Jobs
  2. Hippocratic AI

Research Scientist, Speech Technologies

Posted on April 28, 2025 (6 days ago)

Job description

About Us:

Hippocratic AI is building safety-focused large language model (LLM) for the healthcare industry. Our team comprised of ex-researchers from Microsoft, Meta, Nvidia, Apple, Stanford, John Hopkins and HuggingFace are reinventing the next generation of foundation model training and alignment to create AI-powered conversational agents for real time patient-AI interactions.

Why Join Our Team:

  • Innovative Mission: We are developing a safe, healthcare-focused large language model (LLM) designed to revolutionize health outcomes on a global scale.
  • Visionary Leadership: Hippocratic AI was co-founded by CEO Munjal Shah, alongside a group of physicians, hospital administrators, healthcare professionals, and artificial intelligence researchers from leading institutions, including El Camino Health, Johns Hopkins, Stanford, Microsoft, Google, and NVIDIA.
  • Strategic Investors: We have raised a total of $278 million in funding, backed by top investors such as Andreessen Horowitz, General Catalyst, Kleiner Perkins, NVIDIA’s NVentures, Premji Invest, SV Angel, and six health systems.
  • World-Class Team: Our team is composed of leading experts in healthcare and artificial intelligence, ensuring our technology is safe, effective, and capable of delivering meaningful improvements to healthcare delivery and outcomes.

Responsibilities:

  • Design, develop, evaluate and update data-driven models for Speech First applications.
  • Participate in research activities including the application and evaluation of speech technologies in the medical domain.
  • Research and implement state-of-the-art (SOTA) models for conversational speech recognition from zero to one.

Basic Qualifications:

  • PhD with 3+ years of experience in Speech Recognition or related field or Masters with 5+ years of hands on experience with ASR.
  • Experience designing and developing algorithms for accurate and efficient speech recognition for both streaming and non-streaming use cases.
  • Experience with training, evaluating, and optimizing ASR models for various factors including accuracy, latency, and resource utilization.
  • Experience with preprocessing and curating large speech datasets for training models.
  • Strong programming skills with working knowledge of Python & C++.
  • Comfort working in a Linux/Unix command-line environment.
  • Team player with good communication skills (oral and written).

Preferred Qualifications:

  • Experience with building state-of-the-art ASR solutions, including setting up data pipelines, model architectures, and evaluation pipelines from zero to one.
  • Hands-on experience with ESPNET, Kaldi, and PyTorch.
  • Experience with CUDA.
  • Experience leveraging large language models (LLMs) for enhanced speech recognition tasks.
  • Experience with neural/end-to-end (E2E) endpoint modeling.
  • Publications in tier 1 journals in the field of speech recognition/NLP.

How to apply

How to Apply:

Click the "Apply for this Job" button on the job posting page to submit your application.