Senior Data Engineer, AI Infrastructure
Posted on November 24, 2025 (3 minutes ago)
About Arbiter
Arbiter is the AI-powered care orchestration system that unites healthcare. Today, healthcare runs on $100B+ in fragmented point solutions that can't see the full picture. We replace them with a single intelligent system that sits on top of EMRs and existing workflows, unifies clinical, policy, and financial data, then automates the actions that close care gaps - starting with site-of-care optimization.
Backed by one of the largest seed rounds in health tech history and operators who bring the expertise and distribution to scale nationally, we're building the connected infrastructure healthcare should have had all along.
Engineering Culture & Values
We are a high-performing group of engineers dedicated to delivering innovative, high-quality solutions to our clients and business partners. We believe in:
- Engineering Excellence: Taking immense pride in our technical craft and the products we build, treating both with utmost respect and care.
- Impact-Driven Development: Firmly committed to engineering high-quality, fault-tolerant, and highly scalable systems that evolve seamlessly with business needs, minimizing disruption.
- Collaboration Over Ego: Valuing exceptional work and groundbreaking ideas above all else. We seek talented individuals who are accustomed to working in a fast-paced environment and are driven to ship often to achieve significant impact.
- Continuous Growth: Fostering an environment of continuous learning, mentorship, and professional development, where you can deepen your expertise and grow your career.
Responsibilities
- AI/ML Pipeline Development: Design, develop, and maintain robust, scalable data pipelines specifically for our AI models. This includes data ingestion, cleaning, transformation, classification, and tagging to create high-quality, reliable training and evaluation datasets.
- MLOps & Infrastructure: Build and manage the AI infrastructure to support the full machine learning lifecycle. This includes automating model training, versioning, deployment, and monitoring (CI/CD for ML).
- Embedding & Vector Systems: Architect and operate scalable systems for generating, storing, and serving embeddings. Implement and manage vector databases to power retrieval-augmented generation (RAG) and semantic search for our AI agents.
- AI Platform & Tooling: Champion and build core tooling, frameworks, and standards for the AI/ML platform. Develop systems that enable AI engineers to iterate quickly and self-serve for model development and deployment.
- Cross-Functional Collaboration: Partner closely with AI engineers, product managers, and software engineers to understand their needs. Translate complex model requirements into stable, scalable infrastructure and data solutions.
- Mentorship & Growth: Actively participate in mentoring junior engineers, contributing to our team's growth through technical guidance, code reviews, and knowledge sharing.
- Hiring & Onboarding: Play an active role in interviewing and onboarding new team members, helping to build a world-class data engineering organization.
Minimum Qualifications
- 8+ years of deep, hands-on experience in Data Engineering, MLOps, or AI/ML Infrastructure, ideally within a high-growth tech environment.
- Exceptional expertise in data structures, algorithms, and distributed systems.
- Mastery in Python for large-scale data processing and ML applications.
- Extensive experience designing, building, and optimizing complex, fault-tolerant data pipelines specifically for ML models (e.g., feature engineering, training data generation).
- Profound understanding and hands-on experience with cloud-native data and AI platforms, especially Google Cloud Platform (GCP) (e.g., Vertex AI, BigQuery, Dataflow, GKE).
- Strong experience with containerization (Docker) and orchestration (Kubernetes) for deploying and scaling applications.
- Demonstrated experience with modern ML orchestration (e.g., Kubeflow, Airflow), data transformation (dbt), and MLOps principles.
- Intimate knowledge of and ability to implement unit, integration, and functional testing strategies.
- Experience providing technical leadership and guidance, and thinking strategically and analytically to solve problems.
- Friendly communication skills and ability to work well in a diverse team setting.
- Demonstrated experience working with many cross-functional partners.
Preferred Qualifications
- Experience with vector databases (e.g., Pinecone, Elasticsearch) and building embedding generation pipelines.
- Experience with MLOps platforms and tools (e.g., MLflow, Weights & Biases) for experiment tracking and model management.
- Experience with advanced data extraction and correlation techniques, especially from unstructured medical data sources (e.g., PDF charts, clinical notes).
- Familiarity with deep learning frameworks (e.g., TensorFlow, PyTorch).
- Familiarity with data governance, data security, and compliance frameworks (e.g., HIPAA, GDPR) in a highly regulated industry.
Additional Details
This role can be remote or onsite, based in our New York City or Boca Raton offices, in a fast-paced, collaborative environment where great ideas move quickly from whiteboard to production.
Job Benefits
- Highly Competitive Salary & Equity Package: Designed to rival top FAANG compensation, including meaningful equity.
- Generous Paid Time Off (PTO): To ensure a healthy work-life balance.
- Comprehensive Health, Vision, and Dental Insurance: Robust coverage for you and your family.
- Life and Disability Insurance: Providing financial security.
- Simple IRA Matching: To support your long-term financial goals.
- Professional Development Budget: Support for conferences, courses, and certifications to fuel your continuous learning.
- Wellness Programs: Initiatives to support your physical and mental health.
Salary Range
The annual base salary range for this position is $180,000-$240,000. Actual compensation offered to the successful candidate may vary from the posted hiring range based on work experience, skill level, and other factors.
How to Apply
Please visit the job application page at https://jobs.ashbyhq.com/arbiter-ai/62b37a6e-c2df-4356-b555-e4297668f2c8/application and click on the "Apply for this Job" button to submit your application.