1. Digital Health Jobs
  2. Protege

Solutions Engineer (Media)

Posted on April 23, 2026 (about 1 month ago)

Company Overview

We are building Protege to solve the biggest unmet need in AI — getting access to the right training data. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.
We are backed by world-class investors and already powering partnerships with ambitious AI teams. Our culture is lean, fast-moving, and built for people who thrive on ambiguity, ownership, and impact.

Role Overview

We are hiring a Solutions Engineer for our media vertical to connect Protege’s media catalog with customer AI data needs. This applied data curation and delivery role focuses on normalizing, validating, and operationalizing partner datasets for downstream AI use cases.
You will become an expert in Protege’s catalog of audio, video, and motion capture content and deliver datasets that meet both technical and conceptual requirements on tight timelines.

What Youll Do

Own data quality and curate media datasets, working with Sales and Solutions to translate customer requirements into curation strategies and handling imperfect partner data.
  • Normalize and standardize datasets for reliable downstream use
  • Query and analyze the media catalog using SQL, internal APIs, and metadata tools
  • Build validation checks and workflows to ensure dataset integrity before delivery
  • Use AI tools and embeddings to surface and refine clip-level content
  • Run iterative sample reviews with customers and refine selections to meet specifications

Catalogue Expertise & Cross-functional Work

Build deep expertise in the catalog structure, track content coverage and modality mix, and partner with Product and Partnerships to inform sourcing priorities.
Work across product, data, and customer teams to ensure content packaging meets technical, ethical, and licensing requirements and develop scripts and tools that improve curation efficiency.

What Success Looks Like

30 days: Learn and get operational by building understanding of the catalog, delivery lifecycle, and core tools.
60 days: Lead dataset sampling and curation for active use cases and surface insights on catalog coverage and metadata quality.
90 days: Create repeatable QA and delivery workflows that increase consistency and speed and provide feedback that shapes product and sourcing roadmaps.

What You Bring

  • 4-7 years in data science, media analytics, technical curation, or similar hands-on data roles
  • Strong SQL proficiency and experience querying large, messy datasets
  • Experience with media metadata, embeddings, or unstructured content
  • Ability to translate customer or model requirements into concrete dataset specifications
  • High standards for data quality, operational rigor, and clear communication
  • Comfort in ambiguous, fast-moving environments

Nice to Have

  • Familiarity with video/audio processing, embeddings, or multimodal AI workflows
  • Prior experience curating or packaging datasets for machine learning
  • Background in content analysis, recommendation systems, or information retrieval

Working with Protege

We move fast but thoughtfully, value clarity and autonomy, and maintain a kind, direct, and inclusive culture. Everyone is hands-on and focused on impact.

How to Apply

To apply, visit the job application page at: https://jobs.ashbyhq.com/protege/e2d2ef75-9e8c-42fc-9a9e-6a2b0d6593c6/application

Application Instructions

Click the "Apply for this Job" button on the job page and complete the application form. Include your resume and any relevant examples or links to past work.