See all roles

AI Evaluation – Safety Specialist

Work from home Full-time role Hiring
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

At Mercor, we believe the foundation of AI safety is high-quality human data. Models can’t evaluate themselves — they need humans who can apply structured judgment to complex, nuanced outputs.

We’re building a flexible pod of Safety specialists: contributors from both technical and non-technical backgrounds who will serve as expert data annotators. This pod will annotate and evaluate AI behaviors to ensure the systems are safe.

No prior annotation experience is required — instead, we’re looking for people with the ability to make careful, consistent decisions in ambiguous situations.

This role may include reviewing AI outputs that touch on sensitive topics such as bias, misinformation, or harmful behaviors. All work is text-based, and participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources.

Qualifications

  • You bring experience in model evaluation, structured annotation, or applied research.
  • You are skilled at spotting biases, inconsistencies, or subtle unsafe behaviors that automated systems may miss.
  • You can explain and defend your reasoning with clarity.
  • You thrive in a fast-moving, experimental environment where evaluation methods evolve quickly.
  • Examples of past titles: Machine Learning Research Assistant, AI Evaluator, Data Scientist, Applied Scientist, Research Engineer, AI Safety Fellow, Annotation Specialist, Data Labeling Analyst, AI Ethics Researcher.

Requirements

  • Produce high-quality human data by annotating AI outputs against safety criteria (e.g., bias, misinformation, disallowed content, unsafe reasoning, etc).
  • Apply harm taxonomies and guidelines consistently, even when tasks are ambiguous.
  • Document your reasoning to improve guidelines.
  • Collaborate to provide the human data that powers AI safety research, model improvements, and risk audits.

Benefits

  • Work at the frontier of AI safety, providing the human data that shapes how advanced systems behave.
  • Gain experience in a rapidly growing field with direct impact on how labs deploy frontier AI responsibly.
  • Be part of a team committed to making AI systems safer, trustworthy, and aligned with human values.

Company Description

Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations.

Our investors include Benchmark, General Catalyst, Adam D’Angelo, Larry Summers, and Jack Dorsey.

Thousands of professionals across law, engineering, research, and creative fields collaborate with Mercor on frontier AI projects shaping the future.

The pay rate for this role may vary by project, customer, and content category. Compensation will be aligned with the level of expertise required, the sensitivity of the material, and the scope of work for each engagement.

Apply To This Job

You might like

AI Red-Teamer — Adversarial AI Testing (Advanced)

Work from home Full-time role

Rubric Grading Expert

Work from home Full-time role

Procurement Expert

Work from home Full-time role

Linguistic Experts

Work from home Full-time role

AI Red-Teamer — Adversarial AI Testing

Work from home Full-time role

STEM PhD Researcher

Work from home Full-time role

Expert Recruiters

Work from home Full-time role

Project Manager

Work from home Full-time role

Visual Annotation Expert

Work from home Full-time role

Project Manager

Work from home Full-time role

Retail Property Manager (Remote)

Work from home Full-time role

Experienced Part-Time Remote Amazon Chat Specialist – Delivering Exceptional Customer Service with arenaflex

Work from home Full-time role

Senior Manager, Sales Operations and Planning

Work from home Full-time role

Senior Compliance Claims Auditor, Claims

Work from home Full-time role

[Remote] Principal AI Systems Engineer

Work from home Full-time role

Join Today: Distributed Systems Engineer (L4) - Data Platform

Work from home Full-time role

Experienced Full Stack Data Entry Specialist – Remote Data Management and Business Operations Support

Work from home Full-time role

Remote Entry-Level Data Entry Associate – Home‑Based Position with arenaflex (No Experience Required)

Work from home Full-time role

Experienced Bilingual Spanish/English Customer Experts – Remote Opportunity at arenaflex

Work from home Full-time role

Customer Service Representative, Amazon (remote USA anywhere)

Work from home Full-time role