AI Evaluation – Safety Specialist

Work from home Full-time role Hiring

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

At Mercor, we believe the foundation of AI safety is high-quality human data. Models can’t evaluate themselves — they need humans who can apply structured judgment to complex, nuanced outputs.

We’re building a flexible pod of Safety specialists: contributors from both technical and non-technical backgrounds who will serve as expert data annotators. This pod will annotate and evaluate AI behaviors to ensure the systems are safe.

No prior annotation experience is required — instead, we’re looking for people with the ability to make careful, consistent decisions in ambiguous situations.

This role may include reviewing AI outputs that touch on sensitive topics such as bias, misinformation, or harmful behaviors. All work is text-based, and participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources.

Qualifications

You bring experience in model evaluation, structured annotation, or applied research.
You are skilled at spotting biases, inconsistencies, or subtle unsafe behaviors that automated systems may miss.
You can explain and defend your reasoning with clarity.
You thrive in a fast-moving, experimental environment where evaluation methods evolve quickly.
Examples of past titles: Machine Learning Research Assistant, AI Evaluator, Data Scientist, Applied Scientist, Research Engineer, AI Safety Fellow, Annotation Specialist, Data Labeling Analyst, AI Ethics Researcher.

Requirements

Produce high-quality human data by annotating AI outputs against safety criteria (e.g., bias, misinformation, disallowed content, unsafe reasoning, etc).
Apply harm taxonomies and guidelines consistently, even when tasks are ambiguous.
Document your reasoning to improve guidelines.
Collaborate to provide the human data that powers AI safety research, model improvements, and risk audits.

Benefits

Work at the frontier of AI safety, providing the human data that shapes how advanced systems behave.
Gain experience in a rapidly growing field with direct impact on how labs deploy frontier AI responsibly.
Be part of a team committed to making AI systems safer, trustworthy, and aligned with human values.

Company Description

Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations.

Our investors include Benchmark, General Catalyst, Adam D’Angelo, Larry Summers, and Jack Dorsey.

Thousands of professionals across law, engineering, research, and creative fields collaborate with Mercor on frontier AI projects shaping the future.

The pay rate for this role may vary by project, customer, and content category. Compensation will be aligned with the level of expertise required, the sensitivity of the material, and the scope of work for each engagement.

Apply To This Job

Apply

AI Evaluation – Safety Specialist

You might like

AI Red-Teamer — Adversarial AI Testing (Advanced)

Rubric Grading Expert

Procurement Expert

Linguistic Experts

AI Red-Teamer — Adversarial AI Testing

STEM PhD Researcher

Expert Recruiters

Project Manager

Visual Annotation Expert

Project Manager

Retail Property Manager (Remote)

Experienced Part-Time Remote Amazon Chat Specialist – Delivering Exceptional Customer Service with arenaflex

Senior Manager, Sales Operations and Planning

Senior Compliance Claims Auditor, Claims

[Remote] Principal AI Systems Engineer

Join Today: Distributed Systems Engineer (L4) - Data Platform

Experienced Full Stack Data Entry Specialist – Remote Data Management and Business Operations Support

Remote Entry-Level Data Entry Associate – Home‑Based Position with arenaflex (No Experience Required)

Experienced Bilingual Spanish/English Customer Experts – Remote Opportunity at arenaflex

Customer Service Representative, Amazon (remote USA anywhere)

AI Evaluation – Safety Specialist

You might like

AI Red-Teamer — Adversarial AI Testing (Advanced)

Rubric Grading Expert

Procurement Expert

Linguistic Experts

AI Red-Teamer — Adversarial AI Testing

STEM PhD Researcher

Expert Recruiters

Project Manager

Visual Annotation Expert

Project Manager

Retail Property Manager (Remote)

Experienced Part-Time Remote Amazon Chat Specialist – Delivering Exceptional Customer Service with arenaflex

Senior Manager, Sales Operations and Planning

Senior Compliance Claims Auditor, Claims

[Remote] Principal AI Systems Engineer

Join Today: Distributed Systems Engineer (L4) - Data Platform

Experienced Full Stack Data Entry Specialist – Remote Data Management and Business Operations Support

Remote Entry-Level Data Entry Associate – Home‑Based Position with arenaflex (No Experience Required)

Experienced Bilingual Spanish/English Customer Experts – Remote Opportunity at arenaflex

Customer Service Representative, Amazon (remote USA anywhere)

Looking for more remote jobs?