Image Challenge

SAFE: Image Edit Detection and Localization Challenge 2025

Hosted by: UL Research Institutes — Digital Safety Research Institute (DSRI)
Co-located with the SynRDinBAS: Synthetic Realities and Data in Biometric Analysis and Security Workshop @ WACV 2026

Detecting partially-synthetic images containing subtle manipulations.

View Challenge Results

Overview

The SAFE: Image Edit Detection and Localization Challenge 2025 focuses on bridging the research gap in detecting partially synthetic images. Participants will compete to create the best-performing synthetic image detectors as measured on novel, private datasets.

Localized changes – such as subtle face edits, background inpainting, or object insertions/removals– are harder to detect and more likely to deceive viewers than fully synthetic content.

Target tasks include:

  • Detection of images generated by state-of-the-art generative models (image/text-to-image, diffusion, etc.)
  • Detection & localization of object additions and removals
  • Identification of inpainted or altered regions
  • Image classification: authentic vs. partially synthetic vs. fully synthetic

By anchoring the challenge in subtle manipulations, we aim to drive innovation in detection methods that move beyond binary detection toward fine-grained reasoning about where and how an image was altered.

Ready to participate?

Register your team to get started. The principal investigator fills out the registration form -- organizers manually approve and issue your access token.

Evaluations for this challenge will run on the Dyff Platform hosted by UL DSRI.

Register

Updates

2026-01-14

We are releasing a new version of the main challenge task , titled Main Task 1 -- v2, that fixes some issues that we have identified with the original task. The new v2 will supersede the old version. Submissions to v1 of the main task will close at midnight UTC on January 15, 2026. The ID of the new task is 132e0ce09a1145f8af12fd800ae2479e and its canonical name is main-1-v2.

2025-12-08

Main Task 1 is now open for submissions! This is the first main task for this challenge. Results on this task will contribute to team rankings in the challenge.

2025-11-24

Submissions for the Pilot 1 task are now open! See the example submission repo for instructions on preparing a submission.

2025-11-17

The HuggingFace community for the challenge is now open. The HF community includes an example submission repository and a pilot task dataset .

Participation

For the most detailed instructions, refer to the example submission repository on HuggingFace.

Register for the Challenge

The principal investigator of each participating team should fill out the registration form. The challenge organizers must manually approve your registration. After approval, you will receive an access token for the Dyff Platform that will allow you to manage your team information, make submissions, and review non-public results and system logs.

Implement your detector model

Submissions must be in the form of a containerized web service that implements a standard JSON API over HTTP. You will submit a runnable Docker image and, optionally, a volume of data files to be mounted in the running Docker container.

We provide an example submission repository that contains a specification and reference implementation of the required interface as well as step-by-step instructions for creating a new submission. You can use this code as a starting point, but you don't have to.

Submitted systems will run in a virtual machine with access to the following resources:

  • GPU: 1x Nvidia L4
  • CPU: 4
  • Memory: 24Gi

Submit your detector model for evaluation

Submit your detector model for evaluation. You can build your submission package yourself and submit it using a CLI tool (preferred), or you can build your submission in a HuggingFace Space and submit the Space using a web form.

Teams will be allowed a limited number of submissions per day.

Datasets

New datasets will be generated specifically for this challenge, with a substantial portion synthetically manipulated at varying granularities and annotated for both detection and localization.

Blind Evaluation

This challenge uses novel, private datasets for evaluation. To maximize the validity of the performance measurements:

  • No training data will be provided. Participants may train on any data they have rights to use.
  • Evaluation data will not be published. Organizers will publish general descriptions of the collection methodology only.
  • Small data samples are provided only to validate submission flow and are not representative of the actual evaluation data content.

Challenge Tasks

This is a script-based competition — your model repository runs on our infrastructure on private data and must output its predictions in a standard format.

Pilot Task: System Testing and Competition Design Validation

The Pilot Task allows participants to test the submission process, system behavior, and evaluation flow before the release of official competition tasks.

It is not intended to reflect the complexity, realism, or diversity of the final datasets.

Goals:

  • Verify submission pipeline (upload, evaluation, logging, etc.)
  • Allow organizers to validate infrastructure and parameters
  • Familiarize teams with submission mechanics before full competition

Main Task 1

This first Main Task evaluates the performance of synthetic image detectors on three different capabilities -- detection, classification, and localization. The task dataset consists of images in 4 different categories:

  • 0 = Natural -- A natural (un-manipulated) image
  • 1 = FullySynthesized -- The entire image is synthetic
  • 2 = LocallyEdited -- The image was manipulated locally using traditional image processing methods (e.g., in-painting or splicing)
  • 3 = LocallySynthesized -- The image was manipulated locally by synthesizing content

Each image contains manipulations from at most one category.

Detection performance is the detector's ability to detect image manipulation of any kind. We formulate this as a binary classification problem of discriminating Natural images vs. all other classes.

Classification performance is the detector's ability to classify images by the type of manipulation they exhibit, if any. We formulate this as a multi-class classification problem.

Localization performance is the detector's ability to determine where the image has been manipulated. We formulate this as an object detection task with a single kind of object -- namely, manipulated pixels.

For each capability, we define one primary score, which is the score that determines the ranking of challenge submissions in that category. We also define a number of secondary scores that may provide additional insight into detector performance but that do not contribute to rankings in the challenge.

Evaluation

Primary Metrics
Detection (Accuracy, Balanced Accuracy, AUC) and Localization (IoU, pixel-level F1).

Submissions will be evaluated on private data created for this challenge, and the evaluations will be run on private computing infrastructure. Evaluation data will not be released publicly. The organizers will publish a description of the dataset creation methodology.

The competition will maintain both a public leaderboard and a private leaderboard.

Datasets may differ between the two to ensure fair and unbiased evaluation.

Datasets may differ between the two to ensure fair and unbiased evaluation.

Rules

To ensure a fair and rigorous evaluation process for the Synthetic and AI Forensic Evaluations (SAFE) — Synthetic Image Authenticity Challenge, all participants must adhere to the following:

1

Leaderboards

  1. Both public and private leaderboards will be maintained.
  2. The private leaderboard will serve as the basis for final ranking.
2

Submission Limits

Participants will be limited in the number of daily submissions.

3

Confidentiality

  1. Participants agree not to publicly compare results with others until those results are published outside the conference venue.
  2. Participants are free to publish and use their own results independently.
4

Appropriate Use

  1. Use of provided computing resources for any non-challenge-related purposes is prohibited.
  2. Participants should take appropriate precautions to protect their authorization credentials (API tokens, etc) and report account compromise or misuse to the challenge organizers immediately.
5

Recognition and Awards

Top performers may be eligible for research grants and travel support to WACV for invited teams.

6

Compliance

  1. All rules and guidelines issued by the organizers must be followed.
  2. Failure to comply may result in disqualification or exclusion from future challenges.

By participating in the SAFE Challenge, you agree to uphold these rules and contribute to advancing the field of synthetic image forensics.

Schedule

Event Date Status
Starter project released November 17, 2025 HuggingFace community
Pilot Task Opens November 24, 2025 Open
Main Task 1 Opens December 1 - December 8, 2025 Open
End of Evaluation Phase February 27, 2026
Workshop Session & Results @ WACV March 6–10, 2026 📍 SynRDinBAS Workshop

Dates subject to change; final timelines will be posted.

Helpful Resources

Contact The Organizers

Contact The Organizers

Join Our Discord

Join Our Discord

Detailed Instructions & Code Examples

Detailed Instructions & Code Examples

Dyff Platform Documentation

Dyff Platform Documentation

Research Grant Winners

Top performers may be eligible for research grants and travel support to WACV for invited teams.

Results Publications

To Be Announced

The SAFE: Image Edit Detection and Localization Challenge 2025 focuses on bridging the research gap in detecting partially synthetic images. Participants will compete to create the best-performing synthetic image detectors as measured on novel, private datasets.

Localized changes – such as subtle face edits, background inpainting, or object insertions/removals– are harder to detect and more likely to deceive viewers than fully synthetic content.

Target tasks include:

  • Detection of images generated by state-of-the-art generative models (image/text-to-image, diffusion, etc.)
  • Detection & localization of object additions and removals
  • Identification of inpainted or altered regions
  • Image classification: authentic vs. partially synthetic vs. fully synthetic

By anchoring the challenge in subtle manipulations, we aim to drive innovation in detection methods that move beyond binary detection toward fine-grained reasoning about where and how an image was altered.

Ready to participate?

Register your team to get started. The principal investigator fills out the registration form -- organizers manually approve and issue your access token.

Evaluations for this challenge will run on the Dyff Platform hosted by UL DSRI.

Register

2026-01-14

We are releasing a new version of the main challenge task , titled Main Task 1 -- v2, that fixes some issues that we have identified with the original task. The new v2 will supersede the old version. Submissions to v1 of the main task will close at midnight UTC on January 15, 2026. The ID of the new task is 132e0ce09a1145f8af12fd800ae2479e and its canonical name is main-1-v2.

2025-12-08

Main Task 1 is now open for submissions! This is the first main task for this challenge. Results on this task will contribute to team rankings in the challenge.

2025-11-24

Submissions for the Pilot 1 task are now open! See the example submission repo for instructions on preparing a submission.

2025-11-17

The HuggingFace community for the challenge is now open. The HF community includes an example submission repository and a pilot task dataset .

For the most detailed instructions, refer to the example submission repository on HuggingFace.

Register for the Challenge

The principal investigator of each participating team should fill out the registration form. The challenge organizers must manually approve your registration. After approval, you will receive an access token for the Dyff Platform that will allow you to manage your team information, make submissions, and review non-public results and system logs.

Implement your detector model

Submissions must be in the form of a containerized web service that implements a standard JSON API over HTTP. You will submit a runnable Docker image and, optionally, a volume of data files to be mounted in the running Docker container.

We provide an example submission repository that contains a specification and reference implementation of the required interface as well as step-by-step instructions for creating a new submission. You can use this code as a starting point, but you don't have to.

Submitted systems will run in a virtual machine with access to the following resources:

  • GPU: 1x Nvidia L4
  • CPU: 4
  • Memory: 24Gi

Submit your detector model for evaluation

Submit your detector model for evaluation. You can build your submission package yourself and submit it using a CLI tool (preferred), or you can build your submission in a HuggingFace Space and submit the Space using a web form.

Teams will be allowed a limited number of submissions per day.

New datasets will be generated specifically for this challenge, with a substantial portion synthetically manipulated at varying granularities and annotated for both detection and localization.

Blind Evaluation

This challenge uses novel, private datasets for evaluation. To maximize the validity of the performance measurements:

  • No training data will be provided. Participants may train on any data they have rights to use.
  • Evaluation data will not be published. Organizers will publish general descriptions of the collection methodology only.
  • Small data samples are provided only to validate submission flow and are not representative of the actual evaluation data content.

This is a script-based competition — your model repository runs on our infrastructure on private data and must output its predictions in a standard format.

Pilot Task: System Testing and Competition Design Validation

The Pilot Task allows participants to test the submission process, system behavior, and evaluation flow before the release of official competition tasks.

It is not intended to reflect the complexity, realism, or diversity of the final datasets.

Goals:

  • Verify submission pipeline (upload, evaluation, logging, etc.)
  • Allow organizers to validate infrastructure and parameters
  • Familiarize teams with submission mechanics before full competition

Main Task 1

This first Main Task evaluates the performance of synthetic image detectors on three different capabilities -- detection, classification, and localization. The task dataset consists of images in 4 different categories:

  • 0 = Natural -- A natural (un-manipulated) image
  • 1 = FullySynthesized -- The entire image is synthetic
  • 2 = LocallyEdited -- The image was manipulated locally using traditional image processing methods (e.g., in-painting or splicing)
  • 3 = LocallySynthesized -- The image was manipulated locally by synthesizing content

Each image contains manipulations from at most one category.

Detection performance is the detector's ability to detect image manipulation of any kind. We formulate this as a binary classification problem of discriminating Natural images vs. all other classes.

Classification performance is the detector's ability to classify images by the type of manipulation they exhibit, if any. We formulate this as a multi-class classification problem.

Localization performance is the detector's ability to determine where the image has been manipulated. We formulate this as an object detection task with a single kind of object -- namely, manipulated pixels.

For each capability, we define one primary score, which is the score that determines the ranking of challenge submissions in that category. We also define a number of secondary scores that may provide additional insight into detector performance but that do not contribute to rankings in the challenge.

Primary Metrics
Detection (Accuracy, Balanced Accuracy, AUC) and Localization (IoU, pixel-level F1).

Submissions will be evaluated on private data created for this challenge, and the evaluations will be run on private computing infrastructure. Evaluation data will not be released publicly. The organizers will publish a description of the dataset creation methodology.

The competition will maintain both a public leaderboard and a private leaderboard.

Datasets may differ between the two to ensure fair and unbiased evaluation.

Datasets may differ between the two to ensure fair and unbiased evaluation.

To ensure a fair and rigorous evaluation process for the Synthetic and AI Forensic Evaluations (SAFE) — Synthetic Image Authenticity Challenge, all participants must adhere to the following:

1

Leaderboards

  1. Both public and private leaderboards will be maintained.
  2. The private leaderboard will serve as the basis for final ranking.
2

Submission Limits

Participants will be limited in the number of daily submissions.

3

Confidentiality

  1. Participants agree not to publicly compare results with others until those results are published outside the conference venue.
  2. Participants are free to publish and use their own results independently.
4

Appropriate Use

  1. Use of provided computing resources for any non-challenge-related purposes is prohibited.
  2. Participants should take appropriate precautions to protect their authorization credentials (API tokens, etc) and report account compromise or misuse to the challenge organizers immediately.
5

Recognition and Awards

Top performers may be eligible for research grants and travel support to WACV for invited teams.

6

Compliance

  1. All rules and guidelines issued by the organizers must be followed.
  2. Failure to comply may result in disqualification or exclusion from future challenges.

By participating in the SAFE Challenge, you agree to uphold these rules and contribute to advancing the field of synthetic image forensics.

Event Date Status
Starter project released November 17, 2025 HuggingFace community
Pilot Task Opens November 24, 2025 Open
Main Task 1 Opens December 1 - December 8, 2025 Open
End of Evaluation Phase February 27, 2026
Workshop Session & Results @ WACV March 6–10, 2026 📍 SynRDinBAS Workshop

Dates subject to change; final timelines will be posted.

Contact The Organizers

Contact The Organizers

Join Our Discord

Join Our Discord

Detailed Instructions & Code Examples

Detailed Instructions & Code Examples

Dyff Platform Documentation

Dyff Platform Documentation

Top performers may be eligible for research grants and travel support to WACV for invited teams.

To Be Announced