YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

DetectiveSAMv2

DetectiveSAMv2 is an inference-only image forgery localization bundle built around SAM2. This repo is v2-only: the code path, config sidecar, tests, and demo commands all target the detective_sam_v2 checkpoint.

The full runnable bundle with model weights is hosted on Hugging Face:

What is bundled

  • Inference checkpoint configs under checkpoints/
  • SAM2 config under sam2configs/
  • Poster demo pairs under demo/cocoglide/, demo/flux_test/, and demo/qwen_test/
  • A drop-in single-image slot at demo/user_image/demo_input.png

Built-in checkpoint alias:

  • detective_sam_v2

The detective_sam_v2 alias expects checkpoints/detective_sam_v2.pth and the bundled YAML sidecar.

DetectiveSAMv2 Release

This release packages the v2 architecture and benchmarked weights without the old DetectiveSAMv1 compatibility branches.

Main changes:

  • Evidence features built from unadapted target features, raw perturbation-stream features, and adapted target deltas.
  • Spatial cross-attention feature adapters for flexible information sharing across target and perturbation streams.
  • Transformer evidence mask adapter for direct prompt-mask prediction.
  • V2-only adapter implementation in detectivesam_inference/models/adapters.py.
  • Three perturbation streams by default: Gaussian blur, JPEG compression, and Gaussian noise.
  • Training scaled beyond the original SIDA/MagicBrush mix with PicoBanana 10k clean, UltraEdit 10k, and AutoEditForge-generated QWEN/FLUX train data designed to promote generalization to new editors.
  • Validation expanded across trained-on and held-out modern edit benchmarks, including QWEN-Bench, FLUX-Bench, CoCoGLIDE, AutoSplice, and NanoBanana.
  • JSON and YAML checkpoint sidecar support, including training checkpoint containers with model or ema keys.

Benchmark Scores

Scores below are IoU / F1 in percent for the 10-epoch v2 checkpoint dsamv2_best6info_10ep_20260623_060047/best_model.pth. Paper baseline numbers are from the Detective SAM ICLR 2026 paper tables for the original Detective SAM model.

Dataset Split DetectiveSAMv2 Paper Detective SAM Delta
MagicBrush trained on 62.73 / 74.38 46.48 / 57.55 +16.25 / +16.83
SIDA trained on 55.34 / 65.72 54.55 / 65.29 +0.79 / +0.43
UltraEdit 10k trained on 67.03 / 76.71 27.74 / 35.54 +39.29 / +41.17
PicoBanana 10k clean trained on 41.17 / 55.75 n/a n/a
FLUX-Bench trained on 46.10 / 58.02 18.70 / 21.28 +27.40 / +36.74
QWEN-Bench trained on 46.32 / 57.94 20.41 / 22.29 +25.91 / +35.65
CoCoGLIDE not trained on 47.17 / 58.83 44.74 / 51.50 +2.43 / +7.33
AutoSplice not trained on 58.39 / 71.40 46.90 / 60.30 +11.49 / +11.10
NanoBanana not trained on 33.81 / 46.79 19.34 / 20.77 +14.47 / +26.02

PicoBanana 10k clean is the curated training-set split evaluated with the inference package at threshold 0.5; it has no Detective SAM paper baseline. QWEN-Bench and FLUX-Bench are reported as trained-on because the training mix includes corresponding AutoEditForge-generated QWEN and FLUX datasets.

Grouped results using the same checkpoint:

Group Weighted IoU / F1 Simple IoU / F1
Trained-on 54.33 / 66.01 53.12 / 64.75
Not-trained-on 56.54 / 69.39 46.46 / 59.01
Overall 54.61 / 66.45 50.90 / 62.84

The not-trained-on group follows the requested reporting split and includes NanoBanana, CoCoGLIDE, and AutoSplice.

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Then download the large weights from Hugging Face:

pip install -U huggingface_hub
hf download Gertlek/DetectiveSAMv2 \
  checkpoints/detective_sam_v2.pth \
  sam2configs/sam2.1_hiera_base_plus.pt \
  --local-dir .

The expected checkpoint paths are also documented in checkpoints/README.md and sam2configs/README.md.

Hugging Face Usage

For the simplest setup, clone the Hugging Face repo directly:

git lfs install
git clone https://huggingface.co/Gertlek/DetectiveSAMv2
cd DetectiveSAMv2
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m detectivesam_inference.predict \
  --output-dir outputs/poster_baseline

Poster Demo Flows

1. Live single-image demo

Place your image at demo/user_image/demo_input.png, then run:

python -m detectivesam_inference.predict \
  --output-dir outputs/poster_user_image

In this mode the CLI reuses the target image as its own source reference so the demo stays runnable with a single image.

2. Bundled CocoGLIDE example

If demo/user_image/demo_input.png is absent, the default predict command falls back to the bundled CocoGlide sample banana_28809.

python -m detectivesam_inference.predict \
  --output-dir outputs/poster_baseline

3. Bundled modern-edit examples

Flux example:

python -m detectivesam_inference.predict \
  --checkpoint detective_sam_v2 \
  --source demo/flux_test/source/548.png \
  --target demo/flux_test/target/548.png \
  --mask demo/flux_test/mask/548.png \
  --output-dir outputs/poster_flux

Qwen example:

python -m detectivesam_inference.predict \
  --checkpoint detective_sam_v2 \
  --source demo/qwen_test/source/166.png \
  --target demo/qwen_test/target/166.png \
  --mask demo/qwen_test/mask/166.png \
  --output-dir outputs/poster_qwen

4. Bundled CocoGlide subset sweep

Use this to evaluate the bundled banana and train CocoGlide demo pairs.

python -m detectivesam_inference.evaluate \
  --checkpoint detective_sam_v2 \
  --dataset-root demo/cocoglide \
  --output-dir outputs/poster_eval_cocoglide \
  --num-visualizations 2

Outputs

Each predict run writes a compact set of visual artifacts plus a JSON summary:

  • <name>_comparison.png
  • <name>_probability.png
  • <name>_pred_mask.png
  • <name>_pred_overlay.png
  • <name>_summary.json

If a ground-truth mask is provided, the run also saves:

  • <name>_gt_mask.png
  • <name>_gt_overlay.png

The evaluate command writes summary.json plus a few visualization examples under visualizations/.

Notes

  • The runtime selects cuda automatically when available and otherwise runs on CPU.
  • Checkpoint settings come from YAML or JSON sidecars in checkpoints/; you only need the alias or checkpoint path.
  • The Hugging Face repo bundles the v2 .pth checkpoint and SAM2 .pt weight file.
  • This repo does not include training code or training-only dependencies.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support