腐蚀基准测试¶

介绍¶

我们提供工具来测试目标检测和实例分割模型在 Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming 中定义的图像腐蚀基准测试上的表现。本页提供有关如何使用该基准测试的基本教程。

@article{michaelis2019winter,
  title={Benchmarking Robustness in Object Detection:
    Autonomous Driving when Winter is Coming},
  author={Michaelis, Claudio and Mitzkus, Benjamin and
    Geirhos, Robert and Rusak, Evgenia and
    Bringmann, Oliver and Ecker, Alexander S. and
    Bethge, Matthias and Brendel, Wieland},
  journal={arXiv:1907.07484},
  year={2019}
}

image corruption example

关于该基准测试¶

要向该基准测试提交结果，请访问基准测试主页

该基准测试是根据 imagenet-c 基准测试建模的，该基准测试最初在 Benchmarking Neural Network Robustness to Common Corruptions and Perturbations (ICLR 2019) 中由 Dan Hendrycks 和 Thomas Dietterich 发表。

图像腐蚀函数包含在这个库中，但可以通过以下方式单独安装：

pip install imagecorruptions

与 imagenet-c 相比，必须进行一些更改来处理任意大小的图像和灰度图像。我们还修改了“运动模糊”和“雪”腐蚀，以消除对特定于 Linux 的库的依赖，否则该库将需要单独安装。有关详细信息，请参阅 imagecorruptions 存储库。

使用预训练模型进行推断¶

我们提供一个测试脚本，用于评估模型在基准测试中提供的任何腐蚀组合上的性能。

测试数据集¶

[x] 单 GPU 测试
[ ] 多 GPU 测试
[ ] 可视化检测结果

可以使用以下命令来测试模型在基准测试中使用的 15 种腐蚀情况下的性能。

# single-gpu testing
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}]

或者可以选择不同的腐蚀组。

# noise
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions noise

# blur
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions blur

# wetaher
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions weather

# digital
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] --corruptions digital

或者可以选择自定义的腐蚀集，例如：

# gaussian noise, zoom blur and snow
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions gaussian_noise zoom_blur snow

最后，可以选择要评估的腐蚀严重程度。严重程度 0 对应于干净数据，严重程度从 1 到 5 递增。

# severity 1
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 1

# severities 0,2,4
python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 0 2 4

模型库模型的结果¶

下表显示了在 COCO 2017val 上的结果。

模型	主干网络	风格	学习率调度	box AP（干净）	box AP（腐蚀）	box %	mask AP（干净）	mask AP（腐蚀）	mask %
Faster R-CNN	R-50-FPN	pytorch	1x	36.3	18.2	50.2	-	-	-
Faster R-CNN	R-101-FPN	pytorch	1x	38.5	20.9	54.2	-	-	-
Faster R-CNN	X-101-32x4d-FPN	pytorch	1x	40.1	22.3	55.5	-	-	-
Faster R-CNN	X-101-64x4d-FPN	pytorch	1x	41.3	23.4	56.6	-	-	-
Faster R-CNN	R-50-FPN-DCN	pytorch	1x	40.0	22.4	56.1	-	-	-
Faster R-CNN	X-101-32x4d-FPN-DCN	pytorch	1x	43.4	26.7	61.6	-	-	-
Mask R-CNN	R-50-FPN	pytorch	1x	37.3	18.7	50.1	34.2	16.8	49.1
Mask R-CNN	R-50-FPN-DCN	pytorch	1x	41.1	23.3	56.7	37.2	20.7	55.7
Cascade R-CNN	R-50-FPN	pytorch	1x	40.4	20.1	49.7	-	-	-
Cascade Mask R-CNN	R-50-FPN	pytorch	1x	41.2	20.7	50.2	35.7	17.6	49.3
RetinaNet	R-50-FPN	pytorch	1x	35.6	17.8	50.1	-	-	-
混合任务级联	X-101-64x4d-FPN-DCN	pytorch	1x	50.6	32.7	64.7	43.8	28.1	64.0

由于随机应用腐蚀，结果可能略有差异。