快捷键

将 MMDetection 2.x 的配置文件迁移到 3.x

与 2.x 版本相比,MMDetection 3.x 的配置文件发生了重大变化。本文档介绍如何将 2.x 配置文件迁移到 3.x。

在之前的教程 了解配置文件 中,我们以 Mask R-CNN 为例介绍了 MMDetection 3.x 的配置文件结构。在这里,我们将遵循相同的结构来演示如何将 2.x 配置文件迁移到 3.x。

模型配置

与 2.x 相比,3.x 中模型配置没有重大变化。对于模型的 backbone、neck、head 以及 train_cfg 和 test_cfg,参数与 2.x 版本保持一致。

另一方面,我们在 MMDetection 3.x 中添加了 DataPreprocessor 模块。 DataPreprocessor 模块的配置位于 model.data_preprocessor 中。它用于预处理输入数据,例如对输入图像进行归一化和将不同大小的图像填充到批次中,以及将图像从内存加载到 VRAM。此配置替换了早期版本中 train_pipelinetest_pipeline 中的 NormalizePad 模块。

2.x 配置
# Image normalization parameters
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True)
pipeline=[
    ...,
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),  # Padding the image to multiples of 32
    ...
]
3.x 配置
model = dict(
    data_preprocessor=dict(
        type='DetDataPreprocessor',
        # Image normalization parameters
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        bgr_to_rgb=True,
        # Image padding parameters
        pad_mask=True,  # In instance segmentation, the mask needs to be padded
        pad_size_divisor=32)  # Padding the image to multiples of 32
)

数据集和评估器配置

与 2.x 版本相比,数据集和评估器配置发生了重大变化。我们将从三个方面介绍如何从 2.x 迁移到 3.x:Dataloader 和数据集、数据转换管道以及评估器配置。

Dataloader 和数据集配置

在新版本中,我们将数据加载设置与 PyTorch 的官方 DataLoader 保持一致,使用户更容易理解和上手。我们将训练、验证和测试的数据加载设置分别放在 train_dataloaderval_dataloadertest_dataloader 中。用户可以为这些 dataloader 设置不同的参数。输入参数与 PyTorch DataLoader 所需的参数基本一致。

这样,我们将 2.x 版本中不可配置的参数,例如 samplerbatch_samplerpersistent_workers,放在配置文件中,使用户可以更灵活地设置 dataloader 参数。

用户可以通过 train_dataloader.datasetval_dataloader.datasettest_dataloader.dataset 设置数据集配置,分别对应 2.x 版本中的 data.traindata.valdata.test

2.x 配置
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_train2017.json',
        img_prefix=data_root + 'train2017/',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=test_pipeline))
3.x 配置
train_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,  # Avoid recreating subprocesses after each iteration
    sampler=dict(type='DefaultSampler', shuffle=True),  # Default sampler, supports both distributed and non-distributed training
    batch_sampler=dict(type='AspectRatioBatchSampler'),  # Default batch_sampler, used to ensure that images in the batch have similar aspect ratios, so as to better utilize graphics memory
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='annotations/instances_train2017.json',
        data_prefix=dict(img='train2017/'),
        filter_cfg=dict(filter_empty_gt=True, min_size=32),
        pipeline=train_pipeline))
# In version 3.x, validation and test dataloaders can be configured independently
val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='annotations/instances_val2017.json',
        data_prefix=dict(img='val2017/'),
        test_mode=True,
        pipeline=test_pipeline))
test_dataloader = val_dataloader  # The configuration of the testing dataloader is the same as that of the validation dataloader, which is omitted here

数据转换管道配置

如前所述,我们将图像的归一化和填充配置从 train_pipelinetest_pipeline 中分离出来,并将它们放在 model.data_preprocessor 中。因此,在 3.x 版本的管道中,我们不再需要 NormalizePad 转换。

同时,我们还重构了负责打包数据格式的转换,并将 CollectDefaultFormatBundle 转换合并为 PackDetInputs。此转换负责将数据管道中的数据打包到模型的输入格式。有关输入格式转换的更多详细信息,请参考 数据流文档

下面,我们将以 Mask R-CNN 的 train_pipeline 为例,演示如何从 2.x 配置迁移到 3.x 配置

2.x 配置
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
3.x 配置
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PackDetInputs')
]

对于 test_pipeline,除了删除 NormalizePad 转换之外,我们还将测试数据增强(TTA)从正常的测试流程中分离出来,并删除了 MultiScaleFlipAug。有关如何使用新版 TTA 的更多信息,请参考 TTA 文档

下面,我们将再次以 Mask R-CNN 的 test_pipeline 为例,演示如何从 2.x 配置迁移到 3.x 配置

2.x 配置
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1333, 800),
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
3.x 配置
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='Resize', scale=(1333, 800), keep_ratio=True),
    dict(
        type='PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]

此外,我们还重构了一些数据增强转换。下表列出了 2.x 版本和 3.x 版本中使用的转换之间的映射

名称 2.x 配置 3.x 配置
Resize
dict(type='Resize',
     img_scale=(1333, 800),
     keep_ratio=True)
dict(type='Resize',
     scale=(1333, 800),
     keep_ratio=True)
RandomResize
dict(
    type='Resize',
    img_scale=[
        (1333, 640), (1333, 800)],
    multiscale_mode='range',
    keep_ratio=True)
dict(
    type='RandomResize',
    scale=[
        (1333, 640), (1333, 800)],
    keep_ratio=True)
RandomChoiceResize
dict(
    type='Resize',
    img_scale=[
        (1333, 640), (1333, 672),
        (1333, 704), (1333, 736),
        (1333, 768), (1333, 800)],
    multiscale_mode='value',
    keep_ratio=True)
dict(
    type='RandomChoiceResize',
    scales=[
        (1333, 640), (1333, 672),
        (1333, 704), (1333, 736),
        (1333, 768), (1333, 800)],
    keep_ratio=True)
RandomFlip
dict(type='RandomFlip', flip_ratio=0.5)
dict(type='RandomFlip', prob=0.5)

评测器配置

在 3.x 版本中,模型精度评估不再与数据集绑定,而是通过使用评估器来完成。评估器配置分为两部分: val_evaluatortest_evaluatorval_evaluator 用于验证数据集评估,而 test_evaluator 用于测试数据集评估。这对应于 2.x 版本中的 evaluation 字段。

下表显示了 2.x 版本和 3.x 版本中评估器之间的对应关系。

指标名称 2.x 配置 3.x 配置
COCO
data = dict(
    val=dict(
        type='CocoDataset',
        ann_file=data_root + 'annotations/instances_val2017.json'))
evaluation = dict(metric=['bbox', 'segm'])
val_evaluator = dict(
    type='CocoMetric',
    ann_file=data_root + 'annotations/instances_val2017.json',
    metric=['bbox', 'segm'],
    format_only=False)
Pascal VOC
data = dict(
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt'))
evaluation = dict(metric='mAP')
val_evaluator = dict(
    type='VOCMetric',
    metric='mAP',
    eval_mode='11points')
OpenImages
data = dict(
    val=dict(
        type='OpenImagesDataset',
        ann_file=data_root + 'annotations/validation-annotations-bbox.csv',
        img_prefix=data_root + 'OpenImages/validation/',
        label_file=data_root + 'annotations/class-descriptions-boxable.csv',
        hierarchy_file=data_root +
        'annotations/bbox_labels_600_hierarchy.json',
        meta_file=data_root + 'annotations/validation-image-metas.pkl',
        image_level_ann_file=data_root +
        'annotations/validation-annotations-human-imagelabels-boxable.csv'))
evaluation = dict(interval=1, metric='mAP')
val_evaluator = dict(
    type='OpenImagesMetric',
    iou_thrs=0.5,
    ioa_thrs=0.5,
    use_group_of=True,
    get_supercategory=True)
CityScapes
data = dict(
    val=dict(
        type='CityScapesDataset',
        ann_file=data_root +
        'annotations/instancesonly_filtered_gtFine_val.json',
        img_prefix=data_root + 'leftImg8bit/val/',
        pipeline=test_pipeline))
evaluation = dict(metric=['bbox', 'segm'])
val_evaluator = [
    dict(
        type='CocoMetric',
        ann_file=data_root +
        'annotations/instancesonly_filtered_gtFine_val.json',
        metric=['bbox', 'segm']),
    dict(
        type='CityScapesMetric',
        ann_file=data_root +
        'annotations/instancesonly_filtered_gtFine_val.json',
        seg_prefix=data_root + '/gtFine/val',
        outfile_prefix='./work_dirs/cityscapes_metric/instance')
]

训练和测试配置

2.x 配置
runner = dict(
    type='EpochBasedRunner',  # Type of training loop
    max_epochs=12)  # Maximum number of training epochs
evaluation = dict(interval=2)  # Interval for evaluation, check the performance every 2 epochs
3.x 配置
train_cfg = dict(
    type='EpochBasedTrainLoop',  # Type of training loop, please refer to https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py
    max_epochs=12,  # Maximum number of training epochs
    val_interval=2)  # Interval for validation, check the performance every 2 epochs
val_cfg = dict(type='ValLoop')  # Type of validation loop
test_cfg = dict(type='TestLoop')  # Type of testing loop

优化器配置

优化器和梯度裁剪的配置已移至 optim_wrapper 字段。下表显示了 2.x 版本和 3.x 版本之间优化器配置的对应关系

2.x 配置
optimizer = dict(
    type='SGD',  # Optimizer: Stochastic Gradient Descent
    lr=0.02,  # Base learning rate
    momentum=0.9,  # SGD with momentum
    weight_decay=0.0001)  # Weight decay
optimizer_config = dict(grad_clip=None)  # Configuration for gradient clipping, set to None to disable
3.x 配置
optim_wrapper = dict(  # Configuration for the optimizer wrapper
    type='OptimWrapper',  # Type of optimizer wrapper, you can switch to AmpOptimWrapper to enable mixed precision training
    optimizer=dict(  # Optimizer configuration, supports various PyTorch optimizers, please refer to https://pytorch.ac.cn/docs/stable/optim.html#algorithms
        type='SGD',  # SGD
        lr=0.02,  # Base learning rate
        momentum=0.9,  # SGD with momentum
        weight_decay=0.0001),  # Weight decay
    clip_grad=None,  # Configuration for gradient clipping, set to None to disable. For usage, please see https://mmengine.readthedocs.io/en/latest/tutorials/optimizer.html
    )

学习率的配置也从 lr_config 字段移至 param_scheduler 字段。 param_scheduler 配置更类似于 PyTorch 的学习率调度器,更加灵活。下表显示了 2.x 版本和 3.x 版本之间学习率配置的对应关系

2.x 配置
lr_config = dict(
    policy='step',  # Use multi-step learning rate strategy during training
    warmup='linear',  # Use linear learning rate warmup
    warmup_iters=500,  # End warmup at iteration 500
    warmup_ratio=0.001,  # Coefficient for learning rate warmup
    step=[8, 11],  # Learning rate decay at which epochs
    gamma=0.1)  # Learning rate decay coefficient

3.x 配置
param_scheduler = [
    dict(
        type='LinearLR',  # Use linear learning rate warmup
        start_factor=0.001, # Coefficient for learning rate warmup
        by_epoch=False,  # Update the learning rate during warmup at each iteration
        begin=0,  # Starting from the first iteration
        end=500),  # End at the 500th iteration
    dict(
        type='MultiStepLR',  # Use multi-step learning rate strategy during training
        by_epoch=True,  # Update the learning rate at each epoch
        begin=0,   # Starting from the first epoch
        end=12,  # Ending at the 12th epoch
        milestones=[8, 11],  # Learning rate decay at which epochs
        gamma=0.1)  # Learning rate decay coefficient
]

有关如何迁移其他学习率调整策略的信息,请参考 MMEngine 的学习率迁移文档

其他配置的迁移

保存检查点的配置

功能 2.x 配置 3.x 配置
设置保存间隔
checkpoint_config = dict(
    interval=1)
default_hooks = dict(
    checkpoint=dict(
        type='CheckpointHook',
        interval=1))
保存最佳模型
evaluation = dict(
    save_best='auto')
default_hooks = dict(
    checkpoint=dict(
        type='CheckpointHook',
        save_best='auto'))
保留最新模型
checkpoint_config = dict(
    max_keep_ckpts=3)
default_hooks = dict(
    checkpoint=dict(
        type='CheckpointHook',
        max_keep_ckpts=3))

日志记录配置

在 MMDetection 3.x 中,日志记录和日志可视化分别由 MMEngine 中的日志记录器和可视化器执行。下表显示了 MMDetection 2.x 和 3.x 中打印日志和可视化日志的配置之间的比较。

功能 2.x 配置 3.x 配置
设置日志打印间隔
log_config = dict(interval=50)
default_hooks = dict(
    logger=dict(type='LoggerHook', interval=50))
# Optional: set moving average window size
log_processor = dict(
    type='LogProcessor', window_size=50)
使用 TensorBoard 或 WandB 可视化日志
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        dict(type='TensorboardLoggerHook'),
        dict(type='MMDetWandbHook',
             init_kwargs={
                'project': 'mmdetection',
                'group': 'maskrcnn-r50-fpn-1x-coco'
             },
             interval=50,
             log_checkpoint=True,
             log_checkpoint_metadata=True,
             num_eval_images=100)
    ])
vis_backends = [
    dict(type='LocalVisBackend'),
    dict(type='TensorboardVisBackend'),
    dict(type='WandbVisBackend',
         init_kwargs={
            'project': 'mmdetection',
            'group': 'maskrcnn-r50-fpn-1x-coco'
         })
]
visualizer = dict(
    type='DetLocalVisualizer',
    vis_backends=vis_backends,
    name='visualizer')

有关可视化相关的教程,请参阅 MMDetection 的 可视化教程

运行时配置

3.x 版本中的运行时配置字段已调整,具体对应关系如下

2.x 配置 3.x 配置
cudnn_benchmark = False
opencv_num_threads = 0
mp_start_method = 'fork'
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None


env_cfg = dict(
    cudnn_benchmark=False,
    mp_cfg=dict(mp_start_method='fork',
                opencv_num_threads=0),
    dist_cfg=dict(backend='nccl'))
log_level = 'INFO'
load_from = None
resume = False