自定义模型¶
我们基本上将模型组件分为 5 种类型。
- 主干网络:通常是一个 FCN 网络,用于提取特征图,例如 ResNet、MobileNet。 
- 颈部:主干网络和头部之间的组件,例如 FPN、PAFPN。 
- 头部:用于特定任务的组件,例如边界框预测和掩码预测。 
- RoI 提取器:从特征图中提取 RoI 特征的部分,例如 RoI Align。 
- 损失:头部中用于计算损失的组件,例如 FocalLoss、L1Loss 和 GHMLoss。 
开发新组件¶
添加新的主干网络¶
这里我们以 MobileNet 为例,展示如何开发新组件。
1. 定义新的主干网络(例如 MobileNet)¶
创建一个新的文件 mmdet/models/backbones/mobilenet.py。
import torch.nn as nn
from mmdet.registry import MODELS
@MODELS.register_module()
class MobileNet(nn.Module):
    def __init__(self, arg1, arg2):
        pass
    def forward(self, x):  # should return a tuple
        pass
2. 导入模块¶
您可以将以下行添加到 mmdet/models/backbones/__init__.py 中
from .mobilenet import MobileNet
或者,您也可以添加
custom_imports = dict(
    imports=['mmdet.models.backbones.mobilenet'],
    allow_failed_imports=False)
到配置文件中,避免修改原始代码。
3. 在您的配置文件中使用主干网络¶
model = dict(
    ...
    backbone=dict(
        type='MobileNet',
        arg1=xxx,
        arg2=xxx),
    ...
添加新的颈部¶
1. 定义颈部(例如 PAFPN)¶
创建一个新的文件 mmdet/models/necks/pafpn.py。
import torch.nn as nn
from mmdet.registry import MODELS
@MODELS.register_module()
class PAFPN(nn.Module):
    def __init__(self,
                in_channels,
                out_channels,
                num_outs,
                start_level=0,
                end_level=-1,
                add_extra_convs=False):
        pass
    def forward(self, inputs):
        # implementation is ignored
        pass
2. 导入模块¶
您可以将以下行添加到 mmdet/models/necks/__init__.py 中,
from .pafpn import PAFPN
或者,您也可以添加
custom_imports = dict(
    imports=['mmdet.models.necks.pafpn'],
    allow_failed_imports=False)
到配置文件中,避免修改原始代码。
3. 修改配置文件¶
neck=dict(
    type='PAFPN',
    in_channels=[256, 512, 1024, 2048],
    out_channels=256,
    num_outs=5)
添加新的头部¶
这里我们以 双头 R-CNN 为例,展示如何开发新的头部。
首先,在 mmdet/models/roi_heads/bbox_heads/double_bbox_head.py 中添加新的边界框头部。双头 R-CNN 实现了新的边界框头部,用于目标检测。要实现边界框头部,我们基本上需要实现新模块的三个函数,如下所示。
from typing import Tuple
import torch.nn as nn
from mmcv.cnn import ConvModule
from mmengine.model import BaseModule, ModuleList
from torch import Tensor
from mmdet.models.backbones.resnet import Bottleneck
from mmdet.registry import MODELS
from mmdet.utils import ConfigType, MultiConfig, OptConfigType, OptMultiConfig
from .bbox_head import BBoxHead
@MODELS.register_module()
class DoubleConvFCBBoxHead(BBoxHead):
    r"""Bbox head used in Double-Head R-CNN
    .. code-block:: none
                                          /-> cls
                      /-> shared convs ->
                                          \-> reg
        roi features
                                          /-> cls
                      \-> shared fc    ->
                                          \-> reg
    """  # noqa: W605
    def __init__(self,
                 num_convs: int = 0,
                 num_fcs: int = 0,
                 conv_out_channels: int = 1024,
                 fc_out_channels: int = 1024,
                 conv_cfg: OptConfigType = None,
                 norm_cfg: ConfigType = dict(type='BN'),
                 init_cfg: MultiConfig = dict(
                     type='Normal',
                     override=[
                         dict(type='Normal', name='fc_cls', std=0.01),
                         dict(type='Normal', name='fc_reg', std=0.001),
                         dict(
                             type='Xavier',
                             name='fc_branch',
                             distribution='uniform')
                     ]),
                 **kwargs) -> None:
        kwargs.setdefault('with_avg_pool', True)
        super().__init__(init_cfg=init_cfg, **kwargs)
    def forward(self, x_cls: Tensor, x_reg: Tensor) -> Tuple[Tensor]:
其次,如果需要,实现新的 RoI 头部。我们计划从 StandardRoIHead 继承新的 DoubleHeadRoIHead。我们可以发现 StandardRoIHead 已经实现了以下函数。
from typing import List, Optional, Tuple
import torch
from torch import Tensor
from mmdet.registry import MODELS, TASK_UTILS
from mmdet.structures import DetDataSample
from mmdet.structures.bbox import bbox2roi
from mmdet.utils import ConfigType, InstanceList
from ..task_modules.samplers import SamplingResult
from ..utils import empty_instances, unpack_gt_instances
from .base_roi_head import BaseRoIHead
@MODELS.register_module()
class StandardRoIHead(BaseRoIHead):
    """Simplest base roi head including one bbox head and one mask head."""
    def init_assigner_sampler(self) -> None:
    def init_bbox_head(self, bbox_roi_extractor: ConfigType,
                       bbox_head: ConfigType) -> None:
    def init_mask_head(self, mask_roi_extractor: ConfigType,
                       mask_head: ConfigType) -> None:
    def forward(self, x: Tuple[Tensor],
                rpn_results_list: InstanceList) -> tuple:
    def loss(self, x: Tuple[Tensor], rpn_results_list: InstanceList,
             batch_data_samples: List[DetDataSample]) -> dict:
    def _bbox_forward(self, x: Tuple[Tensor], rois: Tensor) -> dict:
    def bbox_loss(self, x: Tuple[Tensor],
                  sampling_results: List[SamplingResult]) -> dict:
    def mask_loss(self, x: Tuple[Tensor],
                  sampling_results: List[SamplingResult], bbox_feats: Tensor,
                  batch_gt_instances: InstanceList) -> dict:
    def _mask_forward(self,
                      x: Tuple[Tensor],
                      rois: Tensor = None,
                      pos_inds: Optional[Tensor] = None,
                      bbox_feats: Optional[Tensor] = None) -> dict:
    def predict_bbox(self,
                     x: Tuple[Tensor],
                     batch_img_metas: List[dict],
                     rpn_results_list: InstanceList,
                     rcnn_test_cfg: ConfigType,
                     rescale: bool = False) -> InstanceList:
    def predict_mask(self,
                     x: Tuple[Tensor],
                     batch_img_metas: List[dict],
                     results_list: InstanceList,
                     rescale: bool = False) -> InstanceList:
双头的修改主要在 bbox_forward 逻辑中,它从 StandardRoIHead 继承了其他逻辑。在 mmdet/models/roi_heads/double_roi_head.py 中,我们实现了新的 RoI 头部,如下所示
from typing import Tuple
from torch import Tensor
from mmdet.registry import MODELS
from .standard_roi_head import StandardRoIHead
@MODELS.register_module()
class DoubleHeadRoIHead(StandardRoIHead):
    """RoI head for `Double Head RCNN <https://arxiv.org/abs/1904.06493>`_.
    Args:
        reg_roi_scale_factor (float): The scale factor to extend the rois
            used to extract the regression features.
    """
    def __init__(self, reg_roi_scale_factor: float, **kwargs):
        super().__init__(**kwargs)
        self.reg_roi_scale_factor = reg_roi_scale_factor
    def _bbox_forward(self, x: Tuple[Tensor], rois: Tensor) -> dict:
        """Box head forward function used in both training and testing.
        Args:
            x (tuple[Tensor]): List of multi-level img features.
            rois (Tensor): RoIs with the shape (n, 5) where the first
                column indicates batch id of each RoI.
        Returns:
             dict[str, Tensor]: Usually returns a dictionary with keys:
                - `cls_score` (Tensor): Classification scores.
                - `bbox_pred` (Tensor): Box energies / deltas.
                - `bbox_feats` (Tensor): Extract bbox RoI features.
        """
        bbox_cls_feats = self.bbox_roi_extractor(
            x[:self.bbox_roi_extractor.num_inputs], rois)
        bbox_reg_feats = self.bbox_roi_extractor(
            x[:self.bbox_roi_extractor.num_inputs],
            rois,
            roi_scale_factor=self.reg_roi_scale_factor)
        if self.with_shared_head:
            bbox_cls_feats = self.shared_head(bbox_cls_feats)
            bbox_reg_feats = self.shared_head(bbox_reg_feats)
        cls_score, bbox_pred = self.bbox_head(bbox_cls_feats, bbox_reg_feats)
        bbox_results = dict(
            cls_score=cls_score,
            bbox_pred=bbox_pred,
            bbox_feats=bbox_cls_feats)
        return bbox_results
最后,用户需要在 mmdet/models/bbox_heads/__init__.py 和 mmdet/models/roi_heads/__init__.py 中添加模块,这样相应的注册表就可以找到并加载它们。
或者,用户可以添加
custom_imports=dict(
    imports=['mmdet.models.roi_heads.double_roi_head', 'mmdet.models.roi_heads.bbox_heads.double_bbox_head'])
到配置文件中,实现相同的功能。
双头 R-CNN 的配置文件如下所示
_base_ = '../faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py'
model = dict(
    roi_head=dict(
        type='DoubleHeadRoIHead',
        reg_roi_scale_factor=1.3,
        bbox_head=dict(
            _delete_=True,
            type='DoubleConvFCBBoxHead',
            num_convs=4,
            num_fcs=2,
            in_channels=256,
            conv_out_channels=1024,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=80,
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0., 0., 0., 0.],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0),
            loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=2.0))))
从 MMDetection 2.0 开始,配置文件系统支持继承配置文件,这样用户就可以专注于修改。双头 R-CNN 主要使用新的 DoubleHeadRoIHead 和新的 DoubleConvFCBBoxHead ,参数根据每个模块的 __init__ 函数设置。
添加新的损失¶
假设您想添加新的损失函数 MyLoss,用于边界框回归。要添加新的损失函数,用户需要在 mmdet/models/losses/my_loss.py 中实现它。装饰器 weighted_loss 使得损失可以针对每个元素进行加权。
import torch
import torch.nn as nn
from mmdet.registry import MODELS
from .utils import weighted_loss
@weighted_loss
def my_loss(pred, target):
    assert pred.size() == target.size() and target.numel() > 0
    loss = torch.abs(pred - target)
    return loss
@MODELS.register_module()
class MyLoss(nn.Module):
    def __init__(self, reduction='mean', loss_weight=1.0):
        super(MyLoss, self).__init__()
        self.reduction = reduction
        self.loss_weight = loss_weight
    def forward(self,
                pred,
                target,
                weight=None,
                avg_factor=None,
                reduction_override=None):
        assert reduction_override in (None, 'none', 'mean', 'sum')
        reduction = (
            reduction_override if reduction_override else self.reduction)
        loss_bbox = self.loss_weight * my_loss(
            pred, target, weight, reduction=reduction, avg_factor=avg_factor)
        return loss_bbox
然后,用户需要在 mmdet/models/losses/__init__.py 中添加它。
from .my_loss import MyLoss, my_loss
或者,您也可以添加
custom_imports=dict(
    imports=['mmdet.models.losses.my_loss'])
到配置文件中,实现相同的功能。
要使用它,修改 loss_xxx 字段。由于 MyLoss 用于回归,因此您需要修改头部中的 loss_bbox 字段。
loss_bbox=dict(type='MyLoss', loss_weight=1.0))