当前位置：首页 > news >正文

北京的做网站公司百度百家官网入口

news 2025/7/1 23:37:25

北京的做网站公司,百度百家官网入口,优秀企业vi设计案例,合肥网站建设秋招面试专栏推荐 ：深度学习算法工程师面试问题总结【百面算法工程师】——点击即可跳转 💡💡💡本专栏所有程序均经过测试，可成功执行💡💡💡 专栏目录： 《YOLOv5入门改…

秋招面试专栏推荐 ：深度学习算法工程师面试问题总结【百面算法工程师】——点击即可跳转

💡💡💡本专栏所有程序均经过测试，可成功执行💡💡💡

专栏目录： 《YOLOv5入门 + 改进涨点》专栏介绍 & 专栏目录 |目前已有60+篇内容，内含各种Head检测头、损失函数Loss、Backbone、Neck、NMS等创新点改进

MobileNetV2是一种高效的卷积神经网络架构，专为移动和嵌入式设备上的计算需求设计。它通过引入逆残差结构（Inverted Residuals）和线性瓶颈层，有效地减少了计算量和参数数量，同时保持了良好的精度。该网络在保持较低复杂度的同时，能够在图像分类、目标检测和语义分割等任务中提供强大的性能。文章在介绍主要的原理后，将手把手教学如何进行模块的代码添加和修改，并将修改后的完整代码放在文章的最后，方便大家一键运行，小白也可轻松上手实践。以帮助您更好地学习深度学习目标检测YOLO系列的挑战。

专栏地址： YOLOv5改进+入门——持续更新各种有效涨点方法点击即可跳转

1.原理

2. 将MobileNet v2添加到YOLOv5中

2.1 MobileNet v2的代码实现

2.2 新增yaml文件

2.3 注册模块

2.4 执行程序

3. 完整代码分享

4. GFLOPs

5. 进阶

6. 总结

1.原理

论文地址：MobileNetV2: Inverted Residuals and Linear Bottlenecks——点击即可跳转

官方代码：官方代码仓库——点击即可跳转

以下原理内容来自@太阳花的小绿豆

在MobileNet v1的网络结构表中能够发现，网络的结构就像VGG一样是个直筒型的，不像ResNet网络有shorcut之类的连接方式。而且有人反映说MobileNet v1网络中的DW卷积很容易训练废掉，效果并没有那么理想。所以我们接着看下MobileNet v2网络。
MobileNet v2网络是由google团队在2018年提出的，相比MobileNet V1网络，准确率更高，模型更小。刚刚说了MobileNet v1网络中的亮点是DW卷积，那么在MobileNet v2中的亮点就是Inverted residual block（倒残差结构），如下下图所示，左侧是ResNet网络中的残差结构，右侧就是MobileNet v2中的到残差结构。在残差结构中是1x1卷积降维->3x3卷积->1x1卷积升维，在倒残差结构中正好相反，是1x1卷积升维->3x3DW卷积->1x1卷积降维。为什么要这样做，原文的解释是高维信息通过ReLU激活函数后丢失的信息更少（注意倒残差结构中基本使用的都是ReLU6激活函数，但是最后一个1x1的卷积层使用的是线性激活函数）。

在使用倒残差结构时需要注意下，并不是所有的倒残差结构都有shortcut连接，只有当stride=1且输入特征矩阵与输出特征矩阵shape相同时才有shortcut连接（只有当shape相同时，两个矩阵才能做加法运算，当stride=1时并不能保证输入特征矩阵的channel与输出特征矩阵的channel相同）。

下图是MobileNet v2网络的结构表，其中t代表的是扩展因子（倒残差结构中第一个1x1卷积的扩展因子），c代表输出特征矩阵的channel，n代表倒残差结构重复的次数，s代表步距（注意：这里的步距只是针对重复n次的第一层倒残差结构，后面的都默认为1）。

2. 将MobileNet v2添加到YOLOv5中

2.1 MobileNet v2的代码实现

关键步骤一: 将下面代码添加到 yolov5/models/common.py中

 class conv_bn_relu_maxpool(nn.Module):def __init__(self, c1, c2):  # ch_in, ch_outsuper(conv_bn_relu_maxpool, self).__init__()self.conv = Conv(c1, c2, k=3, s=2, p=1, g=1, act='ReLU')self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)def forward(self, x):return self.maxpool(self.conv(x))def fuse(self):self.conv.fuse()class RepVGGBlock(nn.Module):def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, use_se=False, use_cbam=False,padding=1, dilation=1, groups=1, padding_mode='zeros', deploy=False):super(RepVGGBlock, self).__init__()self.deploy = deployself.groups = groupsself.in_channels = in_channelsself.out_channels = out_channelsself.kernel_size = kernel_sizeself.stride = strideself.padding = paddingself.dilation = dilationself.groups = groupsself.padding_mode = padding_modepadding_11 = padding - kernel_size // 2# self.nonlinearity = nn.SiLU()self.nonlinearity = nn.ReLU()if use_se or use_cbam:if use_se:self.se = SEBlock(out_channels, internal_neurons=out_channels // 16)if use_cbam:self.se = CBAM(out_channels, internal_neurons=out_channels // 16)else:self.se = nn.Identity()if deploy:self.rbr_reparam = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size,stride=stride,padding=padding, dilation=dilation, groups=groups, bias=True,padding_mode=padding_mode)else:self.rbr_identity = nn.BatchNorm2d(num_features=in_channels) if out_channels == in_channels and stride == 1 else Noneself.rbr_dense = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size,stride=stride, padding=padding, groups=groups)self.rbr_1x1 = conv_bn(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride,padding=padding_11, groups=groups)# print('RepVGG Block, identity = ', self.rbr_identity)def get_equivalent_kernel_bias(self):kernel3x3, bias3x3 = self._fuse_bn_tensor(self.rbr_dense)kernel1x1, bias1x1 = self._fuse_bn_tensor(self.rbr_1x1)kernelid, biasid = self._fuse_bn_tensor(self.rbr_identity)return kernel3x3 + self._pad_1x1_to_3x3_tensor(kernel1x1) + kernelid, bias3x3 + bias1x1 + biasiddef _pad_1x1_to_3x3_tensor(self, kernel1x1):if kernel1x1 is None:return 0else:return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1])def _fuse_bn_tensor(self, branch):if branch is None:return 0, 0if isinstance(branch, nn.Sequential):kernel = branch.conv.weightrunning_mean = branch.bn.running_meanrunning_var = branch.bn.running_vargamma = branch.bn.weightbeta = branch.bn.biaseps = branch.bn.epselse:assert isinstance(branch, (nn.BatchNorm2d, nn.SyncBatchNorm))if not hasattr(self, 'id_tensor'):input_dim = self.in_channels // self.groupskernel_value = np.zeros((self.in_channels, input_dim, 3, 3), dtype=np.float32)for i in range(self.in_channels):kernel_value[i, i % input_dim, 1, 1] = 1self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)kernel = self.id_tensorrunning_mean = branch.running_meanrunning_var = branch.running_vargamma = branch.weightbeta = branch.biaseps = branch.epsstd = (running_var + eps).sqrt()t = (gamma / std).reshape(-1, 1, 1, 1)return kernel * t, beta - running_mean * gamma / stddef forward(self, inputs):if hasattr(self, 'rbr_reparam'):return self.nonlinearity(self.se(self.rbr_reparam(inputs)))if self.rbr_identity is None:id_out = 0else:id_out = self.rbr_identity(inputs)return self.nonlinearity(self.se(self.rbr_dense(inputs) + self.rbr_1x1(inputs) + id_out))# RepVGGBlock(in_channels=self.in_planes, out_channels=planes, kernel_size=3,#                           stride=stride, padding=1, groups=1, deploy=self.deploy, use_se=self.use_se))def fuse(self):if self.deploy == False:self.rbr_reparam = nn.Conv2d(in_channels=self.in_channels, out_channels=self.out_channels,kernel_size=self.kernel_size,stride=self.stride,padding=self.padding, dilation=self.dilation, groups=self.groups, bias=True,padding_mode=self.padding_mode).requires_grad_(False).to(self.rbr_dense.conv.weight.device)self.deploy = Truekernel, bias = self.get_equivalent_kernel_bias()self.rbr_reparam.weight.data = kernelself.rbr_reparam.bias.data = biasself.forward = self.fusevggforwardself.__delattr__('rbr_identity')self.rbr_dense.__delattr__('conv')self.rbr_dense.__delattr__('bn')self.rbr_1x1.__delattr__('conv')self.rbr_1x1.__delattr__('bn')del self._modules['rbr_dense']del self._modules['rbr_1x1']def fusevggforward(self, inputs):return self.nonlinearity(self.se(self.rbr_reparam(inputs)))class MobileNetV2_Block(nn.Module):def __init__(self, inp, oup, stride=1, expand_ratio=1):super(MobileNetV2_Block, self).__init__()assert stride in [1, 2]self.stride = strideself.identity = stride == 1 and inp == ouphidden_dim = int(round(inp * expand_ratio))act = 'ReLU'if expand_ratio != 1:self.conv = nn.Sequential(Conv(inp, hidden_dim, k=1, s=1, p=0, act=act),DWConv(hidden_dim, hidden_dim, k=3, s=stride, act=act),Conv(hidden_dim, oup, k=1, s=1, p=0, act=False),)else:self.conv = nn.Sequential(DWConv(hidden_dim, hidden_dim, k=3, s=stride, act=act),Conv(hidden_dim, oup, k=1, s=1, p=0, act=False),)def forward(self, x):y = self.conv(x)if self.identity:return x + yelse:return ydef fuse(self):for m in self.conv:if isinstance(m, (Conv, DWConv, RepVGGBlock)):m.fuse()

2.2 新增yaml文件

关键步骤二：在下/yolov5/models下新建文件 yolov5_MobileNetv2.yaml并将下面代码复制进去

目标检测yaml文件

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:- [10,13, 16,30, 33,23]  # P3/8- [30,61, 62,45, 59,119]  # P4/16- [116,90, 156,198, 373,326]  # P5/32# Mobilenetv3Small backbone
# MobileNetV3_Block in_ch, [out_ch, hid_ch, k_s, stride, SE, HardSwish]
backbone:# [from, number, module, args][[-1, 1, Conv, [32, 3, 2]],                # 0-p1/2[-1, 1, MobileNetV2_Block, [16, 1, 1]],   # 1[-1, 1, MobileNetV2_Block, [24, 2, 6]],   # 2-p2/4[-1, 1, MobileNetV2_Block, [24, 1, 6]],   # 3[-1, 1, MobileNetV2_Block, [32, 2, 6]],   # 4-p3/8[-1, 2, MobileNetV2_Block, [32, 1, 6]],   # 5[-1, 1, MobileNetV2_Block, [64, 2, 6]],   # 6-p4/16[-1, 3, MobileNetV2_Block, [64, 1, 6]],   # 7[-1, 1, MobileNetV2_Block, [96, 1, 6]],   # 8[-1, 2, MobileNetV2_Block, [96, 1, 6]],   # 9[-1, 1, MobileNetV2_Block, [160, 2, 6]],  # 10-p5/32[-1, 2, MobileNetV2_Block, [160, 1, 6]],  # 11[-1, 1, MobileNetV2_Block, [320, 1, 6]],  # 12[-1, 1, SPPF, [1024, 5]],  # 13]# YOLOv5 v6.0 head
head:[[-1, 1, Conv, [256, 1, 1]], # 14[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 9], 1, Concat, [1]],  # cat backbone P4[-1, 1, C3, [256, False]],  # 17[-1, 1, Conv, [128, 1, 1]], # 18[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 5], 1, Concat, [1]],  # cat backbone P3[-1, 1, C3, [128, False]],  # 20 (P3/8-small)[-1, 1, Conv, [128, 3, 2]],[[-1, 18], 1, Concat, [1]], # cat head P4[-1, 1, C3, [256, False]],  # 24 (P4/16-medium)[-1, 1, Conv, [256, 3, 2]],[[-1, 14], 1, Concat, [1]],  # cat head P5[-1, 1, C3, [512, False]],  # 27 (P5/32-large)[[21, 24, 27], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)]

语义分割yaml文件

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:- [10,13, 16,30, 33,23]  # P3/8- [30,61, 62,45, 59,119]  # P4/16- [116,90, 156,198, 373,326]  # P5/32# Mobilenetv3Small backbone
# MobileNetV3_Block in_ch, [out_ch, hid_ch, k_s, stride, SE, HardSwish]
backbone:# [from, number, module, args][[-1, 1, Conv, [32, 3, 2]],                # 0-p1/2[-1, 1, MobileNetV2_Block, [16, 1, 1]],   # 1[-1, 1, MobileNetV2_Block, [24, 2, 6]],   # 2-p2/4[-1, 1, MobileNetV2_Block, [24, 1, 6]],   # 3[-1, 1, MobileNetV2_Block, [32, 2, 6]],   # 4-p3/8[-1, 2, MobileNetV2_Block, [32, 1, 6]],   # 5[-1, 1, MobileNetV2_Block, [64, 2, 6]],   # 6-p4/16[-1, 3, MobileNetV2_Block, [64, 1, 6]],   # 7[-1, 1, MobileNetV2_Block, [96, 1, 6]],   # 8[-1, 2, MobileNetV2_Block, [96, 1, 6]],   # 9[-1, 1, MobileNetV2_Block, [160, 2, 6]],  # 10-p5/32[-1, 2, MobileNetV2_Block, [160, 1, 6]],  # 11[-1, 1, MobileNetV2_Block, [320, 1, 6]],  # 12[-1, 1, SPPF, [1024, 5]],  # 13]# YOLOv5 v6.0 head
head:[[-1, 1, Conv, [256, 1, 1]], # 14[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 9], 1, Concat, [1]],  # cat backbone P4[-1, 1, C3, [256, False]],  # 17[-1, 1, Conv, [128, 1, 1]], # 18[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 5], 1, Concat, [1]],  # cat backbone P3[-1, 1, C3, [128, False]],  # 20 (P3/8-small)[-1, 1, Conv, [128, 3, 2]],[[-1, 18], 1, Concat, [1]], # cat head P4[-1, 1, C3, [256, False]],  # 24 (P4/16-medium)[-1, 1, Conv, [256, 3, 2]],[[-1, 14], 1, Concat, [1]],  # cat head P5[-1, 1, C3, [512, False]],  # 27 (P5/32-large)[[21, 24, 27], 1, Segment, [nc, anchors, 32, 256]], # Detect(P3, P4, P5)]

温馨提示：本文只是对yolov5基础上添加模块，如果要对yolov5n/l/m/x进行添加则只需要指定对应的depth_multiple 和 width_multiple。

# YOLOv5n
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.25  # layer channel multiple# YOLOv5s
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple# YOLOv5l 
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple# YOLOv5m
depth_multiple: 0.67  # model depth multiple
width_multiple: 0.75  # layer channel multiple# YOLOv5x
depth_multiple: 1.33  # model depth multiple
width_multiple: 1.25  # layer channel multiple

2.3 注册模块

关键步骤三：在yolo.py的parse_model函数中注册添加“MobileNetv2",

2.4 执行程序

在train.py中，将cfg的参数路径设置为yolov5_MobileNetv2.yaml的路径

建议大家写绝对路径，确保一定能找到

🚀运行程序，如果出现下面的内容则说明添加成功🚀

                 from  n    params  module                                  arguments0                -1  1       928  models.common.Conv                      [3, 32, 3, 2]1                -1  1       896  models.common.MobileNetV2_Block         [32, 16, 1, 1]2                -1  1      5136  models.common.MobileNetV2_Block         [16, 24, 2, 6]3                -1  1      8832  models.common.MobileNetV2_Block         [24, 24, 1, 6]4                -1  1     10000  models.common.MobileNetV2_Block         [24, 32, 2, 6]5                -1  2     29696  models.common.MobileNetV2_Block         [32, 32, 1, 6]6                -1  1     21056  models.common.MobileNetV2_Block         [32, 64, 2, 6]7                -1  3    162816  models.common.MobileNetV2_Block         [64, 64, 1, 6]8                -1  1     66624  models.common.MobileNetV2_Block         [64, 96, 1, 6]9                -1  2    236544  models.common.MobileNetV2_Block         [96, 96, 1, 6]10                -1  1    155264  models.common.MobileNetV2_Block         [96, 160, 2, 6]11                -1  2    640000  models.common.MobileNetV2_Block         [160, 160, 1, 6]12                -1  1    473920  models.common.MobileNetV2_Block         [160, 320, 1, 6]13                -1  1    708928  models.common.SPPF                      [320, 1024, 5]14                -1  1    262656  models.common.Conv                      [1024, 256, 1, 1]15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']16           [-1, 9]  1         0  models.common.Concat                    [1]17                -1  1    321024  models.common.C3                        [352, 256, 1, False]18                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]19                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']20           [-1, 5]  1         0  models.common.Concat                    [1]21                -1  1     78592  models.common.C3                        [160, 128, 1, False]22                -1  1    147712  models.common.Conv                      [128, 128, 3, 2]23          [-1, 18]  1         0  models.common.Concat                    [1]24                -1  1    296448  models.common.C3                        [256, 256, 1, False]25                -1  1    590336  models.common.Conv                      [256, 256, 3, 2]26          [-1, 14]  1         0  models.common.Concat                    [1]27                -1  1   1182720  models.common.C3                        [512, 512, 1, False]28      [21, 24, 27]  1    229245  Detect                                  [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
YOLOv5 summary: 347 layers, 5662397 parameters, 5662397 gradients, 11.5 GFLOPs

3. 完整代码分享

https://pan.baidu.com/s/1uxmTHtaXpeL-hWyP1-me1w?pwd=r6hg

提取码: r6hg

4. GFLOPs

关于GFLOPs的计算方式可以查看：百面算法工程师 | 卷积基础知识——Convolution

未改进的GFLOPs

改进后的GFLOPs

~~现在手上没有卡了，等过段时候有卡了把这补上，需要的同学自己测一下~~

5. 进阶

可以结合损失函数或者卷积模块进行多重改进

YOLOv5改进 | 损失函数 | EIoU、SIoU、WIoU、DIoU、FocuSIoU等多种损失函数——点击即可跳转

6. 总结

MobileNetV2是谷歌于2018年提出的一种高效卷积神经网络架构，专为移动和嵌入式设备设计。它通过引入逆残差结构和线性瓶颈层，优化了计算效率和模型性能，显著减少了计算量和参数数量。MobileNetV2在图像分类、目标检测和语义分割等任务中表现出色，能够在资源受限的设备上提供与更大模型相近的精度。

查看全文

http://www.fp688.cn/news/161565.html