2015做啥網(wǎng)站能致富百度權(quán)重查詢
YOLO-10簡(jiǎn)介
主要貢獻(xiàn):
- 無NMS的一致雙分配
- YOLOv10提出了一種通過雙標(biāo)簽分配而不用非極大值抑制NMS的策略。這種方法結(jié)合了一對(duì)多和一對(duì)一分配策略的優(yōu)勢(shì),提高了效率并保持了性能。
- 高效的網(wǎng)絡(luò)設(shè)計(jì)
-
輕量化分類頭:在不顯著影響性能的情況下,減少了計(jì)算開銷。
-
空間-通道解耦下采樣:解耦空間下采樣和通道調(diào)整,優(yōu)化計(jì)算成本。
-
基于秩的塊設(shè)計(jì):根據(jù)各階段的內(nèi)在秩適應(yīng)塊設(shè)計(jì),減少冗余,提高效率。
-
大核卷積和部分自注意力PSA:在不顯著增加計(jì)算成本的情況下,增強(qiáng)了感受野和全局建模能力。
-
一致雙分配策略
-
一對(duì)多分配:在訓(xùn)練期間,多個(gè)預(yù)測(cè)框被分配給一個(gè)真實(shí)物體標(biāo)簽。這種策略提供了豐富的監(jiān)督信號(hào),優(yōu)化效果更好。
-
一對(duì)一分配:僅一個(gè)預(yù)測(cè)框被分配給一個(gè)真實(shí)物體標(biāo)簽,避免了NMS,但由于監(jiān)督信號(hào)較弱,容易導(dǎo)致收斂速度慢和性能欠佳。
-
雙頭架構(gòu):模型在訓(xùn)練期間使用兩個(gè)預(yù)測(cè)頭,一個(gè)使用一對(duì)多分配,另一個(gè)使用一對(duì)一分配。這樣,模型可以在訓(xùn)練期間利用一對(duì)多分配的豐富監(jiān)督信號(hào),而在推理期間則使用一對(duì)一分配的預(yù)測(cè)結(jié)果,從而實(shí)現(xiàn)無NMS的高效推理。
Head優(yōu)化
- 綜合一對(duì)一多與一對(duì)一的bbox分配策略,網(wǎng)絡(luò)模塊添加兩種類型的head模塊;推理過程中只保留一對(duì)一分配head
- 相較于分類head,回歸head承擔(dān)更多意義
效率驅(qū)動(dòng)的模型設(shè)計(jì)
- 空間-通道解耦下采樣,首先利用點(diǎn)狀卷積調(diào)節(jié)通道維度,然后使用深度卷積進(jìn)行空間下采樣
- 秩引導(dǎo)的塊設(shè)計(jì):提出了一個(gè)緊湊型倒置塊(CIB)結(jié)構(gòu),它采用廉價(jià)的深度卷積進(jìn)行空間混合和高效的一維卷積進(jìn)行通道混合,如圖(b),作為高效的基本構(gòu)建塊。
- 隨著模型規(guī)模的增加,其感受野自然擴(kuò)大,使用大核卷積的好處減弱,作者只對(duì)小型模型規(guī)模采用大核卷積
CIB
class Conv(nn.Module):"""Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)."""default_act = nn.SiLU() # default activationdef __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):"""Initialize Conv layer with given arguments including activation."""super().__init__()self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)self.bn = nn.BatchNorm2d(c2)self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()def forward(self, x):"""Apply convolution, batch normalization and activation to input tensor."""return self.act(self.bn(self.conv(x)))def forward_fuse(self, x):"""Perform transposed convolution of 2D data."""return self.act(self.conv(x))class CIB(nn.Module):"""Standard bottleneck."""def __init__(self, c1, c2, shortcut=True, e=0.5, lk=False):"""Initializes a bottleneck module with given input/output channels, shortcut option, group, kernels, andexpansion."""super().__init__()c_ = int(c2 * e) # hidden channelsself.cv1 = nn.Sequential(Conv(c1, c1, 3, g=c1),Conv(c1, 2 * c_, 1),Conv(2 * c_, 2 * c_, 3, g=2 * c_) if not lk else RepVGGDW(2 * c_),Conv(2 * c_, c2, 1),Conv(c2, c2, 3, g=c2),)self.add = shortcut and c1 == c2def forward(self, x):"""'forward()' applies the YOLO FPN to input data."""return x + self.cv1(x) if self.add else self.cv1(x)
class C2fCIB(C2f):"""Faster Implementation of CSP Bottleneck with 2 convolutions."""def __init__(self, c1, c2, n=1, shortcut=False, lk=False, g=1, e=0.5):"""Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,expansion."""super().__init__(c1, c2, n, shortcut, g, e)self.m = nn.ModuleList(CIB(self.c, self.c, shortcut, e=1.0, lk=lk) for _ in range(n))class Attention(nn.Module):def __init__(self, dim, num_heads=8,attn_ratio=0.5):super().__init__()self.num_heads = num_headsself.head_dim = dim // num_headsself.key_dim = int(self.head_dim * attn_ratio)self.scale = self.key_dim ** -0.5nh_kd = nh_kd = self.key_dim * num_headsh = dim + nh_kd * 2self.qkv = Conv(dim, h, 1, act=False)self.proj = Conv(dim, dim, 1, act=False)self.pe = Conv(dim, dim, 3, 1, g=dim, act=False)def forward(self, x):B, C, H, W = x.shapeN = H * Wqkv = self.qkv(x)q, k, v = qkv.view(B, self.num_heads, self.key_dim*2 + self.head_dim, N).split([self.key_dim, self.key_dim, self.head_dim], dim=2)attn = ((q.transpose(-2, -1) @ k) * self.scale)attn = attn.softmax(dim=-1)x = (v @ attn.transpose(-2, -1)).view(B, C, H, W) + self.pe(v.reshape(B, C, H, W))x = self.proj(x)return x
class PSA(nn.Module):def __init__(self, c1, c2, e=0.5):super().__init__()assert(c1 == c2)self.c = int(c1 * e)self.cv1 = Conv(c1, 2 * self.c, 1, 1)self.cv2 = Conv(2 * self.c, c1, 1)self.attn = Attention(self.c, attn_ratio=0.5, num_heads=self.c // 64)self.ffn = nn.Sequential(Conv(self.c, self.c*2, 1),Conv(self.c*2, self.c, 1, act=False))def forward(self, x):a, b = self.cv1(x).split((self.c, self.c), dim=1)b = b + self.attn(b)b = b + self.ffn(b)return self.cv2(torch.cat((a, b), 1))class SCDown(nn.Module):def __init__(self, c1, c2, k, s):super().__init__()self.cv1 = Conv(c1, c2, 1, 1)self.cv2 = Conv(c2, c2, k=k, s=s, g=c2, act=False)def forward(self, x):return self.cv2(self.cv1(x))class C2f(nn.Module):"""Faster Implementation of CSP Bottleneck with 2 convolutions."""def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):"""Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,expansion."""super().__init__()self.c = int(c2 * e) # hidden channelsself.cv1 = Conv(c1, 2 * self.c, 1, 1)self.cv2 = Conv((2 + n) * self.c, c2, 1) # optional act=FReLU(c2)self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))def forward(self, x):"""Forward pass through C2f layer."""y = list(self.cv1(x).chunk(2, 1))y.extend(m(y[-1]) for m in self.m)return self.cv2(torch.cat(y, 1))def forward_split(self, x):"""Forward pass using split() instead of chunk()."""y = list(self.cv1(x).split((self.c, self.c), 1))y.extend(m(y[-1]) for m in self.m)return self.cv2(torch.cat(y, 1))