九月婷婷人人澡人人添人人爽,偷拍久久国产视频中文字幕 ,久久久精品网站在线观看

初識(shí) FPN【網(wǎng)絡(luò)結(jié)構(gòu)示意圖分享 fpn網(wǎng)絡(luò)結(jié)構(gòu)詳解】FPN 全稱 Feature Pyramid Network ，翻譯過(guò)來(lái)就是特征金字塔網(wǎng)絡(luò) 。何為特征金字塔，深度卷積神經(jīng)網(wǎng)絡(luò)(DCNN)提取的不同尺度特征組成的金字塔形狀。本文提出了一種新型的特征融合方式，雖然距離論文提出的時(shí)間比較久了，但直到現(xiàn)在該結(jié)構(gòu)仍較常用，尤其是在檢測(cè)小目標(biāo)時(shí)值得一試。
本篇論文的目的是為了合理利用特征金字塔中不同尺度的語(yǔ)義信息。實(shí)際上在本篇文章之前，已經(jīng)有很多特征融合的方式，本文開(kāi)篇就介紹了各種多尺度特征的融合方式：

文章插圖

(a) Featurized image pyramid ，為了獲取不同尺度的特征，這種方式需要將同一張圖片的不同尺寸分別輸入網(wǎng)絡(luò)，分別計(jì)算對(duì)應(yīng)的 feature map 并預(yù)測(cè)結(jié)果，這種方式雖然可以提升預(yù)測(cè)精度但計(jì)算資源消耗太大，在實(shí)際工業(yè)應(yīng)用中不太現(xiàn)實(shí) 。
(b) Single feature map，分類任務(wù)常用的網(wǎng)絡(luò)結(jié)構(gòu)，深層特征包含了豐富的語(yǔ)義信息適用于分類任務(wù)，由于分類任務(wù)對(duì)目標(biāo)的位置信息并不敏感所以富含位置信息的淺層特征沒(méi)用被再次使用，而這種結(jié)構(gòu)也導(dǎo)致了分類網(wǎng)絡(luò)對(duì)小目標(biāo)的檢測(cè)精度并不高。
(c) Pyramid feature hierarchy，SSD 的多尺度特征應(yīng)用方式，在不同尺度的特征上進(jìn)行預(yù)測(cè) 。關(guān)于這種方式作者在文中專門(mén)說(shuō)了一段兒，意思是 SSD 中應(yīng)用的淺層特征還不夠”淺”，而作者發(fā)現(xiàn)更淺層的特征對(duì)檢測(cè)小目標(biāo)來(lái)說(shuō)非常重要。
(d) Feature Pyramid Network，本篇的主角，一種新的特征融合方式，在兼顧速度的同時(shí)提高了準(zhǔn)確率，下面會(huì)介紹細(xì)節(jié) 。
(e) U-net 所采用的結(jié)構(gòu)，與 (d) 的整體結(jié)構(gòu)類似，但只在最后一層進(jìn)行預(yù)測(cè) 。

FPN 結(jié)構(gòu)細(xì)節(jié)FPN 的結(jié)構(gòu)較為簡(jiǎn)單，可以概括為：特征提取，上采樣，特征融合，多尺度特征輸出。FPN 的輸入為任意大小的圖片，輸出為各尺度的 feature map 。與 U-net 類似， FPN 的整個(gè)網(wǎng)絡(luò)結(jié)構(gòu)分為自底向上 (Bottom-Up) 和自頂向下 (Top-Down) 兩個(gè)部分， Bottom-Up 是特征提取過(guò)程，對(duì)應(yīng) Unet 中的 Encoder 部分，文中以 Resnet 作為 backbone，其中使用的 bottleneck 結(jié)構(gòu):

文章插圖
Top-Down 將最深層的特征通過(guò)層層的上采樣，采樣至與 Bottom-Up 輸出對(duì)應(yīng)的分辨率大??，用劗日娢r笫涑?feature map，融合方式為對(duì)應(yīng)位置相加，而 Unet 采用的融合方式為對(duì)應(yīng)位置拼接，關(guān)于兩者的差異我之前在 Unet 這篇文章中提過(guò)，這里就不再贅述。在下圖中放大的部分中，包含了 3 個(gè)步驟：1. 對(duì)上層輸出進(jìn)行 2 倍的上采樣，2. 對(duì) Bottom-Up 中與之對(duì)應(yīng)的 feature map 的進(jìn)行 1×1 卷積，以保證特征 channels 相同，3. 將上面兩步的結(jié)果相加。

文章插圖
以上就是 FPN 的基本結(jié)構(gòu)了，簡(jiǎn)單且有效，這也符合何凱明大神一貫的作風(fēng) ，下面介紹代碼實(shí)現(xiàn)過(guò)程。
代碼實(shí)現(xiàn)FPN 結(jié)構(gòu)比較簡(jiǎn)單且文中說(shuō)明的很清楚，大家有空可以自己實(shí)現(xiàn)一下。下面是文章中對(duì)網(wǎng)絡(luò)結(jié)構(gòu)的敘述以及 Pytorch 版本的實(shí)現(xiàn) ，歡迎留言討論。

Bottom-Up

This process is independent of the backbone convolutional architectures, and in this paper we present results using ResNets.

文中選擇 Resnet 作為 Bottom-Up ，直接把 torchvision 中的 Resnet 拿來(lái)用:
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)self.bn1 = nn.BatchNorm2d(64)self.relu = nn.ReLU(inplace=True)self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)# Bottom-up stagesself.layer1 = self._make_layer(block, 64, layers[0], stride=1) self.layer2 = self._make_layer(block, 128, layers[1], stride=2)self.layer3 = self._make_layer(block, 256, layers[2], stride=2)self.layer4 = self._make_layer(block, 512, layers[3], stride=2)

Top layer

To start the iteration, we simply attach a 1×1 convolutional layer on C5 to produce the coarsest resolution map.
We set d = 256 in this paper and thus all extra convolutional layers have 256-channel outputs.

對(duì) C5(layer4 的輸出) 進(jìn)行 1×1 的卷積確保特征金字塔的每一層都是 256 個(gè) channels 。
self.toplayer = conv1x1(2048, 256)

Top-Down

With a coarser-resolution feature map, we upsample the spatial resolution by a factor of 2 (using nearest neighbor upsampling for simplicity).

每次上采樣的倍數(shù)為 2，且使用 nearest 插值。
F.upsample(x, size=(H,W), mode='nearest')

The upsam3 pled map is then merged with the corresponding bottom-up map (which undergoes a 1×1 convolutional layer to reduce channel dimensions) by element-wise addition.

Bottom-Up 輸出的 C2，C3，C4 都需要進(jìn)行 1×1 的卷積確保特征金字塔的每一層都是 256 個(gè) channels 。
self.laterallayer1 = conv1x1(1024, 256)self.laterallayer2 = conv1x1( 512, 256)self.laterallayer3 = conv1x1( 256, 256)

Finally, we append a 3×3 convolution on each merged map to generate the final feature map, which is to reduce the aliasing effect of upsampling.

最終還需要一個(gè) 3×3 的卷積才能得到最后的 feature map，此舉是為了減小上采樣的影響。
# Final conv layersself.finalconv1 = conv3x3(256, 256)self.finalconv2 = conv3x3(256, 256)self.finalconv3 = conv3x3(256, 256)至此，要用的基本模塊都有了，那么整個(gè)前向傳播的過(guò)程:
def forward(self, x): # Bottom-Up c1 = self.relu(self.bn1(self.conv1(x))) c1 = self.maxpool(c1) c2 = self.layer1(c1) c3 = self.layer2(c2) c4 = self.layer3(c3) c5 = self.layer4(c4) # Top layer && Top-Down p5 = self.toplayer(c5) p4 = self._upsample_add(p5, self.laterallayer1(c4)) p3 = self._upsample_add(p4, self.laterallayer2(c3)) p2 = self._upsample_add(p3, self.laterallayer3(c2)) # Final conv layers p4 = self.finalconv1(p4) p3 = self.finalconv2(p3) p2 = self.finalconv3(p2) return p2, p3, p4, p5論文中是將 FPN 作為一個(gè)結(jié)構(gòu)嵌入到 Fast R-CNN 等網(wǎng)絡(luò)中來(lái)提升網(wǎng)絡(luò)的表現(xiàn)，那么可否將 FPN 直接用于語(yǔ)義分割任務(wù)？答案是可以，一個(gè)思路是將 FPN 輸出的所有 feature map 相加為 1 層，上采樣至原圖分辨率可得輸出，也有不錯(cuò)的效果。

以上關(guān)于本文的內(nèi)容，僅作參考！溫馨提示：如遇健康、疾病相關(guān)的問(wèn)題，請(qǐng)您及時(shí)就醫(yī)或請(qǐng)專業(yè)人士給予相關(guān)指導(dǎo)!

「愛(ài)刨根生活網(wǎng)」www.malaban59.cn小編還為您精選了以下內(nèi)容，希望對(duì)您有所幫助：