当先锋百科网

首页 1 2 3 4 5 6 7

RTSEG:  Real-time semantic segmentation comparative study

Abstract: Most of the research on semantic segmentation only focuses on increasing the accuracy of segmentation models with little attention to computationally efficient solutions.所以实时是个很值得去做。基于feature extraction and decoding methods.

特征提取: VGG16, Resnet18, MobileNet and Shufflenet

Decoding SkipNet, unet, Dilation frontend

Dataset   cityscapes dataset for urban scenes

1. Introduction

Fcn transposed convolution

Pascal, NYU RGBD, Cityscapes and Mapillary

ENet 效果太差,包括ICNet等real-time算法,效果都不好

  1. Provide feature extraction and decoding method which is term as meta-architecture
  2. Present a trade-off between accuracy and computational efficiency
  3. Shufflenet leads 143x gflops reduction in comparison to segment

 

2. Benchmarking framewrk

2.1 meta-Architectures

downsampling factor is 32

Skipnet

U-net

Dilation frontend

使用空洞卷积代替下采样的feature map,空洞卷积确保网络维持足够的感受野而不需要通过pooling和stride conv来破坏像素结构。

2.2 Feature extraction architectures

 

3. Experiments

3.1 Ecperimental setup

Weighted cross entropy loss

Adam optimizer

Learning rate is set to 1e-4

BN

L2 regularization with weight decay rate of 5e-4 is utilized to avoid over-fitting

Feature extractor part of the network is initialized with the pre-trained corresponding encoder trained on Imagenet

Input image resolution is 512x1024

3.2 Semantic Segmentation results

Semantic segmentation is evaluated using mean intersection over union (mIOU), per-class IOU, and per-category IOU