物体检测(Object detection)

物体检测

在图像中对多个物体进行分类和定位的任务称为物体检测。一种通用的方法是采用经过训练的CNN来对单个物体进行分类和定位,然后将其在图像上滑动。

这项技术非常简单直观,但是它将多次检测同一物体,但位置略有不同。然后需要进行一些后期处理,以消除所有不必要的边界框。一种常见的方法称为非极大抑制。以下是操作方式:

  • 首先需要在CNN中添加一个额外的客观分数(置信度)输出,以估计图像中确实存在花朵的可能性(或者可以添加“无花朵”类,但这通常不起好的作用)。它必须使用sigmoid激活函数,而且可以使用二元交叉熵损失对其进行训练。然后删除所有置信度得分低于某个阈值的边界框:这将删除所有实际上不包含花的边界框。
  • 找到具有最大客观分数的边界框,并删除与其重叠很多的所有其他边界框(例如IoU大于60%)
  • 重复第二步,直到没有更多的边界框可以删除

这种简单的物体检测方法效果很好,但是它需要多次运行CNN,因此速度很慢。幸运的是,有一种更快的方法可以在图像上滑动CNN:使用全卷积网络(FCN)

————————

Object detection

The task of classifying and locating multiple objects in an image is called object detection. A general method is to use trained CNN to classify and locate a single object, and then slide it on the image.

This technology is very simple and intuitive, but it will detect the same object many times, but the position is slightly different. Then some post-processing is needed to eliminate all unnecessary bounding boxes. A common method is called non maximal suppression. The following is the operation mode:

  • First, an additional objective score (confidence) output needs to be added to CNN to estimate the possibility that flowers do exist in the image (or the “no flowers” class can be added, but this usually doesn’t work well). It must use sigmoid activation function and can be trained with binary cross entropy loss. Then delete all bounding boxes with confidence scores below a certain threshold: this will delete all bounding boxes that actually do not contain flowers.
  • Find the bounding box with the largest objective score and delete all other bounding boxes that overlap it a lot (e.g. IOU greater than 60%)
  • Repeat step 2 until no more bounding boxes can be deleted

This simple object detection method works well, but it needs to run CNN many times, so the speed is very slow. Fortunately, there is a faster way to slide CNN over an image: using a full convolution network (FCN)