Counting Objects by Blockwise Classification

IEEE Transactions on Circuits and Systems for Video Technology(2020)

引用 35|浏览75
暂无评分
摘要
In this paper, we introduce the idea of blockwise classification to count objects. The current mainstream method for counting objects is to regress the density map or to regress the redundant count map via a deep convolutional neural network (CNN). However, these methods suffer from two critical issues: inaccurately generated regression targets and serious sample imbalances. First, the ground truth density map is generated by convolving the dot map using a Gaussian kernel. Because an inappropriate kernel can cover the background or uncover objects, this approach introduces a form of noise, and therefore results in ambiguities when training the networks. Second, inhomogeneously distributed objects often exist in images, which gives rise to a data collection bias. This leads to a long-tailed distribution of region counts, which is a typical characteristic that occurs with imbalanced samples; therefore, underestimations in high-density regions and overestimations in low-density regions are common. In this paper, we address these two issues within one framework—blockwise count level classification. The intuition behind this idea is that while it may not be possible to provide an exact count of pixels or patches, it is possible to provide a count of a region that falls within a certain interval with high confidence. Our method classifies the count levels of each block produced by nonlinearly quantizing the continuous counts, thus transforming the imbalance of sample patch counts into a class imbalance of count levels. Consequently, an information-entropy-inspired loss can be applied to alleviate this issue. Through ablative studies, we analyze the impact of imbalanced data, Gaussian kernel sizes, quantization errors, and the effectiveness of each module in our method. Without bells and whistles, our method outperforms or performs competitively with other state-of-the-art approaches on seven object-counting benchmarks, including four crowd-counting datasets from ShanghaiTech, WorldExpo’10, UCF-QNRF and UCF_CC_50, one vehicle-counting dataset (TRANCOS), one maize-tassel-counting dataset (MTC), and one challenging sonar fish-counting dataset that we constructed. The results suggest that our framework provides a strong and improved baseline for object counting.
更多
查看译文
关键词
Kernel,Nonhomogeneous media,Task analysis,Feature extraction,Quantization (signal),Convolutional neural networks,Benchmark testing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要