Jun LIN, Wenbo PAN, Jun YOU, et al. Hard Example Mining Method for Visual Perception Based on Multi-sensor Fusion. [J]. Electric Drive for Locomotives 0(6):93-99(2021)
DOI:
Jun LIN, Wenbo PAN, Jun YOU, et al. Hard Example Mining Method for Visual Perception Based on Multi-sensor Fusion. [J]. Electric Drive for Locomotives 0(6):93-99(2021) DOI: 10.13890/j.issn.1000-128x.2021.06.013.
Hard Example Mining Method for Visual Perception Based on Multi-sensor Fusion
Hard examples for visual perception can improve the performance of object detection effectively in autonomous driving scenes. However, they are dif ficult and hard to obtain in the real world. A hard example mining method for visual perception based on multisensory fusion was presented. The obstacles target segmented from radar point clouds was used to verify the image detection targets.Hard samples of image object detection and unlabeled samples in the real open world can be identi fied by the mapping relationship of multi-sensors based on actual obstacles. These hard samples were for training a new object detection model and remote deployment through the cloud-side collaboration mechanism to realize the optimization and iterative update of the model. Experiments show that hard samples in the mining autopilot scenes can be effectively collected by this method, and be used to improve the performance of object detection through incremental transfer learning signi ficantly. In addition, the algorithm also has important guiding signi ficance for autonomous driving scenarios in rail transit and other fields.
关键词
困难样本挖掘多传感器融合深度学习视觉感知自动驾驶
Keywords
hard example miningmulti-sensor fusiondeep learningvisual perceptionautonomous driving
LUO Jiangping, YU Xizhuo, CAO Jingwei, et al. Intelligent rail flaw detection system based on deep learning and support vector machine[J]. Electric Drive for Locomotives, 2021(2): 100-107.
HE Deqiang, JIANG Zhou, CHEN Jiyong, et al. Research on detection of bird nests in overhead catenary based on deep convolutional neural network[J]. Electric Drive for Locomotives, 2019(4): 126-130.
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
REDMON J, FARHADI A. YOLOv3: an incremental improvement[DB/OL]. (2018-04-08) [2021-10-22]. https://arxiv.org/abs/1804.02767https://arxiv.org/abs/1804.02767.
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:optimal speed and accuracy of object detection[DB/OL]. (2020-04-23) [2021-10-22]. https://arxiv.org/abs/2004.10934https://arxiv.org/abs/2004.10934.
GENG C X, HUANG S J, CHEN S C. Recent advances in open set recognition: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3614-3631.
GAMA J, ŽLIOBAITĖ I, BIFET A, et al. A survey on concept drift adaptation[J]. ACM Computing Surveys, 2014, 46(4): 1-37.
WAN S H, CHEN Z J, ZHANG T, et al. Bootstrapping face detection with hard negative examples[DB/OL]. (2016-08-07) [2021-10-22]. https://arxiv.org/abs/1608.02236v1https://arxiv.org/abs/1608.02236v1.
SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]//IEEE. 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2016: 761-769.
WANG X L, SHRIVASTAVA A, GUPTA A. A-fast-RCNN: hard positive generation via adversary for object detection[C]//IEEE. 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2017: 3039-3048.
ROSENBERG C, HEBERT M, SCHNEIDERMAN H. Semi-supervised self-training of object detection models[C]//IEEE. 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05)-Volume 1. Breckenridge: IEEE, 2005: 29-36.
TANG K, RAMANATHAN V, LI F F, et al. Shifting weights: adapting object detectors from image to video[C]//Curran Associates Inc. Proceedings of the 25th International Conference on Neural Information Processing Systems-Volume 1. Nevada: NIPS, 2012: 638-646.
JIN S Y, ROYCHOWDHURY A, JIANG H Z, et al. Unsupervised hard example mining from videos for improved object detection[M]//FERRARI V, HEBERT M, SMINCHISESCU C, et al. Computer Vision-ECCV 2018. Switzerland: Springer Nature Switzerland, 2018: 316-333.
CALTAGIRONE L, BELLONE M, SVENSSON L, et al. LIDAR-camera fusion for road detection using fully convolutional neural networks[J]. Robotics and Autonomous Systems, 2019, 111: 125-131.
KRISPEL G, OPITZ M, WALTNER G, et al. FuseSeg:LiDAR point cloud segmentation fusing multi-modal data[C]//IEEE. 2020 IEEE Winter Conference on Applications of Computer Vision(WACV). Snowmass: IEEE, 2020: 1863-1872.
JOCHER G, STOKEN A, BOROVEC J, et al. Ultralytics/yolov5: v4.0-nn.silu() activations, weights & biases logging,pytorch hub integration[DB/OL]. (2021-01-05) [2021-10-22].https://doi.org/10.5281/zenodo.4418161https://doi.org/10.5281/zenodo.4418161.
ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss:faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000.
CHARLES R Q, SU H, KAICHUN M, et al. PointNet:deep learning on point sets for 3D classification and segmentation[C]//IEEE. 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu: IEEE, 2017: 77-85.
王俊. 无人驾驶车辆环境感知系统关键技术研究[D]. 合肥: 中国科学技术大学, 2016.
WANG Jun. Research on key technology of unmanned vehicle perception system[D]. Hefei: University of Science and Technology of China, 2016.
LIU Jian. Research on key technologies in unmanned vehicle driving environment modelling based on 3D lidar[D]. Hefei: University of Science and Technology of China, 2016.