Improving Deep Learning by Incorporating Semi-automatic Moving Object Annotation and Filtering for Vision-based Vehicle Detection*
Title | Improving Deep Learning by Incorporating Semi-automatic Moving Object Annotation and Filtering for Vision-based Vehicle Detection* |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Feng, Ri-Chen, Lin, Daw-Tung, Chen, Ken-Min, Lin, Yi-Yao, Liu, Chin-De |
Conference Name | 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC) |
Publisher | IEEE |
ISBN Number | 978-1-7281-4569-3 |
Keywords | artifactual data, automatic foreground object extraction, big image data, Computer vision, data augmentation, Data models, dataset construction, Deep Learning, deep learning models, deep learning neural networks, deep learning you-only-look-once model, deep video, feature extraction, human-in-the-loop quick selection, image annotation, image filtering, image motion analysis, learning (artificial intelligence), Metrics, neural nets, object detection, Object segmentation, pubcrawl, resilience, Resiliency, road vehicles, Scalability, security monitoring systems, semiautomatic moving object annotation, semiautomatic moving object annotation method, street intersection surveillance videos, Task Analysis, Tools, traffic engineering computing, Training, Vehicle detection, video analytics applications, video frames, video signal processing, video surveillance, vision-based vehicle detection |
Abstract | Deep learning has undergone tremendous advancements in computer vision studies. The training of deep learning neural networks depends on a considerable amount of ground truth datasets. However, labeling ground truth data is a labor-intensive task, particularly for large-volume video analytics applications such as video surveillance and vehicles detection for autonomous driving. This paper presents a rapid and accurate method for associative searching in big image data obtained from security monitoring systems. We developed a semi-automatic moving object annotation method for improving deep learning models. The proposed method comprises three stages, namely automatic foreground object extraction, object annotation in subsequent video frames, and dataset construction using human-in-the-loop quick selection. Furthermore, the proposed method expedites dataset collection and ground truth annotation processes. In contrast to data augmentation and data generative models, the proposed method produces a large amount of real data, which may facilitate training results and avoid adverse effects engendered by artifactual data. We applied the constructed annotation dataset to train a deep learning you-only-look-once (YOLO) model to perform vehicle detection on street intersection surveillance videos. Experimental results demonstrated that the accurate detection performance was improved from a mean average precision (mAP) of 83.99 to 88.03. |
URL | https://ieeexplore.ieee.org/document/8914169 |
DOI | 10.1109/SMC.2019.8914169 |
Citation Key | feng_improving_2019 |
- Task Analysis
- Object segmentation
- pubcrawl
- resilience
- Resiliency
- road vehicles
- Scalability
- security monitoring systems
- semiautomatic moving object annotation
- semiautomatic moving object annotation method
- street intersection surveillance videos
- object detection
- tools
- traffic engineering computing
- Training
- Vehicle detection
- video analytics applications
- video frames
- video signal processing
- video surveillance
- vision-based vehicle detection
- deep learning you-only-look-once model
- automatic foreground object extraction
- big image data
- computer vision
- data augmentation
- Data models
- dataset construction
- deep learning
- deep learning models
- deep learning neural networks
- artifactual data
- deep video
- feature extraction
- human-in-the-loop quick selection
- image annotation
- image filtering
- image motion analysis
- learning (artificial intelligence)
- Metrics
- neural nets