Visible to the public Biblio

Filters: Keyword is image annotation  [Clear All Filters]
2022-06-06
Madono, Koki, Nakano, Teppei, Kobayashi, Tetsunori, Ogawa, Tetsuji.  2020.  Efficient Human-In-The-Loop Object Detection using Bi-Directional Deep SORT and Annotation-Free Segment Identification. 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). :1226–1233.
The present study proposes a method for detecting objects with a high recall rate for human-supported video annotation. In recent years, automatic annotation techniques such as object detection and tracking have become more powerful; however, detection and tracking of occluded objects, small objects, and blurred objects are still difficult. In order to annotate such objects, manual annotation is inevitably required. For this reason, we envision a human-supported video annotation framework in which over-detected objects (i.e., false positives) are allowed to minimize oversight (i.e., false negatives) in automatic annotation and then the over-detected objects are removed manually. This study attempts to achieve human-in-the-loop object detection with an emphasis on suppressing the oversight for the former stage of processing in the aforementioned annotation framework: bi-directional deep SORT is proposed to reliably capture missed objects and annotation-free segment identification (AFSID) is proposed to identify video frames in which manual annotation is not required. These methods are reinforced each other, yielding an increase in the detection rate while reducing the burden of human intervention. Experimental comparisons using a pedestrian video dataset demonstrated that bi-directional deep SORT with AFSID was successful in capturing object candidates with a higher recall rate over the existing deep SORT while reducing the cost of manpower compared to manual annotation at regular intervals.
2020-07-03
Feng, Ri-Chen, Lin, Daw-Tung, Chen, Ken-Min, Lin, Yi-Yao, Liu, Chin-De.  2019.  Improving Deep Learning by Incorporating Semi-automatic Moving Object Annotation and Filtering for Vision-based Vehicle Detection*. 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). :2484—2489.

Deep learning has undergone tremendous advancements in computer vision studies. The training of deep learning neural networks depends on a considerable amount of ground truth datasets. However, labeling ground truth data is a labor-intensive task, particularly for large-volume video analytics applications such as video surveillance and vehicles detection for autonomous driving. This paper presents a rapid and accurate method for associative searching in big image data obtained from security monitoring systems. We developed a semi-automatic moving object annotation method for improving deep learning models. The proposed method comprises three stages, namely automatic foreground object extraction, object annotation in subsequent video frames, and dataset construction using human-in-the-loop quick selection. Furthermore, the proposed method expedites dataset collection and ground truth annotation processes. In contrast to data augmentation and data generative models, the proposed method produces a large amount of real data, which may facilitate training results and avoid adverse effects engendered by artifactual data. We applied the constructed annotation dataset to train a deep learning you-only-look-once (YOLO) model to perform vehicle detection on street intersection surveillance videos. Experimental results demonstrated that the accurate detection performance was improved from a mean average precision (mAP) of 83.99 to 88.03.

2018-01-23
Baragchizadeh, A., Karnowski, T. P., Bolme, D. S., O’Toole, A. J..  2017.  Evaluation of Automated Identity Masking Method (AIM) in Naturalistic Driving Study (NDS). 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017). :378–385.

Identity masking methods have been developed in recent years for use in multiple applications aimed at protecting privacy. There is only limited work, however, targeted at evaluating effectiveness of methods-with only a handful of studies testing identity masking effectiveness for human perceivers. Here, we employed human participants to evaluate identity masking algorithms on video data of drivers, which contains subtle movements of the face and head. We evaluated the effectiveness of the “personalized supervised bilinear regression method for Facial Action Transfer (FAT)” de-identification algorithm. We also evaluated an edge-detection filter, as an alternate “fill-in” method when face tracking failed due to abrupt or fast head motions. Our primary goal was to develop methods for humanbased evaluation of the effectiveness of identity masking. To this end, we designed and conducted two experiments to address the effectiveness of masking in preventing recognition and in preserving action perception. 1- How effective is an identity masking algorithm?We conducted a face recognition experiment and employed Signal Detection Theory (SDT) to measure human accuracy and decision bias. The accuracy results show that both masks (FAT mask and edgedetection) are effective, but that neither completely eliminated recognition. However, the decision bias data suggest that both masks altered the participants' response strategy and made them less likely to affirm identity. 2- How effectively does the algorithm preserve actions? We conducted two experiments on facial behavior annotation. Results showed that masking had a negative effect on annotation accuracy for the majority of actions, with differences across action types. Notably, the FAT mask preserved actions better than the edge-detection mask. To our knowledge, this is the first study to evaluate a deidentification method aimed at preserving facial ac- ions employing human evaluators in a laboratory setting.