Biblio
In the field of scene understanding, researchers have mainly focused on using video/images to extract different elements in a scene. The computational as well as monetary cost associated with such implementations is high. This paper proposes a low-cost system which uses sound-based techniques in order to jointly perform localization as well as fingerprinting of the sound sources. A network of embedded nodes is used to sense the sound inputs. Phase-based sound localization and Support-Vector Machine classification are used to locate and classify elements of the scene, respectively. The fusion of all this data presents a complete “picture” of the scene. The proposed concepts are applied to a vehicular-traffic case study. Experiments show that the system has a fingerprinting accuracy of up to 97.5%, localization error less than 4 degrees and scene prediction accuracy of 100%.