Visible to the public Biblio

Filters: Keyword is multimedia  [Clear All Filters]
2020-08-07
Liu, Bo, Xiong, Jian, Wu, Yiyan, Ding, Ming, Wu, Cynthia M..  2019.  Protecting Multimedia Privacy from Both Humans and AI. 2019 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). :1—6.
With the development of artificial intelligence (AI), multimedia privacy issues have become more challenging than ever. AI-assisted malicious entities can steal private information from multimedia data more easily than humans. Traditional multimedia privacy protection only considers the situation when humans are the adversaries, therefore they are ineffective against AI-assisted attackers. In this paper, we develop a new framework and new algorithms that can protect image privacy from both humans and AI. We combine the idea of adversarial image perturbation which is effective against AI and the obfuscation technique for human adversaries. Experiments show that our proposed methods work well for all types of attackers.
2018-02-27
Soleymani, Mohammad, Riegler, Michael, al Halvorsen, P$\backslash$a.  2017.  Multimodal Analysis of Image Search Intent: Intent Recognition in Image Search from User Behavior and Visual Content. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. :251–259.

Users search for multimedia content with different underlying motivations or intentions. Study of user search intentions is an emerging topic in information retrieval since understanding why a user is searching for a content is crucial for satisfying the user's need. In this paper, we aimed at automatically recognizing a user's intent for image search in the early stage of a search session. We designed seven different search scenarios under the intent conditions of finding items, re-finding items and entertainment. We collected facial expressions, physiological responses, eye gaze and implicit user interactions from 51 participants who performed seven different search tasks on a custom-built image retrieval platform. We analyzed the users' spontaneous and explicit reactions under different intent conditions. Finally, we trained machine learning models to predict users' search intentions from the visual content of the visited images, the user interactions and the spontaneous responses. After fusing the visual and user interaction features, our system achieved the F-1 score of 0.722 for classifying three classes in a user-independent cross-validation. We found that eye gaze and implicit user interactions, including mouse movements and keystrokes are the most informative features. Given that the most promising results are obtained by modalities that can be captured unobtrusively and online, the results demonstrate the feasibility of deploying such methods for improving multimedia retrieval platforms.

2017-05-22
Khanwalkar, Sanket, Balakrishna, Shonali, Jain, Ramesh.  2016.  Exploration of Large Image Corpuses in Virtual Reality. Proceedings of the 2016 ACM on Multimedia Conference. :596–600.

With the increasing capture of photos and their proliferation on social media, there is a pressing need for a more intuitive and versatile image search and exploration system. Image search systems have long been confined to the binds of the 2D legacy screens and the keyword text-box. With the recent advances in Virtual Reality (VR) technology, a move towards an immersive VR environment will redefine the image navigation experience. To this end, we propose a VR platform that gathers images from various sources, and addresses the 5 Ws of image search - what, where, when, who and why. We achieve this by providing the user with two modes of interactive exploration - (i) A mode that allows for a graph based navigation of an image dataset, using a steering wheel visualization, along multiple dimensions of time, location, visual concept, people, etc. and (ii) Another mode that provides an intuitive exploration of the image dataset using a logical hierarchy of visual concepts. Our contributions include creating a VR image exploration experience that is intuitive and allows image navigation along multiple dimensions.