Reinforcement Learning inspired Deep Learned Compositional Model for Decision Making in Tracking

Submitted by grigby1 on Mon, 10/05/2020 - 2:01pm

Title	Reinforcement Learning inspired Deep Learned Compositional Model for Decision Making in Tracking
Publication Type	Conference Paper
Year of Publication	2018
Authors	Chakraborty, Anit, Dutta, Sayandip, Bhattacharyya, Siddhartha, Platos, Jan, Snasel, Vaclav
Conference Name	2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)
Date Published	Nov. 2018
Publisher	IEEE
ISBN Number	978-1-5386-7638-7
Keywords	cluttered background, Compositional Models, compositionality, Computational modeling, Computer science, decision making, deep learned compositional model, Deep Neural Network, deep reinforcement, heuristic methods, human bodies, human inputs, image representation, incessant decision making, learning (artificial intelligence), Markov processes, Mathematical model, multiple large video datasets, neural nets, object tracking, occlusion handling, partial occlusions, partially observable Markov decision making, penalty based training, pose estimation, pose estimation capabilities, previous location, pubcrawl, reinforcement learning, skeleton based part representation, Streaming media, target tracking, tracker, Tracking, video signal processing
Abstract	We formulate a tracker which performs incessant decision making in order to track objects where the objects may undergo different challenges such as partial occlusions, moving camera, cluttered background etc. In the process, the agent must make a decision on whether to keep track of the object when it is occluded or has moved out of the frame temporarily based on its prediction from the previous location or to reinitialize the tracker based on the belief that the target has been lost. Instead of the heuristic methods we depend on reward and penalty based training that helps the agent reach an optimal solution via this partially observable Markov decision making (POMDP). Furthermore, we employ deeply learned compositional model to estimate human pose in order to better handle occlusion without needing human inputs. By learning compositionality of human bodies via deep neural network the agent can make better decision on presence of human in a frame or lack thereof under occlusion. We adapt skeleton based part representation and do away with the large spatial state requirement. This especially helps in cases where orientation of the target in focus is unorthodox. Finally we demonstrate that the deep reinforcement learning based training coupled with pose estimation capabilities allows us to train and tag multiple large video datasets much quicker than previous works.
URL	https://ieeexplore.ieee.org/document/8718691
DOI	10.1109/ICRCICN.2018.8718691
Citation Key	chakraborty_reinforcement_2018

Groups:

Science of Security VO