Biblio
Hierarchical approaches for representation learning have the ability to encode relevant features at multiple scales or levels of abstraction. However, most hierarchical approaches exploit only the last level in the hierarchy, or provide a multiscale representation that holds a significant amount of redundancy. We argue that removing redundancy across the multiple levels of abstraction is important for an efficient representation of compositionality in object-based representations. With the perspective of feature learning as a data compression operation, we propose a new greedy inference algorithm for hierarchical sparse coding. Convolutional matching pursuit with a L0-norm constraint was used to encode the input signal into compact and non-redundant codes distributed across levels of the hierarchy. Simple and complex synthetic datasets of temporal signals were created to evaluate the encoding efficiency and compare with the theoretical lower bounds on the information rate for those signals. Empirical evidence have shown that the algorithm is able to infer near-optimal codes for simple signals. However, it failed for complex signals with strong overlapping between objects. We explain the inefficiency of convolutional matching pursuit that occurred in such case. This brings new insights about the NP-hard optimization problem related to using L0-norm constraint in inferring optimally compact and distributed object-based representations.