Biblio
Filters: Author is Guo, Tao [Clear All Filters]
Cross-Layer Aggregation with Transformers for Multi-Label Image Classification. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :3448—3452.
.
2022. Multi-label image classification task aims to predict multiple object labels in a given image and faces the challenge of variable-sized objects. Limited by the size of CNN convolution kernels, existing CNN-based methods have difficulty capturing global dependencies and effectively fusing multiple layers features, which is critical for this task. Recently, transformers have utilized multi-head attention to extract feature with long range dependencies. Inspired by this, this paper proposes a Cross-layer Aggregation with Transformers (CAT) framework, which leverages transformers to capture the long range dependencies of CNN-based features with Long Range Dependencies module and aggregate the features layer by layer with Cross-Layer Fusion module. To make the framework efficient, a multi-head pre-max attention is designed to reduce the computation cost when fusing the high-resolution features of lower-layers. On two widely-used benchmarks (i.e., VOC2007 and MS-COCO), CAT provides a stable improvement over the baseline and produces a competitive performance.
The Explicit Coding Rate Region of Symmetric Multilevel Diversity Coding. 2018 Information Theory and Applications Workshop (ITA). :1–9.
.
2018. It is well known that superposition coding, namely separately encoding the independent sources, is optimal for symmetric multilevel diversity coding (SMDC) (Yeung-Zhang 1999). However, the characterization of the coding rate region therein involves uncountably many linear inequalities and the constant term (i.e., the lower bound) in each inequality is given in terms of the solution of a linear optimization problem. Thus this implicit characterization of the coding rate region does not enable the determination of the achievability of a given rate tuple. In this paper, we first obtain closed-form expressions of these uncountably many inequalities. Then we identify a finite subset of inequalities that is sufficient for characterizing the coding rate region. This gives an explicit characterization of the coding rate region. We further show by the symmetry of the problem that only a much smaller subset of this finite set of inequalities needs to be verified in determining the achievability of a given rate tuple. Yet, the cardinality of this smaller set grows at least exponentially fast with L.