Random Manhattan Indexing
Title | Random Manhattan Indexing |
Publication Type | Conference Paper |
Year of Publication | 2014 |
Authors | Zadeh, B.Q., Handschuh, S. |
Conference Name | Database and Expert Systems Applications (DEXA), 2014 25th International Workshop on |
Date Published | Sept |
Keywords | Computational modeling, Context, data reduction, dimensionality reduction, dimensionality reduction technique, Equations, indexing, L1 normed VSM, Manhattan distance, Mathematical model, natural language text, random Manhattan indexing, random projection, retrieval models, RMI, sparse Cauchy random projections, text analysis, vector space model, Vectors |
Abstract | Vector space models (VSMs) are mathematically well-defined frameworks that have been widely used in text processing. In these models, high-dimensional, often sparse vectors represent text units. In an application, the similarity of vectors -- and hence the text units that they represent -- is computed by a distance formula. The high dimensionality of vectors, however, is a barrier to the performance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a new method, called Random Manhattan Indexing (RMI), for the construction of L1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM and dimension reduction into an incremental, and thus scalable, procedure. In order to attain its goal, RMI employs the sparse Cauchy random projections. |
DOI | 10.1109/DEXA.2014.51 |
Citation Key | 6974850 |
- Mathematical model
- Vectors
- vector space model
- text analysis
- sparse Cauchy random projections
- RMI
- retrieval models
- random projection
- random Manhattan indexing
- natural language text
- Computational modeling
- Manhattan distance
- L1 normed VSM
- indexing
- Equations
- dimensionality reduction technique
- dimensionality reduction
- data reduction
- Context