This project seeks to understand how inaccurate messages are propagated over large-scale information networks that are consumed by the general public, how the public responds to such inaccuracy, and what content- or metadata-related characteristics/features make certain messages more error-resistant or error-prone than others. The results of the project have the potential to help build a platform that accurately identifies errors being propagated on an information network and effectively manages/controls such error propagation.
The technical objective of this project involves the design of efficient information extraction techniques that properly extract features from microblog-like messages that are short and often noisy. Specifically, it aims to develop a variety of content models, e.g., graph-based modeling, sentiment-based coding, and shingle- and user-frequency based metrics to make the information extraction techniques more resilient to the noise and high volume commonly present on real-world microblog platforms. The project also includes a case study over a large-scale real-world microblog platform to test the effectiveness of the proposed approaches and their superiority over the existing techniques.
|