Visible to the public Biblio

Filters: Author is Gupta, Atul  [Clear All Filters]
2019-12-16
Malviya, Vikas, Rai, Sawan, Gupta, Atul.  2018.  Development of a Plugin Based Extensible Feature Extraction Framework. Proceedings of the 33rd Annual ACM Symposium on Applied Computing. :1840–1847.

An important ingredient for a successful recipe for solving machine learning problems is the availability of a suitable dataset. However, such a dataset may have to be extracted from a large unstructured and semi-structured data like programming code, scripts, and text. In this work, we propose a plug-in based, extensible feature extraction framework for which we have prototyped as a tool. The proposed framework is demonstrated by extracting features from two different sources of semi-structured and unstructured data. The semi-structured data comprised of web page and script based data whereas the other data was taken from email data for spam filtering. The usefulness of the tool was also assessed on the aspect of ease of programming.