Visible to the public Natural Language Processing Characterization of Recurring Calls in Public Security Services

TitleNatural Language Processing Characterization of Recurring Calls in Public Security Services
Publication TypeBook
Year of PublicationSubmitted
KeywordsHuman Behavior, natural language processing, pubcrawl, Resiliency, Scalability
AbstractExtracting knowledge from unstructured data silos, a legacy of old applications, is mandatory for improving the governance of today's cities and fostering the creation of smart cities. Texts in natural language often compose such data. Nevertheless, the inference of useful information from a linguistic-computational analysis of natural language data is an open challenge. In this paper, we propose a clustering method to analyze textual data employing the unsupervised machine learning algorithms k-means and hierarchical clustering. We assess different vector representation methods for text, similarity metrics, and the number of clusters that best matches the data. We evaluate the methods using a real database of a public record service of security occurrences. The results show that the k-means algorithm using Euclidean distance extracts non-trivial knowledge, reaching up to 93% accuracy in a set of test samples while identifying the 12 most prevalent occurrence patterns.
URLhttps://ieeexplore.ieee.org/document/9049821
Citation Keynoauthor_natural_nodate