Visible to the public Biblio

Filters: Author is Marulli, Fiammetta  [Clear All Filters]
2022-08-03
de Biase, Maria Stella, Marulli, Fiammetta, Verde, Laura, Marrone, Stefano.  2021.  Improving Classification Trustworthiness in Random Forests. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :563—568.
Machine learning algorithms are becoming more and more widespread in industrial as well as in societal settings. This diffusion is starting to become a critical aspect of new software-intensive applications due to the need of fast reactions to changes, even if temporary, in data. This paper investigates on the improvement of reliability in the Machine Learning based classification by extending Random Forests with Bayesian Network models. Such models, combined with a mechanism able to adjust the reputation level of single learners, may improve the overall classification trustworthiness. A small example taken from the healthcare domain is presented to demonstrate the proposed approach.
2022-01-25
Marulli, Fiammetta, Balzanella, Antonio, Campanile, Lelio, Iacono, Mauro, Mastroianni, Michele.  2021.  Exploring a Federated Learning Approach to Enhance Authorship Attribution of Misleading Information from Heterogeneous Sources. 2021 International Joint Conference on Neural Networks (IJCNN). :1–8.
Authorship Attribution (AA) is currently applied in several applications, among which fraud detection and anti-plagiarism checks: this task can leverage stylometry and Natural Language Processing techniques. In this work, we explored some strategies to enhance the performance of an AA task for the automatic detection of false and misleading information (e.g., fake news). We set up a text classification model for AA based on stylometry exploiting recurrent deep neural networks and implemented two learning tasks trained on the same collection of fake and real news, comparing their performances: one is based on Federated Learning architecture, the other on a centralized architecture. The goal was to discriminate potential fake information from true ones when the fake news comes from heterogeneous sources, with different styles. Preliminary experiments show that a distributed approach significantly improves recall with respect to the centralized model. As expected, precision was lower in the distributed model. This aspect, coupled with the statistical heterogeneity of data, represents some open issues that will be further investigated in future work.