Biblio

List
Filter

Found 4 results

Filters: Keyword is multi-task learning [Clear All Filters]

2022-05-06

Bai, Zilong, Hu, Beibei. 2021. A Universal Bert-Based Front-End Model for Mandarin Text-To-Speech Synthesis. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :6074–6078.

The front-end text processing module is considered as an essential part that influences the intelligibility and naturalness of a Mandarin text-to-speech system significantly. For commercial text-to-speech systems, the Mandarin front-end should meet the requirements of high accuracy and low time latency while also ensuring maintainability. In this paper, we propose a universal BERT-based model that can be used for various tasks in the Mandarin front-end without changing its architecture. The feature extractor and classifiers in the model are shared for several sub-tasks, which improves the expandability and maintainability. We trained and evaluated the model with polyphone disambiguation, text normalization, and prosodic boundary prediction for single task modules and multi-task learning. Results show that, the model maintains high performance for single task modules and shows higher accuracy and lower time latency for multi-task modules, indicating that the proposed universal front-end model is promising as a maintainable Mandarin front-end for commercial applications.

2022-02-07

Lee, Shan-Hsin, Lan, Shen-Chieh, Huang, Hsiu-Chuan, Hsu, Chia-Wei, Chen, Yung-Shiu, Shieh, Shiuhpyng. 2021. EC-Model: An Evolvable Malware Classification Model. 2021 IEEE Conference on Dependable and Secure Computing (DSC). :1–8.

Malware evolves quickly as new attack, evasion and mutation techniques are commonly used by hackers to build new malicious malware families. For malware detection and classification, multi-class learning model is one of the most popular machine learning models being used. To recognize malicious programs, multi-class model requires malware types to be predefined as output classes in advance which cannot be dynamically adjusted after the model is trained. When a new variant or type of malicious programs is discovered, the trained multi-class model will be no longer valid and have to be retrained completely. This consumes a significant amount of time and resources, and cannot adapt quickly to meet the timely requirement in dealing with dynamically evolving malware types. To cope with the problem, an evolvable malware classification deep learning model, namely EC-Model, is proposed in this paper which can dynamically adapt to new malware types without the need of fully retraining. Consequently, the reaction time can be significantly reduced to meet the timely requirement of malware classification. To our best knowledge, our work is the first attempt to adopt multi-task, deep learning for evolvable malware classification.

2018-12-10

Gujral, Aditya, Chaspari, Theodora, Timmons, Adela C., Kim, Yehsong, Barrett, Sarah, Margolin, Gayla. 2018. Population-specific Detection of Couples' Interpersonal Conflict Using Multi-task Learning. Proceedings of the 20th ACM International Conference on Multimodal Interaction. :229–233.

The inherent diversity of human behavior limits the capabilities of general large-scale machine learning systems, that usually require ample amounts of data to provide robust descriptors of the outcomes of interest. Motivated by this challenge, personalized and population-specific models comprise a promising line of work for representing human behavior, since they can make decisions for clusters of people with common characteristics, reducing the amount of data needed for training. We propose a multi-task learning (MTL) framework for developing population-specific models of interpersonal conflict between couples using ambulatory sensor and mobile data from real-life interactions. The criteria for population clustering include global indices related to couples' relationship quality and attachment style, person-specific factors of partners' positivity, negativity, and stress levels, as well as fluctuating factors of daily emotional arousal obtained from acoustic and physiological indices. Population-specific information is incorporated through a MTL feed-forward neural network (FF-NN), whose first layers capture the common information across all data samples, while its last layers are specific to the unique characteristics of each population. Our results indicate that the proposed MTL FF-NN trained solely on the sensor-based acoustic, linguistic, and physiological modalities provides unweighted and weighted F1-scores of 0.51 and 0.75, respectively, outperforming the corresponding baselines of a single general FF-NN trained on the entire dataset and separate FF-NNs trained on each population cluster individually. These demonstrate the feasibility of such ambulatory systems for detecting real-life behaviors and possibly intervening upon them, and highlights the importance of taking into account the inherent diversity of different populations from the general pool of data.

2017-09-15

Shi, Tianlin, Agostinelli, Forest, Staib, Matthew, Wipf, David, Moscibroda, Thomas. 2016. Improving Survey Aggregation with Sparsely Represented Signals. Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. :1845–1854.

In this paper, we develop a new aggregation technique to reduce the cost of surveying. Our method aims to jointly estimate a vector of target quantities such as public opinion or voter intent across time and maintain good estimates when using only a fraction of the data. Inspired by the James-Stein estimator, we resolve this challenge by shrinking the estimates to a global mean which is assumed to have a sparse representation in some known basis. This assumption has lead to two different methods for estimating the global mean: orthogonal matching pursuit and deep learning. Both of which significantly reduce the number of samples needed to achieve good estimates of the true means of the data and, in the case of presidential elections, can estimate the outcome of the 2012 United States elections while saving hundreds of thousands of samples and maintaining accuracy.