Visible to the public Biblio

Filters: Keyword is cosine similarity  [Clear All Filters]
2022-07-15
Wang, Yan, Allouache, Yacine, Joubert, Christian.  2021.  A Staffing Recommender System based on Domain-Specific Knowledge Graph. 2021 Eighth International Conference on Social Network Analysis, Management and Security (SNAMS). :1—6.
In the economics environment, Job Matching is always a challenge involving the evolution of knowledge and skills. A good matching of skills and jobs can stimulate the growth of economics. Recommender System (RecSys), as one kind of Job Matching, can help the candidates predict the future job relevant to their preferences. However, RecSys still has the problem of cold start and data sparsity. The content-based filtering in RecSys needs the adaptive data for the specific staffing tasks of Bidirectional Encoder Representations from Transformers (BERT). In this paper, we propose a job RecSys based on skills and locations using a domain-specific Knowledge Graph (KG). This system has three parts: a pipeline of Named Entity Recognition (NER) and Relation Extraction (RE) using BERT; a standardization system for pre-processing, semantic enrichment and semantic similarity measurement; a domain-specific Knowledge Graph (KG). Two different relations in the KG are computed by cosine similarity and Term Frequency-Inverse Document Frequency (TF-IDF) respectively. The raw data used in the staffing RecSys include 3000 descriptions of job offers from Indeed, 126 Curriculum Vitae (CV) in English from Kaggle and 106 CV in French from Linx of Capgemini Engineering. The staffing RecSys is integrated under an architecture of Microservices. The autonomy and effectiveness of the staffing RecSys are verified through the experiment using Discounted Cumulative Gain (DCG). Finally, we propose several potential research directions for this research.
2022-07-05
Wang, Zhiwen, Zhang, Qi, Sun, Hongtao, Hu, Jiqiang.  2021.  Detection of False Data Injection Attacks in smart grids based on cubature Kalman Filtering. 2021 33rd Chinese Control and Decision Conference (CCDC). :2526—2532.
The false data injection attacks (FDIAs) in smart grids can offset the power measurement data and it can bypass the traditional bad data detection mechanism. To solve this problem, a new detection mechanism called cosine similarity ratio which is based on the dynamic estimation algorithm of square root cubature Kalman filter (SRCKF) is proposed in this paper. That is, the detection basis is the change of the cosine similarity between the actual measurement and the predictive measurement before and after the attack. When the system is suddenly attacked, the actual measurement will have an abrupt change. However, the predictive measurement will not vary promptly with it owing to the delay of Kalman filter estimation. Consequently, the cosine similarity between the two at this moment has undergone a change. This causes the ratio of the cosine similarity at this moment and that at the initial moment to fluctuate considerably compared to safe operation. If the detection threshold is triggered, the system will be judged to be under attack. Finally, the standard IEEE-14bus test system is used for simulation experiments to verify the effectiveness of the proposed detection method.
2021-11-29
Gupta, Hritvik, Patel, Mayank.  2020.  Study of Extractive Text Summarizer Using The Elmo Embedding. 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). :829–834.
In recent times, data excessiveness has become a major problem in the field of education, news, blogs, social media, etc. Due to an increase in such a vast amount of text data, it became challenging for a human to extract only the valuable amount of data in a concise form. In other words, summarizing the text, enables human to retrieves the relevant and useful texts, Text summarizing is extracting the data from the document and generating the short or concise text of the document. One of the major approaches that are used widely is Automatic Text summarizer. Automatic text summarizer analyzes the large textual data and summarizes it into the short summaries containing valuable information of the data. Automatic text summarizer further divided into two types 1) Extractive text summarizer, 2) Abstractive Text summarizer. In this article, the extractive text summarizer approach is being looked for. Extractive text summarization is the approach in which model generates the concise summary of the text by picking up the most relevant sentences from the text document. This paper focuses on retrieving the valuable amount of data using the Elmo embedding in Extractive text summarization. Elmo embedding is a contextual embedding that had been used previously by many researchers in abstractive text summarization techniques, but this paper focus on using it in extractive text summarizer.
2021-02-23
Park, S. H., Park, H. J., Choi, Y..  2020.  RNN-based Prediction for Network Intrusion Detection. 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). :572—574.
We investigate a prediction model using RNN for network intrusion detection in industrial IoT environments. For intrusion detection, we use anomaly detection methods that estimate the next packet, measure and score the distance measurement in real packets to distinguish whether it is a normal packet or an abnormal packet. When the packet was learned in the LSTM model, two-gram and sliding window of N-gram showed the best performance in terms of errors and the performance of the LSTM model was the highest compared with other data mining regression techniques. Finally, cosine similarity was used as a scoring function, and anomaly detection was performed by setting a boundary for cosine similarity that consider as normal packet.
2020-07-03
Adari, Suman Kalyan, Garcia, Washington, Butler, Kevin.  2019.  Adversarial Video Captioning. 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :24—27.
In recent years, developments in the field of computer vision have allowed deep learning-based techniques to surpass human-level performance. However, these advances have also culminated in the advent of adversarial machine learning techniques, capable of launching targeted image captioning attacks that easily fool deep learning models. Although attacks in the image domain are well studied, little work has been done in the video domain. In this paper, we show it is possible to extend prior attacks in the image domain to the video captioning task, without heavily affecting the video's playback quality. We demonstrate our attack against a state-of-the-art video captioning model, by extending a prior image captioning attack known as Show and Fool. To the best of our knowledge, this is the first successful method for targeted attacks against a video captioning model, which is able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.
2017-09-19
Bo, Li, Jinzhen, Wang, Ping, Zhao, Zhongjiang, Yan, Mao, Yang.  2016.  Research of Recognition System of Web Intrusion Detection Based on Storm. Proceedings of the Fifth International Conference on Network, Communication and Computing. :98–102.

Based on Storm, a distributed, reliable, fault-tolerant real-time data stream processing system, we propose a recognition system of web intrusion detection. The system is based on machine learning, feature selection algorithm by TF-IDF(Term Frequency–Inverse Document Frequency) and the optimised cosine similarity algorithm, at low false positive rate and a higher detection rate of attacks and malicious behavior in real-time to protect the security of user data. From comparative analysis of experiments we find that the system for intrusion recognition rate and false positive rate has improved to some extent, it can be better to complete the intrusion detection work.

2015-05-05
Eun Hee Ko, Klabjan, D..  2014.  Semantic Properties of Customer Sentiment in Tweets. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :657-663.

An increasing number of people are using online social networking services (SNSs), and a significant amount of information related to experiences in consumption is shared in this new media form. Text mining is an emerging technique for mining useful information from the web. We aim at discovering in particular tweets semantic patterns in consumers' discussions on social media. Specifically, the purposes of this study are twofold: 1) finding similarity and dissimilarity between two sets of textual documents that include consumers' sentiment polarities, two forms of positive vs. negative opinions and 2) driving actual content from the textual data that has a semantic trend. The considered tweets include consumers' opinions on US retail companies (e.g., Amazon, Walmart). Cosine similarity and K-means clustering methods are used to achieve the former goal, and Latent Dirichlet Allocation (LDA), a popular topic modeling algorithm, is used for the latter purpose. This is the first study which discover semantic properties of textual data in consumption context beyond sentiment analysis. In addition to major findings, we apply LDA (Latent Dirichlet Allocations) to the same data and drew latent topics that represent consumers' positive opinions and negative opinions on social media.