Visible to the public Biblio

Filters: Keyword is vector space model  [Clear All Filters]
2019-02-25
Karamollaoglu, H., Dogru, İ A., Dorterler, M..  2018.  Detection of Spam E-mails with Machine Learning Methods. 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). :1–5.

E-mail communication is one of today's indispensable communication ways. The widespread use of email has brought about some problems. The most important one of these problems are spam (unwanted) e-mails, often composed of advertisements or offensive content, sent without the recipient's request. In this study, it is aimed to analyze the content information of e-mails written in Turkish with the help of Naive Bayes Classifier and Vector Space Model from machine learning methods, to determine whether these e-mails are spam e-mails and classify them. Both methods are subjected to different evaluation criteria and their performances are compared.

2018-01-16
Gurjar, S. P. S., Pasupuleti, S. K..  2016.  A privacy-preserving multi-keyword ranked search scheme over encrypted cloud data using MIR-tree. 2016 International Conference on Computing, Analytics and Security Trends (CAST). :533–538.

With increasing popularity of cloud computing, the data owners are motivated to outsource their sensitive data to cloud servers for flexibility and reduced cost in data management. However, privacy is a big concern for outsourcing data to the cloud. The data owners typically encrypt documents before outsourcing for privacy-preserving. As the volume of data is increasing at a dramatic rate, it is essential to develop an efficient and reliable ciphertext search techniques, so that data owners can easily access and update cloud data. In this paper, we propose a privacy preserving multi-keyword ranked search scheme over encrypted data in cloud along with data integrity using a new authenticated data structure MIR-tree. The MIR-tree based index with including the combination of widely used vector space model and TF×IDF model in the index construction and query generation. We use inverted file index for storing word-digest, which provides efficient and fast relevance between the query and cloud data. Design an authentication set(AS) for authenticating the queries, for verifying top-k search results. Because of tree based index, our scheme achieves optimal search efficiency and reduces communication overhead for verifying the search results. The analysis shows security and efficiency of our scheme.

2015-05-05
Zadeh, B.Q., Handschuh, S..  2014.  Random Manhattan Indexing. Database and Expert Systems Applications (DEXA), 2014 25th International Workshop on. :203-208.

Vector space models (VSMs) are mathematically well-defined frameworks that have been widely used in text processing. In these models, high-dimensional, often sparse vectors represent text units. In an application, the similarity of vectors -- and hence the text units that they represent -- is computed by a distance formula. The high dimensionality of vectors, however, is a barrier to the performance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a new method, called Random Manhattan Indexing (RMI), for the construction of L1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM and dimension reduction into an incremental, and thus scalable, procedure. In order to attain its goal, RMI employs the sparse Cauchy random projections.