Biblio

List
Filter

Found 135 results

Filters: Keyword is Clustering algorithms [Clear All Filters]

2021-05-25

Chao, Henry, Stark, Benjamin, Samarah, Mohammad. 2019. Analysis of Learning Modalities Towards Effective Undergraduate Cybersecurity Education Design. 2019 IEEE International Conference on Engineering, Technology and Education (TALE). :1—6.

Cybersecurity education is a critical component of today's computer science and IT curriculum. To provide for a highly effective cybersecurity education, we propose using machine-learning techniques to identify common learning modalities of cybersecurity students in order to optimize how cybersecurity core topics, threats, tools and techniques are taught. We test various hypothesis, e.g. that students of selected VARK learning styles will outperform their peers. The results indicate that for the class assignments in our study preference of read/write and kinesthetic modalities yielded the best results. This further indicates that specific learning instruments can be tailored for students based on their individual VARK learning styles.

Santos, Bernardo, Dzogovic, Bruno, Feng, Boning, Jacot, Niels, Do, Van Thuan, Do, Thanh Van. 2020. Improving Cellular IoT Security with Identity Federation and Anomaly Detection. 2020 5th International Conference on Computer and Communication Systems (ICCCS). :776—780.

As we notice the increasing adoption of Cellular IoT solutions (smart-home, e-health, among others), there are still some security aspects that can be improved as these devices can suffer various types of attacks that can have a high-impact over our daily lives. In order to avoid this, we present a multi-front security solution that consists on a federated cross-layered authentication mechanism, as well as a machine learning platform with anomaly detection techniques for data traffic analysis as a way to study devices' behavior so it can preemptively detect attacks and minimize their impact. In this paper, we also present a proof-of-concept to illustrate the proposed solution and showcase its feasibility, as well as the discussion of future iterations that will occur for this work.

Kore, Ashwini, Patil, Shailaja. 2020. Robust Cross-Layer Security Framework For Internet of Things Enabled Wireless Sensor Networks. 2020 International Conference on Emerging Smart Computing and Informatics (ESCI). :142—147.

The significant development of Internet of Things (IoT) paradigm for monitoring the real-time applications using the wireless communication technologies leads to various challenges. The secure data transmission and privacy is one of the key challenges of IoT enabled Wireless Sensor Networks (WSNs) communications. Due to heterogeneity of attackers like Man-in-Middle Attack (MIMA), the present single layered security solutions are not sufficient. In this paper, the robust cross-layer trust computation algorithm for MIMA attacker detection proposed for IoT enabled WSNs called IoT enabled Cross-Layer Man-in-Middle Attack Detection System (IC-MADS). In IC-MADS, first robust clustering method proposed to form the clusters and cluster head (CH) preference. After clustering, for every sensor node, its trust value computed using the parameters of three layers such as MAC, Physical, and Network layers to protect the network communications in presence of security threats. The simulation results prove that IC-MADS achieves better protection against MIMA attacks with minimum overhead and energy consumption.

2021-05-13

Everson, Douglas, Cheng, Long. 2020. Network Attack Surface Simplification for Red and Blue Teams. 2020 IEEE Secure Development (SecDev). :74–80.

Network port scans are a key first step to developing a true understanding of a network-facing attack surface. However in large-scale networks, the data resulting from such scans can be too numerous for Red Teams to process for manual and semiautomatic testing. Indiscriminate port scans can also compromise a Red Team seeking to quickly gain a foothold on a network. A large attack surface can even complicate Blue Team activities like threat hunting. In this paper we provide a cluster analysis methodology designed to group similar hosts to reduce security team workload and Red Team observability. We also measure the Internet-facing network attack surface of 13 organizations by clustering their hosts based on similarity. Through a case study we demonstrate how the output of our clustering technique provides new insight to both Red and Blue Teams, allowing them to quickly identify potential high-interest points on the attack surface.

2021-04-27

Li, Y., Zhou, Y., Hu, K., Sun, N., Ke, K.. 2020. A Security Situation Prediction Method Based on Improved Deep Belief Network. 2020 IEEE 2nd International Conference on Civil Aviation Safety and Information Technology (ICCASIT. :594–598.

With the rapid development of smart grids and the continuous deepening of informatization, while realizing remote telemetry and remote control of massive data-based grid operation, electricity information network security problems have become more serious and prominent. A method for electricity information network security situation prediction method based on improved deep belief network is proposed in this paper. Firstly, the affinity propagation clustering algorithm is used to determine the depth of the deep belief network and the number of hidden layer nodes based on sample parameters. Secondly, continuously adjust the scaling factor and crossover probability in the differential evolution algorithm according to the population similarity. Finally, a chaotic search method is used to perform a second search for the best individuals and similarity centers of each generation of the population. Simulation experiments show that the proposed algorithm not only enhances the generalization ability of electricity information network security situation prediction, but also has higher prediction accuracy.

2021-03-29

Mar, Z., Oo, K. K.. 2020. An Improvement of Apriori Mining Algorithm using Linked List Based Hash Table. 2020 International Conference on Advanced Information Technologies (ICAIT). :165–169.

Today, the huge amount of data was using in organizations around the world. This huge amount of data needs to process so that we can acquire useful information. Consequently, a number of industry enterprises discovered great information from shopper purchases found in any respect times. In data mining, the most important algorithms for find frequent item sets from large database is Apriori algorithm and discover the knowledge using the association rule. Apriori algorithm was wasted times for scanning the whole database and searching the frequent item sets and inefficient of memory requirement when large numbers of transactions are in consideration. The improved Apriori algorithm is adding and calculating third threshold may increase the overhead. So, in the aims of proposed research, Improved Apriori algorithm with LinkedList and hash tabled is used to mine frequent item sets from the transaction large amount of database. This method includes database is scanning with Improved Apriori algorithm and frequent 1-item sets counts with using the hash table. Then, in the linked list saved the next frequent item sets and scanning the database. The hash table used to produce the frequent 2-item sets Therefore, the database scans the only two times and necessary less processing time and memory space.

2021-03-22

Li, Y., Zhou, W., Wang, H.. 2020. F-DPC: Fuzzy Neighborhood-Based Density Peak Algorithm. IEEE Access. 8:165963–165972.

Clustering is a concept in data mining, which divides a data set into different classes or clusters according to a specific standard, making the similarity of data objects in the same cluster as large as possible. Clustering by fast search and find of density peaks (DPC) is a novel clustering algorithm based on density. It is simple and novel, only requiring fewer parameters to achieve better clustering effect, without the requirement for iterative solution. And it has expandability and can detect the clustering of any shape. However, DPC algorithm still has some defects, such as it employs the clear neighborhood relations to calculate local density, so it cannot identify the neighborhood membership of different values of points from the distance of points and It is impossible to accurately cluster the data of the multi-density peak. The fuzzy neighborhood density peak clustering algorithm is proposed for this shortcoming (F-DPC): novel local density is defined by the fuzzy neighborhood relationship. The fuzzy set theory can be used to make the fuzzy neighborhood function of local density more sensitive, so that the clustering for data set of various shapes and densities is more robust. Experiments show that the algorithm has high accuracy and robustness.

2021-03-01

Raj, C., Khular, L., Raj, G.. 2020. Clustering Based Incident Handling For Anomaly Detection in Cloud Infrastructures. 2020 10th International Conference on Cloud Computing, Data Science Engineering (Confluence). :611–616.

Incident Handling for Cloud Infrastructures focuses on how the clustering based and non-clustering based algorithms can be implemented. Our research focuses in identifying anomalies and suspicious activities that might happen inside a Cloud Infrastructure over available datasets. A brief study has been conducted, where a network statistics dataset the NSL-KDD, has been chosen as the model to be worked upon, such that it can mirror the Cloud Infrastructure and its components. An important aspect of cloud security is to implement anomaly detection mechanisms, in order to monitor the incidents that inhibit the development and the efficiency of the cloud. Several methods have been discovered which help in achieving our present goal, some of these are highlighted as the following; by applying algorithm such as the Local Outlier Factor to cancel the noise created by irrelevant data points, by applying the DBSCAN algorithm which can detect less denser areas in order to identify their cause of clustering, the K-Means algorithm to generate positive and negative clusters to identify the anomalous clusters and by applying the Isolation Forest algorithm in order to implement decision based approach to detect anomalies. The best algorithm would help in finding and fixing the anomalies efficiently and would help us in developing an Incident Handling model for the Cloud.

2021-02-22

Oliver, J., Ali, M., Hagen, J.. 2020. HAC-T and Fast Search for Similarity in Security. 2020 International Conference on Omni-layer Intelligent Systems (COINS). :1–7.

Similarity digests have gained popularity for many security applications like blacklisting/whitelisting, and finding similar variants of malware. TLSH has been shown to be particularly good at hunting similar malware, and is resistant to evasion as compared to other similarity digests like ssdeep and sdhash. Searching and clustering are fundamental tools which help the security analysts and security operations center (SOC) operators in hunting and analyzing malware. Current approaches which aim to cluster malware are not scalable enough to keep up with the vast amount of malware and goodware available in the wild. In this paper, we present techniques which allow for fast search and clustering of TLSH hash digests which can aid analysts to inspect large amounts of malware/goodware. Our approach builds on fast nearest neighbor search techniques to build a tree-based index which performs fast search based on TLSH hash digests. The tree-based index is used in our threshold based Hierarchical Agglomerative Clustering (HAC-T) algorithm which is able to cluster digests in a scalable manner. Our clustering technique can cluster digests in O (n logn) time on average. We performed an empirical evaluation by comparing our approach with many standard and recent clustering techniques. We demonstrate that our approach is much more scalable and still is able to produce good cluster quality. We measured cluster quality using purity on 10 million samples obtained from VirusTotal. We obtained a high purity score in the range from 0.97 to 0.98 using labels from five major anti-virus vendors (Kaspersky, Microsoft, Symantec, Sophos, and McAfee) which demonstrates the effectiveness of the proposed method.

Koda, S., Kambara, Y., Oikawa, T., Furukawa, K., Unno, Y., Murakami, M.. 2020. Anomalous IP Address Detection on Traffic Logs Using Novel Word Embedding. 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). :1504–1509.

This paper presents an anomalous IP address detection algorithm for network traffic logs. It is based on word embedding techniques derived from natural language processing to extract the representative features of IP addresses. However, the features extracted from vanilla word embeddings are not always compatible with machine learning-based anomaly detection algorithms. Therefore, we developed an algorithm that enables the extraction of more compatible features of IP addresses for anomaly detection than conventional methods. The proposed algorithm optimizes the objective functions of word embedding-based feature extraction and anomaly detection, simultaneously. According to the experimental results, the proposed algorithm outperformed conventional approaches; it improved the detection performance from 0.876 to 0.990 in the area under the curve criterion in a task of detecting the IP addresses of attackers from network traffic logs.

2021-02-10

Gomes, G., Dias, L., Correia, M.. 2020. CryingJackpot: Network Flows and Performance Counters against Cryptojacking. 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA). :1—10.

Cryptojacking, the appropriation of users' computational resources without their knowledge or consent to obtain cryp-tocurrencies, is a widespread attack, relatively easy to implement and hard to detect. Either browser-based or binary, cryptojacking lacks robust and reliable detection solutions. This paper presents a hybrid approach to detect cryptojacking where no previous knowledge about the attacks or training data is needed. Our Cryp-tojacking Intrusion Detection Approach, Cryingjackpot, extracts and combines flow and performance counter-based features, aggregating hosts with similar behavior by using unsupervised machine learning algorithms. We evaluate Cryingjackpot experimentally with both an artificial and a hybrid dataset, achieving F1-scores up to 97%.

2021-01-28

Lin, G., Zhao, H., Zhao, L., Gan, X., Yao, Z.. 2020. Differential Privacy Information Publishing Algorithm based on Cluster Anonymity. 2020 International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). :226—233.

With the development of Internet technology, the attacker gets more and more complex background knowledge, which makes the anonymous model susceptible to background attack. Although the differential privacy model can resist the background attack, it reduces the versatility of the data. In this paper, this paper proposes a differential privacy information publishing algorithm based on clustering anonymity. The algorithm uses the cluster anonymous algorithm based on KD tree to cluster the original data sets and gets anonymous tables by anonymous operation. Finally, the algorithm adds noise to the anonymous table to satisfy the definition of differential privacy. The algorithm is compared with the DCMDP (Density-Based Clustering Mechanism with Differential Privacy, DCMDP) algorithm under different privacy budgets. The experiments show that as the privacy budget increases, the algorithm reduces the information loss by about 80% of the published data.

Wang, N., Song, H., Luo, T., Sun, J., Li, J.. 2020. Enhanced p-Sensitive k-Anonymity Models for Achieving Better Privacy. 2020 IEEE/CIC International Conference on Communications in China (ICCC). :148—153.

To our best knowledge, the p-sensitive k-anonymity model is a sophisticated model to resist linking attacks and homogeneous attacks in data publishing. However, if the distribution of sensitive values is skew, the model is difficult to defend against skew attacks and even faces sensitive attacks. In practice, the privacy requirements of different sensitive values are not always identical. The “one size fits all” unified privacy protection level may cause unnecessary information loss. To address these problems, the paper quantifies privacy requirements with the concept of IDF and concerns more about sensitive groups. Two enhanced anonymous models with personalized protection characteristic, that is, (p,αisg) -sensitive k-anonymity model and (pi,αisg)-sensitive k-anonymity model, are then proposed to resist skew attacks and sensitive attacks. Furthermore, two clustering algorithms with global search and local search are designed to implement our models. Experimental results show that the two enhanced models have outstanding advantages in better privacy at the expense of a little data utility.

Li, Y., Chen, J., Li, Q., Liu, A.. 2020. Differential Privacy Algorithm Based on Personalized Anonymity. 2020 5th IEEE International Conference on Big Data Analytics (ICBDA). :260—267.

The existing anonymized differential privacy model adopts a unified anonymity method, ignoring the difference of personal privacy, which may lead to the problem of excessive or insufficient protection of the original data [1]. Therefore, this paper proposes a personalized k-anonymity model for tuples (PKA) and proposes a differential privacy data publishing algorithm (DPPA) based on personalized anonymity, firstly based on the tuple personality factor set by the user in the original data set. The values are classified and the corresponding privacy protection relevance is calculated. Then according to the tuple personality factor classification value, the data set is clustered by clustering method with different anonymity, and the quasi-identifier attribute of each cluster is aggregated and noise-added to realize anonymized differential privacy; finally merge the subset to get the data set that meets the release requirements. In this paper, the correctness of the algorithm is analyzed theoretically, and the feasibility and effectiveness of the proposed algorithm are verified by comparison with similar algorithms.

2021-01-22

Alghamdi, A. A., Reger, G.. 2020. Pattern Extraction for Behaviours of Multi-Stage Threats via Unsupervised Learning. 2020 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA). :1—8.

Detection of multi-stage threats such as Advanced Persistent Threats (APT) is extremely challenging due to their deceptive approaches. Sequential events of threats might look benign when performed individually or from different addresses. We propose a new unsupervised framework to identify patterns and correlations of malicious behaviours by analysing heterogeneous log-files. The framework consists of two main phases of data analysis to extract inner-behaviours of log-files and then the patterns of those behaviours over analysed files. To evaluate the framework we have produced a (publicly available) labelled version of the SotM43 dataset. Our results demonstrate that the framework can (i) efficiently cluster inner-behaviours of log-files with high accuracy and (ii) extract patterns of malicious behaviour and correlations between those patterns from real-world data.

2020-12-28

Padmapriya, S., Valli, R., Jayekumar, M.. 2020. Monitoring Algorithm in Malicious Vehicular Adhoc Networks. 2020 International Conference on System, Computation, Automation and Networking (ICSCAN). :1—6.

Vehicular Adhoc Networks (VANETs) ensures road safety by communicating with a set of smart vehicles. VANET is a subset of Mobile Adhoc Networks (MANETs). VANET enabled vehicles helps in establishing communication services among one another or with the Road Side Unit (RSU). Information transmitted in VANET is distributed in an open access environment and hence security is one of the most critical issues related to VANET. Although each vehicle is not a source of all communications, most contact depends on the information that other vehicles receive from it. That vehicle must be able to assess, determine and respond locally on the information obtained from other vehicles to protect VANET from malicious act. Of this reason, message verification in VANET is more difficult due to the protection and privacy issues of the participating vehicles. To overcome security threats, we propose Monitoring Algorithm that detects malicious nodes based on the pre-selected threshold value. The threshold value is compared with the distrust value which is inherently tagged with each vehicle. The proposed Monitoring Algorithm not only detects malicious vehicles, but also isolates the malicious vehicles from the network. The proposed technique is simulated using Network Simulator2 (NS2) tool. The simulation result illustrated that the proposed Monitoring Algorithm outperforms the existing algorithms in terms of malicious node detection, network delay, packet delivery ratio and throughput, thereby uplifting the overall performance of the network.

Marichamy, V. S., Natarajan, V.. 2020. A Study of Big Data Security on a Partitional Clustering Algorithm with Perturbation Technique. 2020 International Conference on Smart Electronics and Communication (ICOSEC). :482—486.

Partitional Clustering Algorithm (PCA) on the Hadoop Distributed File System is to perform big data securities using the Perturbation Technique is the main idea of the proposed work. There are numerous clustering methods available that are used to categorize the information from the big data. PCA discovers the cluster based on the initial partition of the data. In this approach, it is possible to develop a security safeguarding of data that is impoverished to allow the calculations and communication. The performances were analyzed on Health Care database under the studies of various parameters like precision, accuracy, and F-score measure. The outcome of the results is to demonstrate that this method is used to decrease the complication in preserving privacy and better accuracy than that of the existing techniques.

2020-12-02

Wang, W., Xuan, S., Yang, W., Chen, Y.. 2019. User Credibility Assessment Based on Trust Propagation in Microblog. 2019 Computing, Communications and IoT Applications (ComComAp). :270—275.

Nowadays, Microblog has become an important online social networking platform, and a large number of users share information through Microblog. Many malicious users have released various false news driven by various interests, which seriously affects the availability of Microblog platform. Therefore, the evaluation of Microblog user credibility has become an important research issue. This paper proposes a microblog user credibility evaluation algorithm based on trust propagation. In view of the high consumption and low precision caused by malicious users' attacking algorithms and manual selection of seed sets by establishing false social relationships, this paper proposes two optimization strategies: pruning algorithm based on social activity and similarity and based on The seed node selection algorithm of clustering. The pruning algorithm can trim off the attack edges established by malicious users and normal users. The seed node selection algorithm can efficiently select the highly available seed node set, and finally use the user social relationship graph to perform the two-way propagation trust scoring, so that the low trusted user has a lower trusted score and thus identifies the malicious user. The related experiments verify the effectiveness of the trustworthiness-based user credibility evaluation algorithm in the evaluation of Microblog user credibility.

Gliksberg, J., Capra, A., Louvet, A., García, P. J., Sohier, D.. 2019. High-Quality Fault-Resiliency in Fat-Tree Networks (Extended Abstract). 2019 IEEE Symposium on High-Performance Interconnects (HOTI). :9—12.

Coupling regular topologies with optimized routing algorithms is key in pushing the performance of interconnection networks of HPC systems. In this paper we present Dmodc, a fast deterministic routing algorithm for Parallel Generalized Fat-Trees (PGFTs) which minimizes congestion risk even under massive topology degradation caused by equipment failure. It applies a modulo-based computation of forwarding tables among switches closer to the destination, using only knowledge of subtrees for pre-modulo division. Dmodc allows complete re-routing of topologies with tens of thousands of nodes in less than a second, which greatly helps centralized fabric management react to faults with high-quality routing tables and no impact to running applications in current and future very large-scale HPC clusters. We compare Dmodc against routing algorithms available in the InfiniBand control software (OpenSM) first for routing execution time to show feasibility at scale, and then for congestion risk under degradation to demonstrate robustness. The latter comparison is done using static analysis of routing tables under random permutation (RP), shift permutation (SP) and all-to-all (A2A) traffic patterns. Results for Dmodc show A2A and RP congestion risks similar under heavy degradation as the most stable algorithms compared, and near-optimal SP congestion risk up to 1% of random degradation.

2020-12-01

Sun, P., Yin, S., Man, W., Tao, T.. 2018. Research of Personalized Recommendation Algorithm Based on Trust and User's Interest. 2018 International Conference on Robots Intelligent System (ICRIS). :153—156.

Most traditional recommendation algorithms only consider the binary relationship between users and projects, these can basically be converted into score prediction problems. But most of these algorithms ignore the users's interests, potential work factors or the other social factors of the recommending products. In this paper, based on the existing trustworthyness model and similarity measure, we puts forward the concept of trust similarity and design a joint interest-content recommendation framework to suggest users which videos to watch in the online video site. In this framework, we first analyze the user's viewing history records, tags and establish the user's interest characteristic vector. Then, based on the updated vector, users should be clustered by sparse subspace clust algorithm, which can improve the efficiency of the algorithm. We certainly improve the calculation of similarity to help users find better neighbors. Finally we conduct experiments using real traces from Tencent Weibo and Youku to verify our method and evaluate its performance. The results demonstrate the effectiveness of our approach and show that our approach can substantially improve the recommendation accuracy.

2020-09-04

Qader, Karwan, Adda, Mo. 2019. DOS and Brute Force Attacks Faults Detection Using an Optimised Fuzzy C-Means. 2019 IEEE International Symposium on INnovations in Intelligent SysTems and Applications (INISTA). :1—6.

This paper explains how the commonly occurring DOS and Brute Force attacks on computer networks can be efficiently detected and network performance improved, which reduces costs and time. Therefore, network administrators attempt to instantly diagnose any network issues. The experimental work used the SNMP-MIB parameter datasets, which are collected via a specialised MIB dataset consisting of seven types of attack as noted in section three. To resolves such issues, this researched carried out several important contributions which are related to fault management concerns in computer network systems. A central task in the detection of the attacks relies on MIB feature behaviours using the suggested SFCM method. It was concluded that the DOS and Brute Force fault detection results for three different clustering methods demonstrated that the proposed SFCM detected every data point in the related group. Consequently, the FPC approached 1.0, its highest record, and an improved performance solution better than the EM methods and K-means are based on SNMP-MIB variables.

2020-08-13

Yang, Huiting, Bai, Yunxiao, Zou, Zhenwan, Shi, Yuanyuan, Chen, Shuting, Ni, Chenxi. 2019. Research on Security Self-defense of Power Information Network Based on Artificial Intelligence. 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). 1:1248—1251.

By studying the problems of network information security in power system, this paper proposes a self-defense research and solution for power information network based on artificial intelligence. At the same time, it proposes active defense new technologies such as vulnerability scanning, baseline scanning, network security attack and defense drills in power information network security, aiming at improving the security level of network information and ensuring the security of the information network in the power system.

2020-07-06

Ben, Yongming, Han, Yanni, Cai, Ning, An, Wei, Xu, Zhen. 2019. An Online System Dependency Graph Anomaly Detection based on Extended Weisfeiler-Lehman Kernel. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–6.

Modern operating systems are typical multitasking systems: Running multiple tasks at the same time. Therefore, a large number of system calls belonging to different processes are invoked at the same time. By associating these invocations, one can construct the system dependency graph. In rapidly evolving system dependency graphs, how to quickly find outliers is an urgent issue for intrusion detection. Clustering analysis based on graph similarity will help solve this problem. In this paper, an extended Weisfeiler-Lehman(WL) kernel is proposed. Firstly, an embedded vector with indefinite dimensions is constructed based on the original dependency graph. Then, the vector is compressed with Simhash to generate a fingerprint. Finally, anomaly detection based on clustering is carried out according to these fingerprints. Our scheme can achieve prominent detection with high efficiency. For validation, we choose StreamSpot, a relevant prior work, to act as benchmark, and use the same data set as it to carry out evaluations. Experiments show that our scheme can achieve the highest detection precision of 98% while maintaining a perfect recall performance. Moreover, both quantitative and visual comparisons demonstrate the outperforming clustering effect of our scheme than StreamSpot.

2020-06-01

Lili, Yu, Lei, Zhang, Jing, Li, Fanbo, Meng. 2018. A PSO clustering based RFID middleware. 2018 4th International Conference on Control, Automation and Robotics (ICCAR). :222–225.

In current, RFID (Radio Frequency Identification) Middleware is widely used in nearly all RFID applications, and provides service for raw data capturing, security data reading/writing as well as sensors controlling. However, as the existing Middlewares were organized with rigorous data comparison and data encryption, it is seriously affecting the usefulness and execution efficiency. Thus, in order to improve the utilization rate of effective data, increase the reading/writing speed as well as preserving the security of RFID, this paper proposed a PSO (Particle swarm optimization) clustering scheme to accelerate the procedure of data operation. Then with the help of PSO cluster, the RFID Middleware can provide better service for both data security and data availability. At last, a comparative experiment is proposed, which is used to further verify the advantage of our proposed scheme.

2020-05-22

Yan, Donghui, Wang, Yingjie, Wang, Jin, Wang, Honggang, Li, Zhenpeng. 2018. K-nearest Neighbor Search by Random Projection Forests. 2018 IEEE International Conference on Big Data (Big Data). :4775—4781.

K-nearest neighbor (kNN) search has wide applications in many areas, including data mining, machine learning, statistics and many applied domains. Inspired by the success of ensemble methods and the flexibility of tree-based methodology, we propose random projection forests, rpForests, for kNN search. rpForests finds kNNs by aggregating results from an ensemble of random projection trees with each constructed recursively through a series of carefully chosen random projections. rpForests achieves a remarkable accuracy in terms of fast decay in the missing rate of kNNs and that of discrepancy in the kNN distances. rpForests has a very low computational complexity. The ensemble nature of rpForests makes it easily run in parallel on multicore or clustered computers; the running time is expected to be nearly inversely proportional to the number of cores or machines. We give theoretical insights by showing the exponential decay of the probability that neighboring points would be separated by ensemble random projection trees when the ensemble size increases. Our theory can be used to refine the choice of random projections in the growth of trees, and experiments show that the effect is remarkable.