Biblio

List
Filter

Found 789 results

Filters: Keyword is learning (artificial intelligence) [Clear All Filters]

2021-03-09

Muhammad, A., Asad, M., Javed, A. R.. 2020. Robust Early Stage Botnet Detection using Machine Learning. 2020 International Conference on Cyber Warfare and Security (ICCWS). :1—6.

Among the different types of malware, botnets are rising as the most genuine risk against cybersecurity as they give a stage to criminal operations (e.g., Distributed Denial of Service (DDOS) attacks, malware dispersal, phishing, and click fraud and identity theft). Existing botnet detection techniques work only on specific botnet Command and Control (C&C) protocols and lack in providing early-stage botnet detection. In this paper, we propose an approach for early-stage botnet detection. The proposed approach first selects the optimal features using feature selection techniques. Next, it feeds these features to machine learning classifiers to evaluate the performance of the botnet detection. Experiments reveals that the proposed approach efficiently classifies normal and malicious traffic at an early stage. The proposed approach achieves the accuracy of 99%, True Positive Rate (TPR) of 0.99 %, and False Positive Rate (FPR) of 0.007 % and provide an efficient detection rate in comparison with the existing approach.

Yerima, S. Y., Alzaylaee, M. K.. 2020. Mobile Botnet Detection: A Deep Learning Approach Using Convolutional Neural Networks. 2020 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA). :1—8.

Android, being the most widespread mobile operating systems is increasingly becoming a target for malware. Malicious apps designed to turn mobile devices into bots that may form part of a larger botnet have become quite common, thus posing a serious threat. This calls for more effective methods to detect botnets on the Android platform. Hence, in this paper, we present a deep learning approach for Android botnet detection based on Convolutional Neural Networks (CNN). Our proposed botnet detection system is implemented as a CNN-based model that is trained on 342 static app features to distinguish between botnet apps and normal apps. The trained botnet detection model was evaluated on a set of 6,802 real applications containing 1,929 botnets from the publicly available ISCX botnet dataset. The results show that our CNN-based approach had the highest overall prediction accuracy compared to other popular machine learning classifiers. Furthermore, the performance results observed from our model were better than those reported in previous studies on machine learning based Android botnet detection.

Kamilin, M. H. B., Yamaguchi, S.. 2020. White-Hat Worm Launcher Based on Deep Learning in Botnet Defense System. 2020 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia). :1—2.

This paper proposes a deep learning-based white-hat worm launcher in Botnet Defense System (BDS). BDS uses white-hat botnets to defend an IoT system against malicious botnets. White-hat worm launcher literally launches white-hat worms to create white-hat botnets according to the strategy decided by BDS. The proposed launcher learns with deep learning where is the white-hat worms' right place to successfully drive out malicious botnets. Given a system situation invaded by malicious botnets, it predicts a worms' placement by the learning result and launches them. We confirmed the effect of the proposed launcher through simulating evaluation.

Herrera, A. E. Hinojosa, Walshaw, C., Bailey, C.. 2020. Improving Black Box Classification Model Veracity for Electronics Anomaly Detection. 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA). :1092–1097.

Data driven classification models are useful to assess quality of manufactured electronics. Because decisions are taken based on the models, their veracity is relevant, covering aspects such as accuracy, transparency and clarity. The proposed BB-Stepwise algorithm aims to improve the classification model transparency and accuracy of black box models. K-Nearest Neighbours (KNN) is a black box model which is easy to implement and has achieved good classification performance in different applications. In this paper KNN-Stepwise is illustrated for fault detection of electronics devices. The results achieved shows that the proposed algorithm was able to improve the accuracy, veracity and transparency of KNN models and achieve higher transparency and clarity, and at least similar accuracy than when using Decision Tree models.

Mashhadi, M. J., Hemmati, H.. 2020. Hybrid Deep Neural Networks to Infer State Models of Black-Box Systems. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). :299–311.

Inferring behavior model of a running software system is quite useful for several automated software engineering tasks, such as program comprehension, anomaly detection, and testing. Most existing dynamic model inference techniques are white-box, i.e., they require source code to be instrumented to get run-time traces. However, in many systems, instrumenting the entire source code is not possible (e.g., when using black-box third-party libraries) or might be very costly. Unfortunately, most black-box techniques that detect states over time are either univariate, or make assumptions on the data distribution, or have limited power for learning over a long period of past behavior. To overcome the above issues, in this paper, we propose a hybrid deep neural network that accepts as input a set of time series, one per input/output signal of the system, and applies a set of convolutional and recurrent layers to learn the non-linear correlations between signals and the patterns, over time. We have applied our approach on a real UAV auto-pilot solution from our industry partner with half a million lines of C code. We ran 888 random recent system-level test cases and inferred states, over time. Our comparison with several traditional time series change point detection techniques showed that our approach improves their performance by up to 102%, in terms of finding state change points, measured by F1 score. We also showed that our state classification algorithm provides on average 90.45% F1 score, which improves traditional classification algorithms by up to 17%.

Cui, W., Li, X., Huang, J., Wang, W., Wang, S., Chen, J.. 2020. Substitute Model Generation for Black-Box Adversarial Attack Based on Knowledge Distillation. 2020 IEEE International Conference on Image Processing (ICIP). :648–652.

Although deep convolutional neural network (CNN) performs well in many computer vision tasks, its classification mechanism is very vulnerable when it is exposed to the perturbation of adversarial attacks. In this paper, we proposed a new algorithm to generate the substitute model of black-box CNN models by using knowledge distillation. The proposed algorithm distills multiple CNN teacher models to a compact student model as the substitution of other black-box CNN models to be attacked. The black-box adversarial samples can be consequently generated on this substitute model by using various white-box attacking methods. According to our experiments on ResNet18 and DenseNet121, our algorithm boosts the attacking success rate (ASR) by 20% by training the substitute model based on knowledge distillation.

MATSUNAGA, Y., AOKI, N., DOBASHI, Y., KOJIMA, T.. 2020. A Black Box Modeling Technique for Distortion Stomp Boxes Using LSTM Neural Networks. 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). :653–656.

This paper describes an experimental result of modeling stomp boxes of the distortion effect based on a machine learning approach. Our proposed technique models a distortion stomp box as a neural network consisting of LSTM layers. In this approach, the neural network is employed for learning the nonlinear behavior of the distortion stomp boxes. All the parameters for replicating the distortion sound are estimated through its training process using the input and output signals obtained from some commercial stomp boxes. The experimental result indicates that the proposed technique may have a certain appropriateness to replicate the distortion sound by using the well-trained neural networks.

2021-03-04

Nugraha, B., Nambiar, A., Bauschert, T.. 2020. Performance Evaluation of Botnet Detection using Deep Learning Techniques. 2020 11th International Conference on Network of the Future (NoF). :141—149.

Botnets are one of the major threats on the Internet. They are used for malicious activities to compromise the basic network security goals, namely Confidentiality, Integrity, and Availability. For reliable botnet detection and defense, deep learning-based approaches were recently proposed. In this paper, four different deep learning models, namely Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), hybrid CNN-LSTM, and Multi-layer Perception (MLP) are applied for botnet detection and simulation studies are carried out using the CTU-13 botnet traffic dataset. We use several performance metrics such as accuracy, sensitivity, specificity, precision, and F1 score to evaluate the performance of each model on classifying both known and unknown (zero-day) botnet traffic patterns. The results show that our deep learning models can accurately and reliably detect both known and unknown botnet traffic, and show better performance than other deep learning models.

Matin, I. Muhamad Malik, Rahardjo, B.. 2020. A Framework for Collecting and Analysis PE Malware Using Modern Honey Network (MHN). 2020 8th International Conference on Cyber and IT Service Management (CITSM). :1—5.

Nowadays, Windows is an operating system that is very popular among people, especially users who have limited knowledge of computers. But unconsciously, the security threat to the windows operating system is very high. Security threats can be in the form of illegal exploitation of the system. The most common attack is using malware. To determine the characteristics of malware using dynamic analysis techniques and static analysis is very dependent on the availability of malware samples. Honeypot is the most effective malware collection technique. But honeypot cannot determine the type of file format contained in malware. File format information is needed for the purpose of handling malware analysis that is focused on windows-based malware. For this reason, we propose a framework that can collect malware information as well as identify malware PE file type formats. In this study, we collected malware samples using a modern honey network. Next, we performed a feature extraction to determine the PE file format. Then, we classify types of malware using VirusTotal scanning. As the results of this study, we managed to get 1.222 malware samples. Out of 1.222 malware samples, we successfully extracted 945 PE malware. This study can help researchers in other research fields, such as machine learning and deep learning, for malware detection.

Carlini, N., Farid, H.. 2020. Evading Deepfake-Image Detectors with White- and Black-Box Attacks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :2804—2813.

It is now possible to synthesize highly realistic images of people who do not exist. Such content has, for example, been implicated in the creation of fraudulent socialmedia profiles responsible for dis-information campaigns. Significant efforts are, therefore, being deployed to detect synthetically-generated content. One popular forensic approach trains a neural network to distinguish real from synthetic content.We show that such forensic classifiers are vulnerable to a range of attacks that reduce the classifier to near- 0% accuracy. We develop five attack case studies on a state- of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators, when only trained on one generator. With full access to the classifier, we can flip the lowest bit of each pixel in an image to reduce the classifier's AUC to 0.0005; perturb 1% of the image area to reduce the classifier's AUC to 0.08; or add a single noise pattern in the synthesizer's latent space to reduce the classifier's AUC to 0.17. We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22. These attacks reveal significant vulnerabilities of certain image-forensic classifiers.

Kalin, J., Ciolino, M., Noever, D., Dozier, G.. 2020. Black Box to White Box: Discover Model Characteristics Based on Strategic Probing. 2020 Third International Conference on Artificial Intelligence for Industries (AI4I). :60—63.

In Machine Learning, White Box Adversarial Attacks rely on knowing underlying knowledge about the model attributes. This works focuses on discovering to distrinct pieces of model information: the underlying architecture and primary training dataset. With the process in this paper, a structured set of input probes and the output of the model become the training data for a deep classifier. Two subdomains in Machine Learning are explored - image based classifiers and text transformers with GPT-2. With image classification, the focus is on exploring commonly deployed architectures and datasets available in popular public libraries. Using a single transformer architecture with multiple levels of parameters, text generation is explored by fine tuning off different datasets. Each dataset explored in image and text are distinguishable from one another. Diversity in text transformer outputs implies further research is needed to successfully classify architecture attribution in text domain.

Crescenzo, G. D., Bahler, L., McIntosh, A.. 2020. Encrypted-Input Program Obfuscation: Simultaneous Security Against White-Box and Black-Box Attacks. 2020 IEEE Conference on Communications and Network Security (CNS). :1—9.

We consider the problem of protecting cloud services from simultaneous white-box and black-box attacks. Recent research in cryptographic program obfuscation considers the problem of protecting the confidentiality of programs and any secrets in them. In this model, a provable program obfuscation solution makes white-box attacks to the program not more useful than black-box attacks. Motivated by very recent results showing successful black-box attacks to machine learning programs run by cloud servers, we propose and study the approach of augmenting the program obfuscation solution model so to achieve, in at least some class of application scenarios, program confidentiality in the presence of both white-box and black-box attacks.We propose and formally define encrypted-input program obfuscation, where a key is shared between the entity obfuscating the program and the entity encrypting the program's inputs. We believe this model might be of interest in practical scenarios where cloud programs operate over encrypted data received by associated sensors (e.g., Internet of Things, Smart Grid).Under standard intractability assumptions, we show various results that are not known in the traditional cryptographic program obfuscation model; most notably: Yao's garbled circuit technique implies encrypted-input program obfuscation hiding all gates of an arbitrary polynomial circuit; and very efficient encrypted-input program obfuscation for range membership programs and a class of machine learning programs (i.e., decision trees). The performance of the latter solutions has only a small constant overhead over the equivalent unobfuscated program.

2021-03-01

Tan, R., Khan, N., Guan, L.. 2020. Locality Guided Neural Networks for Explainable Artificial Intelligence. 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.

In current deep network architectures, deeper layers in networks tend to contain hundreds of independent neurons which makes it hard for humans to understand how they interact with each other. By organizing the neurons by correlation, humans can observe how clusters of neighbouring neurons interact with each other. In this paper, we propose a novel algorithm for back propagation, called Locality Guided Neural Network (LGNN) for training networks that preserves locality between neighbouring neurons within each layer of a deep network. Heavily motivated by Self-Organizing Map (SOM), the goal is to enforce a local topology on each layer of a deep network such that neighbouring neurons are highly correlated with each other. This method contributes to the domain of Explainable Artificial Intelligence (XAI), which aims to alleviate the black-box nature of current AI methods and make them understandable by humans. Our method aims to achieve XAI in deep learning without changing the structure of current models nor requiring any post processing. This paper focuses on Convolutional Neural Networks (CNNs), but can theoretically be applied to any type of deep learning architecture. In our experiments, we train various VGG and Wide ResNet (WRN) networks for image classification on CIFAR100. In depth analyses presenting both qualitative and quantitative results demonstrate that our method is capable of enforcing a topology on each layer while achieving a small increase in classification accuracy.

Taylor, E., Shekhar, S., Taylor, G. W.. 2020. Response Time Analysis for Explainability of Visual Processing in CNNs. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :1555–1558.

Explainable artificial intelligence (XAI) methods rely on access to model architecture and parameters that is not always feasible for most users, practitioners, and regulators. Inspired by cognitive psychology, we present a case for response times (RTs) as a technique for XAI. RTs are observable without access to the model. Moreover, dynamic inference models performing conditional computation generate variable RTs for visual learning tasks depending on hierarchical representations. We show that MSDNet, a conditional computation model with early-exit architecture, exhibits slower RT for images with more complex features in the ObjectNet test set, as well as the human phenomenon of scene grammar, where object recognition depends on intrascene object-object relationships. These results cast light on MSDNet's feature space without opening the black box and illustrate the promise of RT methods for XAI.

Kuppa, A., Le-Khac, N.-A.. 2020. Black Box Attacks on Explainable Artificial Intelligence(XAI) methods in Cyber Security. 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.

Cybersecurity community is slowly leveraging Machine Learning (ML) to combat ever evolving threats. One of the biggest drivers for successful adoption of these models is how well domain experts and users are able to understand and trust their functionality. As these black-box models are being employed to make important predictions, the demand for transparency and explainability is increasing from the stakeholders.Explanations supporting the output of ML models are crucial in cyber security, where experts require far more information from the model than a simple binary output for their analysis. Recent approaches in the literature have focused on three different areas: (a) creating and improving explainability methods which help users better understand the internal workings of ML models and their outputs; (b) attacks on interpreters in white box setting; (c) defining the exact properties and metrics of the explanations generated by models. However, they have not covered, the security properties and threat models relevant to cybersecurity domain, and attacks on explainable models in black box settings.In this paper, we bridge this gap by proposing a taxonomy for Explainable Artificial Intelligence (XAI) methods, covering various security properties and threat models relevant to cyber security domain. We design a novel black box attack for analyzing the consistency, correctness and confidence security properties of gradient based XAI methods. We validate our proposed system on 3 security-relevant data-sets and models, and demonstrate that the method achieves attacker's goal of misleading both the classifier and explanation report and, only explainability method without affecting the classifier output. Our evaluation of the proposed approach shows promising results and can help in designing secure and robust XAI methods.

Perisetty, A., Bodempudi, S. T., Shaik, P. Rahaman, Kumar, B. L. N. Phaneendra. 2020. Classification of Hyperspectral Images using Edge Preserving Filter and Nonlinear Support Vector Machine (SVM). 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS). :1050–1054.

Hyperspectral image is acquired with a special sensor in which the information is collected continuously. This sensor will provide abundant data from the scene captured. The high voluminous data in this image give rise to the extraction of materials and other valuable items in it. This paper proposes a methodology to extract rich information from the hyperspectral images. As the information collected in a contiguous manner, there is a need to extract spectral bands that are uncorrelated. A factor analysis based dimensionality reduction technique is employed to extract the spectral bands and a weight least square filter is used to get the spatial information from the data. Due to the preservation of edge property in the spatial filter, much information is extracted during the feature extraction phase. Finally, a nonlinear SVM is applied to assign a class label to the pixels in the image. The research work is tested on the standard dataset Indian Pines. The performance of the proposed method on this dataset is assessed through various accuracy measures. These accuracies are 96%, 92.6%, and 95.4%. over the other methods. This methodology can be applied to forestry applications to extract the various metrics in the real world.

2021-02-23

Khan, M., Rehman, O., Rahman, I. M. H., Ali, S.. 2020. Lightweight Testbed for Cybersecurity Experiments in SCADA-based Systems. 2020 International Conference on Computing and Information Technology (ICCIT-1441). :1—5.

A rapid rise in cyber-attacks on Cyber Physical Systems (CPS) has been observed in the last decade. It becomes even more concerning that several of these attacks were on critical infrastructures that indeed succeeded and resulted into significant physical and financial damages. Experimental testbeds capable of providing flexible, scalable and interoperable platform for executing various cybersecurity experiments is highly in need by all stakeholders. A container-based SCADA testbed is presented in this work as a potential platform for executing cybersecurity experiments. Through this testbed, a network traffic containing ARP spoofing is generated that represents a Man in the middle (MITM) attack. While doing so, scanning of different systems within the network is performed which represents a reconnaissance attack. The network traffic generated by both ARP spoofing and network scanning are captured and further used for preparing a dataset. The dataset is utilized for training a network classification model through a machine learning algorithm. Performance of the trained model is evaluated through a series of tests where promising results are obtained.

Hartpence, B., Kwasinski, A.. 2020. Combating TCP Port Scan Attacks Using Sequential Neural Networks. 2020 International Conference on Computing, Networking and Communications (ICNC). :256—260.

Port scans are a persistent problem on contemporary communication networks. Typically used as an attack reconnaissance tool, they can also create problems with application performance and throughput. This paper describes an architecture that deploys sequential neural networks (NNs) to classify packets, separate TCP datagrams, determine the type of TCP packet and detect port scans. Sequential networks allow this lengthy task to learn from the current environment and to be broken up into component parts. Following classification, analysis is performed in order to discover scan attempts. We show that neural networks can be used to successfully classify general packetized traffic at recognition rates above 99% and more complex TCP classes at rates that are also above 99%. We demonstrate that this specific communications task can successfully be broken up into smaller work loads. When tested against actual NMAP scan pcap files, this model successfully discovers open ports and the scan attempts with the same high percentage and low false positives.

Al-Emadi, S., Al-Mohannadi, A., Al-Senaid, F.. 2020. Using Deep Learning Techniques for Network Intrusion Detection. 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT). :171—176.

In recent years, there has been a significant increase in network intrusion attacks which raises a great concern from the privacy and security aspects. Due to the advancement of the technology, cyber-security attacks are becoming very complex such that the current detection systems are not sufficient enough to address this issue. Therefore, an implementation of an intelligent and effective network intrusion detection system would be crucial to solve this problem. In this paper, we use deep learning techniques, namely, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) to design an intelligent detection system which is able to detect different network intrusions. Additionally, we evaluate the performance of the proposed solution using different evaluation matrices and we present a comparison between the results of our proposed solution to find the best model for the network intrusion detection system.

Liu, J., Xiao, K., Luo, L., Li, Y., Chen, L.. 2020. An intrusion detection system integrating network-level intrusion detection and host-level intrusion detection. 2020 IEEE 20th International Conference on Software Quality, Reliability and Security (QRS). :122—129.

With the rapid development of Internet, the issue of cyber security has increasingly gained more attention. An intrusion Detection System (IDS) is an effective technique to defend cyber-attacks and reduce security losses. However, the challenge of IDS lies in the diversity of cyber-attackers and the frequently-changing data requiring a flexible and efficient solution. To address this problem, machine learning approaches are being applied in the IDS field. In this paper, we propose an efficient scalable neural-network-based hybrid IDS framework with the combination of Host-level IDS (HIDS) and Network-level IDS (NIDS). We applied the autoencoders (AE) to NIDS and designed HIDS using word embedding and convolutional neural network. To evaluate the IDS, many experiments are performed on the public datasets NSL-KDD and ADFA. It can detect many attacks and reduce the security risk with high efficiency and excellent scalability.

Liao, D., Huang, S., Tan, Y., Bai, G.. 2020. Network Intrusion Detection Method Based on GAN Model. 2020 International Conference on Computer Communication and Network Security (CCNS). :153—156.

The existing network intrusion detection methods have less label samples in the training process, and the detection accuracy is not high. In order to solve this problem, this paper designs a network intrusion detection method based on the GAN model by using the adversarial idea contained in the GAN. The model enhances the original training set by continuously generating samples, which expanding the label sample set. In order to realize the multi-classification of samples, this paper transforms the previous binary classification model of the generated adversarial network into a supervised learning multi-classification model. The loss function of training is redefined, so that the corresponding training method and parameter setting are obtained. Under the same experimental conditions, several performance indicators are used to compare the detection ability of the proposed method, the original classification model and other models. The experimental results show that the method proposed in this paper is more stable, robust, accurate detection rate, has good generalization ability, and can effectively realize network intrusion detection.

Shah, A., Clachar, S., Minimair, M., Cook, D.. 2020. Building Multiclass Classification Baselines for Anomaly-based Network Intrusion Detection Systems. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). :759—760.

This paper showcases multiclass classification baselines using different machine learning algorithms and neural networks for distinguishing legitimate network traffic from direct and obfuscated network intrusions. This research derives its baselines from Advanced Security Network Metrics & Tunneling Obfuscations dataset. The dataset captured legitimate and obfuscated malicious TCP communications on selected vulnerable network services. The multiclass classification NIDS is able to distinguish obfuscated and direct network intrusion with up to 95% accuracy.

Xia, H., Gao, N., Peng, J., Mo, J., Wang, J.. 2020. Binarized Attributed Network Embedding via Neural Networks. 2020 International Joint Conference on Neural Networks (IJCNN). :1—8.

Traditional attributed network embedding methods are designed to map structural and attribute information of networks jointly into a continuous Euclidean space, while recently a novel branch of them named binarized attributed network embedding has emerged to learn binary codes in Hamming space, aiming to save time and memory costs and to naturally fit node retrieval task. However, current binarized attributed network embedding methods are scarce and mostly ignore the local attribute similarity between each pair of nodes. Besides, none of them attempt to control the independency of each dimension(bit) of the learned binary representation vectors. As existing methods still need improving, we propose an unsupervised Neural-based Binarized Attributed Network Embedding (NBANE) approach. Firstly, we inherit the Weisfeiler-Lehman proximity matrix from predecessors to aggregate high-order features for each node. Secondly, we feed the aggregated features into an autoencoder with the attribute similarity penalizing term and the orthogonality term to make further dimension reduction. To solve the problem of integer optimization we adopt the relaxation-quantization method during the process of training neural networks. Empirically, we evaluate the performance of NBANE through node classification and clustering tasks on three real-world datasets and study a case on fast retrieval in academic networks. Our method achieves better performance over state- of-the-art baselines methods of various types.

Liu, W., Park, E. K., Krieger, U., Zhu, S. S.. 2020. Smart e-Health Security and Safety Monitoring with Machine Learning Services. 2020 29th International Conference on Computer Communications and Networks (ICCCN). :1—6.

This research provides security and safety extensions to a blockchain based solution whose target is e-health. The Advanced Blockchain platform is extended with intelligent monitoring for security and machine learning for detecting patient treatment medication safety issues. For the reasons of stringent HIPAA, HITECH, EU-GDPR and other regional regulations dictating security, safety and privacy requirements, the e-Health blockchains have to cover mandatory disclosure of violations or enforcements of policies during transaction flows involving healthcare. Our service solution further provides the benefits of resolving the abnormal flows of a medical treatment process, providing accountability of the service providers, enabling a trust health information environment for institutions to handle medication safely, giving patients a better safety guarantee, and enabling the authorities to supervise the security and safety of e-Health blockchains. The capabilities can be generalized to support a uniform smart solution across industry in a variety of blockchain applications.

2021-02-22

Li, M., Zhang, Y., Sun, Y., Wang, W., Tsang, I. W., Lin, X.. 2020. I/O Efficient Approximate Nearest Neighbour Search based on Learned Functions. 2020 IEEE 36th International Conference on Data Engineering (ICDE). :289–300.

Approximate nearest neighbour search (ANNS) in high dimensional space is a fundamental problem in many applications, such as multimedia database, computer vision and information retrieval. Among many solutions, data-sensitive hashing-based methods are effective to this problem, yet few of them are designed for external storage scenarios and hence do not optimized for I/O efficiency during the query processing. In this paper, we introduce a novel data-sensitive indexing and query processing framework for ANNS with an emphasis on optimizing the I/O efficiency, especially, the sequential I/Os. The proposed index consists of several lists of point IDs, ordered by values that are obtained by learned hashing (i.e., mapping) functions on each corresponding data point. The functions are learned from the data and approximately preserve the order in the high-dimensional space. We consider two instantiations of the functions (linear and non-linear), both learned from the data with novel objective functions. We also develop an I/O efficient ANNS framework based on the index. Comprehensive experiments on six benchmark datasets show that our proposed methods with learned index structure perform much better than the state-of-the-art external memory-based ANNS methods in terms of I/O efficiency and accuracy.