Biblio
The number of sensors and embedded devices in an urban area can be on the order of thousands. New low-power wide area (LPWA) wireless network technologies have been proposed to support this large number of asynchronous, low-bandwidth devices. Among them, the Cooperative UltraNarrowband (C-UNB) is a clean-slate cellular network technology to connect these devices to a remote site or data collection server. C-UNB employs small bandwidth channels, and a lightweight random access protocol. In this paper, a new application is investigated - the use of C-UNB wireless networks to support the Advanced Metering Infrastructure (AMI), in order to facilitate the communication between smart meters and utilities. To this end, we adapted a mathematical model for C-UNB, and implemented a network simulation module in NS-3 to represent C-UNB's physical and medium access control layer. For the application layer, we implemented the DLMS-COSEM protocol, or Device Language Message Specification - Companion Specification for Energy Metering. Details of the simulation module are presented and we conclude that it supports the results of the mathematical model.
In the paradigm of network coding, information-theoretic security is considered in the presence of wiretappers, who can access one arbitrary edge subset up to a certain size, referred to as the security level. Secure network coding is applied to prevent the leakage of the source information to the wiretappers. In this paper, we consider the problem of secure network coding for flexible pairs of information rate and security level with any fixed dimension (equal to the sum of rate and security level). We present a novel approach for designing a secure linear network code (SLNC) such that the same SLNC can be applied for all the rate and security-level pairs with the fixed dimension. We further develop a polynomial-time algorithm for efficient implementation and prove that there is no penalty on the required field size for the existence of SLNCs in terms of the best known lower bound by Guang and Yeung. Finally, by applying our approach as a crucial building block, we can construct a family of SLNCs that not only can be applied to all possible pairs of rate and security level but also share a common local encoding kernel at each intermediate node in the network.
Since the security and fault tolerance is the two important metrics of the data storage, it brings both opportunities and challenges for distributed data storage and transaction. The traditional transaction system of storage resources, which generally runs in a centralized mode, results in high cost, vendor lock-in, single point failure risk, DDoS attack and information security. Therefore, this paper proposes a distributed transaction method for cloud storage based on smart contract. First, to guarantee the fault tolerance and decrease the storing cost for erasure coding, a VCG-based auction mechanism is proposed for storage transaction, and we deploy and implement the proposed mechanism by designing a corresponding smart contract. Especially, we address the problem - how to implement a VCG-like mechanism in a blockchain environment. Based on private chain of Ethereum, we make the simulations for proposed storage transaction method. The results showed that proposed transaction model can realize competitive trading of storage resources, and ensure the safe and economic operation of resource trading.
The increasing adoption of 3D printing in many safety and mission critical applications exposes 3D printers to a variety of cyber attacks that may result in catastrophic consequences if the printing process is compromised. For example, the mechanical properties (e.g., physical strength, thermal resistance, dimensional stability) of 3D printed objects could be significantly affected and degraded if a simple printing setting is maliciously changed. To address this challenge, this study proposes a model-free real-time online process monitoring approach that is capable of detecting and defending against the cyber-physical attacks on the firmwares of 3D printers. Specifically, we explore the potential attacks and consequences of four key printing attributes (including infill path, printing speed, layer thickness, and fan speed) and then formulate the attack models. Based on the intrinsic relation between the printing attributes and the physical observations, our defense model is established by systematically analyzing the multi-faceted, real-time measurement collected from the accelerometer, magnetometer and camera. The Kalman filter and Canny filter are used to map and estimate three aforementioned critical toolpath information that might affect the printing quality. Mel-frequency Cepstrum Coefficients are used to extract features for fan speed estimation. Experimental results show that, for a complex 3D printed design, our method can achieve 4% Hausdorff distance compared with the model dimension for infill path estimate, 6.07% Mean Absolute Percentage Error (MAPE) for speed estimate, 9.57% MAPE for layer thickness estimate, and 96.8% accuracy for fan speed identification. Our study demonstrates that, this new approach can effectively defend against the cyber-physical attacks on 3D printers and 3D printing process.
Phishing e-mails are considered as spam e-mails, which aim to collect sensitive personal information about the users via network. Since the main purpose of this behavior is mostly to harm users financially, it is vital to detect these phishing or spam e-mails immediately to prevent unauthorized access to users' vital information. To detect phishing e-mails, using a quicker and robust classification method is important. Considering the billions of e-mails on the Internet, this classification process is supposed to be done in a limited time to analyze the results. In this work, we present some of the early results on the classification of spam email using deep learning and machine methods. We utilize word2vec to represent emails instead of using the popular keyword or other rule-based methods. Vector representations are then fed into a neural network to create a learning model. We have tested our method on an open dataset and found over 96% accuracy levels with the deep learning classification methods in comparison to the standard machine learning algorithms.
In this paper, we analyze the evolution of Certificate Transparency (CT) over time and explore the implications of exposing certificate DNS names from the perspective of security and privacy. We find that certificates in CT logs have seen exponential growth. Website support for CT has also constantly increased, with now 33% of established connections supporting CT. With the increasing deployment of CT, there are also concerns of information leakage due to all certificates being visible in CT logs. To understand this threat, we introduce a CT honeypot and show that data from CT logs is being used to identify targets for scanning campaigns only minutes after certificate issuance. We present and evaluate a methodology to learn and validate new subdomains from the vast number of domains extracted from CT logged certificates.
Phishing is a form of cybercrime where an attacker imitates a real person / institution by promoting them as an official person or entity through e-mail or other communication mediums. In this type of cyber attack, the attacker sends malicious links or attachments through phishing e-mails that can perform various functions, including capturing the login credentials or account information of the victim. These e-mails harm victims because of money loss and identity theft. In this study, a software called "Anti Phishing Simulator'' was developed, giving information about the detection problem of phishing and how to detect phishing emails. With this software, phishing and spam mails are detected by examining mail contents. Classification of spam words added to the database by Bayesian algorithm is provided.
Fourier domain mode locked (FDML) lasers, in which the sweep period of the swept bandpass filter is synchronized with the roundtrip time of the optical field, are broadband and rapidly tunable fiber ring laser systems, which offer rich dynamics. A detailed understanding is important from a fundamental point of view, and also required in order to improve current FDML lasers which have not reached their coherence limit yet. Here, we study the formation of localized patterns in the intensity trace of FDML laser systems based on a master equation approach [1] derived from the nonlinear Schrödinger equation for polarization maintaining setups, which shows excellent agreement with experimental data. A variety of localized patterns and chaotic or bistable operation modes were previously discovered in [2–4] by investigating primarily quasi-static regimes within a narrow sweep bandwidth where a delay differential equation model was used. In particular, the formation of so-called holes which are characterized by a dip in the intensity trace and a rapid phase jump are described. Such holes have tentatively been associated with Nozaki-Bekki holes which are solutions to the complex Ginzburg-Landau equation. In Fig. 1 (b) to (d) small sections of a numerical solution of our master equation are presented for a partially dispersion compensated polarization maintaining FDML laser setup. Within our approach, we are able to study the full sweep dynamics over a broad sweep range of more than 100 nm. This allows us to identify different co-existing intensity patterns within a single sweep. In general, high frequency distortions in the intensity trace of FDML lasers [5] are mainly caused by synchronization mismatches caused by the fiber dispersion or a detuning of the roundtrip time of the optical field to the sweep period of the swept bandpass filter. This timing errors lead to rich and complex dynamics over many roundtrips and are a major source of noise, greatly affecting imaging and sensing applications. For example, the imaging quality in optical coherence tomography where FDML lasers are superior sources is significantly reduced [5].
Brain Computer Interface (BCI) aims at providing a better quality of life to people suffering from neuromuscular disability. This paper establishes a BCI paradigm to provide a biometric security option, used for locking and unlocking personal computers or mobile phones. Although it is primarily meant for the people with neurological disorder, its application can safely be extended for the use of normal people. The proposed scheme decodes the electroencephalogram signals liberated by the brain of the subjects, when they are engaged in selecting a sequence of dots in(6×6)2-dimensional array, representing a pattern lock. The subject, while selecting the right dot in a row, would yield a P300 signal, which is decoded later by the brain-computer interface system to understand the subject's intention. In case the right dots in all the 6 rows are correctly selected, the subject would yield P300 signals six times, which on being decoded by a BCI system would allow the subject to access the system. Because of intra-subjective variation in the amplitude and wave-shape of the P300 signal, a type 2 fuzzy classifier has been employed to classify the presence/absence of the P300 signal in the desired window. A comparison of performances of the proposed classifier with others is also included. The functionality of the proposed system has been validated using the training instances generated for 30 subjects. Experimental results confirm that the classification accuracy for the present scheme is above 90% irrespective of subjects.
Traditionally, power grid vulnerability assessment methods are separated to the study of nodes vulnerability and edges vulnerability, resulting in the evaluation results are not accurate. A framework for vulnerability assessment is still required for power grid. Thus, this paper proposes a universal method for vulnerability assessment of power grid by establishing a complex network model with uniform weight of nodes and edges. The concept of virtual edge is introduced into the distinct weighted complex network model of power system, and the selection function of edge weight and virtual edge weight are constructed based on electrical and physical parameters. In addition, in order to reflect the electrical characteristics of power grids more accurately, a weighted betweenness evaluation index with transmission efficiency is defined. Finally, the method has been demonstrated on the IEEE 39 buses system, and the results prove the effectiveness of the proposed method.
The collection of students' sensible data raises adverse reactions against Learning Analytics that decreases the confidence in its adoption. The laws and policies that surround the use of educational data are not enough to ensure privacy, security, validity, integrity and reliability of students' data. This problem has been detected through literature review and can be solved if a technological layer of automated checking rules is added above these policies. The aim of this thesis is to research about an emerging technology such as blockchain to preserve the identity of students and secure their data. In a first stage a systematic literature review will be conducted in order to set the context of the research. Afterwards, and through the scientific method, we will develop a blockchain based solution to automate rules and constraints with the aim to let students the governance of their data and to ensure data privacy and security.
The rapid rise of cyber-crime activities and the growing number of devices threatened by them place software security issues in the spotlight. As around 90% of all attacks exploit known types of security issues, finding vulnerable components and applying existing mitigation techniques is a viable practical approach for fighting against cyber-crime. In this paper, we investigate how the state-of-the-art machine learning techniques, including a popular deep learning algorithm, perform in predicting functions with possible security vulnerabilities in JavaScript programs. We applied 8 machine learning algorithms to build prediction models using a new dataset constructed for this research from the vulnerability information in public databases of the Node Security Project and the Snyk platform, and code fixing patches from GitHub. We used static source code metrics as predictors and an extensive grid-search algorithm to find the best performing models. We also examined the effect of various re-sampling strategies to handle the imbalanced nature of the dataset. The best performing algorithm was KNN, which created a model for the prediction of vulnerable functions with an F-measure of 0.76 (0.91 precision and 0.66 recall). Moreover, deep learning, tree and forest based classifiers, and SVM were competitive with F-measures over 0.70. Although the F-measures did not vary significantly with the re-sampling strategies, the distribution of precision and recall did change. No re-sampling seemed to produce models preferring high precision, while re-sampling strategies balanced the IR measures.
To build a secure communications software, Vulnerability Prediction Models (VPMs) are used to predict vulnerable software modules in the software system before software security testing. At present many software security metrics have been proposed to design a VPM. In this paper, we predict vulnerable classes in a software system by establishing the system's weighted software network. The metrics are obtained from the nodes' attributes in the weighted software network. We design and implement a crawler tool to collect all public security vulnerabilities in Mozilla Firefox. Based on these data, the prediction model is trained and tested. The results show that the VPM based on weighted software network has a good performance in accuracy, precision, and recall. Compared to other studies, it shows that the performance of prediction has been improved greatly in Pr and Re.
In-network caching is a feature shared by all proposed Information Centric Networking (ICN) architectures as it is critical to achieving a more efficient retrieval of content. However, the default "cache everything everywhere" universal caching scheme has caused the emergence of several privacy threats. Timing attacks are one such privacy breach where attackers can probe caches and use timing analysis of data retrievals to identify if content was retrieved from the data source or from the cache, the latter case inferring that this content was requested recently. We have previously proposed a betweenness centrality based caching strategy to mitigate such attacks by increasing user anonymity. We demonstrated its efficacy in a transit-stub topology. In this paper, we further investigate the effect of betweenness centrality based caching on cache privacy and user anonymity in more general synthetic and real world Internet topologies. It was also shown that an attacker with access to multiple compromised routers can locate and track a mobile user by carrying out multiple timing analysis attacks from various parts of the network. We extend our privacy evaluation to a scenario with mobile users and show that a betweenness centrality based caching policy provides a mobile user with path privacy by increasing an attacker's difficulty in locating a moving user or identifying his/her route.
Set-valued database publication has been attracting much attention due to its benefit for various applications like recommendation systems and marketing analysis. However, publishing original database directly is risky since an unauthorized party may violate individual privacy by associating and analyzing relations between individuals and set of items in the published database, which is known as identity linkage attack. Generally, an attack is performed based on attacker's background knowledge obtained by a prior investigation and such adversary knowledge should be taken into account in the data anonymization. Various data anonymization schemes have been proposed to prevent the identity linkage attack. However, in existing data anonymization schemes, either data utility or data property is reduced a lot after excessive database modification and consequently data recipients become to distrust the released database. In this paper, we propose a new data anonymization scheme, called sibling suppression, which causes minimum data utility lost and maintains data properties like database size and the number of records. The scheme uses multiple sets of adversary knowledge and items in a category of adversary knowledge are replaced by other items in the category. Several experiments with real dataset show that our method can preserve data utility with minimum lost and maintain data property as the same as original database.
Science gateways bring out the possibility of reproducible science as they are integrated into reusable techniques, data and workflow management systems, security mechanisms, and high performance computing (HPC). We introduce BioinfoPortal, a science gateway that integrates a suite of different bioinformatics applications using HPC and data management resources provided by the Brazilian National HPC System (SINAPAD). BioinfoPortal follows the Software as a Service (SaaS) model and the web server is freely available for academic use. The goal of this paper is to describe the science gateway and its usage, addressing challenges of designing a multiuser computational platform for parallel/distributed executions of large-scale bioinformatics applications using the Brazilian HPC resources. We also present a study of performance and scalability of some bioinformatics applications executed in the HPC environments and perform machine learning analyses for predicting features for the HPC allocation/usage that could better perform the bioinformatics applications via BioinfoPortal.
The promise of big data relies on the release and aggregation of data sets. When these data sets contain sensitive information about individuals, it has been scalable and convenient to protect the privacy of these individuals by de-identification. However, studies show that the combination of de-identified data sets with other data sets risks re-identification of some records. Some studies have shown how to measure this risk in specific contexts where certain types of public data sets (such as voter roles) are assumed to be available to attackers. To the extent that it can be accomplished, such analyses enable the threat of compromises to be balanced against the benefits of sharing data. For example, a study that might save lives by enabling medical research may be enabled in light of a sufficiently low probability of compromise from sharing de-identified data. In this paper, we introduce a general probabilistic re-identification framework that can be instantiated in specific contexts to estimate the probability of compromises based on explicit assumptions. We further propose a baseline of such assumptions that enable a first-cut estimate of risk for practical case studies. We refer to the framework with these assumptions as the Naive Re-identification Framework (NRF). As a case study, we show how we can apply NRF to analyze and quantify the risk of re-identification arising from releasing de-identified medical data in the context of publicly-available social media data. The results of this case study show that NRF can be used to obtain meaningful quantification of the re-identification risk, compare the risk of different social media, and assess risks of combinations of various demographic attributes and medical conditions that individuals may voluntarily disclose on social media.

