Visible to the public Biblio

Filters: Keyword is metadata  [Clear All Filters]
2023-09-01
Shang, Siyuan, Zhou, Aoyang, Tan, Ming, Wang, Xiaohan, Liu, Aodi.  2022.  Access Control Audit and Traceability Forensics Technology Based on Blockchain. 2022 4th International Conference on Frontiers Technology of Information and Computer (ICFTIC). :932—937.
Access control includes authorization of security administrators and access of users. Aiming at the problems of log information storage difficulty and easy tampering faced by auditing and traceability forensics of authorization and access in cross-domain scenarios, we propose an access control auditing and traceability forensics method based on Blockchain, whose core is Ethereum Blockchain and IPFS interstellar mail system, and its main function is to store access control log information and trace forensics. Due to the technical characteristics of blockchain, such as openness, transparency and collective maintenance, the log information metadata storage based on Blockchain meets the requirements of distribution and trustworthiness, and the exit of any node will not affect the operation of the whole system. At the same time, by storing log information in the blockchain structure and using mapping, it is easy to locate suspicious authorization or judgment that lead to permission leakage, so that security administrators can quickly grasp the causes of permission leakage. Using this distributed storage structure for security audit has stronger anti-attack and anti-risk.
2023-08-23
Zhang, Chaochao, HOU, RUI.  2022.  Security Support on Memory Controller for Heap Memory Safety. 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :248—257.
Memory corruption attacks have existed for multiple decades, and have become a major threat to computer systems. At the same time, a number of defense techniques have been proposed by research community. With the wide adoption of CPU-based memory safety solutions, sophisticated attackers tend to tamper with system memory via direct memory access (DMA) attackers, which leverage DMA-enabled I/O peripherals to fully compromise system memory. The Input-Output Memory Management Units (IOMMUs) based solutions are widely believed to mitigate DMA attacks. However, recent works point out that attackers can bypass IOMMU-based protections by manipulating the DMA interfaces, which are particularly vulnerable to race conditions and other unsafe interactions.State-of-the-art hardware-supported memory protections rely on metadata to perform security checks on memory access. Consequently, the additional memory request for metadata results in significant performance degradation, which limited their feasibility in real world deployments. For quantitative analysis, we separate the total metadata access latency into DRAM latency, on-chip latency, and cache latency, and observe that the actual DRAM access is less than half of the total latency. To minimize metadata access latency, we propose EMC, a low-overhead heap memory safety solution that implements a tripwire based mechanism on the memory controller. In addition, by using memory controller as a natural gateway of various memory access data paths, EMC could provide comprehensive memory safety enforcement to all memory data paths from/to system physical memory. Our evaluation shows an 0.54% performance overhead on average for SPEC 2017 workloads.
2023-06-22
Hasegawa, Taichi, Saito, Taiichi, Sasaki, Ryoichi.  2022.  Analyzing Metadata in PDF Files Published by Police Agencies in Japan. 2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C). :145–151.
In recent years, new types of cyber attacks called targeted attacks have been observed. It targets specific organizations or individuals, while usual large-scale attacks do not focus on specific targets. Organizations have published many Word or PDF files on their websites. These files may provide the starting point for targeted attacks if they include hidden data unintentionally generated in the authoring process. Adhatarao and Lauradoux analyzed hidden data found in the PDF files published by security agencies in many countries and showed that many PDF files potentially leak information like author names, details on the information system and computer architecture. In this study, we analyze hidden data of PDF files published on the website of police agencies in Japan and compare the results with Adhatarao and Lauradoux's. We gathered 110989 PDF files. 56% of gathered PDF files contain personal names, organization names, usernames, or numbers that seem to be IDs within the organizations. 96% of PDF files contain software names.
ISSN: 2693-9371
2023-06-02
Sharad Sonawane, Hritesh, Deshmukh, Sanika, Joy, Vinay, Hadsul, Dhanashree.  2022.  Torsion: Web Reconnaissance using Open Source Intelligence. 2022 2nd International Conference on Intelligent Technologies (CONIT). :1—4.

Internet technology has made surveillance widespread and access to resources at greater ease than ever before. This implied boon has countless advantages. It however makes protecting privacy more challenging for the greater masses, and for the few hacktivists, supplies anonymity. The ever-increasing frequency and scale of cyber-attacks has not only crippled private organizations but has also left Law Enforcement Agencies(LEA's) in a fix: as data depicts a surge in cases relating to cyber-bullying, ransomware attacks; and the force not having adequate manpower to tackle such cases on a more microscopic level. The need is for a tool, an automated assistant which will help the security officers cut down precious time needed in the very first phase of information gathering: reconnaissance. Confronting the surface web along with the deep and dark web is not only a tedious job but which requires documenting the digital footprint of the perpetrator and identifying any Indicators of Compromise(IOC's). TORSION which automates web reconnaissance using the Open Source Intelligence paradigm, extracts the metadata from popular indexed social sites and un-indexed dark web onion sites, provided it has some relating Intel on the target. TORSION's workflow allows account matching from various top indexed sites, generating a dossier on the target, and exporting the collected metadata to a PDF file which can later be referenced.

2023-03-31
Navuluri, Karthik, Mukkamala, Ravi, Ahmad, Aftab.  2016.  Privacy-Aware Big Data Warehouse Architecture. 2016 IEEE International Congress on Big Data (BigData Congress). :341–344.
Along with the ever increasing growth in data collection and its mining, there is an increasing fear of compromising individual and population privacy. Several techniques have been proposed in literature to preserve privacy of collected data while storing and processing. In this paper, we propose a privacy-aware architecture for storing and processing data in a Big Data warehouse. In particular, we propose a flexible, extendable, and adaptable architecture that enforces user specified privacy requirements in the form of Embedded Privacy Agreements. The paper discusses the details of the architecture with some implementation details.
2023-02-03
Liang, Xiubo, Guo, Ningxiang, Hong, Chaoqun.  2022.  A Certificate Authority Scheme Based on Trust Ring for Consortium Nodes. 2022 International Conference on High Performance Big Data and Intelligent Systems (HDIS). :90–94.
The access control mechanism of most consortium blockchain is implemented through traditional Certificate Authority scheme based on trust chain and centralized key management such as PKI/CA at present. However, the uneven power distribution of CA nodes may cause problems with leakage of certificate keys, illegal issuance of certificates, malicious rejection of certificates issuance, manipulation of issuance logs and metadata, it could compromise the security and dependability of consortium blockchain. Therefore, this paper design and implement a Certificate Authority scheme based on trust ring model that can not only enhance the reliability of consortium blockchain, but also ensure high performance. Combined public key, transformation matrix and elliptic curve cryptography are applied to the scheme to generate and store keys in a cluster of CA nodes dispersedly and securely for consortium nodes. It greatly reduced the possibility of malicious behavior and key leakage. To achieve the immutability of logs and metadata, the scheme also utilized public blockchain and smart contract technology to organize the whole procedure of certificate issuance, the issuance logs and metadata for certificate validation are stored in public blockchain. Experimental results showed that the scheme can surmount the disadvantages of the traditional scheme while maintaining sufficiently good performance, including issuance speed and storage efficiency of certificates.
2023-01-05
Nusrat Zahan, Thomas Zimmermann, Patrice Godefroid, Brendan Murphy, Chandra Maddila, Laurie Williams.  2022.  What are Weak Links in the npm Supply Chain? ICSE-SEIP '22: Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice.

Modern software development frequently uses third-party packages, raising the concern of supply chain security attacks. Many attackers target popular package managers, like npm, and their users with supply chain attacks. In 2021 there was a 650% year-on-year growth in security attacks by exploiting Open Source Software's supply chain. Proactive approaches are needed to predict package vulnerability to high-risk supply chain attacks. The goal of this work is to help software developers and security specialists in measuring npm supply chain weak link signals to prevent future supply chain attacks by empirically studying npm package metadata.

In this paper, we analyzed the metadata of 1.63 million JavaScript npm packages. We propose six signals of security weaknesses in a software supply chain, such as the presence of install scripts, maintainer accounts associated with an expired email domain, and inactive packages with inactive maintainers. One of our case studies identified 11 malicious packages from the install scripts signal. We also found 2,818 maintainer email addresses associated with expired domains, allowing an attacker to hijack 8,494 packages by taking over the npm accounts. We obtained feedback on our weak link signals through a survey responded to by 470 npm package developers. The majority of the developers supported three out of our six proposed weak link signals. The developers also indicated that they would want to be notified about weak links signals before using third-party packages. Additionally, we discussed eight new signals suggested by package developers.

2022-09-30
Alom, Ifteher, Eshita, Romana Mahjabin, Ibna Harun, Anam, Ferdous, Md Sadek, Kamrul Bashar Shuhan, Mirza, Chowdhury, Mohammad Jabed M, Shahidur Rahman, Mohammad.  2021.  Dynamic Management of Identity Federations using Blockchain. 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). :1–9.
Federated Identity Management (FIM) is a model of identity management in which different trusted organizations can provide secure online services to their uses. Security Assertion Markup Language (SAML) is one of the widely-used technologies for FIM. However, a SAML-based FIM has two significant issues: the metadata (a crucial component in SAML) has security issues, and federation management is hard to scale. The concept of dynamic identity federation has been introduced, enabling previously unknown entities to join in a new federation facilitating inter-organization service provisioning to address federation management's scalability issue. However, the existing dynamic federation approaches have security issues concerning confidentiality, integrity, authenticity, and transparency. In this paper, we present the idea of facilitating dynamic identity federations utilizing blockchain technology to improve the existing approaches' security issues. We demonstrate its architecture based on a rigorous threat model and requirement analysis. We also discuss its implementation details, current protocol flows and analyze its performance to underline its applicability.
2022-08-26
de Moura, Ralf Luis, Franqueira, Virginia N. L., Pessin, Gustavo.  2021.  Towards Safer Industrial Serial Networks: An Expert System Framework for Anomaly Detection. 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI). :1197—1205.

Cyber security is a topic of increasing relevance in relation to industrial networks. The higher intensity and intelligent use of data pushed by smart technology (Industry 4.0) together with an augmented integration between the operational technology (production) and the information technology (business) parts of the network have considerably raised the level of vulnerabilities. On the other hand, many industrial facilities still use serial networks as underlying communication system, and they are notoriously limited from a cyber security perspective since protection mechanisms available for ТСР/IР communication do not apply. Therefore, an attacker gaining access to a serial network can easily control the industrial components, potentially causing catastrophic incidents, jeopardizing assets and human lives. This study proposes a framework to act as an anomaly detection system (ADS) for industrial serial networks. It has three ingredients: an unsupervised К-means component to analyse message content, a knowledge-based Expert System component to analyse message metadata, and a voting process to generate alerts for security incidents, anomalous states, and faults. The framework was evaluated using the Proflbus-DP, a network simulator which implements a serial bus system. Results for the simulated traffic were promising: 99.90% for accuracy, 99,64% for precision, and 99.28% for F1-Score. They indicate feasibility of the framework applied to serial-based industrial networks.

2022-07-29
Badran, Sultan, Arman, Nabil, Farajallah, Mousa.  2021.  An Efficient Approach for Secure Data Outsourcing using Hybrid Data Partitioning. 2021 International Conference on Information Technology (ICIT). :418—423.
This paper presents an implementation of a novel approach, utilizing hybrid data partitioning, to secure sensitive data and improve query performance. In this novel approach, vertical and horizontal data partitioning are combined together in an approach that called hybrid partitioning and the new approach is implemented using Microsoft SQL server to generate divided/partitioned relations. A group of proposed rules is applied to the query request process using query binning (QB) and Metadata of partitioning. The proposed approach is validated using experiments involving a collection of data evaluated by outcomes of advanced stored procedures. The suggested approach results are satisfactory in achieving the properties of defining the data security: non-linkability and indistinguishability. The results of the proposed approach were satisfactory. The proposed novel approach outperforms a well-known approach called PANDA.
2022-07-14
Ayub, Md. Ahsan, Sirai, Ambareen.  2021.  Similarity Analysis of Ransomware based on Portable Executable (PE) File Metadata. 2021 IEEE Symposium Series on Computational Intelligence (SSCI). :1–6.
Threats, posed by ransomware, are rapidly increasing, and its cost on both national and global scales is becoming significantly high as evidenced by the recent events. Ransomware carries out an irreversible process, where it encrypts victims' digital assets to seek financial compensations. Adversaries utilize different means to gain initial access to the target machines, such as phishing emails, vulnerable public-facing software, Remote Desktop Protocol (RDP), brute-force attacks, and stolen accounts. To combat these threats of ransomware, this paper aims to help researchers gain a better understanding of ransomware application profiles through static analysis, where we identify a list of suspicious indicators and similarities among 727 active ran-somware samples. We start with generating portable executable (PE) metadata for all the studied samples. With our domain knowledge and exploratory data analysis tasks, we introduce some of the suspicious indicators of the structure of ransomware files. We reduce the dimensionality of the generated dataset by using the Principal Component Analysis (PCA) technique and discover clusters by applying the KMeans algorithm. This motivates us to utilize the one-class classification algorithms on the generated dataset. As a result, the algorithms learn the common data boundary in the structure of our studied ransomware samples, and thereby, we achieve the data-driven similarities. We use the findings to evaluate the trained classifiers with the test samples and observe that the Local Outlier Factor (LoF) performs better on all the selected feature spaces compared to the One-Class SVM and the Isolation Forest algorithms.
2022-04-25
Mubarak, Sinil, Habaebi, Mohamed Hadi, Islam, Md Rafiqul, Khan, Sheroz.  2021.  ICS Cyber Attack Detection with Ensemble Machine Learning and DPI using Cyber-kit Datasets. 2021 8th International Conference on Computer and Communication Engineering (ICCCE). :349–354.

Digitization has pioneered to drive exceptional changes across all industries in the advancement of analytics, automation, and Artificial Intelligence (AI) and Machine Learning (ML). However, new business requirements associated with the efficiency benefits of digitalization are forcing increased connectivity between IT and OT networks, thereby increasing the attack surface and hence the cyber risk. Cyber threats are on the rise and securing industrial networks are challenging with the shortage of human resource in OT field, with more inclination to IT/OT convergence and the attackers deploy various hi-tech methods to intrude the control systems nowadays. We have developed an innovative real-time ICS cyber test kit to obtain the OT industrial network traffic data with various industrial attack vectors. In this paper, we have introduced the industrial datasets generated from ICS test kit, which incorporate the cyber-physical system of industrial operations. These datasets with a normal baseline along with different industrial hacking scenarios are analyzed for research purposes. Metadata is obtained from Deep packet inspection (DPI) of flow properties of network packets. DPI analysis provides more visibility into the contents of OT traffic based on communication protocols. The advancement in technology has led to the utilization of machine learning/artificial intelligence capability in IDS ICS SCADA. The industrial datasets are pre-processed, profiled and the abnormality is analyzed with DPI. The processed metadata is normalized for the easiness of algorithm analysis and modelled with machine learning-based latest deep learning ensemble LSTM algorithms for anomaly detection. The deep learning approach has been used nowadays for enhanced OT IDS performances.

2022-04-13
Chen, Ping-Xiang, Chen, Shuo-Han, Chang, Yuan-Hao, Liang, Yu-Pei, Shih, Wei-Kuan.  2021.  Facilitating the Efficiency of Secure File Data and Metadata Deletion on SMR-based Ext4 File System. 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC). :728–733.
The efficiency of secure deletion is highly dependent on the data layout of underlying storage devices. In particular, owing to the sequential-write constraint of the emerging Shingled Magnetic Recording (SMR) technology, an improper data layout could lead to serious write amplification and hinder the performance of secure deletion. The performance degradation of secure deletion on SMR drives is further aggravated with the need to securely erase the file system metadata of deleted files due to the small-size nature of file system metadata. Such an observation motivates us to propose a secure-deletion and SMR-aware space allocation (SSSA) strategy to facilitate the process of securely erasing both the deleted files and their metadata simultaneously. The proposed strategy is integrated within the widely-used extended file system 4 (ext4) and is evaluated through a series of experiments to demonstrate the effectiveness of the proposed strategy. The evaluation results show that the proposed strategy can reduce the secure deletion latency by 91.3% on average when compared with naive SMR-based ext4 file system.
Solanke, Abiodun A., Chen, Xihui, Ramírez-Cruz, Yunior.  2021.  Pattern Recognition and Reconstruction: Detecting Malicious Deletions in Textual Communications. 2021 IEEE International Conference on Big Data (Big Data). :2574–2582.
Digital forensic artifacts aim to provide evidence from digital sources for attributing blame to suspects, assessing their intents, corroborating their statements or alibis, etc. Textual data is a significant source of artifacts, which can take various forms, for instance in the form of communications. E-mails, memos, tweets, and text messages are all examples of textual communications. Complex statistical, linguistic and other scientific procedures can be manually applied to this data to uncover significant clues that point the way to factual information. While expert investigators can undertake this task, there is a possibility that critical information is missed or overlooked. The primary objective of this work is to aid investigators by partially automating the detection of suspicious e-mail deletions. Our approach consists in building a dynamic graph to represent the temporal evolution of communications, and then using a Variational Graph Autoencoder to detect possible e-mail deletions in this graph. Our model uses multiple types of features for representing node and edge attributes, some of which are based on metadata of the messages and the rest are extracted from the contents using natural language processing and text mining techniques. We use the autoencoder to detect missing edges, which we interpret as potential deletions; and to reconstruct their features, from which we emit hypotheses about the topics of deleted messages. We conducted an empirical evaluation of our model on the Enron e-mail dataset, which shows that our model is able to accurately detect a significant proportion of missing communications and to reconstruct the corresponding topic vectors.
2022-04-12
Rane, Prachi, Rao, Aishwarya, Verma, Diksha, Mhaisgawali, Amrapali.  2021.  Redacting Sensitive Information from the Data. 2021 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON). :1—5.
Redaction of personal, confidential and sensitive information from documents is becoming increasingly important for individuals and organizations. In past years, there have been many well-publicized cases of data leaks from various popular companies. When the data contains sensitive information, these leaks pose a serious threat. To protect and conceal sensitive information, many companies have policies and laws about processing and sanitizing sensitive information in business documents.The traditional approach of manually finding and matching millions of words and then redacting is slow and error-prone. This paper examines different models to automate the identification and redaction of personal and sensitive information contained within the documents using named entity recognition. Sensitive entities example person’s name, bank account details or Aadhaar numbers targeted for redaction, are recognized based on the file’s content, providing users with an interactive approach to redact the documents by changing selected sensitive terms.
2022-03-22
Huang, Jianming, Hua, Yu.  2021.  A Write-Friendly and Fast-Recovery Scheme for Security Metadata in Non-Volatile Memories. 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). :359—370.
Non-Volatile Memories (NVMs) require security mechanisms, e.g., counter mode encryption and integrity tree verification, which are important to protect systems in terms of encryption and data integrity. These security mechanisms heavily rely on extra security metadata that need to be efficiently and accurately recovered after system crashes or power off. Established SGX integrity tree (SIT) becomes efficient to protect system integrity and however fails to be restored from leaves, since the computations of SIT nodes need their parent nodes as inputs. To recover the security metadata with low write overhead and short recovery time, we propose an efficient and instantaneous persistence scheme, called STAR, which instantly persists the modifications of security metadata without extra memory writes. STAR is motivated by our observation that the parent nodes in cache are modified due to persisting their child nodes. STAR stores the modifications of parent nodes in their child nodes and persists them just using one atomic memory write. To eliminate the overhead of persisting the modifications, STAR coalesces the modifications and MACs in the evicted metadata. For fast recovery and verification of the metadata, STAR uses bitmap lines in asynchronous DRAM refresh (ADR) to indicate the locations of stale metadata, and constructs a cached merkle tree to verify the correctness of the recovery process. Our evaluation results show that compared with state-of-the-art work, our proposed STAR delivers high performance, low write traffic, low energy consumption and short recovery time.
2022-02-24
Castellano, Giovanna, Vessio, Gennaro.  2021.  Deep Convolutional Embedding for Digitized Painting Clustering. 2020 25th International Conference on Pattern Recognition (ICPR). :2708–2715.
Clustering artworks is difficult for several reasons. On the one hand, recognizing meaningful patterns in accordance with domain knowledge and visual perception is extremely difficult. On the other hand, applying traditional clustering and feature reduction techniques to the highly dimensional pixel space can be ineffective. To address these issues, we propose to use a deep convolutional embedding model for digitized painting clustering, in which the task of mapping the raw input data to an abstract, latent space is jointly optimized with the task of finding a set of cluster centroids in this latent feature space. Quantitative and qualitative experimental results show the effectiveness of the proposed method. The model is also capable of outperforming other state-of-the-art deep clustering approaches to the same problem. The proposed method can be useful for several art-related tasks, in particular visual link retrieval and historical knowledge discovery in painting datasets.
Singh, Parwinder, Acharya, Kartikeya Satish, Beliatis, Michail J., Presser, Mirko.  2021.  Semantic Search System For Real Time Occupancy. 2021 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS). :49–55.
This paper presents an IoT enabled real time occupancy semantic search system leveraging ETSI defined context information and interface meta model standard- ``Next Generation Service Interface for Linked Data'' (NGSI-LD). It facilitates interoperability, integration and federation of information exchange related to spatial infrastructure among geo-distributed deployed IoT entities, different stakeholders, and process domains. This system, in the presented use case, solves the problem of adhoc booking of meetings in real time through semantic discovery of spatial data and metadata related to room occupancy and thus enables optimum utilization of spatial infrastructure in university campuses. Therefore, the proposed system has the capability to save on effort, cost and productivity in institutional spatial management contexts in the longer run and as well provide a new enriched user experience in smart public buildings. Additionally, the system empowers different stakeholders to plan, forecast and fulfill their spatial infrastructure requirements through semantic data search analysis and real time data driven planning. The initial performance results of the system have shown quick response enabled semantic discovery of data and metadata (textless2 seconds mostly). The proposed system would be a steppingstone towards smart management of spatial infrastructure which offers scalability, federation, vendor agnostic ecosystem, seamless interoperability and integration and security by design. The proposed system provides the fundamental work for its extension and potential in relevant spatial domains of the future.
2022-01-31
Mueller, Tobias.  2021.  Let’s Attest! Multi-modal Certificate Exchange for the Web of Trust. 2021 International Conference on Information Networking (ICOIN). :758—763.
On the Internet, trust is difficult to obtain. With the rise of the possibility of obtaining gratis x509 certificates in an automated fashion, the use of TLS for establishing secure connections has significantly increased. However, other use cases, such as end-to-end encrypted messaging, do not yet have an easy method of managing trust in the public keys. This is particularly true for personal communication where two people want to securely exchange messages. While centralised solutions, such as Signal, exist, decentralised and federated protocols lack a way of conveniently and securely exchanging personal certificates. This paper presents a protocol and an implementation for certifying OpenPGP certificates. By offering multiple means of data transport protocols, it achieves robust and resilient certificate exchange between an attestee, the party whose key certificate is to be certified, and an attestor, the party who will express trust in the certificate once seen. The data can be transferred either via the Internet or via proximity-based technologies, i.e. Bluetooth or link-local networking. The former presents a challenge when the parties interested in exchanging certificates are not physically close, because an attacker may tamper with the connection. Our evaluation shows that a passive attacker learns nothing except the publicly visible metadata, e.g. the timings of the transfer while an active attacker can either have success with a very low probability or be detected by the user.
2021-08-31
Natarajan, K, Shaik, Vaheedbasha.  2020.  Transparent Data Encryption: Comparative Analysis and Performance Evaluation of Oracle Databases. 2020 Fifth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN). :137—142.
This Transparent Data Encryption (TDE) can provide enormous benefits to the Relational Databases in the aspects of Data Security, Cryptographic Encryption, and Compliances. For every transaction, the stored data must be decrypted before applying the updates as well as should be encrypted before permanently storing back at the storage level. By adding this extra functionality to the database, the general thinking denotes that the Database (DB) going to hit some performance overhead at the CPU and storage level. However, The Oracle Corporation has adversely claimed that their latest Oracle DB version 19c TDE feature can provide significant improvement in the optimization of CPU and no overhead at the storage level for data processing. Impressively, it is true. the results of this paper prove too. Most interestingly the results also revealed about highly impacted components in the servers which are not yet disclosed in any of the previous research work. This paper completely concentrates on CPU, IO, and RAM performance analysis and identifying the bottlenecks along with possible solutions.
2021-08-12
Jaigirdar, Fariha Tasmin, Rudolph, Carsten, Bain, Chris.  2020.  Prov-IoT: A Security-Aware IoT Provenance Model. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1360—1367.
A successful application of an Internet of Things (IoT) based network depends on the accurate and successful delivery of a large amount of data collected from numerous sources. However, the highly dynamic nature of IoT network prevents the establishment of clear security perimeters and hampers the understanding of security aspects. Risk assessment in such networks requires good situational awareness with respect to security. Therefore, a comprehensive view of data propagation including information on security controls can improve security analysis and risk assessment in each layer of data propagation in an IoT architecture. Documentation of metadata is already used in data provenance to identify who generates which data, how, and when. However, documentation of security information is not seen as relevant for data provenance graphs. In this paper, we discuss the importance of adding security metadata in a data provenance graph. We propose a novel IoT Provenance model, Prov-IoT, which documents the history of data records considering data processing and aggregation along with security metadata to enable a foundation for trust in data. The model portrays a comprehensive framework and outlines the identification of information to be included in designing a security-aware provenance graph. This can be beneficial for uncovering system fault or intrusion. Also, it can be useful for decision-based systems for security analysis and risk estimation. We design an associated class diagram for the Prov-IoT model. Finally, we use an IoT healthcare example scenario to demonstrate the impact of the proposed model.
2021-08-05
Ramasubramanian, Muthukumaran, Muhammad, Hassan, Gurung, Iksha, Maskey, Manil, Ramachandran, Rahul.  2020.  ES2Vec: Earth Science Metadata Keyword Assignment using Domain-Specific Word Embeddings. 2020 SoutheastCon. :1—6.
Earth science metadata keyword assignment is a challenging problem. Dataset curators select appropriate keywords from the Global Change Master Directory (GCMD) set of keywords. The keywords are integral part of search and discovery of these datasets. Hence, selection of keywords are crucial in increasing the discoverability of datasets. Utilizing machine learning techniques, we provide users with automated keyword suggestions as an improved approach to complement manual selection. We trained a machine learning model that leverages the semantic embedding ability of Word2Vec models to process abstracts and suggest relevant keywords. A user interface tool we built to assist data curators in assignment of such keywords is also described.
Bogatu, Alex, Fernandes, Alvaro A. A., Paton, Norman W., Konstantinou, Nikolaos.  2020.  Dataset Discovery in Data Lakes. 2020 IEEE 36th International Conference on Data Engineering (ICDE). :709—720.
Data analytics stands to benefit from the increasing availability of datasets that are held without their conceptual relationships being explicitly known. When collected, these datasets form a data lake from which, by processes like data wrangling, specific target datasets can be constructed that enable value- adding analytics. Given the potential vastness of such data lakes, the issue arises of how to pull out of the lake those datasets that might contribute to wrangling out a given target. We refer to this as the problem of dataset discovery in data lakes and this paper contributes an effective and efficient solution to it. Our approach uses features of the values in a dataset to construct hash- based indexes that map those features into a uniform distance space. This makes it possible to define similarity distances between features and to take those distances as measurements of relatedness w.r.t. a target table. Given the latter (and exemplar tuples), our approach returns the most related tables in the lake. We provide a detailed description of the approach and report on empirical results for two forms of relatedness (unionability and joinability) comparing them with prior work, where pertinent, and showing significant improvements in all of precision, recall, target coverage, indexing and discovery times.
Alecakir, Huseyin, Kabukcu, Muhammet, Can, Burcu, Sen, Sevil.  2020.  Discovering Inconsistencies between Requested Permissions and Application Metadata by using Deep Learning. 2020 International Conference on Information Security and Cryptology (ISCTURKEY). :56—56.
Android gives us opportunity to extract meaningful information from metadata. From the security point of view, the missing important information in metadata of an application could be a sign of suspicious application, which could be directed for extensive analysis. Especially the usage of dangerous permissions is expected to be explained in app descriptions. The permission-to-description fidelity problem in the literature aims to discover such inconsistencies between the usage of permissions and descriptions. This study proposes a new method based on natural language processing and recurrent neural networks. The effect of user reviews on finding such inconsistencies is also investigated in addition to application descriptions. The experimental results show that high precision is obtained by the proposed solution, and the proposed method could be used for triage of Android applications.
Wang, Xiaowen, Huang, Yan.  2020.  Research on Semantic Based Metadata Method of SWIM Information Service. 2020 IEEE 2nd International Conference on Civil Aviation Safety and Information Technology (ICCASIT. :1121—1125.
Semantic metadata is an important means to promote the integration of information and services and improve the level of search and discovery automation. Aiming at the problems that machine is difficult to handle service metadata description and lack of information metadata description in current SWIM information services, this paper analyzes the methods of metadata sematic empowerment and mainstream semantic metadata standards related to air traffic control system, constructs the SWIM information, and service sematic metadata model based on semantic expansion. The method of semantic metadata model mapping is given from two aspects of service and data, which can be used to improve the level of information sharing and intelligent processing.