Biblio
Filters: Keyword is Big Data [Clear All Filters]
The Analysis and Development of an XAI Process on Feature Contribution Explanation. 2022 IEEE International Conference on Big Data (Big Data). :5039–5048.
.
2022. Explainable Artificial Intelligence (XAI) research focuses on effective explanation techniques to understand and build AI models with trust, reliability, safety, and fairness. Feature importance explanation summarizes feature contributions for end-users to make model decisions. However, XAI methods may produce varied summaries that lead to further analysis to evaluate the consistency across multiple XAI methods on the same model and data set. This paper defines metrics to measure the consistency of feature contribution explanation summaries under feature importance order and saliency map. Driven by these consistency metrics, we develop an XAI process oriented on the XAI criterion of feature importance, which performs a systematical selection of XAI techniques and evaluation of explanation consistency. We demonstrate the process development involving twelve XAI methods on three topics, including a search ranking system, code vulnerability detection and image classification. Our contribution is a practical and systematic process with defined consistency metrics to produce rigorous feature contribution explanations.
Open Source Software Computed Risk Framework. 2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT). :172–175.
.
2022. The increased dissemination of open source software to a broader audience has led to a proportional increase in the dissemination of vulnerabilities. These vulnerabilities are introduced by developers, some intentionally or negligently. In this paper, we work to quantity the relative risk that a given developer represents to a software project. We propose using empirical software engineering based analysis on the vast data made available by GitHub to create a Developer Risk Score (DRS) for prolific contributors on GitHub. The DRS can then be aggregated across a project as a derived vulnerability assessment, we call this the Computational Vulnerability Assessment Score (CVAS). The CVAS represents the correlation between the Developer Risk score across projects and vulnerabilities attributed to those projects. We believe this to be a contribution in trying to quantity risk introduced by specific developers across open source projects. Both of the risk scores, those for contributors and projects, are derived from an amalgamation of data, both from GitHub and outside GitHub. We seek to provide this risk metric as a force multiplier for the project maintainers that are responsible for reviewing code contributions. We hope this will lead to a reduction in the number of introduced vulnerabilities for projects in the Open Source ecosystem.
ISSN: 2766-3639
A Mapping Study on Privacy Attacks in Big Data and IoT. 2022 13th International Conference on Information and Communication Technology Convergence (ICTC). :1158–1163.
.
2022. Application domains like big data and IoT require a lot of user data collected and analyzed to extract useful information, and those data might include user's sensitive and personal information. Hence, it is strongly required to ensure the privacy of user data before releasing them in the public space. Since the fields of IoT and big data are constantly evolving with new types of privacy attacks and prevention mechanisms, there is an urgent need for new research and surveys to develop an overview of the state-of-art. We conducted a systematic mapping study on selected papers related to user privacy in IoT and big data, published between 2010 to 2021. This study focuses on identifying the main privacy objectives, attacks and measures taken to prevent the attacks in the two application domains. Additionally, a visualized classification of the existing attacks is presented along with privacy metrics to draw similarities and dissimilarities among different attacks.
ISSN: 2162-1241
Star-Bridge: a topological multidimensional subgraph analysis to detect fraudulent nodes and rings in telecom networks. 2022 IEEE International Conference on Big Data (Big Data). :2239–2242.
.
2022. Fraud mechanisms have evolved from isolated actions performed by single individuals to complex criminal networks. This paper aims to contribute to the identification of potentially relevant nodes in fraud networks. Whilst traditional methods for fraud detection rely on identifying abnormal patterns, this paper proposes STARBRIDGE: a new linear and scalable, ranked out, parameter free method to identify fraudulent nodes and rings based on Bridging, Influence and Control metrics. This is applied to the telecommunications domain where fraudulent nodes form a star-bridge-star pattern. Over 75% of nodes involved in fraud denote control, bridging centrality and doubled the influence scores, when compared to non-fraudulent nodes in the same role, stars and bridges being chief positions.
Schedule or Wait: Age-Minimization for IoT Big Data Processing in MEC via Online Learning. IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. :1809–1818.
.
2022. The age of data (AoD) is identified as one of the most novel and important metrics to measure the quality of big data analytics for Internet-of-Things (IoT) applications. Meanwhile, mobile edge computing (MEC) is envisioned as an enabling technology to minimize the AoD of IoT applications by processing the data in edge servers close to IoT devices. In this paper, we study the AoD minimization problem for IoT big data processing in MEC networks. We first propose an exact solution for the problem by formulating it as an Integer Linear Program (ILP). We then propose an efficient heuristic for the offline AoD minimization problem. We also devise an approximation algorithm with a provable approximation ratio for a special case of the problem, by leveraging the parametric rounding technique. We thirdly develop an online learning algorithm with a bounded regret for the online AoD minimization problem under dynamic arrivals of IoT requests and uncertain network delay assumptions, by adopting the Multi-Armed Bandit (MAB) technique. We finally evaluate the performance of the proposed algorithms by extensive simulations and implementations in a real test-bed. Results show that the proposed algorithms outperform existing approaches by reducing the AoD around 10%.
ISSN: 2641-9874
The Impact of Big Data Analytics on Traffic Prediction. 2022 International Conference on Advanced Aspects of Software Engineering (ICAASE). :1–6.
.
2022. The Internet of Vehicles (IoVs) performs the rapid expansion of connected devices. This massive number of devices is constantly generating a massive and near-real-time data stream for numerous applications, which is known as big data. Analyzing such big data to find, predict, and control decisions is a critical solution for IoVs to enhance service quality and experience. Thus, the main goal of this paper is to study the impact of big data analytics on traffic prediction in IoVs. In which we have used big data analytics steps to predict the traffic flow, and based on different deep neural models such as LSTM, CNN-LSTM, and GRU. The models are validated using evaluation metrics, MAE, MSE, RMSE, and R2. Hence, a case study based on a real-world road is used to implement and test the efficiency of the traffic prediction models.
Construction of Intelligent Emergency Response Technology System Based on Big Data Technology. 2022 International Conference on Big Data, Information and Computer Network (BDICN). :59–62.
.
2022. This paper analyzes the problems existing in the existing emergency management technology system in China from various perspectives, and designs the construction of intelligent emergency system in combination with the development of new generation of Internet of Things, big data, cloud computing and artificial intelligence technology. The overall design is based on scientific and technological innovation to lead the reform of emergency management mechanism and process reengineering to build an intelligent emergency technology system characterized by "holographic monitoring, early warning, intelligent research and accurate disposal". To build an intelligent emergency management system that integrates intelligent monitoring and early warning, intelligent emergency disposal, efficient rehabilitation, improvement of emergency standards, safety and operation and maintenance construction.
Research on Network Security Protection System Based on Computer Big Data Era. 2022 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS). :1487–1490.
.
2022. This paper designs a network security protection system based on artificial intelligence technology from two aspects of hardware and software. The system can simultaneously collect Internet public data and secret-related data inside the unit, and encrypt it through the TCM chip solidified in the hardware to ensure that only designated machines can read secret-related materials. The data edge-cloud collaborative acquisition architecture based on chip encryption can realize the cross-network transmission of confidential data. At the same time, this paper proposes an edge-cloud collaborative information security protection method for industrial control systems by combining end-address hopping and load balancing algorithms. Finally, using WinCC, Unity3D, MySQL and other development environments comprehensively, the feasibility and effectiveness of the system are verified by experiments.
Privacy security protection based on data life cycle. 2022 World Automation Congress (WAC). :433–436.
.
2022. Large capacity, fast-paced, diversified and high-value data are becoming a hotbed of data processing and research. Privacy security protection based on data life cycle is a method to protect privacy. It is used to protect the confidentiality, integrity and availability of personal data and prevent unauthorized access or use. The main advantage of using this method is that it can fully control all aspects related to the information system and its users. With the opening of the cloud, attackers use the cloud to recalculate and analyze big data that may infringe on others' privacy. Privacy protection based on data life cycle is a means of privacy protection based on the whole process of data production, collection, storage and use. This approach involves all stages from the creation of personal information by individuals (e.g. by filling out forms online or at work) to destruction after use for the intended purpose (e.g. deleting records). Privacy security based on the data life cycle ensures that any personal information collected is used only for the purpose of initial collection and destroyed as soon as possible.
ISSN: 2154-4824
A Demo of a Software Platform for Ubiquitous Big Data Engineering, Visualization, and Analytics, via Reconfigurable Micro-Services, in Smart Factories. 2022 IEEE International Conference on Smart Computing (SMARTCOMP). :1–3.
.
2022. Intelligent, smart, Cloud, reconfigurable manufac-turing, and remote monitoring, all intersect in modern industry and mark the path toward more efficient, effective, and sustain-able factories. Many obstacles are found along the path, including legacy machineries and technologies, security issues, and software that is often hard, slow, and expensive to adapt to face unforeseen challenges and needs in this fast-changing ecosystem. Light-weight, portable, loosely coupled, easily monitored, variegated software components, supporting Edge, Fog and Cloud computing, that can be (re)created, (re)configured and operated from remote through Web requests in a matter of milliseconds, and that rely on libraries of ready-to-use tasks also extendable from remote through sub-second Web requests, constitute a fertile technological ground on top of which fourth-generation industries can be built. In this demo it will be shown how starting from a completely virgin Docker Engine, it is possible to build, configure, destroy, rebuild, operate, exclusively from remote, exclusively via API calls, computation networks that are capable to (i) raise alerts based on configured thresholds or trained ML models, (ii) transform Big Data streams, (iii) produce and persist Big Datasets on the Cloud, (iv) train and persist ML models on the Cloud, (v) use trained models for one-shot or stream predictions, (vi) produce tabular visualizations, line plots, pie charts, histograms, at real-time, from Big Data streams. Also, it will be shown how easily such computation networks can be upgraded with new functionalities at real-time, from remote, via API calls.
ISSN: 2693-8340
A Step Towards Improvement in Classical Honeypot Security System. 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON). 1:720–725.
.
2022. Data security is a vast term that doesn’t have any limits, but there are a certain amount of tools and techniques that could help in gaining security. Honeypot is among one of the tools that are designated and designed to protect the security of a network but in a very dissimilar manner. It is a system that is designed and developed to be compromised and exploited. Honeypots are meant to lure the invaders, but due to advancements in computing systems parallelly, the intruding technologies are also attaining their gigantic influence. In this research work, an approach involving apache-spark (a Big Data Technique) would be introduced in order to use it with the Honeypot System. This work includes an extensive study based on several research papers, through which elaborated experiment-based result has been expressed on the best known open-source honeypot systems. The preeminent possible method of using The Honeypot with apache spark in the sequential channel would also be proposed with the help of a framework diagram.
Research on Communication Network Security Detection System based on Computer Big Data. 2022 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA). :273–276.
.
2022. With the development of information networks, cloud computing, big data, and virtualization technologies promote the emergence of various new network applications to meet the needs of various Internet services. A security protection system for virtual host in cloud computing center is proposed in the article. The system takes “security as a service” as the starting point, takes virtual machines as the core, and takes virtual machine clusters as the unit to provide unified security protection against the borderless characteristics of virtualized computing. The thesis builds a network security protection system for APT attacks; uses the system dynamics method to establish a system capability model, and conducts simulation analysis. The simulation results prove the validity and rationality of the network communication security system framework and modeling analysis method proposed in the thesis. Compared with traditional methods, this method has more comprehensive modeling and analysis elements, and the deduced results are more instructive.
Data Integrity Auditing Scheme for Preserving Security in Cloud based Big Data. 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS). :609–613.
.
2022. Cloud computing has become an integral part of medical big data. The cloud has the capability to store the large data volumes has attracted more attention. The integrity and privacy of patient data are some of the issues that cloud-based medical big data should be addressed. This research work introduces data integrity auditing scheme for cloud-based medical big data. This will help minimize the risk of unauthorized access to the data. Multiple copies of the data are stored to ensure that it can be recovered quickly in case of damage. This scheme can also be used to enable doctors to easily track the changes in patients' conditions through a data block. The simulation results proved the effectiveness of the proposed scheme.
ISSN: 2768-5330
Securing Big Data: A Survey on Security Solutions. 2022 13th International Conference on Electrical Engineering (ICEENG). :145–149.
.
2022. Big Data (BD) is the combination of several technologies which address the gathering, analyzing and storing of massive heterogeneous data. The tremendous spurt of the Internet of Things (IoT) and different technologies are the fundamental incentive behind this enduring development. Moreover, the analysis of this data requires high-performance servers for advanced and parallel data analytics. Thus, data owners with their limited capabilities may outsource their data to a powerful but untrusted environment, i.e., the Cloud. Furthermore, data analytic techniques performed on external cloud may arise various security intimidations regarding the confidentiality and the integrity of the aforementioned; transferred, analyzed, and stored data. To countermeasure these security issues and challenges, several techniques have been addressed. This survey paper aims to summarize and emphasize the security threats within Big Data framework, in addition, it is worth mentioning research work related to Big Data Analytics (BDA).
Analysis of a Joint Data Security Architecture Integrating Artificial Intelligence and Cloud Computing in the Era of Big Data. 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT). :988–991.
.
2022. This article analyzes the analysis of the joint data security architecture that integrates artificial intelligence and cloud computing in the era of big data. The article discusses and analyzes the integrated applications of big data, artificial intelligence and cloud computing. As an important part of big data security protection, joint data security Protecting the technical architecture is not only related to the security of joint data in the big data era, but also has an important impact on the overall development of the data era. Based on this, the thesis takes the big data security and joint data security protection technical architecture as the research content, and through a simple explanation of big data security, it then conducts detailed research on the big data security and joint data security protection technical architecture from five aspects and thinking.
The legal debate about personal data privacy at a time of big data mining and searching: Making big data researchers cooperating with lawmakers to find solutions for the future. 2016 First IEEE International Conference on Computer Communication and the Internet (ICCCI). :354–357.
.
2016. At the same time as Big Data technologies are being constantly refined, the legislation relating to data privacy is changing. The invalidation by the Court of Justice of the European Union on October 6, 2015, of the agreement known as “Safe Harbor”, negotiated by the European Commission on behalf of the European Union with the United States has two consequences. The first is to announce its replacement by a new, still fragile, program, the “Privacy Shield”, which isn't yet definitive and which could also later be repealed by the Court of Justice of the European Union. For example, we are expecting to hear the opinion in mid-April 2016 of the group of data protection authorities for the various states of the European Union, known as G29. The second is to mobilize the Big Data community to take control of the question of data privacy management and to put in place an adequate internal program.
Privacy-Aware Big Data Warehouse Architecture. 2016 IEEE International Congress on Big Data (BigData Congress). :341–344.
.
2016. Along with the ever increasing growth in data collection and its mining, there is an increasing fear of compromising individual and population privacy. Several techniques have been proposed in literature to preserve privacy of collected data while storing and processing. In this paper, we propose a privacy-aware architecture for storing and processing data in a Big Data warehouse. In particular, we propose a flexible, extendable, and adaptable architecture that enforces user specified privacy requirements in the form of Embedded Privacy Agreements. The paper discusses the details of the architecture with some implementation details.
Big Data Privacy Based on Differential Privacy a Hope for Big Data. 2014 International Conference on Computational Intelligence and Communication Networks. :776–781.
.
2014. In era of information age, due to different electronic, information & communication technology devices and process like sensors, cloud, individual archives, social networks, internet activities and enterprise data are growing exponentially. The most challenging issues are how to effectively manage these large and different type of data. Big data is one of the term named for this large and different type of data. Due to its extraordinary scale, privacy and security is one of the critical challenge of big data. At the every stage of managing the big data there are chances that privacy may be disclose. Many techniques have been suggested and implemented for privacy preservation of large data set like anonymization based, encryption based and others but unfortunately due to different characteristic (large volume, high speed, and unstructured data) of big data all these techniques are not fully suitable. In this paper we have deeply analyzed, discussed and suggested how an existing approach "differential privacy" is suitable for big data. Initially we have discussed about differential privacy and later analyze how it is suitable for big data.
Research on Enterprise Information Security and Privacy Protection in Big Data Environment. 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI). :324–327.
.
2021. With the development of information technology, extracting important data that people need from the vast information has become the key to a successful era. Therefore, big data technology is increasingly recognized by the public. While creating a lot of commercial value for enterprises, it also brings huge challenges to information security and privacy. In the big data environment, data has become an important medium for corporate decision-making, and information security and privacy protection have become the “army battleground” in corporate competition. Therefore, information security and privacy protection are getting more and more attention from enterprises, which also determines whether enterprises can occupy a place in the fiercely competitive market. This article analyzes the information security and privacy protection issues of enterprises in the big data environment from three aspects. Starting from the importance and significance of big data protection, it analyzes the security and privacy issues of big data in enterprise applications, and finally conducts information security and privacy protection for enterprises. Privacy protection puts forward relevant suggestions.
Privacy, consumer trust and big data: Privacy by design and the 3 C'S. 2015 ITU Kaleidoscope: Trust in the Information Society (K-2015). :1–5.
.
2015. The growth of ICTs and the resulting data explosion could pave the way for the surveillance of our lives and diminish our democratic freedoms, at an unimaginable scale. Consumer mistrust of an organization's ability to safeguard their data is at an all time high and this has negative implications for Big Data. The timing is right to be proactive about designing privacy into technologies, business processes and networked infrastructures. Inclusiveness of all objectives can be achieved through consultation, co-operation, and collaboration (3 C's). If privacy is the default, without diminishing functionality or other legitimate interests, then trust will be preserved and innovation will flourish.
Privacy-Preserving Big Data Stream Mining: Opportunities, Challenges, Directions. 2017 IEEE International Conference on Data Mining Workshops (ICDMW). :992–994.
.
2017. This paper explores recent achievements and novel challenges of the annoying privacy-preserving big data stream mining problem, which consists in applying mining algorithms to big data streams while ensuring the privacy of data. Recently, the emerging big data analytics context has conferred a new light to this exciting research area. This paper follows the so-depicted research trend.
ISSN: 2375-9259
Privacy-Preserving Big Data Exchange: Models, Issues, Future Research Directions. 2021 IEEE International Conference on Big Data (Big Data). :5081–5084.
.
2021. Big data exchange is an emerging problem in the context of big data management and analytics. In big data exchange, multiple entities exchange big datasets beyond the common data integration or data sharing paradigms, mostly in the context of data federation architectures. How to make big data exchange while ensuring privacy preservation constraintsƒ The latter is a critical research challenge that is gaining momentum on the research community, especially due to the wide family of application scenarios where it plays a critical role (e.g., social networks, bio-informatics tools, smart cities systems and applications, and so forth). Inspired by these considerations, in this paper we provide an overview of models and issues in the context of privacy-preserving big data exchange research, along with a selection of future research directions that will play a critical role in next-generation research.
Privacy Preserving Big Data Publishing. 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT). :24–29.
.
2018. In order to gain more benefits from big data, they must be shared, published, analyzed and processed without having any harm or facing any violation and finally get better values from these analytics. The literature reports that this analytics brings an issue of privacy violations. This issue is also protected by law and bring fines to the companies, institutions or individuals. As a result, data collectors avoid to publish or share their big data due to these concerns. In order to obtain plausible solutions, there are a number of techniques to reduce privacy risks and to enable publishing big data while preserving privacy at the same time. These are known as privacy-preserving big data publishing (PPBDP) models. This study presents the privacy problem in big data, evaluates big data components from privacy perspective, privacy risks and protection methods in big data publishing, and reviews existing privacy-preserving big data publishing approaches and anonymization methods in literature. The results were finally evaluated and discussed, and new suggestions were presented.
Property Graph Access Control Using View-Based and Query-Rewriting Approaches. 2022 IEEE/ACS 19th International Conference on Computer Systems and Applications (AICCSA). :1–2.
.
2022. Managing and storing big data is non-trivial for traditional relational databases (RDBMS). Therefore, the NoSQL (Not Only SQL) database management system emerged. It is ca-pable of handling the vast amount and the heterogeneity of data. In this research, we are interested in one of its trending types, the graph database, namely, the Directed Property Graph (DPG). This type of database is powerful in dealing with complex relationships (\$\textbackslashmathrme.\textbackslashmathrmg\$., social networks). However, its sen-sitive and private data must be protected against unauthorized access. This research proposes a security model that aims at exploiting and combining the benefits of Access Control, View-Based, and Query-Rewriting approaches. This is a novel combination for securing DPG.
ISSN: 2161-5330
Critical Data Security Model: Gap Security Identification and Risk Analysis In Financial Sector. 2022 17th Iberian Conference on Information Systems and Technologies (CISTI). :1–6.
.
2022. In this paper, we proposed a data security model of a big data analytical environment in the financial sector. Big Data can be seen as a trend in the advancement of technology that has opened the door to a new approach to understanding and decision making that is used to describe the vast amount of data (structured, unstructured and semi-structured) that is too time consuming and costly to load a relational database for analysis. The increase in cybercriminal attacks on an organization’s assets results in organizations beginning to invest in and care more about their cybersecurity points and controls. The management of business-critical data is an important point for which robust cybersecurity controls should be considered. The proposed model is applied in a datalake and allows the identification of security gaps on an analytical repository, a cybersecurity risk analysis, design of security components and an assessment of inherent risks on high criticality data in a repository of a regulated financial institution. The proposal was validated in financial entities in Lima, Peru. Proofs of concept of the model were carried out to measure the level of maturity focused on: leadership and commitment, risk management, protection control, event detection and risk management. Preliminary results allowed placing the entities in level 3 of the model, knowing their greatest weaknesses, strengths and how these can affect the fulfillment of business objectives.
ISSN: 2166-0727