A Scalable Meta-Model for Big Data Security Analyses
Title | A Scalable Meta-Model for Big Data Security Analyses |
Publication Type | Conference Paper |
Year of Publication | 2016 |
Authors | Yang, B., Zhang, T. |
Conference Name | 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS) |
Date Published | April 2016 |
Publisher | IEEE |
ISBN Number | 978-1-5090-2403-2 |
Keywords | Big Data, Big Data security analysis, Conferences, Data models, learning (artificial intelligence), linear regression, linear regression models, linear regressions, machine learning algorithms, matrix algebra, Meta-Model, meta-model matrix, meta-model sufficient statistics, network anomaly detection, Predictive models, pubcrawl, regression analysis, Scalability, scalable meta-model, Scalable Security, security, security analyses, security of data, statistical data models, sufficient statistics, Training data |
Abstract | This paper proposes a highly scalable framework that can be applied to detect network anomaly at per flow level by constructing a meta-model for a family of machine learning algorithms or statistical data models. The approach is scalable and attainable because raw data needs to be accessed only one time and it will be processed, computed and transformed into a meta-model matrix in a much smaller size that can be resident in the system RAM. The calculation of meta-model matrix can be achieved through disposable updating operations at per row level: once a per-flow information is proceeded, it is no longer needed in calculating the meta-model matrix. While the proposed framework covers both Gaussian and non-Gaussian data, the focus of this work is on the linear regression models. Specifically, a new concept called meta-model sufficient statistics is proposed to analyze a group of models, where exact, not the approximate, results are derived. In addition, the proposed framework can quickly discover an optimal statistical or computer model from a family of candidate models without the need of rescanning the raw dataset. This suggest an extremely efficient and effectively theory and method is possible for big data security analysis. |
URL | https://ieeexplore.ieee.org/document/7502264/ |
DOI | 10.1109/BigDataSecurity-HPSC-IDS.2016.71 |
Citation Key | yang_scalable_2016 |
- network anomaly detection
- Training data
- sufficient statistics
- statistical data models
- security of data
- security analyses
- security
- Scalable Security
- scalable meta-model
- Scalability
- regression analysis
- pubcrawl
- Predictive models
- Big Data
- meta-model sufficient statistics
- meta-model matrix
- Meta-Model
- matrix algebra
- machine learning algorithms
- linear regressions
- linear regression models
- linear regression
- learning (artificial intelligence)
- Data models
- Conferences
- Big Data security analysis