Biblio
Filters: Keyword is Iterative algorithms [Clear All Filters]
Towards Unveiling Dark Web Structured Data. 2021 IEEE International Conference on Big Data (Big Data). :5275—5282.
.
2021. Anecdotal evidence suggests that Web-search engines, together with the Knowledge Graphs and Bases, such as YAGO [46], DBPedia [13], Freebase [16], Google Knowledge Graph [52] provide rapid access to most structured information on the Web. However, taking a closer look reveals a so called "knowledge gap" [18] that is largely in the dark. For example, a person searching for a relevant job opening has to spend at least 3 hours per week for several months [2] just searching job postings on numerous online job-search engines and the employer websites. The reason why this seemingly simple task cannot be completed by typing in a few keyword queries into a search-engine and getting all relevant results in seconds instead of hours is because access to structured data on the Web is still rudimentary. While searching for a job we have many parameters in mind, not just the job title, but also, usually location, salary range, remote work option, given a recent shift to hybrid work places, and many others. Ideally, we would like to write a SQL-style query, selecting all job postings satisfying our requirements, but it is currently impossible, because job postings (and all other) Web tables are structured in many different ways and scattered all over the Web. There is neither a Web-scale generalizable algorithm nor a system to locate and normalize all relevant tables in a category of interest from millions of sources.Here we describe and evaluate on a corpus having hundreds of millions of Web tables [39], a new scalable iterative training data generation algorithm, producing high quality training data required to train Deep- and Machine-learning models, capable of generalizing to Web scale. The models, trained on such en-riched training data efficiently deal with Web scale heterogeneity compared to poor generalization performance of models, trained without enrichment [20], [25], [38]. Such models are instrumental in bridging the knowledge gap for structured data on the Web.
MITOS: Optimal Decisioning for the Indirect Flow Propagation Dilemma in Dynamic Information Flow Tracking Systems. 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). :1090–1100.
.
2020. Dynamic Information Flow Tracking (DIFT), also called Dynamic Taint Analysis (DTA), is a technique for tracking the information as it flows through a program's execution. Specifically, some inputs or data get tainted and then these taint marks (tags) propagate usually at the instruction-level. While DIFT has been a fundamental concept in computer and network security for the past decade, it still faces open challenges that impede its widespread application in practice; one of them being the indirect flow propagation dilemma: should the tags involved in an indirect flow, e.g., in a control or address dependency, be propagated? Propagating all these tags, as is done for direct flows, leads to overtainting (all taintable objects become tainted), while not propagating them leads to undertainting (information flow becomes incomplete). In this paper, we analytically model that decisioning problem for indirect flows, by considering various tradeoffs including undertainting versus overtainting, importance of heterogeneous code semantics and context. Towards tackling this problem, we design MITOS, a distributed-optimization algorithm, that: decides about the propagation of indirect flows by properly weighting all these tradeoffs, is of low-complexity, is scalable, is able to flexibly adapt to different application scenarios and security needs of large distributed systems. Additionally, MITOS is applicable to most DIFT systems that consider an arbitrary number of tag types, and introduces the key properties of fairness and tag-balancing to the DIFT field. To demonstrate MITOS's applicability in practice, we implement and evaluate MITOS on top of an open-source DIFT, and we shed light on the open problem. We also perform a case-study scenario with a real in-memory only attack and show that MITOS improves simultaneously (i) system's spatiotemporal overhead (up to 40%), and (ii) system's fingerprint on suspected bytes (up to 167%) compared to traditional DIFT, even though these metrics usually conflict.
Attribution Based Approach for Adversarial Example Generation. SoutheastCon 2021. :1–6.
.
2021. Neural networks with deep architectures have been used to construct state-of-the-art classifiers that can match human level accuracy in areas such as image classification. However, many of these classifiers can be fooled by examples slightly modified from their original forms. In this work, we propose a novel approach for generating adversarial examples that makes use of only attribution information of the features and perturbs only features that are highly influential to the output of the classifier. We call this approach Attribution Based Adversarial Generation (ABAG). To demonstrate the effectiveness of this approach, three somewhat arbitrary algorithms are proposed and examined. In the first algorithm all non-zero attributions are utilized and associated features perturbed; in the second algorithm only the top-n most positive and top-n most negative attributions are used and corresponding features perturbed; and in the third algorithm the level of perturbation is increased in an iterative manner until an adversarial example is discovered. All of the three algorithms are implemented and experiments are performed on the well-known MNIST dataset. Experiment results show that adversarial examples can be generated very efficiently, and thus prove the validity and efficacy of ABAG - utilizing attributions for the generation of adversarial examples. Furthermore, as shown by examples, ABAG can be adapted to provides a systematic searching approach to generate adversarial examples by perturbing a minimum amount of features.
A Review of Reconstruction Algorithms in Compressive Sensing. 2020 International Conference on Advances in Computing, Communication Materials (ICACCM). :322–325.
.
2020. Compressive Sensing (CS) is a promising technology for the acquisition of signals. The number of measurements is reduced by using CS which is needed to obtain the signals in some basis that are compressible or sparse. The compressible or sparse nature of the signals can be obtained by transforming the signals in some domain. Depending on the signals sparsity signals are sampled below the Nyquist sampling criteria by using CS. An optimization problem needs to be solved for the recovery of the original signal. Very few studies have been reported about the reconstruction of the signals. Therefore, in this paper, the reconstruction algorithms are elaborated systematically for sparse signal recovery in CS. The discussion of various reconstruction algorithms in made in this paper will help the readers in order to understand these algorithms efficiently.
A Hole in the Ladder : Interleaved Variables in Iterative Conditional Branching. 2020 IEEE 27th Symposium on Computer Arithmetic (ARITH). :56–63.
.
2020. The modular exponentiation is crucial to the RSA cryptographic protocol, and variants inspired by the Montgomery ladder have been studied to provide more secure algorithms. In this paper, we abstract away the iterative conditional branching used in the Montgomery ladder, and formalize systems of equations necessary to obtain what we call the semi-interleaved and fully-interleaved ladder properties. In particular, we design fault-injection attacks able to obtain bits of the secret against semi-interleaved ladders, including the Montgomery ladder, but not against fully-interleaved ladders that are more secure. We also apply these equations to extend the Montgomery ladder for both the semi- and fully-interleaved cases, thus proposing novel and more secure algorithms to compute the modular exponentiation.
Traffic Off-Loading over Uncertain Shared Spectrums with End-to-End Session Guarantee. 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall). :1–5.
.
2020. As a promising solution of spectrum shortage, spectrum sharing has received tremendous interests recently. However, under different sharing policies of different licensees, the shared spectrum is heterogeneous both temporally and spatially, and is usually uncertain due to the unpredictable activities of incumbent users. In this paper, considering the spectrum uncertainty, we propose a spectrum sharing based delay-tolerant traffic off-loading (SDTO) scheme. To capture the available heterogeneous shared bands, we adopt a mesh cognitive radio network and employ the multi-hop transmission mode. To statistically guarantee the end-to-end (E2E) session request under the uncertain spectrum supply, we formulate the SDTO scheme into a stochastic optimization problem, which is transformed into a mixed integer nonlinear programming (MINLP) problem. Then, a coarse-fine search based iterative heuristic algorithm is proposed to solve the MINLP problem. Simulation results demonstrate that the proposed SDTO scheme can well schedule the network resource with an E2E session guarantee.