Biblio
Oblivious linear-function evaluation (OLE) is a secure two-party protocol allowing a receiver to learn any linear combination of a pair of field elements held by a sender. OLE serves as a common building block for secure computation of arithmetic circuits, analogously to the role of oblivious transfer (OT) for boolean circuits. A useful extension of OLE is vector OLE (VOLE), allowing the receiver to learn any linear combination of two vectors held by the sender. In several applications of OLE, one can replace a large number of instances of OLE by a smaller number of instances of VOLE. This motivates the goal of amortizing the cost of generating long instances of VOLE. We suggest a new approach for fast generation of pseudo-random instances of VOLE via a deterministic local expansion of a pair of short correlated seeds and no interaction. This provides the first example of compressing a non-trivial and cryptographically useful correlation with good concrete efficiency. Our VOLE generators can be used to enhance the efficiency of a host of cryptographic applications. These include secure arithmetic computation and non-interactive zero-knowledge proofs with reusable preprocessing. Our VOLE generators are based on a novel combination of function secret sharing (FSS) for multi-point functions and linear codes in which decoding is intractable. Their security can be based on variants of the learning parity with noise (LPN) assumption over large fields that resist known attacks. We provide several constructions that offer tradeoffs between different efficiency measures and the underlying intractability assumptions.
Algebraic natural proofs were recently introduced by Forbes, Shpilka and Volk (Proc. of the 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 653–664, 2017) and independently by Grochow, Kumar, Saks and Saraf (CoRR, abs/1701.01717, 2017) as an attempt to transfer Razborov and Rudich's famous barrier result (J. Comput. Syst. Sci., 55(1): 24–35, 1997) for Boolean circuit complexity to algebraic complexity theory. Razborov and Rudich's barrier result relies on a widely believed assumption, namely, the existence of pseudo-random generators. Unfortunately, there is no known analogous theory of pseudo-randomness in the algebraic setting. Therefore, Forbes et al. use a concept called succinct hitting sets instead. This assumption is related to polynomial identity testing, but it is currently not clear how plausible this assumption is. Forbes et al. are only able to construct succinct hitting sets against rather weak models of arithmetic circuits. Generalized matrix completion is the following problem: Given a matrix with affine linear forms as entries, find an assignment to the variables in the linear forms such that the rank of the resulting matrix is minimal. We call this rank the completion rank. Computing the completion rank is an NP-hard problem. As our first main result, we prove that it is also NP-hard to determine whether a given matrix can be approximated by matrices of completion rank $łeq$ b. The minimum quantity b for which this is possible is called border completion rank (similar to the border rank of tensors). Naturally, algebraic natural proofs can only prove lower bounds for such border complexity measures. Furthermore, these border complexity measures play an important role in the geometric complexity program. Using our hardness result above, we can prove the following barrier: We construct a small family of matrices with affine linear forms as entries and a bound b, such that at least one of these matrices does not have an algebraic natural proof of polynomial size against all matrices of border completion rank b, unless coNP $\subseteq$ $\exists$ BPP. This is an algebraic barrier result that is based on a well-established and widely believed conjecture. The complexity class $\exists$ BPP is known to be a subset of the more well known complexity class in the literature. Thus $\exists$ BPP can be replaced by MA in the statements of all our results. With similar techniques, we can also prove that tensor rank is hard to approximate. Furthermore, we prove a similar result for the variety of matrices with permanent zero. There are no algebraic polynomial size natural proofs for the variety of matrices with permanent zero, unless P\#P $\subseteq$ $\exists$ BPP. On the other hand, we are able to prove that the geometric complexity theory approach initiated by Mulmuley and Sohoni (SIAM J. Comput. 31(2): 496–526, 2001) yields proofs of polynomial size for this variety, therefore overcoming the natural proofs barrier in this case.
We present a new connection between self-adjusting binary search trees (BSTs) and heaps, two fundamental, extensively studied, and practically relevant families of data structures (Allen,Munro, 1978; Sleator, Tarjan, 1983; Fredman, Sedgewick, Sleator, Tarjan, 1986; Wilber, 1989; Fredman, 1999; Iacono, Özkan, 2014). Roughly speaking, we map an arbitrary heap algorithm within a broad and natural model, to a corresponding BST algorithm with the same cost on a dual sequence of operations (i.e. the same sequence with the roles of time and key-space switched). This is the first general transformation between the two families of data structures. There is a rich theory of dynamic optimality for BSTs (i.e. the theory of competitiveness between BST algorithms). The lack of an analogous theory for heaps has been noted in the literature (e.g. Pettie; 2005, 2008). Through our connection, we transfer all instance-specific lower bounds known for BSTs to a general model of heaps, initiating a theory of dynamic optimality for heaps. On the algorithmic side, we obtain a new, simple and efficient heap algorithm, which we call the smooth heap. We show the smooth heap to be the heap-counterpart of Greedy, the BST algorithm with the strongest proven and conjectured properties from the literature, conjectured to be instance-optimal (Lucas, 1988; Munro, 2000; Demaine et al., 2009). Assuming the optimality of Greedy, the smooth heap is also optimal within our model of heap algorithms. Intriguingly, the smooth heap, although derived from a non-practical BST algorithm, is simple and easy to implement (e.g. it stores no auxiliary data besides the keys and tree pointers). It can be seen as a variation on the popular pairing heap data structure, extending it with a ``power-of-two-choices'' type of heuristic. For the smooth heap we obtain instance-specific upper bounds, with applications in adaptive sorting, and we see it as a promising candidate for the long-standing question of a simpler alternative to Fibonacci heaps.
Sticky notes are ubiquitous in design processes because of their tangibility and ease of use. Yet, they have well-known limitations in professional design processes, as documentation and distribution are cumbersome at best. This paper compares the use of sticky notes in ideation with a remediated digital sticky notes setup. The paper contributes with a nuanced understanding of what happens when remediating a physical design tool into digital space, by emphasizing focus shifts and breakdowns caused by the technology, but also benefits and promises inherent in the digital media. Despite users' preference for creating physical notes, handling digital notes on boards was easier and the potential of proper documentation make the digital setup a possible alternative. While the analogy in our remediation supported a transfer of learned handling, the users' experiences across technological setups impact their use and understanding, yielding new concerns regarding cross-device transfer and collaboration.
Process Variation (PV) may cause accuracy loss of the analog neural network (ANN) processors, and make it hard to be scaled down, as well as feasibility degrading. This paper first analyses the impact of PV on the performance of ANN chips. Then proposes an in-situ transfer learning method at system level to reduce PV's influence with low-precision back-propagation. Simulation results show the proposed method could increase 50% tolerance of operating point drift and 70% $\sim$ 100% tolerance of mismatch with less than 1% accuracy loss of benchmarks. It also reduces 66.7% memories and has about 50× energy-efficiency improvement of multiplication in the learning stage, compared with the conventional full-precision (32bit float) training system.
In this paper we discuss a simple and inexpensive method to introduce students to Newton's law of cooling using only their smartphones, according to the Bring-Your-Own-Device philosophy. A popular experiment in basic thermodynamics, both at a high-school and at University level, is the determination of the specific heat of solids and liquids using a water calorimeter, resourcing in many cases to a mercury thermometer. With our approach the analogical instrument is quickly turned into a digital device by analyzing the movement of the mercury with a video tracker. Thus, using very simple labware and the students' smartphones or tablets, it is possible to observe the decay behavior of the temperature of a liquid left to cool at room temperature. The dependence of the time constant with the mass and surface of the liquid can be easily probed, and the results of the different groups in the classroom can be brought together to observe the linear dependence1.
Recent years have witnessed a rapid growth in the domain of Internet of Things (IoT). This network of billions of devices generates and exchanges huge amount of data. The limited cache capacity and memory bandwidth make transferring and processing such data on traditional CPUs and GPUs highly inefficient, both in terms of energy consumption and delay. However, many IoT applications are statistical at heart and can accept a part of inaccuracy in their computation. This enables the designers to reduce complexity of processing by approximating the results for a desired accuracy. In this paper, we propose an ultra-efficient approximate processing in-memory architecture, called APIM, which exploits the analog characteristics of non-volatile memories to support addition and multiplication inside the crossbar memory, while storing the data. The proposed design eliminates the overhead involved in transferring data to processor by virtually bringing the processor inside memory. APIM dynamically configures the precision of computation for each application in order to tune the level of accuracy during runtime. Our experimental evaluation running six general OpenCL applications shows that the proposed design achieves up to 20x performance improvement and provides 480x improvement in energy-delay product, ensuring acceptable quality of service. In exact mode, it achieves 28x energy savings and 4.8x speed up compared to the state-of-the-art GPU cores.
We transfer a key idea from the field of sentiment analysis to a new domain: community question answering (cQA). The cQA task we are interested in is the following: given a question and a thread of comments, we want to re-rank the comments, so that the ones that are good answers to the question would be ranked higher than the bad ones. We notice that good vs. bad comments use specific vocabulary and that one can often predict the goodness/badness of a comment even ignoring the question, based on the comment contents only. This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis. In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. The evaluation results show an improvement of 0.7 MAP points absolute over a very strong baseline, and state-of-the art performance on SemEval-2016 Task 3.
Recent methods for learning vector space representations of words, word embedding, such as GloVe and Word2Vec have succeeded in capturing fine-grained semantic and syntactic regularities. We analyzed the effectiveness of these methods for e-commerce recommender systems by transferring the sequence of items generated by users' browsing journey in an e-commerce website into a sentence of words. We examined the prediction of fine-grained item similarity (such as item most similar to iPhone 6 64GB smart phone) and item analogy (such as iPhone 5 is to iPhone 6 as Samsung S5 is to Samsung S6) using real life users' browsing history of an online European department store. Our results reveal that such methods outperform related models such as singular value decomposition (SVD) with respect to item similarity and analogy tasks across different product categories. Furthermore, these methods produce a highly condensed item vector space representation, item embedding, with behavioral meaning sub-structure. These vectors can be used as features in a variety of recommender system applications. In particular, we used these vectors as features in a neural network based models for anonymous user recommendation based on session's first few clicks. It is found that recurrent neural network that preserves the order of user's clicks outperforms standard neural network, item-to-item similarity and SVD (recall@10 value of 42% based on first three clicks) for this task.
FPGAs have been used as accelerators in a wide variety of domains such as learning, search, genomics, signal processing, compression, analytics and so on. In recent years, the availability of tools and flows such as high-level synthesis has made it even easier to accelerate a variety of high-performance computing applications onto FPGAs. In this paper we propose a systematic methodology for optimizing the performance of an accelerated block using the notion of compute intensity to guide optimizations in high-level synthesis. We demonstrate the effectiveness of our methodology on an FPGA implementation of a non-uniform discrete Fourier transform (NUDFT), used to convert a wireless channel model from the time-domain to the frequency domain. The acceleration of this particular computation can be used to improve the performance and capacity of wireless channel simulation, which has wide applications in the system level design and performance evaluation of wireless networks. Our results show that our FPGA implementation outperforms the same code offloaded onto GPUs and CPUs by 1.6x and 10x respectively, in performance as measured by the throughput of the accelerated block. The gains in performance per watt versus GPUs and CPUs are 15.6x and 41.5x respectively.
This paper presents an integrated Analog Delay Line (ADL) for analog RF signal processing. The design is inspired by a Bucket Brigade Device (BBD) structure. It transfers charges from a sampled input signal stage after stage. It belongs to the Charge Coupled Devices (CCD). This ADL is fully differential with Common Mode (CM) control. The 28nm Fully Depleted Silicon on Insulator (FDSOI) Technology from ST Microelectronics is used for the design. Further results come from simulations using Spectre Cadence.
Capturing knowledge via learned latent vector representations of words, images and knowledge graph (KG) entities has shown state-of-the-art performance in computer vision, computational linguistics and KG tasks. Recent results demonstrate that the learning of such representations across modalities can be beneficial, since each modality captures complementary information. However, those approaches are limited to concepts with cross-modal alignments in the training data which are only available for just a few concepts. Especially for visual objects exist far fewer embeddings than for words or KG entities. We investigate whether a word embedding (e.g., for "apple") can still capture information from other modalities even if there is no matching concept within the other modalities (i.e., no images or KG entities of apples but of oranges as pictured in the title analogy). The empirical results of our knowledge transfer approach demonstrate that word embeddings do benefit from extrapolating information across modalities even for concepts that are not represented in the other modalities. Interestingly, this applies most to concrete concepts (e.g., dragonfly) while abstract concepts (e.g., animal) benefit most if aligned concepts are available in the other modalities.
We introduce a simplified island model with behavior similar to the λ (1+1) islands optimizing the Maze fitness function, and investigate the effects of the migration topology on the ability of the simplified island model to track the optimum of a dynamic fitness function. More specifically, we prove that there exist choices of model parameters for which using a unidirectional ring as the migration topology allows the model to track the oscillating optimum through n Maze-like phases with high probability, while using a complete graph as the migration topology results in the island model losing track of the optimum with overwhelming probability. Additionally, we prove that if migration occurs only rarely, denser migration topologies may be advantageous. This serves to illustrate that while a less-dense migration topology may be useful when optimizing dynamic functions with oscillating behavior, and requires less problem-specific knowledge to determine when migration may be allowed to occur, care must be taken to ensure that a sufficient amount of migration occurs during the optimization process.
Fabrication process introduces some inherent variability to the attributes of transistors (in particular length, widths, oxide thickness). As a result, every chip is physically unique. Physical uniqueness of microelectronics components can be used for multiple security applications. Physically Unclonable Functions (PUFs) are built to extract the physical uniqueness of microelectronics components and make it usable for secure applications. However, the microelectronics components used by PUFs designs suffer from external, environmental variations that impact the PUF behavior. Variations of temperature gradients during manufacturing can bias the PUF responses. Variations of temperature or thermal noise during PUF operation change the behavior of the circuit, and can introduce errors in PUF responses. Detailed knowledge of the behavior of PUFs operating over various environmental factors is needed to reliably extract and demonstrate uniqueness of the chips. In this work, we present a detailed and exhaustive analysis of the behavior of two PUF designs, a ring oscillator PUF and a timing path violation PUF. We have implemented both PUFs using FPGA fabricated by Xilinx, and analyzed their behavior while varying temperature and supply voltage. Our experiments quantify the robustness of each design, demonstrate their sensitivity to temperature and show the impact which supply voltage has on the uniqueness of the analyzed PUFs.
Convolutional Neural Network (CNN) is a powerful technique widely used in computer vision area, which also demands much more computations and memory resources than traditional solutions. The emerging metal-oxide resistive random-access memory (RRAM) and RRAM crossbar have shown great potential on neuromorphic applications with high energy efficiency. However, the interfaces between analog RRAM crossbars and digital peripheral functions, namely Analog-to-Digital Converters (ADCs) and Digital-to-Analog Converters (DACs), consume most of the area and energy of RRAM-based CNN design due to the large amount of intermediate data in CNN. In this paper, we propose an energy efficient structure for RRAM-based CNN. Based on the analysis of data distribution, a quantization method is proposed to transfer the intermediate data into 1 bit and eliminate DACs. An energy efficient structure using input data as selection signals is proposed to reduce the ADC cost for merging results of multiple crossbars. The experimental results show that the proposed method and structure can save 80% area and more than 95% energy while maintaining the same or comparable classification accuracy of CNN on MNIST.
Current-mode Analog-to-Digital Converter (ADC) has drawn many attentions due to its high operating speed, power and ground noise immunity, and etc. However, 2n – 1 comparators are required in traditional n-bit current-mode ADC design, leading to inevitable high power consumption and large chip area. In this work, we propose a low power and compact current mode Multi-Threshold Comparator (MTC) based on giant Spin Hall Effect (SHE). The two threshold currents of the proposed SHE-MTC are 200μA and 250μA with 1ns switching time, respectively. The proposed current-mode hybrid spin-CMOS flash ADC based on SHE-MTC reduces the number of comparators almost by half (2n-1), thus correspondingly reducing the required current mirror branches, total power consumption and chip area. Moreover, due to the non-volatility of SHE-MTC, the front-end analog circuits can be switched off when it is not required to further increase power efficiency. The device dynamics of SHE-MTC is simulated using a numerical device model based on Landau-Lifshitz-Gilbert (LLG) equation with Spin-Transfer Torque (STT) term and SHE term. The device-circuit co-simulation in SPICE (45nm CMOS technology) have shown that the average power dissipation of proposed ADC is 1.9mW, operating at 500MS/s with 1.2 V power supply. The INL and DNL are in the range of 0.23LSB and 0.32LSB, respectively.
Information retrieval methods, especially considering multimedia data, have evolved towards the integration of multiple sources of evidence in the analysis of the relevance of the items considering a given user search task. In this context, for attenuating the semantic gap between low-level features extracted from the content of the digital objects and high-level semantic concepts (objects, categories, etc.) and making the systems adaptive to different user needs, interactive models have brought the user closer to the retrieval loop allowing user-system interaction mainly through implicit or explicit relevance feedback. Analogously, diversity promotion has emerged as an alternative for tackling ambiguous or underspecified queries. Additionally, several works have addressed the issue of minimizing the required user effort on providing relevance assessments while keeping an acceptable overall effectiveness This thesis discusses, proposes, and experimentally analyzes multimodal and interactive diversity-oriented information retrieval methods. This work, comprehensively covers the interactive information retrieval literature and also discusses about recent advances, the great research challenges, and promising research opportunities. We have proposed and evaluated two relevancediversity trade-off enhancement work-flows, which integrate multiple information from images, such as: visual features, textual metadata, geographic information, and user credibility descriptors. In turn, as an integration of interactive retrieval and diversity promotion techniques, for maximizing the coverage of multiple query interpretations/aspects and speeding up the information transfer between the user and the system, we have proposed and evaluated a multimodal online learning-to-rank method trained with relevance feedback over diversified results Our experimental analysis shows that the joint usage of multiple information sources positively impacted the relevance-diversity balancing algorithms. Our results also suggest that the integration of multimodal-relevance-based filtering and reranking is effective on improving result relevance and also boosts diversity promotion methods. Beyond it, with a thorough experimental analysis we have investigated several research questions related to the possibility of improving result diversity and keeping or even improving relevance in interactive search sessions. Moreover, we analyze how much the diversification effort affects overall search session results and how different diversification approaches behave for the different data modalities. By analyzing the overall and per feedback iteration effectiveness, we show that introducing diversity may harm initial results whereas it significantly enhances the overall session effectiveness not only considering the relevance and diversity, but also how early the user is exposed to the same amount of relevant items and diversity
Users’ online behaviors such as ratings and examination of items are recognized as one of the most valuable sources of information for learning users’ preferences in order to make personalized recommendations. But most previous works focus on modeling only one type of users’ behaviors such as numerical ratings or browsing records, which are referred to as explicit feedback and implicit feedback, respectively. In this article, we study a Semisupervised Collaborative Recommendation (SSCR) problem with labeled feedback (for explicit feedback) and unlabeled feedback (for implicit feedback), in analogy to the well-known Semisupervised Learning (SSL) setting with labeled instances and unlabeled instances. SSCR is associated with two fundamental challenges, that is, heterogeneity of two types of users’ feedback and uncertainty of the unlabeled feedback. As a response, we design a novel Self-Transfer Learning (sTL) algorithm to iteratively identify and integrate likely positive unlabeled feedback, which is inspired by the general forward/backward process in machine learning. The merit of sTL is its ability to learn users’ preferences from heterogeneous behaviors in a joint and selective manner. We conduct extensive empirical studies of sTL and several very competitive baselines on three large datasets. The experimental results show that our sTL is significantly better than the state-of-the-art methods.
A power-efficient programmable-gain control function embedded Delta-Sigma (ΔΣ) analog-to-digital converter (ADC) for various smart sensor applications is presented. It consists of a programmable-gain switched-capacitor ΔΣ modulator followed by a digital decimation filter for down-sampling. The programmable function is realized with programmable coefficients of a loop filter using a capacitor array. The coefficient control is accomplished with keeping the location of poles of a noise transfer function, so the stability of a designed closed-loop transfer function can be assured. The proposed gain control method helps ADC to optimize its performance with varying input signal magnitude. The gain controllability requires negligible additional energy consuming or area occupying block. The power efficient programmable-gain ADC (PGADC) is well-suited for sensor devices. The gain amplification can be optimized from 0 to 18 dB with a 6 dB step. Measurements show that the PGADC achieves 15.2-bit resolution and 12.4-bit noise free resolution with 99.9 % reliability. The chip operates with a 3.3 V analog supply and a 1.8 V digital supply, while consuming only 97 μA analog current and 37 μA digital current. The analog core area is 0.064 mm2 in a standard 0.18-μm CMOS process.
Head portraits are popular in traditional painting. Automating portrait painting is challenging as the human visual system is sensitive to the slightest irregularities in human faces. Applying generic painting techniques often deforms facial structures. On the other hand portrait painting techniques are mainly designed for the graphite style and/or are based on image analogies; an example painting as well as its original unpainted version are required. This limits their domain of applicability. We present a new technique for transferring the painting from a head portrait onto another. Unlike previous work our technique only requires the example painting and is not restricted to a specific style. We impose novel spatial constraints by locally transferring the color distributions of the example painting. This better captures the painting texture and maintains the integrity of facial structures. We generate a solution through Convolutional Neural Networks and we present an extension to video. Here motion is exploited in a way to reduce temporal inconsistencies and the shower-door effect. Our approach transfers the painting style while maintaining the input photograph identity. In addition it significantly reduces facial deformations over state of the art.
Humans are often able to generalize knowledge learned from a single exemplar. In this paper, we present a novel integration of mental simulation and analogical generalization algorithms into a cognitive robotic architecture that enables a similarly rudimentary generalization capability in robots. Specifically, we show how a robot can generate variations of a given scenario and then use the results of those new scenarios run in a physics simulator to generate generalized action scripts using analogical mappings. The generalized action scripts then allow the robot to perform the originally learned activity in a wider range of scenarios with different types of objects without the need for additional exploration or practice. In a proof-of-concept demonstration we show how the robot can generalize from a previously learned pick-and-place action performed with a single arm on an object with a handle to a pick-and-place action of a cylindrical object with no handle with two arms.
Although computing students may enjoy when their instructors teach using analogies, it is unknown to what extent these analogies are useful for their learning. This study examines the value of analogies when used to introduce three introductory computing topics. The value of these analogies may be evident during the teaching process itself (short term), in subsequent exams (long term), or in students' ability to apply their understanding to related non-technical areas (transfer). Comparing results between an experimental group (analogy) and control group (no analogy), we find potential value for analogies in short term learning. However, no solid evidence was found to support analogies as valuable for students in the long term or for knowledge transfer. Specific demographic groups were examined and promising preliminary findings are presented.
Information system developers and administrators often overlook critical security requirements and best practices. This may be due to lack of tools and techniques that allow practitioners to tailor security knowledge to their particular context. In order to explore the impact of new security methods, we must improve our ability to study the impact of security tools and methods on software and system development. In this paper, we present early findings of an experiment to assess the extent to which the number and type of examples used in security training stimuli can impact security problem solving. To motivate this research, we formulate hypotheses from analogical transfer theory in psychology. The independent variables include number of problem surfaces and schemas, and the dependent variable is the answer accuracy. Our study results do not show a statistically significant difference in performance when the number and types of examples are varied. We discuss the limitations, threats to validity and opportunities for future studies in this area.