Biblio
With the advent of the big data era, information systems have exhibited some new features, including boundary obfuscation, system virtualization, unstructured and diversification of data types, and low coupling among function and data. These features not only lead to a big difference between big data technology (DT) and information technology (IT), but also promote the upgrading and evolution of network security technology. In response to these changes, in this paper we compare the characteristics between IT era and DT era, and then propose four DT security principles: privacy, integrity, traceability, and controllability, as well as active and dynamic defense strategy based on "propagation prediction, audit prediction, dynamic management and control". We further discuss the security challenges faced by DT and the corresponding assurance strategies. On this basis, the big data security technologies can be divided into four levels: elimination, continuation, improvement, and innovation. These technologies are analyzed, combed and explained according to six categories: access control, identification and authentication, data encryption, data privacy, intrusion prevention, security audit and disaster recovery. The results will support the evolution of security technologies in the DT era, the construction of big data platforms, the designation of security assurance strategies, and security technology choices suitable for big data.
In this paper, we present a formal verification tool for the Ethereum Virtual Machine (EVM) bytecode. To precisely reason about all possible behaviors of the EVM bytecode, we adopted KEVM, a complete formal semantics of the EVM, and instantiated the K-framework's reachability logic theorem prover to generate a correct-by-construction deductive verifier for the EVM. We further optimized the verifier by introducing EVM-specific abstractions and lemmas to improve its scalability. Our EVM verifier has been used to verify various high-profile smart contracts including the ERC20 token, Ethereum Casper, and DappHub MakerDAO contracts.
Anti-reverse engineering is one of the core technologies of software intellectual property protection, prevailing techniques of which are static and dynamic obfuscation. Static obfuscation can only prevent static analysis with code mutation done before execution by compressing, encrypting and obfuscating. Dynamic obfuscation can prevent both static and dynamic analysis, which changes code while being executed. Popular dynamic obfuscation techniques include self-modifying code and virtual machine protection. Despite the higher safety, dynamic obfuscation has its problems: 1) code appear in plain text remains a long time; 2) control flow is exposable; 3) time and space overheads are too big. This paper presents a binary protection scheme using dynamic fine-grained code hiding and obfuscation named dynFCHO. In this scheme, basic blocks to be protected are hidden in original code and will be restored while being executed. Code obfuscation is also implemented additionally to enhance safety. Experiments prove that dynFCHO can effectively resist static and dynamic analysis without destructing original software functions. It can be used on most binary programs compiled by standard compilers. This scheme can be widely used with the advantages of strong protection, light-weight implementation, and good extendibility.
We will demonstrate a conversational products recommendation agent. This system shows how we combine research in personalized recommendation systems with research in dialogue systems to build a virtual sales agent. Based on new deep learning technologies we developed, the virtual agent is capable of learning how to interact with users, how to answer user questions, what is the next question to ask, and what to recommend when chatting with a human user. Normally a descent conversational agent for a particular domain requires tens of thousands of hand labeled conversational data or hand written rules. This is a major barrier when launching a conversation agent for a new domain. We will explore and demonstrate the effectiveness of the learning solution even when there is no hand written rules or hand labeled training data.