Biblio
Web archives about school shootings consist of webpages that may or may not be relevant to the events of interest. There are 3 main goals of this work; first is to clean the webpages, which involves getting rid of the stop words and non-relevant parts of a webpage. The second goal is to select just webpages relevant to the events of interest. The third goal is to upload the cleaned and relevant webpages to Apache Solr so that they are easily accessible. We show the details of all the steps required to achieve these goals. The results show that representative Web archives are noisy, with 2% - 40% relevant content. By cleaning the archives, we aid researchers to focus on relevant content for their analysis.
Cloud computing is widely deployed to handle challenges such as big data processing and storage. Due to the outsourcing and sharing feature of cloud computing, security is one of the main concerns that hinders the end users to shift their businesses to the cloud. A lot of cryptographic techniques have been proposed to alleviate the data security issues in cloud computing, but most of these works focus on solving a specific security problem such as data sharing, comparison, searching, etc. At the same time, little efforts have been done on program security and formalization of the security requirements in the context of cloud computing. We propose a formal definition of the security of cloud computing, which captures the essence of the security requirements of both data and program. Analysis of some existing technologies under the proposed definition shows the effectiveness of the definition. We also give a simple look-up table based solution for secure cloud computing which satisfies the given definition. As FPGA uses look-up table as its main computation component, it is a suitable hardware platform for the proposed secure cloud computing scheme. So we use FPGAs to implement the proposed solution for k-means clustering algorithm, which shows the effectiveness of the proposed solution.