Visible to the public Biblio

Filters: Author is Chen, Di  [Clear All Filters]
2023-02-17
Chen, Di.  2022.  Practice on the Data Service of University Scientific Research Management Based on Cloud Computing. 2022 World Automation Congress (WAC). :424–428.
With the continuous development of computer technology, the coverage of informatization solutions covers all walks of life and all fields of society. For colleges and universities, teaching and scientific research are the basic tasks of the school. The scientific research ability of the school will affect the level of teachers and the training of students. The establishment of a good scientific research environment has become a more important link in the development of universities. SR(Scientific research) data is a prerequisite for SR activities. High-quality SR management data services are conducive to ensuring the quality and safety of SRdata, and further assisting the smooth development of SR projects. Therefore, this article mainly conducts research and practice on cloud computing-based scientific research management data services in colleges and universities. First, analyze the current situation of SR data management in colleges and universities, and the results show that the popularity of SR data management in domestic universities is much lower than that of universities in Europe and the United States, and the data storage awareness of domestic researchers is relatively weak. Only 46% of schools have developed SR data management services, which is much lower than that of European and American schools. Second, analyze the effect of CC(cloud computing )on the management of SR data in colleges and universities. The results show that 47% of SR believe that CC is beneficial to the management of SR data in colleges and universities to reduce scientific research costs and improve efficiency, the rest believe that CC can speed up data storage and improve security by acting on SR data management in colleges and universities.
ISSN: 2154-4824
2017-05-16
Chen, Di, Zhang, Qin.  2016.  Streaming Algorithms for Robust Distinct Elements. Proceedings of the 2016 International Conference on Management of Data. :1433–1447.

We study the problem of estimating distinct elements in the data stream model, which has a central role in traffic monitoring, query optimization, data mining and data integration. Different from all previous work, we study the problem in the noisy data setting, where two different looking items in the stream may reference the same entity (determined by a distance function and a threshold value), and the goal is to estimate the number of distinct entities in the stream. In this paper, we formalize the problem of robust distinct elements, and develop space and time-efficient streaming algorithms for datasets in the Euclidean space, using a novel technique we call bucket sampling. We also extend our algorithmic framework to other metric spaces by establishing a connection between bucket sampling and the theory of locality sensitive hashing. Moreover, we formally prove that our algorithms are still effective under small distinct elements ambiguity. Our experiments demonstrate the practicality of our algorithms.