Visible to the public Biblio

Filters: Author is Han, Peiyi  [Clear All Filters]
2023-02-02
Wang, Zirui, Duan, Shaoming, Wu, Chengyue, Lin, Wenhao, Zha, Xinyu, Han, Peiyi, Liu, Chuanyi.  2022.  Generative Data Augmentation for Non-IID Problem in Decentralized Clinical Machine Learning. 2022 4th International Conference on Data Intelligence and Security (ICDIS). :336–343.
Swarm learning (SL) is an emerging promising decentralized machine learning paradigm and has achieved high performance in clinical applications. SL solves the problem of a central structure in federated learning by combining edge computing and blockchain-based peer-to-peer network. While there are promising results in the assumption of the independent and identically distributed (IID) data across participants, SL suffers from performance degradation as the degree of the non-IID data increases. To address this problem, we propose a generative augmentation framework in swarm learning called SL-GAN, which augments the non-IID data by generating the synthetic data from participants. SL-GAN trains generators and discriminators locally, and periodically aggregation via a randomly elected coordinator in SL network. Under the standard assumptions, we theoretically prove the convergence of SL-GAN using stochastic approximations. Experimental results demonstrate that SL-GAN outperforms state-of-art methods on three real world clinical datasets including Tuberculosis, Leukemia, COVID-19.
2020-07-09
Liu, Chuanyi, Han, Peiyi, Dong, Yingfei, Pan, Hezhong, Duan, Shaoming, Fang, Binxing.  2019.  CloudDLP: Transparent and Automatic Data Sanitization for Browser-Based Cloud Storage. 2019 28th International Conference on Computer Communication and Networks (ICCCN). :1—8.

Because cloud storage services have been broadly used in enterprises for online sharing and collaboration, sensitive information in images or documents may be easily leaked outside the trust enterprise on-premises due to such cloud services. Existing solutions to this problem have not fully explored the tradeoffs among application performance, service scalability, and user data privacy. Therefore, we propose CloudDLP, a generic approach for enterprises to automatically sanitize sensitive data in images and documents in browser-based cloud storage. To the best of our knowledge, CloudDLP is the first system that automatically and transparently detects and sanitizes both sensitive images and textual documents without compromising user experience or application functionality on browser-based cloud storage. To prevent sensitive information escaping from on-premises, CloudDLP utilizes deep learning methods to detect sensitive information in both images and textual documents. We have evaluated the proposed method on a number of typical cloud applications. Our experimental results show that it can achieve transparent and automatic data sanitization on the cloud storage services with relatively low overheads, while preserving most application functionalities.