Breaking the Privacy Paradox: Pushing AI to the Edge with Provable Guarantees
As an immense number of connected edge devices such as mobile devices, wearables, and autonomous vehicles generate massive amounts of data each day to develop machine learning based intelligent services, multiple spheres of human life, such as healthcare, entertainment, and industry, are being transformed. The traditional process for developing machine learning applications is to gather a large dataset, train a model on the data, and run the trained model on a cloud server. Due to the growing tension between the need for big data and privacy protection, it is increasingly attractive to push artificial intelligence (AI) to the edge, e.g., to enable edge devices to train machine learning models collaboratively while keeping the data locally. However, the deployment of this distributed learning architecture depends on a set of challenges, such as the new privacy risks, limited computation and communication resources, heterogeneous data and devices, and security vulnerabilities. This work aims to improve privacy, efficiency, and security when pushing AI to the edge. Specifically, we propose privacy-preserving distributed learning schemes that can provide rigorous privacy guarantees for each device in the learning system, and the improved methods based on this can even reduce the privacy loss and communication cost of devices at the same time. Meanwhile, we design an incentive mechanism to encourage users to contribute their raw data to these private distributed learning systems. Besides, we consider the heterogeneous energy of edge devices and develop a data offloading and queueing mechanism to improve the energy efficiency, and also explore the vulnerability of machine learning systems by attacking a recommendation system using malicious devices. Our proposed methods are validated theoretically and experimentally with rigorous analysis and extensive experiments on real-world datasets.