When and How to Protect? Modeling Repeated Interactions with Computing Services Under Uncertainty
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modern computing systems and web applications often interact or provide services to entities without knowing their type, i.e., it could be interacting with an honest entity wanting a genuine good response from it, or it could be interacting with a malicious entity wanting to compromise it by launching security and privacy attacks against it. Hence, designing and modeling effective defense mechanisms for such systems is a non-trivial task. The problem is even more difficult against a strategic adversary who repeatedly interacts with the target system by imitating an honest end-user/entity to infer as much information as possible, then stealthily and strategically attacking it at the appropriate opportunity. In such an interaction scenario, on the one hand, the system's goal is to strategically block the malicious entities without significantly impacting the quality of service provided to the honest entities. On the other hand, a malicious entity's objective is to stealthily compromise the target system as quickly as possible without being detected and in the most cost-efficient manner. This dissertation plans to study this classical trade-off in modern computing services and web applications by focusing on three unique use-cases of such applications/services that present this conundrum. In the first use-case, we consider the scenario of a mobile operating system attempting to regulate access to zero-permission or permission-less sensors such as accelerometers, gyroscopes, and ambient light sensors. These sensors are critical for all mobile applications, but malicious applications can misuse data from them to infer private information about users. So the mobile system must strategically decide under what conditions to share data from these sensors without knowing the type (malicious or honest) of a mobile application requesting the data. In the first research thrust of this dissertation, we address the above trade-off by modeling the strategic interactions between mobile applications and a defense mechanism (or a mobile system) using a two-player discrete-time, imperfect information game called the Signaling game. The second use-case that we consider comprises of a black-box machine learning model that provides a label to each query sent by an end-user and an explanation (or attribution) for that label. Such explanations/attributions can be very useful for honest users in understanding model decisions. However, malicious users can misuse repeated explanations to reveal private model information such as parameters and training data. So the model must strategically decide under what conditions to stop sharing explanations with end users without knowing their type (malicious or honest). In the second research thrust, we address this trade-off by modeling the dynamics of explanation variance generated by a system (comprising of an ML model and the corresponding explanation technique) for predictions/labels related to queries sent by end-users. Specifically, we model the interactions between an end-user and the system, where the variance of the explanations generated by the system evolve according to a \emph{lcub}stochastic differential equation (SDE){rcub}, as a \emph{lcub}two-player continuous-time Signaling Game{rcub}. Such a modeling and analysis exercise helps us determine the optimal explanation variance threshold for an attacker to launch explanation-based threshold attacks against the system. The third use-case that we consider is a federated learning scenario, where multiple clients (malicious or honest) cooperatively learn a global system model (computed by some server) in a distributed or decentralized fashion. In such a distributed learning scenario, malicious clients want to cheat the server (computing the global model) by stealthily sending false/incorrect updates, while the server wants to detect such malicious updates in a timely fashion so that it does not corrupt the global model. In the third research thrust, we address this trade-off by designing a Bayesian defense mechanism on the server-side. Specifically, we employ concepts from non-parametric Bayesian modeling to compute a probabilistic measure that can be leveraged in the detection phase (of the malicious updates) with the aim to decouple it from the local clients' training strategies such as data distribution, attack strategy of the malicious clients or the number of clients selected in a federated learning training round.