Robust Countermeasures for Adversarial Attacks on Deep Learning, Deep Reinforcement Learning, and Deepfake
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Machine Learning (ML) algorithms are in demand in almost every field. Yet, even as they become commonplace, they are hardly understood. Their complex architecture makes them elusive even to scientists. This same sophistication that makes ML algorithms high-demand solutions has abundant potential for malicious exploitation, raising urgent questions on the reliability of ML-based solutions in security-critical applications. This thesis pursues solutions to this exigency by developing a new solution to detect attacks on Machine Learning models.
This thesis will begin with a survey of recent attacks on ML algorithms at test time, alongside a comprehensive overview of current state-of-the-art countermeasures. After studying the strengths and vulnerabilities of these solutions, we propose a hybrid image purification defense against adversarial attacks based on the adaptive clustering of robust semantic representations.
Next, we propose a hybrid security method that minimizes the vulnerabilities of a reinforcement learning (RL) agent, which navigates an adversarial scenario with under-specified dynamics. To minimize vulnerabilities, we reduce the model the RL agent learns by modeling part of it as a static graph. This allows the agent to learn only causal relations in order to safely navigate immediate unknowns.
Lastly, we study countermeasures for deepfake attacks. We begin by evaluating current state-of-the-art security methods to defend against deepfake attacks. The vast majority of leading deepfake countermeasures are Deep Learning based deepfake detectors, although these methods are flawed by their trade-off between robustness and explainability. We overcome this impasse by proposing an explainable deepfake detector with weakly supervised deep attention data augmentation.