Arm Race Problem - Huang Xiao

In security, there are analogues to the arms race problem for describing a game-theoretic setting between multiple parties, who usually compete over a certain assets to achieve their own goals respectively, e.g., arms race between malware writers and anti-virus writers or spammers and anti-spam detectors. Similarly in adversarial machine learning, it also involves at least two parties, namely the learner and the adversary. Driven by opposite objectives of the learners and adversaries, a "reactive" arms race problem is commonly launched by the party of the very intelligent and adaptive adversaries. By "reactive" it refers to a unconscious situation to the learners, who usually naively sticks to the stationary assumption until the adversaries explore some vulnerabilities in the design of algorithm. Damages caused by the adversaries are supposed to be discovered rapidly by the learners, and are to be countered usually by either retraining on new data excluding malicious ones, or adding or removing a certain number of features that are intrinsically sensitive to those attacks. Even the most efficient countermeasures would have a significant delay of response time, which gives adversaries enough time to obtain what they aim for. ![[chp02_sec01_arms_race.png]] `Figure.1: Arm race problem in both reactive and proactive mode` ^fig-1 The "reactive" arms race problem exists commonly in security engineering. A common approach widely used in its domain, especially in cryptography is *security by obscurity*. It protects confidential information by keeping it under an encrypted shell, and the malicious users are not supposed to get through to the protection. Although this mechanism is still valuable in the domain of adversarial learning, we advocate and investigate another paradigm of securing a learning system, that is, instead of reactively responding to attacks, the system should be secured proactively in the design phase, so called "proactive" arms race problem. Following the paradigm of *security by design*, the learners are now aware of the possible breakthrough of the stationary assumption, and can anticipate potential attacks by modelling the adversaries. On one hand, such proactive security strategy offers the system higher level of security by taking potential vulnerabilities into account before designing the learning algorithm, on the other hand, the adversaries would spend much more effort to explore the systems before they are detected. The robustness is then guaranteed for a longer time and even without much effort of human intervention. In [[Arm Race Problem#^fig-1|Figure.1]] we depicted the concepts of both reactive and proactive arms race problems, which are of central role for the adversarial learning. On the left hand side, the "reactive" arms race problem is formulated in four stages. Firstly, the adversary examines an existing learning system's potential vulnerabilities driven by a certain incentive, e.g., economic gain, privacy. By analysing the system, the adversary devise an effective attack and perform on the system. Then the attack may cause suspicious system behaviours which are then discovered by the learner. Finally, the learner responses to the attack by retraining on new data or refining features accordingly. In proactive mode on the right hand side, the learner assumes the existence of adversaries and models the adversaries before the real attacks happen. By simulating the potential attacks, the learners can evaluate its negative impact on current system, whose security is then enhanced again at the design phase. Although in proactive arms race the adversary is not explicitly shown, we expect the learners' capability of modelling the potential adversaries highly depends on the properties of the adversaries, who are extremely adaptive during the iterations of the arms races. Indeed the arms race problem in adversarial learning environment is still iterative and evolving over time. The concept diagram in [[Arm Race Problem#^fig-1|Figure.1]] illustrates a full cycle of adversarial learning and implies that the game between adversary and learner can be either one-shot game or iterated game. Although the importance of equilibrium in this game-theoretic setting can not be neglected, we focus on one-shot game in this work as atomic game to achieve balance. A work (Brückner & Scheffer, 2009) has explored properties of Nash-equilibrial prediction models, and shown that under certain conditions we do find a unique Nash equilibrium between the adversary and the learner. Nevertheless iterated games and equilibrial of various adversarial games remain as an important research direction for the future. Finally built on the arms race problem defined, we summarise the problems to be investigated in this work as follows, - **Analysis of targeted learner process** - A targeted learner process, e.g., classifier, must be firstly thoroughly studied and carefully analyzed, so as to model a potential adversary based on the knowledge of the learner process. During the analysis, theoretical properties of the learning algorithm will be a central issue. - **Adversary modelling** - Potential attacks are modelled targeting a particular learner process that evaluated thoroughly. And the impact of attacks is also to be examined. - **Design of countermeasures** - Countermeasures against adversarial attacks are integrated directly at the design phase. The learning algorithms assume a possibly non-stationary data distribution and can adapt to concept shift, they are designed as being more robust and more resilient to uncertainty of data sampling.