Title | Exploring Adversaries to Defend Audio CAPTCHA |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Shekhar, Heemany, Moh, Melody, Moh, Teng-Sheng |
Conference Name | 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) |
Date Published | dec |
Keywords | adversarial audio captcha, adversarial audio datasets, adversarial examples attack, attack accuracy, attack models, authorisation, basic iterative method, BIM, bots, CAPTCHA, captchas, Classification algorithms, composability, Deep Learning, deep learning models, deepFool, distortion, handicapped aids, Human Behavior, Internet, learning (artificial intelligence), machine learning, machine learning algorithms, medium background noise, near-sighted users, neural nets, Noise measurement, pubcrawl, security, single audio captcha, Speech recognition, spoken digits, Web sites, Web-based authentication method, websites |
Abstract | CAPTCHA is a web-based authentication method used by websites to distinguish between humans (valid users) and bots (attackers). Audio captcha is an accessible captcha meant for the visually disabled section of users such as color-blind, blind, near-sighted users. Firstly, this paper analyzes how secure current audio captchas are from attacks using machine learning (ML) and deep learning (DL) models. Each audio captcha is made up of five, seven or ten random digits[0-9] spoken one after the other along with varying background noise throughout the length of the audio. If the ML or DL model is able to correctly identify all spoken digits and in the correct order of occurance in a single audio captcha, we consider that captcha to be broken and the attack to be successful. Throughout the paper, accuracy refers to the attack model's success at breaking audio captchas. The higher the attack accuracy, the more unsecure the audio captchas are. In our baseline experiments, we found that attack models could break audio captchas that had no background noise or medium background noise with any number of spoken digits with nearly 99% to 100% accuracy. Whereas, audio captchas with high background noise were relatively more secure with attack accuracy of 85%. Secondly, we propose that the concepts of adversarial examples algorithms can be used to create a new kind of audio captcha that is more resilient towards attacks. We found that even after retraining the models on the new adversarial audio data, the attack accuracy remained as low as 25% to 36% only. Lastly, we explore the benefits of creating adversarial audio captcha through different algorithms such as Basic Iterative Method (BIM) and deepFool. We found that as long as the attacker has less than 45% sample from each kinds of adversarial audio datasets, the defense will be successful at preventing attacks. |
DOI | 10.1109/ICMLA.2019.00192 |
Citation Key | shekhar_exploring_2019 |