Biblio
The resistance to attacks aimed to break CAPTCHA challenges and the effectiveness, efficiency and satisfaction of human users in solving them called usability are the two major concerns while designing CAPTCHA schemes. User-friendliness, universality, and accessibility are related dimensions of usability, which must also be addressed adequately. With recent advances in segmentation and optical character recognition techniques, complex distortions, degradations and transformations are added to text-based CAPTCHA challenges resulting in their reduced usability. The extent of these deformations can be decreased if some additional security mechanism is incorporated in such challenges. This paper proposes an additional security mechanism that can add an extra layer of protection to any text-based CAPTCHA challenge, making it more challenging for bots and scripts that might be used to attack websites and web applications. It proposes the use of hidden text-boxes for user entry of CAPTCHA string which serves as honeypots for bots and automated scripts. The honeypot technique is used to trick bots and automated scripts into filling up input fields which legitimate human users cannot fill in. The paper reports implementation of honeypot technique and results of tests carried out over three months during which form submissions were logged for analysis. The results demonstrated great effectiveness of honeypots technique to improve security control and usability of text-based CAPTCHA challenges.
Text-based CAPTCHAs are still commonly used to attempt to prevent automated access to web services. By displaying an image of distorted text, they attempt to create a challenge image that OCR software can not interpret correctly, but a human user can easily determine the correct response to. This work focuses on a CAPTCHA used by a popular Chinese language question-and-answer website and how resilient it is to modern machine learning methods. While the majority of text-based CAPTCHAs focus on transcription tasks, the CAPTCHA solved in this work is based on localization of inverted symbols in a distorted image. A convolutional neural network (CNN) was created to evaluate the likelihood of a region in the image belonging to an inverted character. It is used with a feature map and clustering to identify potential locations of inverted characters. Training of the CNN was performed using curriculum learning and compared to other potential training methods. The proposed method was able to determine the correct response in 95.2% of cases of a simulated CAPTCHA and 67.6% on a set of real CAPTCHAs. Potential methods to increase difficulty of the CAPTCHA and the success rate of the automated solver are considered.
In this paper we explore the potential for securing a distributed Arabic Optical Character Recognition (OCR) system via cloud computing technology in a pervasive and mobile environment. The goal of the system is to achieve full accuracy, high speed and security when taking into account large vocabularies and amounts of documents. This issue has been resolved by integrating the recognition process and the security issue with multiprocessing and distributed computing technologies.