학술논문

GeeSolver: A Generic, Efficient, and Effortless Solver with Self-Supervised Learning for Breaking Text Captchas
Document Type
Conference
Source
2023 IEEE Symposium on Security and Privacy (SP) SP Security and Privacy (SP), 2023 IEEE Symposium on. :1649-1666 May, 2023
Subject
Components, Circuits, Devices and Systems
Computing and Processing
Privacy
Self-supervised learning
Feature extraction
Real-time systems
Decoding
Security
Data mining
Text-based-captchas
deep-learning
self-supervised-learning
masked-autoencoders
Language
ISSN
2375-1207
Abstract
Although text-based captcha, which is used to differentiate between human users and bots, has faced many attack methods, it remains a widely used security mechanism and is employed by some websites. Some deep learning-based text captcha solvers have shown excellent results, but the labor-intensive and time-consuming labeling process severely limits their viability. Previous works attempted to create easy-to-use solvers using a limited collection of labeled data. However, they are hampered by inefficient preprocessing procedures and inability to recognize the captchas with complicated security features.In this paper, we propose GeeSolver, a generic, efficient, and effortless solver for breaking text-based captchas based on self-supervised learning. Our insight is that numerous difficult-to-attack captcha schemes that "damage" the standard font of characters are similar to image masks. And we could leverage masked autoencoders (MAE) to improve the captcha solver to learn the latent representation from the "unmasked" part of the captcha images. Specifically, our model consists of a ViT encoder as latent representation extractor and a well-designed decoder for captcha recognition. We apply MAE paradigm to train our encoder, which enables the encoder to extract latent representation from local information (i.e., without masking part) that can infer the corresponding character. Further, we freeze the parameters of the encoder and leverage a few labeled captchas and many unlabeled captchas to train our captcha decoder with semi-supervised learning.Our experiments with real-world captcha schemes demonstrate that GeeSolver outperforms the state-of-the-art methods by a large margin using a few labeled captchas. We also show that GeeSolver is highly efficient as it can solve a captcha within 25 ms using a desktop CPU and 9 ms using a desktop GPU. Besides, thanks to latent representation extraction, we successfully break the hard-to-attack captcha schemes, proving the generality of our solver. We hope that our work will help security experts to revisit the design and availability of text-based captchas. The code is available at https://github.com/NSSL-SJTU/GeeSolver.