GENPass: A Multi-Source Deep Learning Model For Password Guessing
Loading...
Links to Files
Author/Creator
Author/Creator ORCID
Date
2019-09-11
Type of Work
Department
Program
Citation of Original Publication
Z. Xia, P. Yi, Y. Liu, B. Jiang, W. Wang and T. Zhu, "GENPass: A Multi-Source Deep Learning Model for Password Guessing," in IEEE Transactions on Multimedia. doi: 10.1109/TMM.2019.2940877 keywords: {Password;Neural networks;Deep learning;Gallium nitride;Training;Computational modeling;Markov processes}, URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8832180&isnumber=4456689
Rights
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution 4.0 International (CC BY 4.0)
https://creativecommons.org/licenses/by/4.0/
Attribution 4.0 International (CC BY 4.0)
https://creativecommons.org/licenses/by/4.0/
Abstract
The password has become today’s dominant method
of authentication. While brute-force attack methods such as
HashCat and John the Ripper have proven unpractical, the
research then switches to password guessing. State-of-the-art
approaches such as the Markov Model and probabilistic contextfree grammar (PCFG) are all based on statistical probability.
These approaches require a large amount of calculation, which
is time-consuming. Neural networks have proven more accurate
and practical in password guessing than traditional methods.
However, a raw neural network model is not qualified for crosssite attacks because each dataset has its own features. Our work
aims to generalize those leaked passwords and improves the
performance in cross-site attacks.
In this paper, we propose GENPass, a multi-source deep
learning model for generating “general” password. GENPass
learns from several datasets and ensures the output wordlist can
maintain high accuracy for different datasets using adversarial
generation. The password generator of GENPass is PCFG+LSTM
(PL). We are the first to combine a neural network with PCFG.
Compared with Long short-term memory (LSTM), PL increases
the matching rate by 16%-30% in cross-site tests when learning
from a single dataset. GENPass uses several PL models to learn
datasets and generate passwords. The results demonstrate that
the matching rate of GENPass is 20% higher than by simply
mixing datasets in the cross-site test. Furthermore, we propose
GENPass with probability (GENPass-pro), the updated version
of GENPass, which can further increase the matching rate of
GENPass.