I am trying to implement Pass2Edit (this paper: read 3.1, 3.2). It takes in original password and current password strings, and tries to model the edit behaviour. The following is what it looks like:
The input of the neural network is the password pair, and the output is the probability of each transformation state. From the paper I understand that the model works by:
- Firstly, the input passes through the embedding layer, and each one-hot encoded password character is converted into a 256-dimensional vector (i.e., v_origi and v_curi )
- Next, concatenate v_origi and v_curi into vi and then input it to a 3-layer GRU (the hidden layer dimension is 256)
- Finally, take the output of the GRU for the last character through a 2-layer FC (i.e., fully connected layer, where the hidden layer dimension is 512), and finally obtain the probability of each transformation ti through the softmax layer.
Specifically, after each password is transformed into a key sequence,
the character set Σ includes 48 types of characters that can be
entered through the EN-US standard keyboard, as well as<shift>,
<caps>and <placeholder> (48+3=51). If we limit the length of the password to no more
than 30 (i.e., 0≤p<30), then the total number of atomic operations is
|t|=30∗51+30+1=1, 561, where 30∗51 is the category # of insertions, 30
is the category # of deletions, and 1 represents the EOS operation. In
this light, our one-step prediction process can essentially be seen as
a 1,561-class multi-classification problem.
I am very new to writing a RNN model, and am not able to translate this to the keras GRU implementation (which seems fairly easy to implement).
Specifically:
- Since the dataset contains variable length password pairs and they have limited the password length to 30, does that mean when l<30, the rest of the GRU units just do not engage?
- Same goes for the final number of classes I have for prediction. Since the model assumes it as a 1561 class prediction problem, there are classes that are just irrelevant for l<30. For example the class INS(14, “a”) when password length is 8.
- How do I incorporate the caps key, shift key and “placeholders” they mention in the paper?
An outline of the model, some clarity on how l<30 passwords will work and a way to put in caps key, shift key and “placeholders” would be really helpful. Thanks!