I am trying to re-implement Krizhevsky et al., 2012 using PyTorch, and I am confused how precisely the second and the third convolutional layers of the AlexNet model communicate (and same for inputs from the fifth to the sixth layer, and from the sixth to the seventh layer, although I am omitting this in my question here).
In the figure below, there are two “filters” that pass the output from the top half to next top half but also the bottom half. Similarly, the bottom half has two “filters” that pass the output to the next bottom and top half.
I don’t have enough reputation points to embed images, so here’s the partial screenshot of Figure 1 from Krizhevsky et al. (2012).
How are the outputs of the second layer being passed to the third layer?
I read the paper, and unless I missed something, it seems that the authors didn’t precisely outline how the outputs are being passed from the second to the third layer. I skimmed through a bunch of blogposts and git repositories, and most descriptions are high-level and most implementations don’t seem to split the model between two GPUs.
The most relevant thing I could find was the following sentence from the convnet2 readme:
Here, the layers conv2a and conv2b take both conv1a and conv1b as inputs. An implicit copy operation is performed in order to get the output of conv1a into the input of conv2b, as well as the output of conv1b into the input of conv2a.
My best guess is that the out_channels
parameter in the second layers should actually be 128 instead of 256, and then the output from both the top and bottom layer should be concatenated as torch.cat([output_from_top_half, output_from_bottom_half], dim=1)
and passed to the top and bottom half of the third layer. After some back-and-forth with Claude, it was suggested that I should set groups=2
in the second convolution layer instead of manually splitting and concatenating.
I am confused what’s the correct interpretation here, and would appreciate any helpful pointers!
minimafinder is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.