What do these CAN_BUS values represent and why are they used as positional encodings for BEVFormer?
BEVFormer is a transformer-based architecture that is heavily used as a component of many other architectures. It learns bird-eye-view representations of scenes for self-driving cars. When studying its source code to understand how it works I came across these lines: