Deciphering src_key_padding_mask
A deep dive into src_key_padding_mask and how to check for correctness.
Hypothesis Driven Mindset and First Principles Thinking
PhD Reflection (and hopefully advice to prospective students)
Intricacies of nn.CrossEntropyLoss Ignore Index and Gradients
A deep dive into ignore index and how it affects the gradients.