メモ: torch.nn.functional.kl_div
torch.nn.functional.kl_divのメモ
torch.nn.functional.kl_divのメモ
How RoPE is implemented in practice.
PCAの基礎と実装
Removing bias term from LogitLens.
Relationship between RMSNorm and LayerNorm and rewritten LayerNorm using centering matrix.