Einstein Notation and Fast Matrix Multiplications
We discuss how Einstein notation can help clarify why kij loops are used when implementing naive matrix multiplication.
30 November 2025
This algorithm is still relevant as of the time of this blog post, since the naive way is fundamentally the way most (all?) libraries implement, albeit with blocking methods (architecture specific) for performance. Theoretical methods have not taken over yet.
Suppose we want to compute the product of two matrices and save it into a matrix . In Einstein notation, we would write the entry of as
where is a dummy index. To completely fill in matrix , we must loop over as well. Altogether, we must loop over indices . But the order in which we do so does not matter by Fubini's / commutativity of addition. This raises the question of which index we should put into the hot loop.
The insight is that in row-major order languages, to obey spatial locality, we fix the row and iterate over the columns. In Einstein notation, this corresponds to fixing the top index and moving the bottom one. Looking at equation (1), we see a repeat of index on both sides of the equation (namely on matrices C and B). Therefore, we choose hot index , which means choosing either or .
Contrast this to the naive implementation, where the dummy index is hot. In that case, disobeys spatial locality since matrix multiplication has an access with stride , where is the number of columns of .