The Big Picture
Matrix multiplication is one of the main computational building blocks in modern machine learning. Neural networks use it to combine inputs with learned weights. Large language models, or LLMs, are stacked neural networks that perform huge numbers of these matrix multiplications to transform text into predictions.
If you understand matrix multiplication as "mixing numbers according to learned recipes," you already have the foundation for understanding why neural networks and LLMs are so compute-heavy.