When mathematicians think about matrix multiplication (and matrices in general), they don't really think about "rotating" matrices like in the animation above, but rather about operators and their composition. The matrix multiplication is what it is, because that's how function composition works.
Look: consider two functions, f(x, y) = (x + 2y, 3x + 4y), and g(x, y) = (-x + 3y, 4x - y). What's f(g(x, y))? Well, let's work it out, it's simple algebra:
Whew, that was some hassle to keep track of everything. Now, here's what mathematicians typically do instead: they introduce matrices to make it much easier to keep track of the operations:
Let e_0 = (1, 0), and e_1 = (0, 1). Then f(e_0) = f(1, 0) = (1, 3) = e_0 + 3 e_1, and f(e_1) = f(0, 1) = (2, 4) = 2 e_0 + 4 e_1. Thus, mathematicians would write that f in basis e_0, e_1 is represented by the matrix
[1 2]
[3 4]
so that when you multiply it by the (coulmn) vector [x, y], you get
[x]
* [y]
[1 2][x + 2y]
[3 4][3x + 4y]
Similarly, g(e_0) = (-1, 4) = -e_0 + 4e_1, g(e_1) = (3, -1) = 3e_0 - e_1, so it's represented by the matrix:
Thanks all in the thread for explanations. I had matrix multiplication at the Uni (IT faculty, many years ago) as part of algebra courses, I memorized, I passed the exam, and forgot the topic. Though, I don't remember anyone explaining it then why it's used or useful.
My early understanding after reading your responses and the wiki article, it that's useful if we have some input data (vector), which then undergoes some sequential manipulation by several functions, and we want to know the result in one step, instead of many?
Look: consider two functions, f(x, y) = (x + 2y, 3x + 4y), and g(x, y) = (-x + 3y, 4x - y). What's f(g(x, y))? Well, let's work it out, it's simple algebra:
f(g(x, y)) = f(-x+3y, 4x-y) = ((-x+3y)+2(4x-y), 3(-x+3y) + 4(4x-y)) = (-x + 3y + 8x - 2y, -3x + 9y + 16x - 4y) = (7x + y, 13x + 5y).
Whew, that was some hassle to keep track of everything. Now, here's what mathematicians typically do instead: they introduce matrices to make it much easier to keep track of the operations:
Let e_0 = (1, 0), and e_1 = (0, 1). Then f(e_0) = f(1, 0) = (1, 3) = e_0 + 3 e_1, and f(e_1) = f(0, 1) = (2, 4) = 2 e_0 + 4 e_1. Thus, mathematicians would write that f in basis e_0, e_1 is represented by the matrix
[1 2] [3 4]
so that when you multiply it by the (coulmn) vector [x, y], you get
Similarly, g(e_0) = (-1, 4) = -e_0 + 4e_1, g(e_1) = (3, -1) = 3e_0 - e_1, so it's represented by the matrix:[-1 3] [ 4 -1]
Now, let's multiply matrix of f by matrix of g:
and when we multiply the resulting matrix by column vector [x, y]: So, what did we get was in fact our original calculation of f(g(x, y)) = (7x + y, 13x + 5y).The conclusion here is that matrix multiplication is what the function composition forces it to be.