Matrices and working with matrices can be fun. While a lot of data science could be done without doing crazy math with matrices, there is a lot of complex and interesting analysis that is difficult without the use of them. Thus it is good to get plenty of practice using these little data structures.
The code I use in this post can be found on my GitHub here, or follow the link at the bottom of this post.
Adding and Subtracting Two Matrices
First we start with two matrices, A and B. We can use the following R code to generate them.
A = matrix(c(2, 0, 1, 3), ncol = 2)
B = matrix(c(5, 2,4, -1), ncol = 2)
Code language: R (r)
And then we have two matrices that look like the following:
Matrix A
[,1] [,2]
[1,] 2 1
[2,] 0 3
Matrix B
[,1] [,2]
[1,] 5 4
[2,] 2 -1
Code language: plaintext (plaintext)
Now, if I wanted to add these two matrices together, I could do it by hand, which is not to difficult, or I can use some super simple R code and get an answer immediately.
# Find A + B
A + B
Code language: R (r)
A single line of code and you get a result:
[,1] [,2]
[1,] 7 5
[2,] 2 2
Code language: plaintext (plaintext)
If you were to do the calculation by hand and compare, you would get the same result.
Now lets subtract matrix B from A. Again, you could take your time doing it by hand, or do the super easy R code below:
# Find A - B
A - B
Code language: R (r)
And this gives us the following output:
[,1] [,2]
[1,] -3 -3
[2,] -2 4
Code language: plaintext (plaintext)
While traditionally working with large matrices can be difficult, R makes doing these kinds of calculations quick and easy.
Using diag()
One of the most used matrices is the identity matrix which is used to find the inverse of a matrix so that one can divide to matrices. This is done with R’s diag() function. Besides just creating an identity matrix with a series of 1’s going across diagonally, a single number or vector can be passed as one of the arguments to the function. For example if we wanted to have the values 4, 1, 2, and 3 in the diagonal, the code would look like the following:
diag(x = c(4, 1, 2, 3), nrow = 4)
Code language: R (r)
Since it is not being assigned to a variable, the matrix is printed directly to the console.
[,1] [,2] [,3] [,4]
[1,] 4 0 0 0
[2,] 0 1 0 0
[3,] 0 0 2 0
[4,] 0 0 0 3
Code language: plaintext (plaintext)
While this matrix may not be particularly useful, the identity matrix that can be created using this function is necessary for any attempts at more complex calculations such as dividing matrices.
Generating a Matrix
So you want to generate a matrix? How would you generate a matrix that has 3’s on the diagonal, first row has 1’s, and first column has 2’s? The R code would look something like this:
C <- diag(x = 3, nrow = 5)
C[1, 2:5] <- 1
C[2:5 ,1] <- 2
C
Code language: R (r)
After running the code you get this output:
[,1] [,2] [,3] [,4] [,5]
[1,] 3 1 1 1 1
[2,] 2 3 0 0 0
[3,] 2 0 3 0 0
[4,] 2 0 0 3 0
[5,] 2 0 0 0 3
Code language: plaintext (plaintext)
I just did a lot of code at once, so let us walk through it. First I used the diag() function I described earlier to fill in the 3’s along the diagonal. It is critical this step is first because the diag() function only creates a brand new matrix, it does not fill in the values. To fill the remains values I specified the rows and columns and set all of those rows and columns equal to either 1 or 2. In the bracket on line 2 above, the first number is the row number, and then I use 2:5 which means all values between 2 and 5. I use similar code on the following line to set those values to 2.
R is really good at making difficult math, such as dealing with matrices, very simple and easy to follow. Other languages could easily take 10 times as much code to do the same instructions I did above, and they will be more buggy and prone to error than R. This is something about R that I love, and is only going to further enable my addiction to the language.
Links:
GitHub: https://github.com/SimonLiles/LIS4370RProgramming/blob/main/LIS4370Mod6.R