Equations
Solving equations is an important part of mathematics. If we are working with more than one unknown at a time, we need to solve systems of equations. You may already know how to solve a system of linear equations, but matrices provide a more compact way to arrive at the solution. Matrices are also easier to manipulate on a computer or calculator. Both of these facts will become more important when you work with larger systems.
Let's look at a system of linear equations. The system
| 5x1 | + | 3x2 | = | 93 | ||
| -4x1 | - | 2x2 | = | -66 |
can be written in matrix form as AX = B where
However, you will usually see Ax = b rather than AX = B because most authors use small letters to represent vectors. You can multiply this out to convince yourself that AX = B does represent this system.
When you learned to solve systems of linear equations, you learned that
We can also solve this system entirely in matrix form. We use the same rules, and we call them elementary row operations (EROs). The EROs tell us that we can
| x1 | = | a | ||
| x2 | = | b |
you would be able to "solve" it instantly because you only have
to read off the solution. If this system was written using matrix notation, it
would look like this:
The matrix
is the
2 by 2 identity matrix. Because you can just read off the solution when a system
is in this form, our first goal is to transform our system into this form.
Let's solve the system above using matrices. We can represent this entire
system with a 2 by 3 matrix which looks like this:
. This is called an
augmented matrix because we
combined 2 matrices (a matrix and a vector for this system). In this case, we
combined the 2 by 2 coefficient matrix which is made of the coefficients for our
unknowns and the 2 by 1 matrix from the right-hand side of the equations into
one 2 by 3 matrix. In other words, we put A to the left of the bar and
put b to the right of the bar. The application of an ERO to the augmented
matrix does not change the solution set of the linear system that the augmented
matrix represents because whatever you do to the left side of an equation, you
also do to the right side. Therefore, we will arrive at the same solution
whether we use augmented matrices or not, and augmented matrices are more
compact to write. Using matrix notation, our goal is to transform our system
into one that looks like the following:
In other words, we want the identity matrix to the left of the bar and the solution to the right of the bar.
Remark The bar is not a formal part of the matrix, so it is not necessary. It is placed there so that we can refer to the different parts of the augmented matrix and the linear system that it represents.Let's use EROs to obtain a system of this form. It is a good idea to write notes to yourself about what you do in each step. This helps you locate and correct your mistake if you make one. It also helps you to explain your work. In this book, r1 represents row 1.
When we convert this from augmented matrix notation back to the
algebraic notation for a system of equations, it looks like this:
| 1x1 | + | 0x2 | = | 6 | ||
| 0x1 | - | 1x2 | = | 21 |
This tells us that x1 = 6 and x2 = 21. Substitute this solution into the system to assure yourself that we are correct. If we systematically use elementary row operations to obtain the identity matrix to the left of the bar, we call this the Gauss-Jordan elimination method.
Now, let's solve the system
| 5x1 | + | 3x2 | = | 70 | ||
| -4x1 | - | 2x2 | = | -56 |
using Gauss-Jordan elimination.
Look back at the two systems of equations that we solved. How
are they similar? We performed the same steps both times because the steps
involved in solving a system of equations depend only on the matrix that is to
the left of the bar. If we want to solve a system of equations with the same
matrix A for different b vectors that we will be given at a later
time, it would be nice if we did not have to do Gauss-Jordan elimination every
time.
Let's look at the scalar version of this equation, ax = b, to help us find a general method for matrices. We know that x = a-1b if a ≠ 0 because a-1 = 1/a where a-1 is called the multiplicative inverse or the reciprocal. There is something analogous to this with matrices. It is also called the inverse . With scalars, a-1a = aa-1 = 1.
Definition The matrix A-1 (called A inverse) is the inverse of A if A-1A = AA-1 = I where I is the identity matrix.Once we find A-1, Ax = b can be solved by matrix multiplication rather than Gauss-Jordan elimination. We follow the algebraic steps below to find an expression for x:
| Ax | = | b | ||
| A-1Ax | = | A-1b | ||
| Ix | = | A-1b | ||
| x | = | A-1b |
This means that if we find A-1, we only need to multiply to solve systems with the same matrix A for different b vectors. Please remember that A-1b ≠ bA-1, so you must multiply in the correct order.
Remark If we have all the b vectors at the time when we wish to solve the system, we can simply augment all the b vectors together on the right side of the bar. Then the solution for each b vector will fall in the column that originally contained that b vector. For example, if we wished to solve Ax = b and Ax = c for the same A matrix, we could use that augmented matrixLet's find A-1 for the same matrix that we have been using,When the matrix to the left of the bar reaches the identity matrix by use of EROs, the solution to Ax = b will be in the first column to the right of the bar, and the solution to Ax = c will be in the second column to the right of the bar. Now you may wonder why we should ever need an inverse. If we do not have all the right-hand sides at the time when we solve the problem, we should find A-1 and multiply as indicated earlier. This situation often occurs when the solution to one system is the right-hand side of the next system.
. We can do this by
solving the equation AX = I for the n by n matrix
X. Because we know that AA-1 = I, we know that
our solution, X, is the same as A-1.
Remark In computational mathematics, the inverse is very seldom found because other methods exist that serve the same purpose and require fewer steps. However, the inverse will serve our needs at this level and is important in the theory of matrices.Using the Gauss-Jordan elimination method, let's find A-1 where
.
Therefore, A-1 =
. Multiply
AA-1 and A-1A to
convince yourself that they both multiply to I.
Did you notice that there was a pattern to our elimination? Look at the example with a 3 by 3 matrix to see if you can find the pattern.
Notice that we obtain all the zeros below the main diagonal before we work to get any zeros above the main diagonal. Other books tell you to obtain all the zeros needed for a column above and below the diagonal before you move to the next column. That method makes the problem easier to code on a computer, but the method that we used often requires fewer calculations.WARNING: We know that a-1 is not defined when a = 0. It is also true that A-1 is not always defined. Is it possible to find a unique solution to the system if the matrix A does not have an inverse? No it is not. You will learn more about this in Chapter 6.
We know that we can use the Gauss-Jordan elimination method to solve a system
of equations using matrices, but we don't really have to do all that work if we
are only trying to solve a system of linear equations. It is true that it is
easy to solve a system if the identity matrix is to the left of the bar because
you can just read off the answer. However, it is also fairly easy if the matrix
to the left of the bar is upper triangular because you can read the last element
of the solution and substitute it into the previous equation to obtain another
element. Repeated use of substitution will yield the entire solution. Therefore
there is a method called Gaussian elimination that
stops row operations after you have an upper triangular matrix to the left of
the bar. At that point, you use back-substitution to
find the remaining values of the solution. This is very similar to the way you
learned to solve systems of equations algebraically. Once you find a solution,
you substitute it in everywhere to decrease the size of your system. Let's go
back to our original 2 by 2 matrix example in this section.
In Gaussian elimination, we can stop performing row operations
now since we have an upper triangular matrix to the left of the bar. When we
translate from the augmented matrix into a system of equations, we get
| x1 | + | 0.6x2 | = | 18.6 | ||
| x2 | = | 21 |
| |