Equations

Solving equations is an important part of mathematics. If we are working with more than one unknown at a time, we need to solve systems of equations. You may already know how to solve a system of linear equations, but matrices provide a more compact way to arrive at the solution. Matrices are also easier to manipulate on a computer or calculator. Both of these facts will become more important when you work with larger systems.

Let's look at a system of linear equations. The system

5x1 + 3x2   =   93
-4x1 - 2x2   =   -66

can be written in matrix form as AX = B where

matrices

However, you will usually see Ax = b rather than AX = B because most authors use small letters to represent vectors. You can multiply this out to convince yourself that AX = B does represent this system.

When you learned to solve systems of linear equations, you learned that

  1. you arrive at the same solution no matter which equation you write first;
  2. the solution doesn't change if you multiply an equation by a scalar other than zero; and
  3. you can replace an equation with the sum of that equation and another equation without changing the solution.
These may not be exactly the words you used when you were solving a system of linear equations, but you did all these things. Experiment with the system above to convince yourself that these statements are true.

We can also solve this system entirely in matrix form. We use the same rules, and we call them elementary row operations (EROs).   The EROs tell us that we can

  1. interchange any two rows;
  2. multiply any row by a non-zero scalar; and
  3. replace any row by the sum of that row and any other row.
Proper use of EROs will leave us with a system that has the same solution as our original system, but is much easier to solve. If you were presented the system

x1   =   a
x2   =   b


you would be able to "solve" it instantly because you only have to read off the solution. If this system was written using matrix notation, it would look like this:

matrix

The matrix matrix is the 2 by 2 identity matrix. Because you can just read off the solution when a system is in this form, our first goal is to transform our system into this form.

Let's solve the system above using matrices. We can represent this entire system with a 2 by 3 matrix which looks like this: matrix. This is called an augmented matrix   because we combined 2 matrices (a matrix and a vector for this system). In this case, we combined the 2 by 2 coefficient matrix which is made of the coefficients for our unknowns and the 2 by 1 matrix from the right-hand side of the equations into one 2 by 3 matrix. In other words, we put A to the left of the bar and put b to the right of the bar. The application of an ERO to the augmented matrix does not change the solution set of the linear system that the augmented matrix represents because whatever you do to the left side of an equation, you also do to the right side. Therefore, we will arrive at the same solution whether we use augmented matrices or not, and augmented matrices are more compact to write. Using matrix notation, our goal is to transform our system into one that looks like the following:

matrices

In other words, we want the identity matrix to the left of the bar and the solution to the right of the bar.

Remark The bar is not a formal part of the matrix, so it is not necessary. It is placed there so that we can refer to the different parts of the augmented matrix and the linear system that it represents.
Let's use EROs to obtain a system of this form. It is a good idea to write notes to yourself about what you do in each step. This helps you locate and correct your mistake if you make one. It also helps you to explain your work. In this book, r1 represents row 1.

augmented matrices


When we convert this from augmented matrix notation back to the algebraic notation for a system of equations, it looks like this:

1x1 + 0x2   =   6
0x1 - 1x2   =   21

This tells us that x1 = 6 and x2 = 21. Substitute this solution into the system to assure yourself that we are correct. If we systematically use elementary row operations to obtain the identity matrix to the left of the bar, we call this the Gauss-Jordan elimination method.

Now, let's solve the system

5x1 + 3x2   =   70
-4x1 - 2x2   =   -56

using Gauss-Jordan elimination.

augmented equations


Look back at the two systems of equations that we solved. How are they similar? We performed the same steps both times because the steps involved in solving a system of equations depend only on the matrix that is to the left of the bar. If we want to solve a system of equations with the same matrix A for different b vectors that we will be given at a later time, it would be nice if we did not have to do Gauss-Jordan elimination every time.

Let's look at the scalar version of this equation, ax = b, to help us find a general method for matrices. We know that x = a-1b if a ≠ 0 because a-1 = 1/a where a-1 is called the multiplicative inverse or the reciprocal. There is something analogous to this with matrices. It is also called the inverse . With scalars, a-1a = aa-1 = 1.

Definition The matrix A-1 (called A inverse) is the inverse of A if A-1A = AA-1 = I where I is the identity matrix.
Once we find A-1, Ax = b can be solved by matrix multiplication rather than Gauss-Jordan elimination. We follow the algebraic steps below to find an expression for x:

Ax   =   b
A-1Ax   =   A-1b
Ix   =   A-1b
x   =   A-1b

This means that if we find A-1, we only need to multiply to solve systems with the same matrix A for different b vectors. Please remember that A-1bbA-1, so you must multiply in the correct order.

Remark If we have all the b vectors at the time when we wish to solve the system, we can simply augment all the b vectors together on the right side of the bar. Then the solution for each b vector will fall in the column that originally contained that b vector. For example, if we wished to solve Ax = b and Ax = c for the same A matrix, we could use that augmented matrix augmented matrix When the matrix to the left of the bar reaches the identity matrix by use of EROs, the solution to Ax = b will be in the first column to the right of the bar, and the solution to Ax = c will be in the second column to the right of the bar. Now you may wonder why we should ever need an inverse. If we do not have all the right-hand sides at the time when we solve the problem, we should find A-1 and multiply as indicated earlier. This situation often occurs when the solution to one system is the right-hand side of the next system.
Let's find A-1 for the same matrix that we have been using, matrix. We can do this by solving the equation AX = I for the n by n matrix X. Because we know that AA-1 = I, we know that our solution, X, is the same as A-1.

augmented matrix


Notice that we used the exact same steps again. We now know that A^-1
Remark In computational mathematics, the inverse is very seldom found because other methods exist that serve the same purpose and require fewer steps. However, the inverse will serve our needs at this level and is important in the theory of matrices.
Using the Gauss-Jordan elimination method, let's find A-1 where augmented matrix.

martrix


Therefore, A-1 = matrix. Multiply AA-1 and A-1A to convince yourself that they both multiply to I.

Did you notice that there was a pattern to our elimination? Look at the example with a 3 by 3 matrix to see if you can find the pattern.

  1. Begin with the first row. Let i = 1.

  2. Check to see if the pivot for row i is zero. The pivot is the element of the main diagonal that is on the current row. For instance, if you are working with row i, then the pivot element is aii. If the pivot is zero, exchange that row with a row below it that does not contain a zero in column i. If this is not possible, then an inverse to that matrix does not exist.

  3. Divide every element of row i by the pivot.

  4. For every row below row i, replace that row with the sum of that row and a multiple of row i so that each new element in column i below row i is zero.

  5. Let i = i + 1. This means that you move to the next row and column. Repeat steps 2 through 5 until you have zeros for every element below the main diagonal. Now you have a matrix to the left of the bar that is called upper triangular because all the non-zero numbers fall in the triangle above and including the main diagonal.

  6. Now we work to get zeros above the main diagonal. The index i should be equal to the number of rows.

  7. For every row above row i, replace that row with the sum of that row and a multiple of row i so that each new element in column i above row i is zero. You will notice that the zeros below the main diagonal are still zeros.

  8. Let i = i - 1. This means that you move to the left one column and up a row. Repeat steps 6-8 until you have zeros for every element above the main diagonal. Since the zeros below the main diagonal did not change, you now have a diagonal matrix to the left of the bar because all the non-zero elements lie on the main diagonal. Since all the elements along the diagonal of this diagonal matrix are the number one, this matrix is the identity matrix. Therefore, the matrix to the right of the bar is our solution.
Notice that we obtain all the zeros below the main diagonal before we work to get any zeros above the main diagonal. Other books tell you to obtain all the zeros needed for a column above and below the diagonal before you move to the next column. That method makes the problem easier to code on a computer, but the method that we used often requires fewer calculations.
WARNING: We know that a-1 is not defined when a = 0. It is also true that A-1 is not always defined. Is it possible to find a unique solution to the system if the matrix A does not have an inverse? No it is not. You will learn more about this in Chapter 6.

We know that we can use the Gauss-Jordan elimination method to solve a system of equations using matrices, but we don't really have to do all that work if we are only trying to solve a system of linear equations. It is true that it is easy to solve a system if the identity matrix is to the left of the bar because you can just read off the answer. However, it is also fairly easy if the matrix to the left of the bar is upper triangular because you can read the last element of the solution and substitute it into the previous equation to obtain another element. Repeated use of substitution will yield the entire solution. Therefore there is a method called Gaussian elimination that stops row operations after you have an upper triangular matrix to the left of the bar. At that point, you use back-substitution to find the remaining values of the solution. This is very similar to the way you learned to solve systems of equations algebraically. Once you find a solution, you substitute it in everywhere to decrease the size of your system. Let's go back to our original 2 by 2 matrix example in this section.

augmented matrix


In Gaussian elimination, we can stop performing row operations now since we have an upper triangular matrix to the left of the bar. When we translate from the augmented matrix into a system of equations, we get

x1 + 0.6x2   =   18.6
    x2   =   21

We can read from the second equation that x2 = 21. We substitute 21 for x2 into the first equation to get x1 + 0.6(21) = 18.6, so x1 = 6. This is the same solution as before, and Gaussian elimination requires fewer operations than does Gauss-Jordan elimination. Try this with the 3 by 3 matrix to see that you get the same solution. You can see for a 3 by 3 or larger matrix that fewer steps are required. In fact, Gaussian elimination requires approximately n3/3 steps and Gauss-Jordan elimination requires approximately n3/2 steps.

<<<


affordable health insurance; phentermine pills