CS 662 Theory of Parallel Algorithms
Numerical Methods part 2
[To Lecture Notes Index]
San Diego State University -- This page last updated April 30, 1996
data:image/s3,"s3://crabby-images/c25ae/c25ae2836336a4139f89a57f1bca8103b581e7e5" alt="----------"
Contents of Numerical Methods part 2 Lecture
- Linear Equations
Matrix Matrix Multiplication and Scan
Let A and B be a n*n matrices
Compute C = A*B
-
data:image/s3,"s3://crabby-images/43d04/43d04d1b188eb9743081ff2ef6fcc0d47afa7195" alt=""
We have
-
data:image/s3,"s3://crabby-images/02526/0252603bc01c340fc6f6ed472c0c10e283cda76f" alt=""
Scan can compute
in log(n) time using n/2 processors
Since there are
elements of C
Scan takes log(n) time with
processors to compute C
Slowdown
Let A, B, and C be a n*n matrices such that
- C = A*B
Let
be (n/2)*(n/2) matrices such that
-
and
data:image/s3,"s3://crabby-images/19448/19448f2517aafb7991342ca09167e8d301a82826" alt=""
Then
-
data:image/s3,"s3://crabby-images/8ab27/8ab2716322973c5eb168c347c846075ca1a0f687" alt=""
Strassen's Method
Let A, B, and C be a 2*2 matrices such that
- C = A*B
-
-
,
,
and
Define the sums and products:
Then we have:
data:image/s3,"s3://crabby-images/758e6/758e6e87ded6b75b8dd7eee0f690edb3a5b658ca" alt=""
Using 7 multiplications and 15 additions
On n*n matrices the above method take O(
)
operations
Solving a system of linear equations like:
-
data:image/s3,"s3://crabby-images/4b403/4b4038082f84f5f5077fba1d2579c58cd80d2c26" alt=""
can be reduced to solving Ax = b for x where:
-
-
data:image/s3,"s3://crabby-images/3edb3/3edb3ff237db34c605388c3c645a515149e64f1e" alt=""
Gauss-Jordan Method
for j = 1 to n do
for h = 1 to n do
for k = j to n + 1 do
if (h != j) then
end for
end for
end for
for j = 1 to n do
end for
Gauss-Jordan Method - SIMD
for j = 1 to n do
for h = 1 to n do in parallel
for k = j to n + 1 do in parallel
if (h != j) then
end for
end for
end for
for j = 1 to n do in parallel
end for
T(n) = O(n)
P(n) = O(
)
Another Approach
If Ax = b then x =
data:image/s3,"s3://crabby-images/fc5d3/fc5d31baea81346a913f299d78d72bf4f8ac63d7" alt=""
So just compute
data:image/s3,"s3://crabby-images/f0f45/f0f4579c754d429e46636986ad2fb44c2cbf312c" alt=""
Special Case
Let
data:image/s3,"s3://crabby-images/0abb9/0abb903e31464b8746cd6815f541acab1cceea9e" alt=""
Define
data:image/s3,"s3://crabby-images/380e2/380e27cf05b4d341e4dda017d89fa6f6508e2cd4" alt=""
We have:
- 1)
-
- 2)
-
- 3) if A = BC then
-
- 4)
data:image/s3,"s3://crabby-images/256fd/256fd81649c018125f9866f53f0b1043846289d3" alt=""
So how long does it take to compute
?
Note that A(BC) = (AB)C when A, B, C are n*n matrices
So we can use generalized scan operator with matrix multiplication
Sequential matrix multiplication takes
data:image/s3,"s3://crabby-images/97985/979851f223676a6121c248bc89f63281483e7bd2" alt=""
So with n processors we can compute
in
log(n)
time
But using a scan within a scan to perform all matrix multiplications we can
compute
in
time using
processors
Theorem. L. Csanky 1976
- The A be a nonsingular n*n matrix. Then
can be computed in parallel using
processors in
time.
proof.
- Uses some cool math to do some fancy stuff
-
- Details available on demand
-
- Method is not practical due to number of processors required and numerical
problems
It is an open problem if
time is best possible for computing inverse of an n*n matrix
It is an open problem if
time is best possible for triangular matrices
x =
General Case
Let
where
are n/2*n/2 matrices
We get:
-
data:image/s3,"s3://crabby-images/747e4/747e466d080ffcea5c8e0db1d2381e8091ba680c" alt=""
So
-
data:image/s3,"s3://crabby-images/cee38/cee3857163ffa60e66929a06259ca4b4f8fd6e06" alt=""
Compute
and
recursively
This requires O(
)
sequential time, where 2 < x < 2.5