This version spreads the DYNPROG product method to GPU Fausts. This is a dynamic programming method to compute a Faust product (or a Faust-matrix product). It varies the order of matrix products composing the whole Faust product (aka Faust.toarray) considering the cost of each product to speed up the whole computation. Find more information about this method here.
The table below compares the computation times obtained using DYNPROG or DEFAULT_L2R methods (in the latter, the product is basically computed from the left to the right). Four Fausts are tested, they are dense, sparse (CSR, BSR matrices) or mixed (dense + sparse matrices).
The data for the table below was made using this script (please note that the script is random, so the results won’t be exactly the same from one run to another).
|Faust name||Product Method||Device||Faust matrix type||Comp. time (100 products)||Speedup vs DEFAULT_L2R||Number of factors|
|F2||DEFAULT_L2R||GPU GTX980||sparse (CSR)||43.4||1||32|
|F2||DYNPROG||GPU GTX980||sparse (CSR)||17.7||2.45||32|
|F3||DEFAULT_L2R||GPU GTX980||sparse (BSR)||2.64||1||5|
|F3||DYNPROG||GPU GTX980||sparse (BSR)||1.75||1.5||5|
|F4||DEFAULT_L2R||GPU GTX980||mixed (BSR + dense)||6.26||1||14|
|F4||DYNPROG||GPU GTX980||mixed (BSR + dense)||2.77||2.26||14|