On the debate of GPU #tensor units in #HPC - it looks like we are bound to repeat the history of SIMD units. Those were developed for a very specific use-case (MMX in graphics) and then quickly re-purposed by the HPC community. Vendors reacted fast by adding FP64 support ...
... #HPL @top500supercomp numbers followed and compilers were adapted to "vectorize".

Now, tensor units are introduced in #deeplearning and I don't doubt that those will find their uses in #HPC. Especially if vendors react to specifics of the field, such as sparsity and FP64...
... my worry is more the local optimum we get stuck in. #HPC always had better solutions than SIMD, so called "Cray Vectors" and it took more than a decade to get out of it with #SVE and #RISCV!

For tensor units, we have a more generic version with SSRs as #RISCV extension ...
... https://arxiv.org/abs/1911.08356  and also sparse https://arxiv.org/abs/2011.08070  that allows multi-dimensional configurable vectorization for tensor computations (with @LucaBeniniZhFe and the @pulp_platform team).

I am now watching for history to be made by the vendors. There is no panacea!
You can follow @thoefler.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.