128-Bit Multiplication Performance on Raspberry Pi 4 (ARMv8-A)
UMULH -- A64
The AArch64 processor (aka arm64), part 5: Multiplication and division - The Old New Thing
Microsoft Word - Cortex-A72 Software Optimization Guide - external.docx
Using the GNU Compiler Collection (GCC): __int128
c++ - Getting the high part of 64 bit integer multiplication - Stack Overflow
c++ - Getting the high part of 64 bit integer multiplication - Stack Overflow
Microsoft Word - Cortex-A72 Software Optimization Guide - external.docx
Microsoft Word - Cortex-A72 Software Optimization Guide - external.docx