Why is calling my asm function from Rust slower than calling it from C?

TL;DR


Summary:
- This article discusses the rav1d project, which is an open-source AV1 video decoder written in Rust.
- The author explains how they optimized the rav1d decoder by using assembly language to improve its performance.
- The article provides technical details on the specific optimizations made, such as using SIMD instructions and reducing branch mispredictions, which resulted in significant speed improvements.

Like summarized versions? Support us on Patreon!