Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately, we can't change the past, and seemingly in the past it wasn't worth it to have a fast One True memcpy (and perhaps to a decent extent still isn't). I'm still typing this on a Haswell CPU, which don't have FSRM (rep movsb of 16 bytes in a loop takes ~10ns=36 cycles per iteration avg).

But, yeah it does seem that my 128 bytes of a quick search was wrong. (though, gcc & clang for '-march=alderlake' both never generate 'rep movsb' on '-O3'; on `-Os` gcc starts giving a rep movsb for ≥65B, clang still never does)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: