ディスカッション (3件)
x86-64のアセンブリ言語を使用して、C言語の標準ライブラリであるstring.hに含まれる関数(strlen, strcpy, memcpyなど)を実装する方法についてのトピックです。CPUの強力な文字列命令を活用し、極限まで最適化されたコードを作成するテクニックを学びましょう。
vpcmpestri xmm2, xmm3, BYTEWISE_CMP
test cx, 0x10 ; if(rcx != 16)
I see this test/cmp all the time after the instruction and I don't understand it. pcmpestri will set ZF if edx < 16, and it will set SF if eax < 16. It is already giving you the necessary status. Also testing sub words of the larger register is very slow and is a pipeline hazard.
You've got this monster of an instruction and then people place all this paranoid slowness around it. Am I reading the x86 manual wrong?
Not sure what Visual Studio has done over the years but I remember decompiling Gearbox's utilities .dll in James Bond 007 Nightfire (2002) and it appeared to have a bunch of string manipulation functions written using these instructions.