|
|

The following table summarizes the
latencies of MMX/iSSE instructions on the Intel
Pentium III and Pentium 4 processors, and on the AMD Athlon processor:
| Instruction |
Pentium III |
Pentium 4 |
AMD Athlon |
| MOVD mm,r32 |
1 |
2 |
3 |
| MOVD r32,mm |
1 |
5 |
5 |
| MOVQ mm,mm |
1 |
6 |
2 |
| PACKSSWB / PACKSSDW / PACKUSWB mm,mm |
1 |
2 |
2 |
| PADDB / PADDW / PADDD |
1 |
2 |
2 |
| PADDSB / PADDSW / PADDUSB / PADDUSW mm,mm |
1 |
2 |
2 |
| PAND / PANDN/ POR / PXOR mm,mm |
1 |
2 |
2 |
| PCMPEQB / PCMPEQW / PCMPEQD mm,mm |
1 |
2 |
2 |
| PCMPGTB / PCMPGTW / PCMPGTD mm,mm |
1 |
2 |
2 |
| PMADDWD mm,mm |
3 |
8 |
3 |
| PMULHW / PMULLW / PMULHUW mm,mm |
3 |
8 |
3 |
| PSLLW / PSLLW / PSLLQ mm,mm/imm8 |
1 |
2 |
2 |
| PSRAW / PSRAD mm,mm/imm8 |
1 |
2 |
2 |
| PSUBB / PSUBW / PSUBD mm,mm |
1 |
2 |
2 |
| PSUBSB / PSUBSW / PSUBUSB / PSUBUSW mm,mm |
1 |
2 |
2 |
| PUNPCKHBW / PUNPCKHWD / PUNPCKHDQ mm,mm |
1 |
2 |
2 |
| PUNPCKLBW / PUNPCKLWD / PUNPCKLDQ mm,mm |
1 |
2 |
2 |
| EMMS |
6 |
12 |
2 |
| PAVGB / PAVGW mm,mm |
1 |
2 |
2 |
| PEXTRW r32,mm,imm8 |
2 |
7 |
7 |
| PINSRW mm,r32,imm8 |
4 |
4 |
5 |
| PMAX / PMIN mm,mm |
1 |
2 |
2 |
| PMOVMSKB r32,mm |
1 |
7 |
6 |
| PSADBW mm,mm |
5 |
4 |
3 |
| PSHUFW mm,mm,imm8 |
1 |
2 |
2 |
| Latency |
the number of clock cycles that are required
to complete the execution of all of the µops that form an instruction. |
| Throughput |
the number of clock cycles required to
wait before the issue ports are free to accept the same instruction again. |
| Execution
Unit |
the names of the execution units in the
execution core that are utilized to execute the µops for each instruction. |
|
|