Stefano Tommesani

  • Increase font size
  • Default font size
  • Decrease font size
Home Programming MMX / iSSE latency

MMX / iSSE latency

Hits

The following table summarizes the latencies of MMX/iSSE instructions on the Intel Pentium III and Pentium 4 processors, and on the AMD Athlon processor:
 

 
Instruction Pentium III Pentium 4 AMD Athlon
MOVD mm,r32 1 2 3
MOVD r32,mm 1 5 5
MOVQ mm,mm 1 6 2
PACKSSWB / PACKSSDW / PACKUSWB mm,mm 1 2 2
PADDB / PADDW / PADDD 1 2 2
PADDSB / PADDSW / PADDUSB / PADDUSW mm,mm 1 2 2
PAND / PANDN/ POR / PXOR mm,mm 1 2 2
PCMPEQB / PCMPEQW / PCMPEQD mm,mm 1 2 2
PCMPGTB / PCMPGTW / PCMPGTD mm,mm 1 2 2
PMADDWD mm,mm 3 8 3
PMULHW / PMULLW / PMULHUW mm,mm 3 8 3
PSLLW / PSLLW / PSLLQ mm,mm/imm8 1 2 2
PSRAW / PSRAD mm,mm/imm8 1 2 2
PSUBB / PSUBW / PSUBD mm,mm 1 2 2
PSUBSB / PSUBSW / PSUBUSB / PSUBUSW mm,mm 1 2 2
PUNPCKHBW / PUNPCKHWD / PUNPCKHDQ mm,mm 1 2 2
PUNPCKLBW / PUNPCKLWD / PUNPCKLDQ mm,mm 1 2 2
EMMS 6 12 2
PAVGB / PAVGW mm,mm 1 2 2
PEXTRW r32,mm,imm8 2 7 7
PINSRW mm,r32,imm8 4 4 5
PMAX / PMIN mm,mm 1 2 2
PMOVMSKB r32,mm 1 7 6
PSADBW mm,mm 5 4 3
PSHUFW mm,mm,imm8 1 2 2


 

Latency the number of clock cycles that are required to complete the execution of all of the µops that form an instruction.
Throughput the number of clock cycles required to wait before the issue ports are free to accept the same instruction again.
Execution Unit the names of the execution units in the execution core that are utilized to execute the µops for each instruction.
Quote this article on your site

To create link towards this article on your website,
copy and paste the text below in your page.




Preview :

MMX / iSSE latency
Tuesday, 25 April 2000

Powered by QuoteThis © 2008
 
View Stefano Tommesani's profile on LinkedIn

Latest Articles

A software to stand out 27 January 2018, 14.35 Web
A software to stand out
Standing out of the pack starts by being visible, and being noticed by the right group of professionals. No matter how good your profile is, it is lost in a sea of similar profiles, so you need to show up and start attracting
Web page scraping, the easy way 07 January 2018, 00.46 Web
Web page scraping, the easy way
There are many ways to extract data elements from web pages, almost all of them prettier and cooler than the method proposed here, but as we are in an hurry, let's get that data quickly, ok? Suppose we have to extract the
Scraping dynamic page content 06 January 2018, 23.57 Web
Scraping dynamic page content
One of the most common roadblocks when scraping the content of web sites is getting the full contents of the page, including JS-generated data elements (probably, the ones you are looking for). So, when using CEFSharp to scrape
Unit-testing file I/O 26 November 2017, 12.09 Testing
Unit-testing file I/O
Two good news: file I/O is unit-testable, and it is surprisingly easy to do. Let's see how it works! A software no-one asked for First, we need a piece of software that deals with files and that has to be unit-tested. The
Fixing Git pull errors in SourceTree 10 April 2017, 01.44 Software
Fixing Git pull errors in SourceTree
If you encounter the following error when pulling a repository in SourceTree: VirtualAlloc pointer is null, Win32 error 487 it is due to to the Cygwin system failing to allocate a 5 MB large chunk of memory for its heap at

Translate