• Multi-thread

    Multi-thread loops with Intel TBB

    A new article about using Intel TBB is here. It contains examples using C++ lambdas and joining multi-threaded loops with SIMD code In this article we will transform a plain C loop into a multi-threaded version using Intel Thread Building Blocks library (TBB). Here is the loop to transform: {CODE…

  • Software

    About Stefano Tommesani

    15 years of experience in the CCTV area, including: R&D strategic planning and partnerships pre-sales HW / SW integration QA Broad software development experience, from flashy GUIs to down-to-the-metal assembly programming, and a performance-minded approach to development allow me to reach outstanding results in software products: Design and implementation of…

  • SIMD

    SSE2 Intrinsics

    Floating-Point Intrinsics Arithmetic Operation Intrinsics Intrinsic name Corresponding instruction Operation R0 value R1 value _mm_add_sd ADDSD Adds a0 [op] b0 a1 _mm_add_pd ADDPD Adds a0 [op] b0 a1 [op] b1 _mm_div_sd DIVSD Divides a0 [op] b0 a1 _mm_div_pd DIVPD Divides a0 [op] b0 a1 [op] b1 _mm_max_sd MAXSD Computes maximum…

  • SIMD

    SSE Intrinsics

    Packed Arithmetic Intrinsics Intrinsic Instruction Operation R0 R1 R2 R3 _mm_add_ss ADDSS Adds a0 [op] b0 a1 a2 a3 _mm_add_ps ADDPS Adds a0 [op] b0 a1 [op] b1 a2 [op] b2 a3 [op] b3 _mm_sub_ss SUBSS Subtracts a0 [op] b0 a1 a2 a3 _mm_sub_ps SUBPS Subtracts a0 [op] b0 a1…

  • SIMD

    MMX Intrinsics

    General Support Intrinsics Intrinsic name Operation Signed Saturation Assembly instruction _mm_empty Empties MM state Not applicable Not applicable EMMS _mm_cvtsi32_si64 Converts from int Not applicable Not applicable MOVD _mm_cvtsi64_si32 Converts from int Not applicable Not applicable MOVD _mm_packs_pi16 Packs Yes Yes PACKSSWB _mm_packs_pi32 Packs Yes Yes PACKSSDW _mm_packs_pu16 Packs No…

  • SIMD

    SSE Arithmetic

    ADDPS (parallel) and ADDSS (scalar) add the pair of operands. SUBPS (parallel) and SUBSS (scalar) subtract the pair of operands. MULPS (parallel) and MULSS (scalar) multiply the pair of operands.DIVPS (parallel) and DIVSS (scalar) divides the pair of operands. SQRTPS (parallel) and SQRTSS (scalar) return the square root of the…

  • SIMD

    SSE2 and MMX

    I?’m quite sure that Intel would not like to see SSE2 named 128-bit MMX. In fact, MMX has a bad reputation: the Intel marketing hype pushed it as an universal solution to multimedia requirements, but at the same time the gaming industry switched from mostly 2D games to Virtual Reality-like…