• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
llvm-gcc-4.2 generates incorrect code for certain SSE intrinsics (RADAR #11934110)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

llvm-gcc-4.2 generates incorrect code for certain SSE intrinsics (RADAR #11934110)


  • Subject: llvm-gcc-4.2 generates incorrect code for certain SSE intrinsics (RADAR #11934110)
  • From: Paul Russell <email@hidden>
  • Date: Mon, 23 Jul 2012 10:47:33 +0100

Just an FYI and wondering if anyone else has seen this problem or anything similar - llvm-gcc-4.2 seems to generate incorrect code for certain SSE intrinsics. The following code demonstrates the problem:

#include <stdio.h>
#include <tmmintrin.h> // SSSE3
#include <Accelerate/Accelerate.h>

vUInt8 _mm_hmax_epu8(const vUInt8 v)
{
    vUInt8 vmax = v;

    vmax = _mm_max_epu8(vmax, _mm_alignr_epi8(vmax, vmax, 1));
    vmax = _mm_max_epu8(vmax, _mm_alignr_epi8(vmax, vmax, 2));
    vmax = _mm_max_epu8(vmax, _mm_alignr_epi8(vmax, vmax, 4));
    vmax = _mm_max_epu8(vmax, _mm_alignr_epi8(vmax, vmax, 8));

    return vmax;
}

int main(void)
{
    vUInt8 v1 = _mm_setr_epi8(34, 201, 96, 11, 28, 149, 66, 87, 12, 56, 76, 84, 51, 175, 91, 45);
    vUInt8 v2;

    printf("v1 = %vu\n", v1);

    v2 = _mm_hmax_epu8(v1);

    printf("v2 = %vu\n", v2);

    return 0;
}

$ gcc -Wall -mssse3 _mm_hmax_epu8.c -framework Accelerate -o _mm_hmax_epu8 && ./_mm_hmax_epu8

gives:

v1 = 34 201 96 11 28 149 66 87 12 56 76 84 51 175 91 45
v2 = 201 201 201 175 201 201 201 175 201 201 201 175 201 201 201 175

Compiling this with regular gcc 4.2 gives the correct result:

v1 = 34 201 96 11 28 149 66 87 12 56 76 84 51 175 91 45
v2 = 201 201 201 201 201 201 201 201 201 201 201 201 201 201 201 201

Looking at the generated code it appears that llvm-gcc is trying to convert _mm_alignr_epi8 to something other than PALIGNR for certain shifts, but the logic for this is incorrect.

Paul


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Xcode-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: llvm-gcc-4.2 generates incorrect code for certain SSE intrinsics (RADAR #11934110)
      • From: Fritz Anderson <email@hidden>
  • Prev by Date: Re: dev files location advice
  • Next by Date: Re: dev files location advice
  • Previous by thread: Re: dev files location advice
  • Next by thread: Re: llvm-gcc-4.2 generates incorrect code for certain SSE intrinsics (RADAR #11934110)
  • Index(es):
    • Date
    • Thread