RGBA support added to CV_Bayer (Patch #1923)
Description
Expanded CV_BayerBG2BGR functions to support BGRA/RGBA output.
Currently the raw 'C' works correctly
The SIMD version 'nearly' works - but I'm stuck on understanding the optimisation.
Patches against 2.4 branch
History
Updated by Kevin Smith over 12 years ago
Hey, so I've been doing some SIMD stuff and I think I have some code that will make your optimized code work. I don't have a system to test this code, but it is working with my application where I am doing BGRA support. I have switched the B and R values so this code should work, but I am not 100% sure (although this code does switch the B and R values in my tests). This is the relevant portion of the code after you have done the XORing to switch b0 and r0 if necessary. The code after that should look as follows in your bayer2RGBA function (sorry about the bad formatting):
b1 = _mm_unpackhi_epi8(b0, g0); b0 = _mm_unpacklo_epi8(b0, g0); r1 = _mm_unpackhi_epi8(r0, z); r0 = _mm_unpacklo_epi8(r0, z); g0 = _mm_unpacklo_epi16(b0, r0); g1 = _mm_unpackhi_epi16(b0, r0); r0 = _mm_unpacklo_epi16(b1, r1); r1 = _mm_unpackhi_epi16(b1, r1); b0 = _mm_unpacklo_epi32(g0, r0); b1 = _mm_unpackhi_epi32(g0, r0); //Store pixels 128 bits at a time into dst + offset mm_storeu_si128((_m128i*)(dst-1+0), b0); mm_storeu_si128((_m128i*)(dst-1+8*2), b1); //Unpack lower 32 bits of g1 and r1 interleaved g0 = _mm_unpacklo_epi32(g1, r1); g1 = _mm_unpackhi_epi32(g1, r1); //Write pixels 128 bits at a time to dst + offset mm_storeu_si128((_m128i*)(dst-1+8*4), g0); //Only 14 pixels are good data so only write 2 pixels this time mm_storel_epi64((_m128i*)(dst-1+8*6), g1);
In this code I am assuming that we want the alpha channel set to whatever you put into the z variable. In my use case I set z to all 0's (as in the original code that you commented out when setting the variable z) so that the alpha channel would be zero, but in your code it looks like you would have the alpha channel set to 0xff. You may also consider simplifying your code by changing the line "__m128i z = mm_set_epi32(0xffffffff,0xffffffff,0xffffffff,0xffffffff);" just to be "_m128i z = _mm_set1_epi32(0xffffffff);" so you don't have to repeat the 0xffffffff every time. Hope this helps solve how to fix this algorithm and let me know if you have any other questions.
Updated by Vadim Pisarevsky over 12 years ago
- Target version deleted ()
- Assignee deleted (
Vadim Pisarevsky)
Updated by Kirill Kornyakov over 12 years ago
- Target version set to Next Hackathon
Updated by Andrey Kamaev about 12 years ago
RGBA support is already pushed to master (if I'm not mistaken).
- Affected version set to branch 'master' (2.4.9)
- Target version changed from Next Hackathon to 3.0
- Status changed from Open to Done
- Assignee set to Vadim Pisarevsky