cvResize (linear, u8) loss of precision (Bug #836)


Added by Peter Collingbourne about 14 years ago. Updated almost 14 years ago.


Status:Done Start date:
Priority:High Due date:
Assignee:Vadim Pisarevsky % Done:

0%

Category:imgproc, video
Target version:-
Affected version: Operating System:
Difficulty: HW Platform:
Pull request:

Description

The scalar variant of this code evaluates expressions of the
form (simplified to remove irrelevant saturation checks):

(((1536 × N0) + (512 × N0)) + 2097152) >> 22

whereas the SIMD variant evaluates expressions of the form:

(2 + (((1536 × (N0 >> 4)) >> 16) +
((512 × (N0 >> 4)) >> 16))) >> 2

N0 is a complex subexpression shared between the two expressions.

All intermediate values are 32 bits. The SIMD variant loses 11 bits
of precision through right shifts before the addition operation,
while the scalar variant retains all precision until the final right
shift. This leads to differences where the lower 11 bits of N0 affect
the upper 10 bits of the addition result.

Attached is a small example program which illustrates the differences
between the two implementations. The output is shown below:

ptrr0: 224 vs. 224
ptrr1: 176 vs. 176
ptrr2: 80 vs. 80
ptrr3: 32 vs. 32
ptrr4: 31 vs. 31
ptrr5: 23 vs. 23
ptrr6: 8 vs. 8
ptrr7: 0 vs. 0
ptrr8: 169 vs. 169
ptrr9: 136 vs. 136
ptrr10: 68 vs. 68
ptrr11: 32 vs. 32
ptrr12: 27 vs. 27
ptrr13: 18 vs. 18
ptrr14: 6 vs. 6
ptrr15: 0 vs. 0
ptrr16: 60 vs. 60
ptrr17: 55 vs. 55
ptrr18: 44 vs. 44
ptrr19: 32 vs. 32
ptrr20: 18 vs. 18
ptrr21: 9 vs. 9
ptrr22: 3 vs. 3
ptrr23: 0 vs. 0
ptrr24: 12 vs. 12
ptrr25: 27 vs. 27
ptrr26: 57 vs. 57
ptrr27: 61 vs. 61
ptrr28: 40 vs. 40
ptrr29: 22 vs. 22
ptrr30: 7 vs. 7
ptrr31: 0 vs. 0
ptrr32: 26 vs. 26
ptrr33: 53 vs. 53
ptrr34: 106 vs. 106
ptrr35: 119 vs. 119
ptrr36: 91 vs. 91
ptrr37: 58 vs. 58
ptrr38: 19 vs. 19
ptrr39: 0 vs. 0
ptrr40: 49 vs. 49
ptrr41: 74 vs. 73 ...NO
ptrr42: 123 vs. 122 ...NO
ptrr43: 135 vs. 135
ptrr44: 110 vs. 110
ptrr45: 74 vs. 74
ptrr46: 25 vs. 25
ptrr47: 0 vs. 0
ptrr48: 81 vs. 81
ptrr49: 90 vs. 89 ...NO
ptrr50: 107 vs. 106 ...NO
ptrr51: 109 vs. 109
ptrr52: 98 vs. 98
ptrr53: 69 vs. 69
ptrr54: 23 vs. 23
ptrr55: 0 vs. 0
ptrr56: 97 vs. 97
ptrr57: 98 vs. 97 ...NO
ptrr58: 99 vs. 98 ...NO
ptrr59: 97 vs. 96 ...NO
ptrr60: 92 vs. 92
ptrr61: 67 vs. 67
ptrr62: 22 vs. 22
ptrr63: 0 vs. 0
--
7 mismatches FOUND!


We are a team of researchers at Imperial College London who have
developed a technique for symbolically crosschecking a floating-point
program against its SIMD-vectorized version, as well as a tool,
KLEE-FP, which implements this technique. We found this bug by
applying KLEE-FP to a test benchmark which compares the symbolic
output of the scalar version of the algorithm against that of the
SIMD version. In this case, KLEE-FP reported a mismatch.

In this particular example, KLEE-FP also generated the input data
used by the test program.


resize-conc.cpp (1.2 kB) Peter Collingbourne, 2011-01-24 11:16 pm


Associated revisions

Revision fba72cb6
Added by Andrey Pavlenko almost 12 years ago

Merge pull request #836 from jet47:gpu-modules

Revision 416fb505
Added by Andrey Kamaev almost 12 years ago

Revert "Merge pull request #836 from jet47:gpu-modules"

This reverts commit fba72cb60d27905cce9f1390250adf13d5355c03, reversing
changes made to 02131ffb620f349617a65da334591eca5d3cb23b.

Revision 3eeaa918
Added by Vladislav Vinogradov almost 12 years ago

Revert "Revert "Merge pull request #836 from jet47:gpu-modules""

History

Updated by Vadim Pisarevsky almost 14 years ago

Accuracy of SIMD variant is difficult to improve without going to 32-bit accumulators, which will degrade performance significantly. In all interpolation functions that operate on integers the +/-1 difference between different implementations is perfectly acceptable.

  • Status changed from Open to Done
  • (deleted custom field) set to wontfix

Also available in: Atom PDF