cvDilate and cvErode - commutativity/associativity issue (Bug #824)
Description
The f32 variants of these functions are implemented in terms of min and
max operations. The SSE and scalar variants of the implementations
apply min and max to the same operands but in a different order.
Specifically, the scalar variant sometimes performs an optimisation to
avoid double computing results in places where the SSE variant does
not perform this optimisation. For example, the first inner loop in
the MorphRowFilter::operator() function contains this optimisation,
and the corresponding SSE function MorphRowFVec::operator() does not.
We cannot consider the two implementations to be equivalent because
the implementations of min and max used by the benchmarks are neither
associative nor commutative.
This means that if any element of the input matrix is NaN the scalar
version of this code may produce a different result to the SIMD
version. E.g., if x<y, then min(x, NaN, y)
- would be implemented in the SIMD version as min(min(x, NaN), y) and would
return y,
- and may be implemented in the scalar version as min(min(NaN, y), x) and
would return x.
To solve this problem we could try to ensure that min and max are
applied in the same order in both scalar and SIMD implementations.
However this seems difficult, and may slow down the SIMD code.
An alternative approach would be to re-arrange the min and max
operations so that the same result is arrived at in both cases. We can
do this by applying the min/max operations right associatively and
supplying an identity element as the rightmost value. For example,
the two expressions below will yield the same result:
min(a, min(b, min(c, +Inf))) and min(b, min(c, min(a, +Inf)))
We are a team of researchers at Imperial College London who have
developed a technique for symbolically crosschecking a floating-point
program against its SIMD-vectorized version, as well as a tool,
KLEE-FP, which implements this technique. We found this bug by
applying KLEE-FP to a test benchmark which compares the symbolic
output of the scalar version of the algorithm against that of the
SIMD version. In this case, KLEE-FP reported a mismatch.
Associated revisions
Merge pull request #824 from taka-no-me:reduce_c_api_deps
History
Updated by Vadim Pisarevsky almost 14 years ago
Since on finite numbers (and even +/-Infs) we get correct results, and NaNs do not occur in normal images people are dealing with, the unification will not likely happen.
- Status changed from Open to Done
- (deleted custom field) set to wontfix