Bug #10525
wsdiff output is not correct for a binary file
100%
Description
Sometimes there are 1-2 byte differences in the file usr/lib/spell/hlistb
(or hlista
) between builds. Though this difference is itself probably spurious, it seems that wsdiff
is not correctly rendering the differences between these two files anymore.
First, a comparison of the two sample hlistb
files with od
and diff
manually (all in the C
locale):
$ export LANG=C LC_ALL=C $ locale LANG=C LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL=C $ /usr/bin/cksum SAMPLE/*/hlistb 691265056 55780 SAMPLE/a/hlistb 1297749167 55780 SAMPLE/b/hlistb $ /usr/bin/diff <(/usr/bin/od -c -tx4 SAMPLE/a/hlistb) <(/usr/bin/od -c -tx4 SAMPLE/b/hlistb) 831,832c831,832 < 0014760 327 361 032 223 032 330 ~ 350 230 237 337 251 | 336 211 261 < 931af1d7 e87ed81a a9df9f98 b189de7c --- > 0014760 327 361 032 223 032 330 ~ 350 230 ? 317 251 | 336 211 261 > 931af1d7 e87ed81a a9cf3f98 b189de7c
Using a wsdiff
from just prior to the integration of #9979 (commit 9b40c3052b9b0d91120c568df0c5211c131c8da1
), using python2.7
, and the same files:
$ /opt/onbld/bin/wsdiff-old -v -V -r SAMPLE/old.txt SAMPLE/a SAMPLE/b SAMPLE/a/hlistb $ cat SAMPLE/old.txt # This file was produced by wsdiff # 2019-03-10 at 18:19:06 Old proto area: SAMPLE/a/ New proto area: SAMPLE/b/ Results file: SAMPLE/old.txt SAMPLE/a/hlistb NOTE: ASCII difference detected. 831,832c831,832 < 0014760 327 361 032 223 032 330 ~ 350 230 237 337 251 | 336 211 261 < 931af1d7 e87ed81a a9df9f98 b189de7c --- > 0014760 327 361 032 223 032 330 ~ 350 230 ? 317 251 | 336 211 261 > 931af1d7 e87ed81a a9cf3f98 b189de7c
Using a current master wsdiff
plus the proposed patch for #10524 & #10448, also using python2.7
, and the same files:
$ /opt/onbld/bin/wsdiff-fixed -v -V -r SAMPLE/new.txt SAMPLE/a SAMPLE/b SAMPLE/a/hlistb $ wc -l SAMPLE/new.txt 6291 SAMPLE/new.txt $ head -20 SAMPLE/new.txt # This file was produced by wsdiff # 2019-03-10 at 18:20:20 Old proto area: SAMPLE/a/ New proto area: SAMPLE/b/ Results file: SAMPLE/new.txt SAMPLE/a/hlistb NOTE: ASCII difference detected. 523,3661c523,3661 < 0010120 032 032 ~ | e 022 V 005 ~ ~ s $ r s $ 6 < 7c7e1a1a 05561265 24737e7e 36247372 < 0010140 e u * 032 M 3 q $ 7 025 N | 002 P L 037 < 1a2a7565 2471334d 7c4e1537 1f4c5002 < 0010160 K { A S # Q r 1 - E ~ \0 \0 \0 \0 ' < 53417b4b 31725123 007e452d 27000000 < 0010200 " S B i * 020 K 030 _ g N # 006 \b 033 < 42532220 4b102a69 4e675f18 1b080623 < 0010220 ) c c w ] ^ 027 9 \ t { k Y Z \b q < 77636329 39175e5d 6b7b745c 71085a59 $ tail SAMPLE/new.txt > 78653c01 5a4c672d 3a657643 741f684b > 0071060 P s \f 031 c h f Z \n < ( x o ] 9 002 > 190c7350 5a666863 78283c0a 02395d6f > 0071100 Y 026 034 C * 025 - 6 X Y # C Q V O > 431c1659 362d152a 23205958 4f565143 > 0071120 ' \0 \0 \0 > 00000027 > 0071124
Note that the result file is the same with and without the patches, so those patches are not the cause.
It's not completely clear yet what produces the differences.
Files
Related issues
Updated by Joshua M. Clulow almost 2 years ago
I've attached a tar file with the SAMPLE
directory files.
Updated by Joshua M. Clulow almost 2 years ago
- File 10525_corpus.tar.gz 10525_corpus.tar.gz added
Updated by Joshua M. Clulow almost 2 years ago
- Related to Feature #9979: Support python3 for in-gate tools added
Updated by Andy Fiddaman almost 2 years ago
- Status changed from New to In Progress
- Assignee set to Andy Fiddaman
Updated by Electric Monk almost 2 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Closed
git commit 2f7dba3e6747cbaaf1deb86e6ca1e2a5c96332ac
commit 2f7dba3e6747cbaaf1deb86e6ca1e2a5c96332ac Author: Andy Fiddaman <omnios@citrus-it.co.uk> Date: 2019-03-14T20:01:47.000Z 10524 wsdiff much slower after move from deprecated commands module 10448 wsdiff explodes on encoding error 10525 wsdiff output is not correct for a binary file 10526 wsdiff tries to spawn 4.8 threads Reviewed by: Gergő Mihály Doma <domag02@gmail.com> Reviewed by: Sebastian Wiedenroth <sebastian.wiedenroth@skylime.net> Approved by: Rich Lowe <richlowe@richlowe.net>
Updated by Robert Mustacchi 2 months ago
The underlying cause of the differences in hlista and hlistb are in #13333.
Updated by Robert Mustacchi 2 months ago
- Related to Bug #13333: spellcheck1 doesn't zero table memory added