Lister - Searching for hex string returns incorrect result

stifani · Post by *stifani » 2019-02-03, 16:01 UTC

Hello,

Using TC 9.21a in win7 x86, if I search for a hex string in some file (for instance 8C), when searching again for the same string, it also picks up 9C string. And in the same way, searching for 8C string gives also 9C string even if there are no 8C string present.

Image: https://i.ibb.co/8rGT0wr/wrongpick.png

Anyone can confirm ?
Thanks

sqa_wizard · Post by *sqa_wizard » 2019-02-03, 16:56 UTC

In fact TC does not really search for 8C, but for the character represented by hex 8C.
This means it will find 8C as well as the lower case of 8C character which is 9C

With this knowledge you just have to enable option "Case sensitive" as well.
This will find 8C only.

stifani · Post by *stifani » 2019-02-03, 18:44 UTC

I thought that case sensitive worked only for normal character and not for hex string search.
In fact using case sensitive returns only the matched string.

Thanks

Usher · Post by *Usher » 2019-02-03, 19:38 UTC

TC by default uses "Hex or ANSI" search and you cannot turn off encoding for Hex search, you can only select other encoding(s).
It may be good for US-ASCII codes (less then 0x80), but it should not work that way for other codepages, otherwise it may give unpredictable results when searching in Unicode text.

Let's have a look at another code: 0xC0. In Windows-1250 (and ISO-8859-2) it's Ŕ - LATIN CAPITAL LETTER R WITH ACUTE. Small letter ŕ, LATIN SMALL LETTER R WITH ACUTE has code 0xE0. In Unicode they have different codepoints: U+0154 and U+0155, and UTF-8 representation is 0xC594 and 0xC595.

In Windows-1252 (and ISO-8859-1) 0xC0 stands for À, LATIN CAPITAL LETTER A WITH GRAVE and 0xEO is à LATIN SMALL LETTER A WITH GRAVE, they have the same numbers for Unicode codepoints: U+00C0 and U+00E0, as ISO-8859-1 codes were used for Unicode codepoints.

Now guess what TC finds in Unicode text? Nothing expected. It finds code for _underscore_, 0x5F, in both Unicode encodings, even if only one of "Unicode UTF-16" and "UTF-8" options is checked.

Total Commander

Lister - Searching for hex string returns incorrect result

Lister - Searching for hex string returns incorrect result

Re: Lister - Searching for hex string returns incorrect result

Re: Lister - Searching for hex string returns incorrect result

Re: Lister - Searching for hex string returns incorrect result