Sorting algorithm needs attention
Moderators: white, Hacker, petermad, Stefan2
Sorting algorithm needs attention
From time to time I've noticed that TC has problems when sorting filenames, so I'm offering an illustration of where I see the issue in TC V6.03
Paste the following into a batch file and execute it in an empty folder. The sort will be in the order that is listed below, where MBYR-ILO.TXT should not follow MB-YADAS.TXT. The Y comes after the - in the ascii table and so the MBYR-ILO.TXT should come after MZ-XXX.TXT
Ditto for MCE-DREA.TXT
@echo off
type nul>MB-THESA.TXT
type nul>MB-WOODY.TXT
type nul>MB-YADAS.TXT
type nul>MBYR-ILO.TXT
type nul>MC-ITSGE.TXT
type nul>MC-MYFOO.TXT
type nul>MC-NACL.TXT
type nul>MC-SOMED.TXT
type nul>MC-WHENE.TXT
type nul>MCE-DREA.TXT
type nul>MD-LILIM.TXT
type nul>MD-SPANI.TXT
Paste the following into a batch file and execute it in an empty folder. The sort will be in the order that is listed below, where MBYR-ILO.TXT should not follow MB-YADAS.TXT. The Y comes after the - in the ascii table and so the MBYR-ILO.TXT should come after MZ-XXX.TXT
Ditto for MCE-DREA.TXT
@echo off
type nul>MB-THESA.TXT
type nul>MB-WOODY.TXT
type nul>MB-YADAS.TXT
type nul>MBYR-ILO.TXT
type nul>MC-ITSGE.TXT
type nul>MC-MYFOO.TXT
type nul>MC-NACL.TXT
type nul>MC-SOMED.TXT
type nul>MC-WHENE.TXT
type nul>MCE-DREA.TXT
type nul>MD-LILIM.TXT
type nul>MD-SPANI.TXT
@foxidrive
You are right:
The dash or hyphen has got ASCII value 45, the letter Y has got the ASCII value 89. So "-" is smaller than "Y".
You are wrong with respect to the sorting order nevertheless.
"MBY" comes before "MC-", because sorting is done on character by character base.
And everything starting with "MB" is "smaller" than anything starting with "MC".
So TC sorts the filenames in the correct way.
Kind regards,
Karl
You are right:
The dash or hyphen has got ASCII value 45, the letter Y has got the ASCII value 89. So "-" is smaller than "Y".
You are wrong with respect to the sorting order nevertheless.
"MBY" comes before "MC-", because sorting is done on character by character base.
And everything starting with "MB" is "smaller" than anything starting with "MC".
So TC sorts the filenames in the correct way.
Kind regards,
Karl
MX Linux 21.3 64-bit xfce, Total Commander 10.52 64-bit
The people of Alderaan keep on bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine.
The Prophet's Song
The people of Alderaan keep on bravely fighting back the clone warriors sent out by the unscrupulous Sith Lord Palpatine.
The Prophet's Song
I'm not sure that I explained myself clearly,
TC V6.03 sorts like this:
Animals - Bovine.jpg
Animals- Horse.jpg
Animals - Snake.jpg
Wheras the sort utility sorts like this:
Animals- Horse.jpg
Animals - Bovine.jpg
Animals - Snake.jpg
If sorting by ascii value the space (32) comes before the hyphen (45), so sort doesn't conform to ascii values and there is wiggle room, but TC shows the hyphen in between two spaces which afaics is wrong. It should either be at the top, to match sorts output, or at the bottom to follow the ascii values.
TC V6.03 sorts like this:
Animals - Bovine.jpg
Animals- Horse.jpg
Animals - Snake.jpg
Wheras the sort utility sorts like this:
Animals- Horse.jpg
Animals - Bovine.jpg
Animals - Snake.jpg
If sorting by ascii value the space (32) comes before the hyphen (45), so sort doesn't conform to ascii values and there is wiggle room, but TC shows the hyphen in between two spaces which afaics is wrong. It should either be at the top, to match sorts output, or at the bottom to follow the ascii values.
- SanskritFritz
- Power Member
- Posts: 3693
- Joined: 2003-07-24, 09:25 UTC
- Location: Budapest, Hungary
- SanskritFritz
- Power Member
- Posts: 3693
- Joined: 2003-07-24, 09:25 UTC
- Location: Budapest, Hungary
D'oh! I did indeed. Have removed it now so I don't make similar blunders in future.
I found this quote from IGL which explains the behaviour.
"Normally the first TC example is standard sorting.
But with SortUpper=2 you want to have different behaviour.
Numbers are sorted in more "humanoidal" way, eg: 1,5,9,10,11,20 intead of standard sorting: 1,10,11,20,5,9.
Also spaces are put first in standard sorting (before letter a), but in SortUpper=2 mode space are ignored"
So it was the spaces being ignored , which puts H after B in Bovine, and before S in Snake, as TC displayed it.
Thanks for the solution.
I found this quote from IGL which explains the behaviour.
"Normally the first TC example is standard sorting.
But with SortUpper=2 you want to have different behaviour.
Numbers are sorted in more "humanoidal" way, eg: 1,5,9,10,11,20 intead of standard sorting: 1,10,11,20,5,9.
Also spaces are put first in standard sorting (before letter a), but in SortUpper=2 mode space are ignored"
So it was the spaces being ignored , which puts H after B in Bovine, and before S in Snake, as TC displayed it.
Thanks for the solution.
- SanskritFritz
- Power Member
- Posts: 3693
- Joined: 2003-07-24, 09:25 UTC
- Location: Budapest, Hungary
Same with accents---
2Sheepdog
Hello Stefan !
• It's the same with all extended characters, I get:
Ancre21.bmp
Big21.bmp
ESPACE21.BMP
Et-commercial.bmp
O-trema-Maj.bmp
œ.bmp
À.bmp
Where À might be the second, instead the last…
and œ.bmp before O-trema-Maj.bmp, since the dash is ignored with SortUpper=2
* That doesn't seem easy to fix up, I hope it's possible…
Friendly,
Claude
Hello Stefan !
• It's the same with all extended characters, I get:
Ancre21.bmp
Big21.bmp
ESPACE21.BMP
Et-commercial.bmp
O-trema-Maj.bmp
œ.bmp
À.bmp
Where À might be the second, instead the last…
and œ.bmp before O-trema-Maj.bmp, since the dash is ignored with SortUpper=2
* That doesn't seem easy to fix up, I hope it's possible…
Friendly,
Claude
#31505 Traducteur Français de T•C French translator Aide en Français Tutoriels Français English Tutorials
And you did set SortUpper =2?vlado wrote:Strange. I have tested it on standard US WinXP and order is as expected:Sheepdog wrote:Also annoying is:
Amt
horse
postings
ärger
ärger should be sorted like aerger or at least as arger
sheepdog
Amt
ärger
horse
postings
(Same on my slovak localized Win2000.)
Vlado
I like this sorting except the strange 'Ä' handling.
sheepdog
"A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete fools."
Douglas Adams
completely foolproof is to underestimate the ingenuity of complete fools."
Douglas Adams