9.0b9 x64 - wincmd.ini encoding issues

The behaviour described in the bug report is either by design, or would be far too complex/time-consuming to be changed

Moderators: Hacker, petermad, Stefan2, white

mag
Junior Member
Junior Member
Posts: 35
Joined: 2008-10-06, 08:35 UTC

9.0b9 x64 - wincmd.ini encoding issues

Post by *mag »

I've got in wincmd.ini

[Configuration]
...
DrivesExportUpcase=1
DrivesShowUpcase=1
...

And since at least 9.0b9 it doesn't work anymore, shown/copied drive letters are always lowercase.

EDIT: Turned out to be caused by the wincmd.ini being UTF-8 encoded, see my posts below. If the wincmd.ini is UTF-16 LE encoded then it's possible to avoid the whole issue.

So the question is now regarding the wincmd.ini encoding - does it have to be UTF-16 LE if it contains some special national characters, or is it a bug in tcmd when it has issues with UTF-8 encoding?
Last edited by mag on 2016-08-11, 18:20 UTC, edited 1 time in total.
User avatar
Lefteous
Power Member
Power Member
Posts: 9537
Joined: 2003-02-09, 01:18 UTC
Location: Germany
Contact:

Post by *Lefteous »

Not confirmed
User avatar
Dalai
Power Member
Power Member
Posts: 10021
Joined: 2005-01-28, 22:17 UTC
Location: Meiningen (Südthüringen)

Post by *Dalai »

Not confirmed either. Are you sure you checked the correct wincmd.ini used by TC?

Regards
Dalai
#101164 Personal licence
Ryzen 5 2600, 16 GiB RAM, ASUS Prime X370-A, Win7 x64

Plugins: Services2, Startups, CertificateInfo, SignatureInfo, LineBreakInfo - Download-Mirror
User avatar
Horst.Epp
Power Member
Power Member
Posts: 7012
Joined: 2003-02-06, 17:36 UTC
Location: Germany

Post by *Horst.Epp »

Not confirmed for TC 9.0b9 x64 and x86 under Windows 10
Windows 11 Home, Version 24H2 (OS Build 26100.4351)
TC 11.55 RC7 x64 / x86
Everything 1.5.0.1395a (x64), Everything Toolbar 1.5.5.0, Listary Pro 6.3.2.88
QAP 11.9.0.4 x64
mag
Junior Member
Junior Member
Posts: 35
Joined: 2008-10-06, 08:35 UTC

Post by *mag »

It behaves really strange here.
I actually wanted to test:

ShowHiddenSystemOverlay
ShowHiddenDimmed

So I used "Configuration / Change Settings Files Directly" to edit the wincmd.ini and add these options there.
After relaunching tcmd I noticed several display settings were reset (such as show hidden files that I had enabled prior to that change was now disabled) which was strange. However I re-enabled that stuff again.

However it seems that beside the originally reported issue also those 2 options ShowHiddenSystemOverlay, ShowHiddenDimmed don't have any effect at all, so there must be something more common wrong.

If I change some setting via GUI and then check the wincmd.ini via "Configuration / Change Settings Files Directly" the related configuration change is visible there, so it definitely seems to use that file. Also I haven't found another one anywhere on the system disk.

Note: I've got the ini file location set to "Application data" and its actual location is C:\Users\<username>\AppData\Roaming\GHISLER\wincmd.ini

After deleting the whole wincmd.ini and thus resetting the whole configuration it started to behave properly - all those options work if I add them there again. It seems that something that was already there was causing issues, I'll try to find out more.
mag
Junior Member
Junior Member
Posts: 35
Joined: 2008-10-06, 08:35 UTC

Post by *mag »

After thoroughly checking the wincmd.ini I found several duplicate sections and entries in there. That's actually happened to me in the past already and it was due to some special character being present somewhere... I guess it was due to similar reason this time as well but now I'm unable to find out the exact cause.

So I manually cleaned up wincmd.ini and now everything works as expected.
mag
Junior Member
Junior Member
Posts: 35
Joined: 2008-10-06, 08:35 UTC

Post by *mag »

Alright I've found the culprit.

in wincmd.ini I've got the following text in Cyrillic:

[RenameSearchFind]
0=скан

The wincmd.ini is stored in UTF-8 and that screws things up when I change anything in the configuration (either via GUI or via "Configuration / Change Settings Files Directly"). Tcmd will choke on the above text when it's in UTF-8 and will write 2nd [Configuration] section in the wincmd.ini and will start to use that one (and not actually fully, some options still seem to be taken from the 1st [Configuration] section).

Note that tcmd will still keep the file in UTF-8 so even after adding the 2nd [Configuration] section the file will still contain the cause of the problem.

If I convert the wincmd.ini to UTF-16 LE then the problem doesn't occur.
User avatar
Dalai
Power Member
Power Member
Posts: 10021
Joined: 2005-01-28, 22:17 UTC
Location: Meiningen (Südthüringen)

Post by *Dalai »

TC uses WinAPI functions to read and write wincmd.ini. These API functions support ANSI and Unicode (UTF-16) only. UTF-8 is not supported! TC stores items in the sections encoded separately by prefixing a BOM (Byte Order Mark) if required.

There have been several reports of wincmd.ini suddenly becoming UTF-8 encoded. I don't know if the reason/cause for this has been detected yet.

Regards
Dalai
#101164 Personal licence
Ryzen 5 2600, 16 GiB RAM, ASUS Prime X370-A, Win7 x64

Plugins: Services2, Startups, CertificateInfo, SignatureInfo, LineBreakInfo - Download-Mirror
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 »

Dalai wrote:There have been several reports of wincmd.ini suddenly becoming UTF-8 encoded. I don't know if the reason/cause for this has been detected yet.
I can't remember any report about the ini being converted to UTF-8 automatically, i.e. w/o user interaction. All reports were due to users manually converting the ini to UTF-8.
My guess is:
Some text editors seem to detect the ini file as UTF-8 when opening it*; and when users save them after doing some changes, it might therefore be recoded in the wrong way.

So I think it's time to make some sticky thread in this forum, or an explicit warning in the TC help file, to inform users that they should not recode the file to UTF-8 under any circumstances.


* This is the case when the ini doesn't contain any standalone ANSI characters > 0x7f, but at least one non-codepage (Unicode) entry encoded as UTF-8 byte sequence with a prefixed BOM
TC plugins: PCREsearch and RegXtract
User avatar
Dalai
Power Member
Power Member
Posts: 10021
Joined: 2005-01-28, 22:17 UTC
Location: Meiningen (Südthüringen)

Post by *Dalai »

2milo1012
Ah, yes, that could be the case. Hadn't thought of the text editors.
#101164 Personal licence
Ryzen 5 2600, 16 GiB RAM, ASUS Prime X370-A, Win7 x64

Plugins: Services2, Startups, CertificateInfo, SignatureInfo, LineBreakInfo - Download-Mirror
mag
Junior Member
Junior Member
Posts: 35
Joined: 2008-10-06, 08:35 UTC

Post by *mag »

I've done some more tests:

- delete wincmd.ini, start tcmd, search (Alt+F7) for "скан" so that it's stored into the new wincmd.ini, check the result in wincmd.ini: UTF-8 without BOM and tcmd doesn't have any problem with that file

- edit it in Windows (10 Anniversary) Notepad (simply "Configuration / Change Settings Files Directly" will open it in Notepad by default), save it (if you use "Save As" you may verify that it saves it in UTF-8), check the result: UTF-8 with BOM (EF BB BF hex) and tcmd has those above mentioned issues with that file

- convert it to UTF-16 LE (either via Windows Notepad "Save As" with "Encoding: Unicode" selected) or any other way, check the result: UTF-16 LE with BOM (FF FE 5B 00 hex) and tcmd has no problem with that file

So:
- tcmd itself uses UTF-8, but without BOM

- if we edit it in external editor that adds BOM to the UTF-8 file, tcmd will choke on that

- Windows Notepad (which is used for "Configuration / Change Settings Files Directly" by default) does exactly that, the easiest way to work around that is to convert it to UTF-16 LE (maybe BE would work as well - I haven't bothered with that) since then almost all text editors should save it properly because BOM is usually used in UTF-16 encoded files while just rarely in UTF-8 ones.

Fixing the tcmd so that it would work with wincmd.ini encoded in UTF-8 with BOM would be welcome, though I don't know how difficult it would be and whether it's possible.
User avatar
milo1012
Power Member
Power Member
Posts: 1158
Joined: 2012-02-02, 19:23 UTC

Post by *milo1012 »

2mag
No, it is just as Dalai explained: TC requests the ini keys from the Win API, and this will only support either ANSI or UTF-16 (LE). The API functions will detect if the file encoding is UTF-16, otherwise it will treat the file as ANSI - nothing else is possible.
This means that TC does not use UTF-8 at any place for the overall file encoding, but:
It will encode individual key values to UTF-8 with a prefixed BOM, because otherwise this information will be lost when dealing with an ANSI encoded ini file (ANSI plain / code page is limited - you wouldn't able to store characters outside your local code page when the ini file is ANSI). So these byte sequences are the cause that text editors might detect the file as UTF-8 when not much else is in the ini, but this is not intended.
This explains all your findings.

You can cross check it by doing the following:
Start a search (Alt+F7) with a string that consists of characters from your local (system) code page with values above 127 (above the ASCII characters) only, but nothing else (no characters outside your code page). So e.g. on a system with page 1252 (Western European) you could use some Umlauts (öäü). TC will save this string to the ini file in ANSI byte encoding, not UTF-8. Now do another search with a string consisting of characters outside your local page, like your former example: TC will save that string to the ini file in UTF-8 byte encoding and prefixed BOM, but only this string, the 1st string is untouched! When you now open the ini file in a decent text editor it will detect it as ANSI, because the 1st (ANSI) string will form an invalid byte sequence for a UTF-8 encoding - just like intended.
TC plugins: PCREsearch and RegXtract
mag
Junior Member
Junior Member
Posts: 35
Joined: 2008-10-06, 08:35 UTC

Post by *mag »

Yes it's more or less like that, but that's even worse actually. That means you can easily end up with a file that uses multiple character encodings at the same time and probably no editor will handle that properly. In such case tcmd should really try to enforce the whole file to be encoded in UTF-16 LE.
User avatar
MVV
Power Member
Power Member
Posts: 8711
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Post by *MVV »

It is correct that TC uses Windows API that only support ANSI and UTF-16, UTF-8 is not supported, but TC may store some strings in UTF-8 with personal BOMs (and these BOMs may tell editors that file has UTF-8, but you will not see these BOMs because BOM has no visible representation).

I can add that second [Configuration] section may appear because of BOM at the beginning of the file. Windows API expect that section names must be at the beginning of lines but in case of UTF-8 with BOM first file line starts with BOM, i.e. it is <BOM>[Configuration] instead of just [Configuration], so API doesn't detect the section.

I think that it is a bad idea nowadays to open files in Windows Notepad by default because of mentioned reasons (it doesn't allow selecting input encoding and adds a BOM at the beginning of file if detects its encoding as UTF-8). It seems that only editors that allow selecting input encoding may be used for editing configuration files...
User avatar
Horst.Epp
Power Member
Power Member
Posts: 7012
Joined: 2003-02-06, 17:36 UTC
Location: Germany

Post by *Horst.Epp »

MVV wrote:It is correct that TC uses Windows API that only support ANSI and UTF-16, UTF-8 is not supported, but TC may store some strings in UTF-8 with personal BOMs (and these BOMs may tell editors that file has UTF-8, but you will not see these BOMs because BOM has no visible representation).

I can add that second [Configuration] section may appear because of BOM at the beginning of the file. Windows API expect that section names must be at the beginning of lines but in case of UTF-8 with BOM first file line starts with BOM, i.e. it is <BOM>[Configuration] instead of just [Configuration], so API doesn't detect the section.

I think that it is a bad idea nowadays to open files in Windows Notepad by default because of mentioned reasons (it doesn't allow selecting input encoding and adds a BOM at the beginning of file if detects its encoding as UTF-8). It seems that only editors that allow selecting input encoding may be used for editing configuration files...
Thats the reason why I use NotepadReplacer to get Syn2 editor instead.
This works even for harcoded Notepad calls.
https://www.binaryfortress.com/NotepadReplacer/
Windows 11 Home, Version 24H2 (OS Build 26100.4351)
TC 11.55 RC7 x64 / x86
Everything 1.5.0.1395a (x64), Everything Toolbar 1.5.5.0, Listary Pro 6.3.2.88
QAP 11.9.0.4 x64
Post Reply