Problem using Utf-8 encoding for file comments

Bug reports will be moved here when the described bug has been fixed

Moderators: white, Hacker, petermad, Stefan2

Post Reply
User avatar
thomasmolover
Member
Member
Posts: 160
Joined: 2016-12-12, 01:32 UTC

Problem using Utf-8 encoding for file comments

Post by *thomasmolover »

When using Utf-8 encoding for file comments, some files do not work properly, as if the file names of these files contain some special characters.
The files in the following package can't be commented normally. If you convert the utf-8 file directly to utf16, the comment information will appear.

https://drive.google.com/open?id=17Eb1Pus_kksq76lt9UBbPk-hQW5yekva

WinXP/Win10 test
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

I will check it, thanks.
Author of Total Commander
https://www.ghisler.com
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

I couldn't reproduce the problem here. However, you used ANSI for the name in the ZIP, so it showed up with characters from the Western charset here. Therefore I used the name from the preview, maybe I got it wrong?
sogou_pinyin_80dбя.txt

Then, commenting it with UTF-8 option worked just fine. Could you please:
1. Set the following option: Configuration - Options - ZIP packer - Pack Unicode names - Store all names containing non-English in extra field
2. Copy sogou_pinyin_80dбя.txt or whatever it is called to an empty directory
3. Delete any descript.ion file from there
4. Set a comment with Ctrl+Z
6. Check whether it misbehaved

If it did, please pack the two files sogou_pinyin_80dбя.txt and descript.ion to ZIP and send them to me.
Thanks!
Author of Total Commander
https://www.ghisler.com
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

Any news? I have tested now with Russian locale on Windows 7, where I got the correct name sogou_pinyin_80dбя.txt from the ZIP, no problems either.
Author of Total Commander
https://www.ghisler.com
User avatar
thomasmolover
Member
Member
Posts: 160
Joined: 2016-12-12, 01:32 UTC

Re: Problem using Utf-8 encoding for file comments

Post by *thomasmolover »

My friends and I both use the simple chinese OS, I use Win10, He is 2003, the file in the first pack could not be comment right

However, the strange thing is that the file name in my zip file should contain a ★, but the file name in your link becomes two Russian letters.

the charater was input by HZ China, but it change to russian letter in the pack

I re-compressed it into a 7z file, and re-previewed the downloaded test file, which can repeat the error problem.

https://drive.google.com/open?id=14qpLoA8oI1nKO0qXlc29OTzi1vLkDM_F

The correct file name should be as shown below

https://imgur.com/a/nQYy6sv
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

Thanks for the file! ZIP doesn't store the encoding by default, so it depends on the user locale which characters appear.

Ctrl+Z works just fine here with Swiss German locale on your file. The descript.ion file you sent also works fine with my locale. I will test with Chinese locale, it may be a locale problem.
Author of Total Commander
https://www.ghisler.com
User avatar
thomasmolover
Member
Member
Posts: 160
Joined: 2016-12-12, 01:32 UTC

Re: Problem using Utf-8 encoding for file comments

Post by *thomasmolover »

I found that the TC would not regonize the filename right, Every time input the comment, it would add a new line for the file NOT look it as ONE file

here is the animal below

https://imgur.com/a/qqKLNph
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

I could reproduce the error with Chinese locale (language for non-Unicode characters) now. I hope to find a solution for RC2.
Author of Total Commander
https://www.ghisler.com
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

This should work now in RC2, please test it! It was a problem with upper-/lowercase conversion having strange effects on that symbol.
Author of Total Commander
https://www.ghisler.com
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

Here are some instructions on how to test it:
1. Change description file "preferred type" in Configuration - Options - Operation to UTF-8
2. Create a file with accents or umlauts, e.g. Olé.txt or MÜLLER.txt
3. Create a comment for this file with Ctrl+Z, e.g. test
4. Open the descript.ion file with F4
5. Change the accent from upper- to lowercase or vice versa, e.g. OLÉ.TXT or Müller.txt
6. Check the comment with Ctrl+Z or Ctrl+Shift+F2

Results:
- TC 9.21 RC2: the comment is still shown
- all older versions: the comment is not shown due to the accent/umlaut
Author of Total Commander
https://www.ghisler.com
User avatar
thomasmolover
Member
Member
Posts: 160
Joined: 2016-12-12, 01:32 UTC

Re: Problem using Utf-8 encoding for file comments

Post by *thomasmolover »

I test the 9.21rc2 with the steps in Win 10, the result is right:

the comment was still shown, the filename in ion file tranged after rename the file with upper/lower charater

and the file "sogou_pinyin_80d★.txt" comment was shown right

I told my friend to test it with other OS

thank you
User avatar
petermad
Power Member
Power Member
Posts: 14739
Joined: 2003-02-05, 20:24 UTC
Location: Denmark
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *petermad »

I have tested as described - and it works fine.
Also tested with a filename like Œš.txt to use characters that are not part of my codepage (1252) - also works fine when changed to œŠ.txt in the descript.ion file.
Also tested with Unicode UTF16 and Plain text (after deleting the descript.ion file after each change) - works fine too.
License #524 (1994)
Danish Total Commander Translator
TC 11.03 32+64bit on Win XP 32bit & Win 7, 8.1 & 10 (22H2) 64bit, 'Everything' 1.5.0.1371a
TC 3.50b4 on Android 6 & 13
Try: TC Extended Menus | TC Languagebar | TC Dark Help | PHSM-Calendar
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Problem using Utf-8 encoding for file comments

Post by *ghisler(Author) »

Thanks a lot for your tests!
Author of Total Commander
https://www.ghisler.com
Post Reply