Inconsistencies when saving descript.ion comments with Unix lines

The behaviour described in the bug report is either by design, or would be far too complex/time-consuming to be changed

Moderators: white, Hacker, petermad, Stefan2

User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

To reproduce:

Test case 1.

1. Save following with text as a text file with Unix lines:

Code: Select all

2020-09-09 16:40:18 Low memory clear event!
2020-09-09 16:40:18 Low memory clear event!
2020-09-09 16:41:30 tcApplication: onCreate
2020-09-09 16:41:35 PlayIntentReceiver.onReceive: com.ghisler.PlayPause
2020-09-09 16:41:35 showPlayerNotificationIfNotVisible
2. Now when it has Unix linebreaks, copy view this file with Lister and copy all text.
3. Open "Edit comment" (Ctrl+Z) dialog for some file.
4. Save the comment with F2:
the comment will be saved with Windows linebreaks.
5. Open this comments dialog again, select all and past the text again instead of current comment (or past it as a new comment for some another file).
6.Don't save comment now, but with Enter key, manually create the linebreaks in places where Unix linebreaks should exist (Edit Comments dialog show them as an invisible character, though you can disover them by need of additional move of caret "|" with cursor keys).
7. Now save the comment.
The comment will have 2 Windows linebreaks now in places where original Unix linebreaks existed.
8. Maybe some side bug or so, would be nice if somewone will confirm it. If to repeat steps 5-7 for the same comment many times, sometimes for some line TC saves the single linebreaks, so saved comment may have a look like this:

Code: Select all

2020-09-09 16:40:18 Low memory clear event!

2020-09-09 16:40:18 Low memory clear event!

2020-09-09 16:41:30 tcApplication: onCreate
2020-09-09 16:41:35 PlayIntentReceiver.onReceive: com.ghisler.PlayPause

2020-09-09 16:41:35 showPlayerNotificationIfNotVisible
Test case 2.

Do the same steps 1-7 for following text saved with Unix linebreaks:

Code: Select all

2020-09-09 16:30:41 =====================================
2020-09-09 16:30:41 MediaPlayerActivity: onCreate
2020-09-09 16:30:41 Low memory clear event!
2020-09-09 16:30:41 MediaPlayerActivity:onResume
2020-09-09 16:30:41 MediaPlayerActivity: onWindowFocusChanged (hasFocus)
2020-09-09 16:30:41 MediaPlayerActivity: adjustPlayerWindow: w=0, h=0, windowh=0
2020-09-09 16:30:41 MediaPlayerActivity: onResizeListener
Results:
  • In step, 4 the comment will be saved with visible "\n" text instead of any kind of new lines.
    The tooltip for the comment will also show the "\n" text instead of actual linebreaks.
  • In step, 7 the comment will be saved with visible "\n\n" text instead of any kind of new lines.
    The tooltip for the comment will also show the "\n\n" text instead of actual linebreaks.
It's not clear which save method (replaceing Unix linbreaks to Windows likebreaks, or to visible "\n" sequence) is intentional here, but in both cases the saved comment doesn't look like the text which we just pasted in Edit Comment field.
Obviously, if invisible linebreak would get autoconverted either to Windows linebreak or to "\n" sequence when the text is pasted, there would be no intention to create the lines in the Edit Comment dialog manually, which leads to unexpected and unpredictable result.
I reported about that unexpected behavior of Edit Comment editbox in September, 2020 by email (at that time I also didn't know that sometimes comments with Unix lines could be saved with "\n" as in a test case 2), but Christian just explained the reasons of curren behavior which I think is unexpected:
Christian Ghisler by email wrote:This happens because the text file is in Unix/Linux format, and Notepad doesn't convert it to Windows line breaks (CR/LF) when copying to the clipboard. The regular edit box of the comment field does not support Unix line breaks.
P.S. Alsthough in TC 10 beta 1/1a there is a change in the copying of the text with Unix lines form Lister, it doesn't seem to have any impact on results of reproduce steps above.

Edit: My configuration:
Windows 7 32-bit SP1 Russian;
Total Commander settings for comments:
Comment type: descript.ion
Preferred type: "Plain text+UTF-16"
"DOS Charset" - tested with enabled and disabled, doesn't seem to make any impact both for existing and new comments woth test text examples.
In Edit Comment dialog: "Use OEM (DOS) font" checkbox is unchecked.
Last edited by DrShark on 2021-03-15, 15:25 UTC, edited 2 times in total.
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
gdpr deleted 6
Power Member
Power Member
Posts: 872
Joined: 2013-09-04, 14:07 UTC

Re: Inconsistencies when saving comments with Unix lines

Post by *gdpr deleted 6 »

What file comment method do you have configured in TC? Do you use files.bbs? Or descript.ion, and if so, which variant?

Because using descript.ion with "Plaintext + UTF16" (and the "DOS charset" checkbox unchecked), trying your Test 1 behaves exactly like Test 2.
(Maybe i misunderstood your Test 1. What exactly are "Unix lines" and how do they differ from lines with Unix linebreaks?)

For descript.ion, representing linebreaks with \n (backslash character followed by lowercase n) is normal. Because each line in this file represent a comment record, linebreaks in comments cannot be stored unencoded (as this would be a real linebreak constituting the end of the comment record) and need to be stored in encoded form.
gdpr deleted 6
Power Member
Power Member
Posts: 872
Joined: 2013-09-04, 14:07 UTC

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *gdpr deleted 6 »

DrShark wrote: 2021-03-15, 15:03 UTC The tooltip for the comment will also show the "\n" text instead of actual linebreaks.
[...]
The tooltip for the comment will also show the "\n\n" text instead of actual linebreaks.
Normally TC does render the "\n" as linebreaks in a tooltip (and it does so for me when trying your tests; using your comment settings which coincide with mine).
Check in your descript.ion file whether the multi-line file comments have the 0x04 0xC2 bytes at then end before the Windows linebreak terminating the comment record. If those two bytes are missing, TC will not decode the "\n" as linebreak when showing the tooltip. I wonder if and how you managed to "lose" those two bytes...
User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

elgonzo wrote: 2021-03-15, 15:17 UTC What file comment method do you have configured in TC? Do you use files.bbs? Or descript.ion, and if so, which variant?

Because using descript.ion with "Plaintext + UTF16" (and the "DOS charset" checkbox unchecked), trying your Test 1 behaves exactly like Test 2.
(Maybe i misunderstood your Test 1. What exactly are "Unix lines" and how do they differ from lines with Unix linebreaks?)
I edited the title and the text of topic by adding the details of my configuration.
Regarding your other questions question, "Unix lines" and "Unix line breaks", there is no difference in context of my post.
Origially, the text parts used for text cases are from log files saved by Total Commander for Android, where it is saved with Unix linebreaks.
But to reproduce this on Windows without need to share original logs, I tried following:
1. Copy the text from the code part of start post.
2. Past it in Akelpad.
3. In Akelpad, set the linkebreak to Unix one (statusbar, doubleclick on area with "Win" text to change it to "Mac" or "Unix"), then saved the file (dind't change the encoding, here it's Cyrillic Windows CP-1251).
4. Opened the file in Lister, selected all text and copied.
5. Pasted in Edit Comment field.
6. With Enter key, created linebreaks in places where they originly intended to be shown,
e.g. pressing Enter when the cursor/caret is after "!" or before "2" between the strings

Code: Select all

2020-09-09 16:40:18 Low memory clear event!
2020-09-09 16:40:18 Low memory clear event!
which, when the text is pasted to Edit Comments field, looks like a sinle line with an invisible character between them:

Code: Select all

2020-09-09 16:40:18 Low memory clear event!2020-09-09 16:40:18 Low memory clear event!
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
gdpr deleted 6
Power Member
Power Member
Posts: 872
Joined: 2013-09-04, 14:07 UTC

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *gdpr deleted 6 »

Still, for me, the result for both Test 1 and Test 2 are exactly the same.

I don't know how you would get Windows linebreaks in Test 1. You can't really, as this would break the descript.ion file. (As i said, Windows linebreaks in the descript.ion would terminate the comment record.) Whether you copy text with Unix or Windows linebreaks into the comment box, they will be translated into "\n" when written into the descript.ion file.

But then again, your descript.ion file also seems to be missing the 0x04 0xC2 extension bytes for multi-line comments (please check this), as indicated by your tooltips rendering "\n" not as linebreaks but as "\n" characters, so i wonder what's going on there on your machine with your TC setup...
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *ghisler(Author) »

The extension bytes tell TC to interpret \n as line breaks. If they are missing, \n will be displayed as is. This is done because descript.ion didn't support line breaks at all. I registered this extension with the developer of 4dos to officially support line breaks.
Author of Total Commander
https://www.ghisler.com
User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

ghisler(Author) wrote: 2021-03-15, 18:24 UTC The extension bytes tell TC to interpret \n as line breaks. If they are missing, \n will be displayed as is. This is done because descript.ion didn't support line breaks at all. I registered this extension with the developer of 4dos to officially support line breaks.
Is TC supposed to add those extension bytes to the descript.ion (or particular comment in it?) when is saves the comment with text which has Unix linebreaks which was pasted to Edit Comment's edit box?

Here's* a ZIPX archive with sample text files and their comment descript.ion files. Both text files have Unix linebreaks. The comment of each file is the content of file itself copied from Lister, then pasted into Edit Comment's edit box, and the comments were saved just like that.

Now when the comments are saved, for me, for the file in "Case1" dir, both in the tooltip and Edit Comment's editbox TC shows the actual linebreaks. For the file in the "Case2" dir, in both in the tooltip and Edit Comment's editbox it shows "\n" characters sequences instead of actual linebreaks.

* if the file will be downloaded using TC's Download dialog called by Ctrl+N, you can set the extension (.zip or .zipx) and any name for the file in the next dialog which will ask for a local filename. If during the download a dialog like this will appear:

Code: Select all

---------------------------
SSL
---------------------------
The presented server certificate seems to belong to a different server name!
Continue anyway?

Connected to: api.onedrive.com

SHA1 Fingerprint:
77:27:91:D8:E9:91:39:0B:F9:F9:5E:86:3E:37:D5:DC:9D:85:30:49

Validity: 13.10.2020 20:35:43 until 13.10.2021 20:35:43 (UTC)

Cert subject: 
Country:     	US, S=WA, 
Location:    	Redmond, 
Organisation:	Microsoft Corporation, 
Org. Unit:   	Microsoft Corporation, 
Common name: 	storage.live.com
Alternate names: 	l-df.live.net,l.live.net,api.live.com,api.live.net,docs.live.net,skyapi.live.net,api-df.live.com,api-df.live.net,docs-df.live.net,skyapi-df.live.net,*.ra.live.com,*.cobalt.df.storage.msn.com,*.cobalt.df.storage.live.com,*.cobalt.storage.msn.com,*.df.storage.l

Cert issuer: 
Country:     	US, 
Organisation:	Microsoft Corporation, 
Common name: 	Microsoft RSA TLS CA 02
---------------------------
Yes   No   
---------------------------
press "Yes", the download should go fine.
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *ghisler(Author) »

Yes, TC will add these bytes for Windows line breaks 0x0d 0x0a, but not for Unix style line breaks 0x0a.
Author of Total Commander
https://www.ghisler.com
User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

ghisler(Author) wrote: 2021-03-16, 20:53 UTCYes, TC will add these bytes for Windows line breaks 0x0d 0x0a, but not for Unix style line breaks 0x0a.
OK, why then here for the text of "comment_with_lines.txt" which has Unix linebreaks and is shared in the archive by link in the above post, TC saves the comment using that marker making it a comment with Windows linebreaks?
The video of the issue (encoded with WMV9, packed into ZIP with compression rate 10).
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *ghisler(Author) »

I will try converting all Unix line breaks received viy Ctrl+D dialog with Windows line breaks, maybe it helps.
Author of Total Commander
https://www.ghisler.com
User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

history1000.txt wrote:18.03.21 Fixed: Ctrl+Z edit comment: If user pastes comment with Unix line breaks, convert them to Windows line breaks (32/64)
I can confirm Ctrl+Z dialog now saves the comment with the text form test case 2 now also with Windows line breaks.
However, for both test cases there seems to be an issue:
1. If file doesn't have a comment, past one with Unix line breaks, save it:
the comment will be saved fine with Windows linebreaks fine.
2. Now when the comment is there, edit it and paste originals text with Unix lines again:
The comment will be saved with Windows linebreaks, but also some characters in the end of it missing/replaced/added (it seems to be random each time that clipboard text with Unix lines it pasted instead of existing comment text and then saved).
For example, for test case 2, after this the comment text may become like this:

Code: Select all

2020-09-09 16:30:41 =====================================
2020-09-09 16:30:41 MediaPlayerActivity: onCreate
2020-09-09 16:30:41 Low memory clear event!
2020-09-09 16:30:41 MediaPlayerActivity:onResume
2020-09-09 16:30:41 MediaPlayerActivity: onWindowFocusChanged (hasFocus)
2020-09-09 16:30:41 MediaPlayerActivity: adjustPlayerWindow: w=0, h=0, windowh=0
2020-09-09 16:30:41 MediaPlayerActivity: onResizeList桧ɴܗ慃瑰潩ٮ伂݋敄慦汵ॴ䴋摯污敒畳瑬Ă合扡牏敤ɲ܀䉔瑵潴౮慃据汥畂瑴湯吃条ꈃЏ敌瑦툃́潔ͰĠ圅摩桴䬂䠆楥桧ɴؗ慃据汥܉慃瑰潩ٮ䌆湡散୬潍慤剬獥汵ɴࠂ慔佢摲牥Ȃ
Sometimes there is nothing after the "onResizeList", so it seems TC just cuts "onResizeListener" word, but once I even got after "onResizeList" the text which TC shows after the comment in a tooltip (file type, size and modification date):

Code: Select all

2020-09-09 16:30:41 =====================================
2020-09-09 16:30:41 MediaPlayerActivity: onCreate
2020-09-09 16:30:41 Low memory clear event!
2020-09-09 16:30:41 MediaPlayerActivity:onResume
2020-09-09 16:30:41 MediaPlayerActivity: onWindowFocusChanged (hasFocus)
2020-09-09 16:30:41 MediaPlayerActivity: adjustPlayerWindow: w=0, h=0, windowh=0
2020-09-09 16:30:41 MediaPlayerActivity: onResizeListТип: Текстовый документ
Размер: 412 байт
Дата изменения: 15.03.2021 15:57
2. Currently the line breaks are converted from Unix type to Windows only after the comment is saved. When the text with Unix linebreaks is just pasted to the Edit Comment editbox, the linebreaks are shown as invisible characters.
Could it be possible to convert the linbreaks to Windows type before the past action (TC already intercepts the clipboard to convert it to prevent the crash caused by pasting files copied from Everything)?
Or, alternatively, TC can provide an indication if there are the not compatible characters (e.g. Unix linebreaks) in the Edit Comment box. For example, it can be a short warning like ":!: Text contains unsupported chararecters (F1 for details)" which then will appear in status area of Edit Comment dialog and will be shown a text as a link on click which a help page will be opened which describes the details of conversion which happens to unsupported characters in comments (not only conversion of Unix to Windows linebreaks, but also other cases of when conversion is used, e.g. since version 9.51RC6, TC, which uses Unicode edit control in Edit Comment dialog, when saves the comment with active options for using OEM font and DOS charset, uses WideCharToMultiByte to convert text from not supported code page, which replaces unsupported characters with something suitable if possible, so when typing e.g. a German umlaut a with 2 dots in Russian locale, it converts it to an "a" without any extra dots, and other unsupported characters are converted with underscscore (_). TC 7.04x used ANSI edit control which converted all the unsupported characters to question marks ("?").
Last edited by DrShark on 2021-03-20, 17:43 UTC, edited 1 time in total.
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *ghisler(Author) »

1. Confirmed, there seems to be a missing 0 character.
2. Unfortunately not, the paste is handled by the edit control itself.
Author of Total Commander
https://www.ghisler.com
User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

ghisler(Author) wrote: 2021-03-19, 16:24 UTC2. Unfortunately not, the paste is handled by the edit control itself.
So, to make it clear, the method to intercept and modify clipboard content used for this fix:
HISTORY.TXT wrote:09.03.20 Fixed: Intercept Ctrl+V and Shift+Insert to command line and current path edit when it doesn't contain plain text, to avoid crash due to a Windows bug (32/64)
can't be used for Edit Comment's edit box?

And even if not, when the text with Unix linebreaks is already pasted and editbox shows that linebreaks as invisible characters, can TC detect their presence in edit box? And autoreplace with Windows linebreaks immediately? Or at least just detect, so TC could shows some kind of indication that some characters in the edit box are not supported and point to Help so user could find out what to expect when comment will be saved (see the alternative possible solution which I mentioned in my previous post)?
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *ghisler(Author) »

It's a lot of work with subclassing etc. so I want to avoid it.
Author of Total Commander
https://www.ghisler.com
User avatar
DrShark
Power Member
Power Member
Posts: 1872
Joined: 2006-11-03, 22:26 UTC
Location: Kyiv, 68/262
Contact:

Re: Inconsistencies when saving descript.ion comments with Unix lines

Post by *DrShark »

1. I see that Mac linebreaks are currently saved as \n sequences, maybe it's worth to convert them to Windows linebreaks too?
2.
ghisler(Author) wrote: 2021-03-21, 11:53 UTC
DrShark wrote: 2021-03-20, 17:39 UTCSo, to make it clear, the method to intercept and modify clipboard content ...can't be used for Edit Comment's edit box?

And even if not, when the text with Unix linebreaks is already pasted and editbox shows that linebreaks as invisible characters, can TC detect their presence in edit box? And autoreplace with Windows linebreaks immediately? Or at least just detect ...
It's a lot of work with subclassing etc. so I want to avoid it.
If even unsupported characters detection requires subclassing, I think at least the behavior for unsupported characters when the comment is saved should be described on the help page "Dialog box - Edit comment". It's worth to mention at least following:
  • Unix linebreaks are saved as Windows linebreaks
  • Mac linebreaks are saved as visible \n sequences (if this will be changed to above behavior for Unix linebreaks, Mac linebreaks should be just added to above point)
  • Characters with diacritics saved as their low ASCII equivalents (Some characters? Probably it depends on what kind pf character mapping is used for conversion). I don't know the correct and short terminology for that. It happens at least for pure "Plain text" comments.
  • Other unsupported characters are saved as underscore ("_")
  • Separately it's worth to note that the characters or their parts (for combined characters) may be invisible while typing if the font (e.g. when set to OEM) doesn't support that characters, but the comment type (e.g. some UTF) supports them, and in this case they are saved correctly
.
Donate for Ukraine to help stop Russian invasion!
Ukraine's National Bank special bank account:
UA843000010000000047330992708
Post Reply