Page 1 of 2

Lister and UTF-8 option

Posted: 2006-11-08, 06:50 UTC
by menet
Hi,

TC does not change automaticly the lister vue to UTF-8 if the TXT file does not contain the UTF-8 signature (at the beginning of the file). :?

Today, to change the view to UTF-8 for a file in lister, we have to do Options / UTF-8 or use the 7 shortcut.

Can we change to 8 this shortcut to be more mnemonic ? :?: Or add a second shortcut : 8 for UTF-8 view ? :twisted:

Another wish but not so easy to implement : Can we have a new option in Lister to use UTF-8 has default for text file ?

Best regards. :wink:

Posted: 2006-11-08, 16:58 UTC
by ghisler(Author)
How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...

Posted: 2006-11-08, 19:11 UTC
by menet
ghisler(Author) wrote:How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...
Hi Christian,
If the file have not UTF-8 signature, TC can't know that it should read it in UTF-8.

But I would like to have a special option to open the text files (files that TC opens in "text only" format by default) with UTF-8 format by default. 8)
I am french and I use UTF-8 format by default for all my text files (but without using UTF-8 signature) with PSPad free text editor ( http://www.pspad.com/ ).
It will give no changes for English text but it is not the case for French text... :roll:

What about adding also 8 for shortcut to UTF-8 format in Lister ?

Regards :wink:

Posted: 2006-11-25, 08:40 UTC
by menet
Hi Christian,

It is possible that i have not well understood your reply. :?

Does my request to have a special new option to read the text file using the UTF-8 format by default is stupid ? :?:
Will it give a problem in some other view ?

Best Regards :wink:

Posted: 2006-11-25, 23:51 UTC
by gigaman
ghisler(Author) wrote:How should TC know that a file is in UTF8 format? If it contains only English text, it looks the same as ANSI text...
For English text, it looks the same, so it also doesn't matter whether the lister is switched to ANSI or UTF-8 ;)
For non-English text, however, it should be possible to "guess" the format even when there's no signature in the file (for example, verify that all bytes >= 0x80 fall into valid UTF-8 sequences... maybe it would be good enough?).

Posted: 2006-11-26, 13:49 UTC
by ghisler(Author)
Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.

Posted: 2006-11-26, 20:46 UTC
by menet
Hi Christian, you have not replied to my request to have a special option to use UTF-8 format has default for text files without doing a scan of the file ? :roll:

Best Regards :wink:

Posted: 2006-11-26, 22:48 UTC
by gigaman
ghisler(Author) wrote:Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.
Right, scanning of the whole file is not a good idea if the file is really big - but I think that a smaller block (32kB?) can give quite a reliable result (if the number of 0x80+ characters exceeds certain limit, of course; the format of UTF-8 sequences is quite special).
Maybe this "text format auto-detection" could be an optional feature (enable/disabled in Lister options). Detecting Unicode files (without BOF signature) should be possible in a similar way.

Posted: 2006-11-27, 07:45 UTC
by now
ghisler(Author) wrote:Currently lister doesn't scan the entire file when loading it, so such a check would take a long time with big files. On the other side, scanning only a small part of the file could lead to incorrect results.
Well, one could argue that not scanning only a small part never leads to the correct result. I think people would rather the lister at least tried to make an educated guess, based on a small part of the file, than that it did nothing.

I can send you a program I wrote to determine the encoding of files at work (based on an algorithm found in the Unix utility "file"). It's written in Ruby, but it should be easy enough to follow even if you're not familiar with the language.

Re: Lister and UTF-8 option

Posted: 2020-06-23, 21:15 UTC
by tommy0910
Is there any news on this?

I don't need any autodetection. I'd just like lister to always start in mode "7" instead of mode "1".

Re: Lister and UTF-8 option

Posted: 2020-06-24, 07:57 UTC
by Horst.Epp
tommy0910 wrote: 2020-06-23, 21:15 UTC Is there any news on this?

I don't need any autodetection. I'd just like lister to always start in mode "7" instead of mode "1".
Install the CudaLister plugin.
You can set UTF-8 as default for opening files and it also has many advantages compared to pure Lister.
The options are reached by the context menu in any open file.
https://totalcmd.net/plugring/CudaLister.html

Re: Lister and UTF-8 option

Posted: 2020-06-24, 09:07 UTC
by Hacker
tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7
HTH
Roman

Re: Lister and UTF-8 option

Posted: 2020-06-24, 19:15 UTC
by tommy0910
Thx :) Great!

Re: Lister and UTF-8 option

Posted: 2021-09-01, 09:55 UTC
by amesh
Hacker wrote: 2020-06-24, 09:07 UTC tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7
HTH
Roman
Where should be "Default" in External Viewer settings? I can not find...
Image: https://diogenesfest.com/temp/TC-External-Viewer-settings.png

Re: Lister and UTF-8 option

Posted: 2021-09-01, 12:13 UTC
by Stefan2
amesh wrote: 2021-09-01, 09:55 UTC
Hacker wrote: 2020-06-24, 09:07 UTC tommy0910,
I'd just like lister to always start in mode "7" instead of mode "1".
Configuration - Options - Edit/View - External Viewer - Default:

Code: Select all

%COMMANDER_EXE% /S=L:T7
HTH
Roman
Where should be "Default" in External Viewer settings? I can not find...
Image: https://diogenesfest.com/temp/TC-External-Viewer-settings.png


The text box (edit control) behind of "Default:"