Is there a (64 bits) plugin, or another way to create a custom column which detects encoding of a file?
p.e. utf-8, ansi/latin1, KOI8-R or whatever encoding.
I would like to know also if a (text) file is a dos or unix file.
File Encoding Detection
Moderators: Hacker, petermad, Stefan2, white
Re: File Encoding Detection
I assume you're aware of the fact that there is no reliable way to detect the difference between different ANSI code pages and OEM/DOS text?raytc wrote:ansi/latin1, KOI8-R or whatever encoding
These existing methods rely on some random statistics and fail quite often.
The only detections that work quite solid are UTF-16 and UTF-8.
You may also use PCREsearch for these encodings, though it doesn't distinguish between OEM/ANSI for the mentioned reason.
You mean the different line endings, a.k.a. CRLF / LF ?raytc wrote:I would like to know also if a (text) file is a dos or unix file.
You could also use PCREsearch for this, by counting the occurrences of CRLF / LF.
Create two columns with the Reg. Expressions:
Code: Select all
1st: \r\n
2nd: \n
- the number in the LF columns exceeding CRLF column: the file is probably Unix text (or binary)
- LF and CRLF columns show the same number: file is probably Windows text due to CRLF.