[Bug] Invalid filenames encoding in ZIP

English support forum

Moderators: white, Hacker, petermad, Stefan2

User avatar
Flint
Power Member
Power Member
Posts: 3487
Joined: 2003-10-27, 09:25 UTC
Location: Antalya, Turkey
Contact:

Post by *Flint »

I can tell which are Russian characters in these two encodings, but I'm not sure that this is a good criterium...

In 1251 codepage the Russian characters are:
capital - 0xC0..0xDF and 0xA8
small - 0xE0..0xFF and 0xB8

In 866:
capital - 0x80..0x9F and 0xF0
small - 0xA0..0xAF, 0xE0..0xEF and 0xF1

But there may also be Ukranian letters too, I don't know them all. I'll ask on the Russian forum (there are many people from the Ukraine).

<Added>
Well, after some investigation I found these letters by myself. :) The additional Ukranian letters in 1251 codepage are: 0xB3, 0xB4, 0xBA, 0xBF (smalls), 0xB2, 0xA5, 0xAA, 0xAF (capitals). In 866 there are only four of them (two capital and two small): 0xF2..0xF4
Last edited by Flint on 2006-06-23, 10:22 UTC, edited 1 time in total.
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 10.52 / Win10 x64
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48097
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

So according to your list, my detection method should also work for Russian.

Unfortunately I don't know Russian or Ukrainian, so I cannot say which of the letters are really used in these languages. For example, the code 0x81 in Windows 1251 (looking like a Greek Gamma with a French Accent egu to me) is described as "Cyrillic capital letter GJE". Do you know/use this character? The are more such characters in the 0x80 line on
http://www.microsoft.com/globaldev/reference/sbcs/1251.mspx
Author of Total Commander
https://www.ghisler.com
User avatar
XPEHOPE3KA
Power Member
Power Member
Posts: 854
Joined: 2006-03-03, 18:23 UTC
Location: Saint-Petersburg, Russia

Post by *XPEHOPE3KA »

2Flint
A5 & B3 are misplaced in your post (check others too!), but I see no difference for detection: does it matter whether a letter is capital or not?
F6, Enter, Tab, F6, Enter, Tab, F6, Enter, Tab... - I like to move IT, move IT!..
User avatar
Flint
Power Member
Power Member
Posts: 3487
Joined: 2003-10-27, 09:25 UTC
Location: Antalya, Turkey
Contact:

Post by *Flint »

No, this letter isn't used in Russian. Also, it's not used in Ukranian, Belarussian and Kazakh (other languages that have 1251 codepage). However, maybe there are still some other cyrillic languages that use this letter... Of course, I don't know all of them.
I can only be sure that punctiation and other symbols like 0x82, 0x84..0x89, 0x8B are not letters. :)

XPEHOPE3KA
I've just misplaced them occasionally. :) Capitals should be smalls and vice versa. I've corrected it already.
Of course, it does not affect anything, but just for convenience...
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 10.52 / Win10 x64
Post Reply