Are the variable %UL has length limit?

kesdoputr · Post by *kesdoputr » 2020-05-20, 19:05 UTC

Hello, i want to use %UL to make a filelist for play, because file name and path has japanese char
https://i.imgur.com/MnTJDFx.png
I select all mp4 and run below command

Code: Select all

[em_test]
cmd=%ComSpec% /c copy
param=%UL "%Tplaylist.txt"

And i found some file name has been cut in the result txt file.
https://i.imgur.com/8N4blPj.png
But i can manually edit and save the txt file for correct path\name, which can be played correctly by player.

Is this is windows's or TC's limit?
Thanks for your reading.

btw.I'm using TC 9.51 32bit on Windows7SP1 x64

Post by *petermad » 2020-05-21, 00:56 UTC

2kesdoputr
Try and see if this works better:

Code: Select all

[em_test]
cmd=%ComSpec% /c copy /b
param=%UL "%Tplaylist.txt"

kesdoputr · Post by *kesdoputr » 2020-05-21, 10:50 UTC

2petermad
Unfortunately the rusult txt is the same with copy and copy /b
Thanks for your reply.

gdpr deleted 6 · Post by *gdpr deleted 6 » 2020-05-21, 16:44 UTC

That kinda looks like a TC bug to my eyes.

I cannot say with confidence, of course, but it almost looks like that for generating the %UL% list, TC internally uses ANSI/MBCS/UTF-8 encoding in conjunction with some buffer(s) that are too small to hold the full path of some of your files. (Your file names consist of a lot of characters from Japanese character sets, with each of them requiring two or more bytes to encode. Which in turn would require byte buffers substantially larger than the character count of those path strings.)

(As a current workaround, i would suggest attempting to shorten either the file names or the directory name in which those files reside. From your screenshots it looks like shortening the directory name by about 10 "western" characters, or 5 Japanese characters, or so, should hopefully get you complete file paths in the %UL% list.)

Also, since you are on x64 Windows, try using the 64-bit version of TC, and see whether it exhibits the same problem...

kesdoputr · Post by *kesdoputr » 2020-05-22, 16:09 UTC

I tried x64 ver with same cmd, but the output seems the same.

Anyway thanks for your reply.

Post by *ghisler(Author) » 2020-05-26, 14:00 UTC

There are length limits to the command line, e.g. 32767 characters when using CreateProcess, and 2048 when using ShellExecuteEx.

gdpr deleted 6 · Post by *gdpr deleted 6 » 2020-05-26, 14:19 UTC

ghisler(Author) wrote: ↑2020-05-26, 14:00 UTC There are length limits to the command line, e.g. 32767 characters when using CreateProcess, and 2048 when using ShellExecuteEx.

The report here is about truncated content of the list file generated when using %UL%.
(There should be no involvement of some command line length limits or CreateProcess/ShellExecuteEx in creating and writing this list file.)

kesdoputr · Post by *kesdoputr » 2020-05-27, 19:42 UTC

ghisler(Author) wrote: ↑2020-05-26, 14:00 UTC There are length limits to the command line, e.g. 32767 characters when using CreateProcess, and 2048 when using ShellExecuteEx.

Even if i select only one file
H:\share\download\anime\BD\ソードアート・オンラインアリシゼーション War of Underworld\[アニメ BD] ソードアート・オンラインアリシゼーション War of Underworld 第01話「北の地にて」 (1920x1080 x264 AAC+コメ).mp4

The content of the %UL will be
H:\share\download\anime\BD\ソードアート・オンラインアリシゼーション War of Underworld\[アニメ BD] ソードアート・オンラインアリシゼーション War of Underworld 第01話「北の地にて」 (1920x1080 x264 AAC+コメ

It should not over the char limit?

Thanks for your reply.

nsp · Post by *nsp » 2020-05-28, 06:49 UTC

@Ghisler // it seems to be a bug
For %L and %UL we have the old path limitation to 260 Bytes, %WL is OK
Can you check if you do not have an internal buffer with such limitation or calling old win API that do not support long path ?

Post by *petermad » 2020-05-28, 16:48 UTC

2kesdoputr
If I try with your filename: H:\share\download\anime\BD\ソードアート・オンラインアリシゼーション War of Underworld\[アニメ BD] ソードアート・オンラインアリシゼーション War of Underworld 第01話「北の地にて」 (1920x1080 x264 AAC+コメ).mp4 it is truncated if I use %L or %UL, but not with %WL

kesdoputr · Post by *kesdoputr » 2020-05-28, 17:07 UTC

2petermad
Thanks for your test, same result here.

gdpr deleted 6 · Post by *gdpr deleted 6 » 2020-05-28, 22:37 UTC

nsp's comment about the 260 bytes buffer size seems to be spot on.

The string

Code: Select all

H:\share\download\anime\BD\ソードアート・オンライン アリシゼーション War of Underworld\[アニメ BD] ソードアート・オンライン アリシゼーション War of Underworld 第01話 「北の地にて」 (1920x1080 x264 AAC+コメ).mp4

encoded in UTF-8 would require 264 bytes. With a buffer of only 260 bytes, the last 4 bytes will not be accounted for.

If one also considers the string in the buffer will be terminated with a null byte (in other words, the buffer could only hold max 259 bytes of text data), in total the last 5 bytes from the original string would be missing. Which is exactly what we are seeing here: ").mp4" are the missing last 5 bytes of the original string.

I think this topic should be moved into the bugs section.

Imortant note to ghisler (just in case you are not already aware of this):

If you are going to address and fix this issue, do NOT mistakenly assume that a string encoded in UTF-8 will always use less or the same amount of bytes as the same string encoded in UTF-16. For characters in the Unicode code range 000800–00FFFF, the UTF-8 encoded character uses 3 octets/bytes, whereas in UTF-16 the same character will only use 2 octets/bytes. As you might already suspect many CJK characters fall into this code range. It is thus quite possible that a file path string encoded in UTF-8 can require more bytes than the same file path string encoded in UTF-16. To deal with this you could query the UTF-8-encoded byte size of any string and dynamically allocate/resize the buffer accordingly before writing the UTF-8 encoded string to the buffer. If you prefer sticking with pre-allocated constant-size buffers, it should be safe to use a buffer size of (̶3̶/̶2̶)̶ ̶*̶ ̶3̶2̶K̶ ̶=̶ ̶4̶8̶ ̶K̶B̶y̶t̶e̶s̶ 3 * 32 KB = 96 KB for UTF-8 file paths.

Post by *Hacker » 2020-05-29, 07:48 UTC

[mod=Hacker]Moved to the 9.5x bugs forum.[/mod]

Post by *ghisler(Author) » 2020-05-29, 14:20 UTC

I will check it, thanks for the detailed analysis.

gdpr deleted 6 · Post by *gdpr deleted 6 » 2020-05-30, 09:35 UTC

elgonzo wrote: ↑2020-05-28, 22:37 UTC [...] it should be safe to use a buffer size of (3/2) * 32K = 48 KBytes for UTF-8 file paths.

Oh my... stuff my brain cooks up when writing forum posts past midnight.
That "math" there in my last post is not making any sense. I don't even know how i ever came up with that. Jeez...

A byte buffer that is capable of holding the UTF-8 encoding of a file path string of max length (32K characters) needs to be 3 * 32 KB = 96 KB, of course. (Technically, it would be a little less than that, since a max length file path would have to incorporate some backslashes (which would only require a single byte in UTF-8)

Total Commander

Are the variable %UL has length limit?

Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?

Re: Are the variable %UL has length limit?