Req: text search both normal and unicode

English support forum

Moderators: white, Hacker, petermad, Stefan2

User avatar
Sheepdog
Power Member
Power Member
Posts: 5150
Joined: 2003-12-18, 21:44 UTC
Location: Berlin, Germany
Contact:

Req: text search both normal and unicode

Post by *Sheepdog »

I would find it very handy if there were another checkbox (maybe ANSI) to let TC search normal Text as well as unicode.

I was looking for a particular text into an *.inf file. The search found several files but not the one I was looking for. After a few times I had the idea to check 'Unicode' and bingo: the file was found - while the other files were not.

I think it should be possible to search both unicode and ususal text in one step. How should I know if the file is saved as unicode or not as the display doesn't show me any difference.

sheepdog[/i]
"A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete fools."
Douglas Adams
User avatar
nevidimka
Senior Member
Senior Member
Posts: 385
Joined: 2004-06-20, 21:38 UTC

Post by *nevidimka »

2Sheepdog
How should I know if the file is saved as unicode or not as the display doesn't show me any difference.
If you're using TC internal lister you have to look menu/Option.
The doorstep to the temple of wisdom is a knowledge of our own ignorance. Benjamin Franklin
User avatar
Sheepdog
Power Member
Power Member
Posts: 5150
Joined: 2003-12-18, 21:44 UTC
Location: Berlin, Germany
Contact:

Post by *Sheepdog »

nevidimka wrote:2Sheepdog
How should I know if the file is saved as unicode or not as the display doesn't show me any difference.
If you're using TC internal lister you have to look menu/Option.
Thanks, I know how to distinguish between unicode and normal text, but
how should you know which format the wanted file uses before you have found it ?

sheepdog
"A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete fools."
Douglas Adams
User avatar
SanskritFritz
Power Member
Power Member
Posts: 3693
Joined: 2003-07-24, 09:25 UTC
Location: Budapest, Hungary

Post by *SanskritFritz »

I support the idea. But is it possible to program? That is a crucial question.
I switched to Linux, bye and thanks for all the fish!
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

If possible---

Post by *Clo »

2Sheepdog
:) Hello Stefan !
but how should you know which format the wanted file uses before you have found it ?
:lol: Maybe using a crystal bowl ?
- I support your idea too, whether it's possible, like SanskritFritz says…

:mrgreen: V G
Claude
Clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
User avatar
Sheepdog
Power Member
Power Member
Posts: 5150
Joined: 2003-12-18, 21:44 UTC
Location: Berlin, Germany
Contact:

Re: If possible---

Post by *Sheepdog »

Clo wrote:- I support your idea too, whether it's possible, like SanskritFritz says…
I think TC should search internally 2 times but remember the result of the first search and present both.

sheepdog
"A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete fools."
Douglas Adams
User avatar
Clo
Moderator
Moderator
Posts: 5731
Joined: 2003-12-02, 19:01 UTC
Location: Bordeaux, France
Contact:

UTF8 too ?

Post by *Clo »

2Sheepdog
:) Hi Stefan !
I think TC should search internally 2 times but remember the result of the first search and present both
.
¤ I agree. I noticed that TC founds the text in UTF8, it doesn't distinguish from the usual *.txt (ANSI - ASCII)
We have not an UTF8 tick box (I don't know if it should be useful ???)

:mrgreen: VG
Claude
clo
#31505 Traducteur Français de TC French translator Aide en Français Tutoriels Français English Tutorials
User avatar
Flint
Power Member
Power Member
Posts: 3487
Joined: 2003-10-27, 09:25 UTC
Location: Antalya, Turkey
Contact:

Post by *Flint »

I was going to create a new thread, but found this one... :)

At present TC supports searching for files with some text in ANSI, ASCII, Unicode (UTF-16) and UTF-8 encodings, but all they are exclusive. There is a confusing thing that the encoding options are designed as checkboxes, so the user get an illusion that it's possible to select several of them, but when he tries, he fails.

My suggestion is to implement one of the following two ideas:
1. (more preferable, but more complex to implement) To make it possible selecting several different encodings, so that the user could quickly find the file even if he doesn't remember its encoding (See the fake screenshot.)
2. (more easy, but less preferable) To make radio-buttons instead of checkboxes, so that users knew at once that it's impossible to search text in several encodings. (See the fake screenshot.)

Of course, in the first variant there is a problem that searching the text will take longer time with several encodings selected than with only one, but if I need to search in different encodings, it will take long time in any case, but in addition I will need to make several different searches with different parameters... Why not automate it?
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 10.52 / Win10 x64
User avatar
Sheepdog
Power Member
Power Member
Posts: 5150
Joined: 2003-12-18, 21:44 UTC
Location: Berlin, Germany
Contact:

Post by *Sheepdog »

Flint wrote:1. (more preferable, but more complex to implement) To make it possible selecting several different encodings, so that the user could quickly find the file even if he doesn't remember its encoding (See the fake screenshot.)
Very good idea.

100% support+++

sheepdog
"A common mistake that people make when trying to design something
completely foolproof is to underestimate the ingenuity of complete fools."
Douglas Adams
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48088
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Well, I could add this, but checking 3 options would mean 3 separate searches then, which would mean a considerable slowdown...
Author of Total Commander
https://www.ghisler.com
User avatar
Flint
Power Member
Power Member
Posts: 3487
Joined: 2003-10-27, 09:25 UTC
Location: Antalya, Turkey
Contact:

Post by *Flint »

ghisler(Author)
but checking 3 options would mean 3 separate searches then, which would mean a considerable slowdown...
Of course, we understand it. But it's much better than making 3 separate searches by hand. :) Thank you for considering this feature!
Flint's Homepage: Full TC Russification Package, VirtualDisk, NTFS Links, NoClose Replacer, and other stuff!
 
Using TC 10.52 / Win10 x64
gigaman
Member
Member
Posts: 131
Joined: 2003-02-14, 11:28 UTC

Post by *gigaman »

Besides, if those 3 searches won't be completely sequential (searching all files for ANSI text first, then searching all files for UNICODE text, etc.), but rather somehow parallel (searching one file for ANSI, then again for UNICODE, etc., then moving to another file - or possibly even by blocks?), the slowdown shouldn't be that bad for "bigger searches" - the file caching should help (compared to 3 separate searches), IMHO.

I certainly vote for this feature! 8)
StatusQuo
Power Member
Power Member
Posts: 1524
Joined: 2007-01-17, 21:36 UTC
Location: Germany

Post by *StatusQuo »

2ghisler(Author)
but checking 3 options
As this should be optional to the user, this sounds like a good way of implementation.

Although it would be 4 possible options from the current state:
- Unicode
- UTF8
- ANSI (which is standard now, not having a checkbox yet)
- DOS/ASCII

Optional, because when you know which encoding the searched file has, you don't mind about "matching" files in another encoding. E.g. *.LNK seem to be some kind of Unicode/UTF16, while *.URL are not (in my experience). Also, when searching for "CompanyName" you probably don't want every single office file to be listed...

I agree that option this would make searching more comfortable in other cases, so:

Support+
Who the hell is General Failure, and why is he reading my disk?
-- TC starter menu: Fast yet descriptive command access!
d
Member
Member
Posts: 157
Joined: 2007-02-05, 14:54 UTC

Post by *d »

unicode Tracing: is T r a c i n g : in ansi/utf-8. they could be searched in parallel.
d
Member
Member
Posts: 157
Joined: 2007-02-05, 14:54 UTC

Post by *d »

Sheepdog>I think it should be possible to search both unicode and ususal text in one step.
ghisler(Author)>Well, I could add this, but checking 3 options would mean 3 separate searches then, which would mean a considerable slowdown...
gigaman>won't be completely sequential (searching all files for ANSI text first, then searching all files for UNICODE text, etc.), but rather somehow parallel
d(I)>unicode Tracing: is T r a c i n g : in ansi/utf-8. they could be searched in parallel.

i mean searching like "ignore case" searching.
it is already available! - with file search tool - how? - set "find text", set "RegEx", and write RegEx with that meaning:
(but don't set "utf8" nor "unicode")
..RegEx with that meaning:
if i search for russian word "здраствуй" (hello)
you should write that:
"здраствуй" OR "здраствуй" OR "74@0AB2C9",
first of them is ansi(windows-1251), second - utf-8, and third -unicode.
what is RexEx for that?:
здраствуй|здраствуй|74@0AB2C9

and, "case sensitive" is faster.

how can you know that здраствуй ? save with notepad as utf-8 and look with lister as ansi.
Post Reply