Plain text search fails but RegEx succeeds

Horst.Epp · Post by *Horst.Epp » 2019-01-14, 09:27 UTC

MarkFilipak wrote: 2019-01-13, 21:08 UTC As a topic for general discussion... (I think poking brains is fun)

In the world of GUI, why are we still dragging around '\n' & '\t'? Why can't we simply feed text -- any text -- into a text-box and click "Find"? By "any text" I include new-lines and tabs and control chars and... anything. The current search input methods are CLI relics that can be abandoned.

So, what would submit the search string? Not '\n' -- that's so 'CLI'. What would submit the search string would be [ Find ].

I don't agree with this general assumption.
That has nothing to do with CLI or any other environment conditions.
Its a major difference if I search for text inside of lines or accross line boundaries.
Nevertheless it would be helpful to have a search mode regeradless of any special chars.

Usher · Post by *Usher » 2019-01-14, 15:01 UTC

Horst.Epp wrote: 2019-01-14, 09:27 UTCNevertheless it would be helpful to have a search mode regardless of any special chars.

In general you are right, it would be something more like a smart web search. For the start I would like to see the following features:

fold all white space characters (replace multiple spaces, tabs and EOLs with a single space) to eliminate regex syntax;
ignore accents, umlauts, ogonki etc.;
ignore punctuation marks.

There are many kinds of special chars, so they should be grouped somehow (by Unicode range?), and I have no idea which group should be ignored in the first place (emoji?).

Finally, it's almost impossible to provide really smart search - because of (backward) compatibility issues there may be problems with duplicates, homographs, some accented characters, other special characters etc. in Unicode, see Wikipedia articles:
https://en.wikipedia.org/wiki/Duplicate_characters_in_Unicode
https://en.wikipedia.org/wiki/Unicode_equivalence
https://en.wikipedia.org/wiki/Unicode_compatibility_characters
https://en.wikipedia.org/wiki/Homoglyph
https://en.wikipedia.org/wiki/IDN_homograph_attack

MarkFilipak · Post by *MarkFilipak » 2019-01-14, 17:36 UTC

Hi Horst! Thanks for participating.

Horst.Epp wrote: 2019-01-14, 09:27 UTC
MarkFilipak wrote: 2019-01-13, 21:08 UTC As a topic for general discussion... (I think poking brains is fun)

In the world of GUI, why are we still dragging around '\n' & '\t'? Why can't we simply feed text -- any text -- into a text-box and click "Find"? By "any text" I include new-lines and tabs and control chars and... anything. The current search input methods are CLI relics that can be abandoned.

So, what would submit the search string? Not '\n' -- that's so 'CLI'. What would submit the search string would be [ Find ].
I don't agree with this general assumption.
That has nothing to do with CLI or any other environment conditions. ...

Oh? Doesn't it? Well, let me ask: Why does '\n' even exist? I think it's because '\n' is the only way to include end-of-line in a search string. And that's only because, for a command line, an actual end-of-line ('Enter' key) terminates the command line! But what if we stop insisting that CLI commands consist solely of printable characters? What if we include non-printable characters?

What about this: Type in the first 'line' of target text but for end-of-line, press Ctrl+Enter; then continue typing in the next 'line' of target text, etc. When all the 'lines' of target text have been entered, then press Enter (without Ctrl) to terminate the command line. IMHO, that's the way Perl RegEx should have been from the git-go.

Next conceptual step: Type in the first 'line' of target text including Enter; then continue typing in the next 'line' of target text, etc. -- just as you would enter text into a text editor. When all the 'lines' of target text have been entered, then press the [Find] button to submit the search command.

'\n' to insert end-of-line into a search string is a relic of a bygone era.

Total Commander

Plain text search fails but RegEx succeeds

Re: Plain text search fails but RegEx succeeds

Re: Plain text search fails but RegEx succeeds

Re: Plain text search fails but RegEx succeeds