Find text with RegEx in non-ANSI files

English support forum

Moderators: Hacker, petermad, Stefan2, white

Post Reply
LiborVojtek
Junior Member
Junior Member
Posts: 2
Joined: 2019-12-11, 05:33 UTC

Find text with RegEx in non-ANSI files

Post by *LiborVojtek »

Hi.

I have a problem, where I need to search a lot of files for a specific part of a code (" *10000))/100 ").
The issue is, that there may or may not be multiple spaces, tabs or even some comments in between.

I have tried multiple approaches, but non of them worked (I know about couple of files which does contain that string).
E.g.:
\*\s*10000\s*.*\s*\)\s*\)\s*/\s*100 (works in Notepad++)
\*\s*10000(.*)\)\s*\)\s*/\s*100
\*(.*)10000(.*)\)(.*)\)(.*)/(.*)100

Then I found out, that the issue is probably with encoding, as those files are apparently in some UCS-2 LE BOM and it cannot even find the exact string from that file unless I select Unicode UTF16.
But when I want to turn on the RegEx, it automatically switches to ANSI charset (Windows).

So my question is, if there is some way for me to somehow identify files containing the string with RegEx if they are not in ANSI.

Thanks,
Libor
User avatar
MVV
Power Member
Power Member
Posts: 8711
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Re: Find text with RegEx in non-ANSI files

Post by *MVV »

I don't know which TC version do you use since it switches encoding to ANSI, but recent TC versions allow simultaneous search in multiple encodings (ANSI, OEM, Unicode), and searching with regex works fine in UTF-16 LE file with or without BOM (though I'm not sure if it supports searching for Unicode characters).
LiborVojtek
Junior Member
Junior Member
Posts: 2
Joined: 2019-12-11, 05:33 UTC

Re: Find text with RegEx in non-ANSI files

Post by *LiborVojtek »

Thank You MVV.

I was indeed running older version of TC and now it looks like it is happily searching in all encodings.
However it is still unable to find those files with any of the RegEx expressions I can come up.
But when I search for some other simple RegEx it returns some results.
So now I presume my RegEx is not good.
Post Reply