Find duplicate plugin fields: slow and no status

Here you can propose new features, make suggestions etc.

Moderators: Hacker, petermad, Stefan2, white

Post Reply
User avatar
white
Power Member
Power Member
Posts: 5988
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Find duplicate plugin fields: slow and no status

Post by *white »

Tested TC 8.50b8 32bit.

Function Search/Advanced/Find duplicate files

Search large number of files.

When searching for same name:
* TC seems to scan all folders
* TC shows "Comparison: 99" in the status bar
* Quickly after, the results are displayed

When searching for plugin field [=tc.fullname]:
* TC seems to scan all folders
* TC shows last name found when scanning all folders, in the status bar
* TC does not respond
* After long time results are displayed

It seems like searching for plugin fields is implemented a lot less efficient than searching for same name (or same size). Can it be improved?

I also suggest to display "Comparing" in the status bar before TC becomes not responding.

Can TC be made to respond when comparing, and show progress?
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50873
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

What plugin fields di you try? If you only search for plugin fields, TC has to get the plugin data value for all files first, and then sort by this value. This can take a long while.
Author of Total Commander
https://www.ghisler.com
User avatar
white
Power Member
Power Member
Posts: 5988
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Post by *white »

ghisler(Author) wrote:What plugin fields did you try? If you only search for plugin fields, TC has to get the plugin data value for all files first, and then sort by this value. This can take a long while.
I only searched for plugin field "tc.fullname".

Does it take so much longer to get tc.fullname then to get file name directly?
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50873
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

There is no need to access the harddisk, but there is still a lot of overhead for allocating the memory blocks to store the extra fields. It's also slower to compare two fields of any type with each other than comparing two hardcoded fields of known type. Therefore I don't think that I can improve much in this situation, but I will of course still have a look at it.
Author of Total Commander
https://www.ghisler.com
User avatar
white
Power Member
Power Member
Posts: 5988
Joined: 2003-11-19, 08:16 UTC
Location: Netherlands

Post by *white »

HISTORY.TXT wrote:17.11.13 Fixed: Search for duplicate files via plugin fields: Much faster by using quick sort to sort by plugin fields (32/64)
Tested OK using TC 8.50b10 32bit.

Status bar now shows "Comparison:" and it's much much faster.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50873
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Thanks for trying it! I was using the simple bubble sort algorithm because I expected that plugin fields would be used only in combination with other options, so only a few files per group would need to be sorted. But bubble sort becomes quickly very slow when there are a lot of files to compare. I'm therefore now using the much faster quick sort algorithm.
Author of Total Commander
https://www.ghisler.com
meisl
Member
Member
Posts: 171
Joined: 2013-12-17, 15:30 UTC

Post by *meisl »

Bubble Sort :shock:
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 50873
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Post by *ghisler(Author) »

Bubble sort is OK for small number of items, just not for thousands...
Author of Total Commander
https://www.ghisler.com
meisl
Member
Member
Posts: 171
Joined: 2013-12-17, 15:30 UTC

Post by *meisl »

Sure, even NP-hard problems are all manageable - with a small enough input size...

But no offense, really. It's just that I can't think of any reason why one would use Bubble Sort, except maybe ease of implementation. But then again, wouldn't expect that you implemented sorting "by hand" in TC, rather than use a library...?
Post Reply