Use bigger buffer/chunks when comparing files in "Synchronize directories"

Here you can propose new features, make suggestions etc.

Moderators: white, Hacker, petermad, Stefan2

Post Reply
kapela86
Junior Member
Junior Member
Posts: 22
Joined: 2013-08-12, 21:28 UTC
Location: Poland

Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *kapela86 »

When you use "Synchronize directories" and have checked "by content" I noticed that files are read really slow (if you compare it to for example generating md5 of source files and verifying it with destination files). After I did some testing I saw that files are read in 32KB chunks, these chunks are compared and if they are different then files are marked as different. And it's great because it speeds up comparison in many situations. But it slows it down when you compare files on the same physical media (HDD/CD/DVD/BR). My proposition is to use bigger chunks in this situation, either automatically detecting that source and destination are on same physical media, or adding a checkbox to UI, or ini setting.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48021
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *ghisler(Author) »

When I implemented this, I tried with various block sizes from 32k to 1MB. Interestingly, the 32k method was the fastest, probably due to Windows read cache.
Author of Total Commander
https://www.ghisler.com
User avatar
MVV
Power Member
Power Member
Posts: 8702
Joined: 2008-08-03, 12:51 UTC
Location: Russian Federation

Re: Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *MVV »

I agree that there should be an option, because reading files by small chunks from HDDs should be slower than reading them by large chunks.
kapela86
Junior Member
Junior Member
Posts: 22
Joined: 2013-08-12, 21:28 UTC
Location: Poland

Re: Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *kapela86 »

ghisler(Author) wrote: 2019-06-27, 09:34 UTC When I implemented this, I tried with various block sizes from 32k to 1MB. Interestingly, the 32k method was the fastest, probably due to Windows read cache.
Well, I don't know how you got that results during your testing. How long ago did you do it? Let me say just this. If you are reading only one file then reading it in chunks larger that 32KB probably won't have any difference on modern HDD, for example here's ATTO benchmark on my Seagate Barracuda Pro 10TB: https://i.imgur.com/mDJLHsR.png
But if you are reading two different files then actuator has to go from one place to another. And that time it takes to move actuator directly corresponds to slower reading (same principle applies to fragmented files and why defragmentation is needed). And if you are reading file in larger chunks then actuator doesn't have to move that much form one place to another. If you need to create some test case, I suggest an extreme example: record some large files on DVD and test synchronization on them. You will clearly see that there is a difference with larger chunk size.
User avatar
Hacker
Moderator
Moderator
Posts: 13052
Joined: 2003-02-06, 14:56 UTC
Location: Bratislava, Slovakia

Re: Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *Hacker »

I remember exactly that when Christian implemented a bigger buffer it was me who was testing some burned CD's or DVD's by comparing them with the original data on the HDD and the comparison was about 10 times slower than with the 32 KB buffers. Upon reporting this Christian was very surprised but after reverting the change the read speeds went back to normal.
I wanted to check the beta forum to see if I could still find the thread and refresh my memory but the betaboard is offline now. I did not find any mention in the history.txt, either.
So I guess we could make the value configurable perhaps?

Roman
Mal angenommen, du drückst Strg+F, wählst die FTP-Verbindung (mit gespeichertem Passwort), klickst aber nicht auf Verbinden, sondern fällst tot um.
User avatar
Usher
Power Member
Power Member
Posts: 1675
Joined: 2011-03-11, 10:11 UTC

Re: Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *Usher »

Hacker wrote: 2019-06-28, 21:09 UTC So I guess we could make the value configurable perhaps?
Sure, but… I suspect that the best value for HDD may be somehow related to the size of HDD cache, so there should be separate settings for every drive.
Andrzej P. Wozniak
Polish subforum moderator
kapela86
Junior Member
Junior Member
Posts: 22
Joined: 2013-08-12, 21:28 UTC
Location: Poland

Re: Use bigger buffer/chunks when comparing files in "Synchronize directories"

Post by *kapela86 »

Hacker wrote: 2019-06-28, 21:09 UTC it was me who was testing some burned CD's or DVD's by comparing them with the original data on the HDD and the comparison was about 10 times slower than with the 32 KB buffers
Interesting, I can't think of why this would happen and my curiosity "sparked".
Hacker wrote: 2019-06-28, 21:09 UTC So I guess we could make the value configurable perhaps?
I think reading in bigger chunks is only needed if you read files from same physical location. Still, I would love to help test this, for now you could just implement it as user configurable value in beta version and let me know so I could test it in different situations.
Post Reply