Add BLAKE2 to checksum methods
Moderators: Hacker, petermad, Stefan2, white
Re: Add BLAKE2 to checksum methods
Out of curiosity I tried the --no-mmap switch and got very strange results:
After reboot:
1st run with b3sum v155 and v180:
~58 s
2nd and n runs:
~93s
CPU: ~10%
Basically no MEM increase
First I thought it's because of thermal throttling, but the SSD kept at 50-55°C.
And caching should display the opposite results.
No idea what's going on there.
After reboot:
1st run with b3sum v155 and v180:
~58 s
2nd and n runs:
~93s
CPU: ~10%
Basically no MEM increase
First I thought it's because of thermal throttling, but the SSD kept at 50-55°C.
And caching should display the opposite results.
No idea what's going on there.
Re: Add BLAKE2 to checksum methods
2ZoSTeR
1-st run is like usual single-threaded BLAKE3, 2-nd... this I can't explain.
I curious what if you define --num-threads 2 and --num-threads 4 - will it be any difference, because you define x2 threads?
1-st run is like usual single-threaded BLAKE3, 2-nd... this I can't explain.
I curious what if you define --num-threads 2 and --num-threads 4 - will it be any difference, because you define x2 threads?
Re: Add BLAKE2 to checksum methods
b3sum_v180.exe --num-threads 2
reboot
1: 135,22 s
2: 135,63 s
3: 138,09 s
b3sum_v180.exe --num-threads 4
reboot
1: 88,58 s
2: 92,47 s
3: 92,54 s
The SSD "Active Time" goes up on the 2nd run, but the transfer speed goes down...
Even if the b3sum.exe is not optimal, I'm wondering what's happening, hardware or Windows?
No improvement after a few minutes pause or Defender real-time protection off, just after a reboot.
Two runs with TC are exactly the same, performance counter wise.
reboot
1: 135,22 s
2: 135,63 s
3: 138,09 s
b3sum_v180.exe --num-threads 4
reboot
1: 88,58 s
2: 92,47 s
3: 92,54 s
The SSD "Active Time" goes up on the 2nd run, but the transfer speed goes down...
Even if the b3sum.exe is not optimal, I'm wondering what's happening, hardware or Windows?
No improvement after a few minutes pause or Defender real-time protection off, just after a reboot.
Two runs with TC are exactly the same, performance counter wise.
Re: Add BLAKE2 to checksum methods
2ZoSTeR
I only ready to give a very non-standard approach.
Download this:
https://github.com/Cyan4973/xxHash/releases/download/v0.8.3/xxhsum_win64_v0_8_3.zip:
and run xxhsum.exe -H2 file
Because it's the maximum possible speed with any type of hashing, this should be 6 GiB/s, at least this is how your NVMe rated.
Maybe it's the drive doesn't perform at its peak?
I only ready to give a very non-standard approach.
Download this:
https://github.com/Cyan4973/xxHash/releases/download/v0.8.3/xxhsum_win64_v0_8_3.zip:
and run xxhsum.exe -H2 file
Because it's the maximum possible speed with any type of hashing, this should be 6 GiB/s, at least this is how your NVMe rated.
Maybe it's the drive doesn't perform at its peak?
Re: Add BLAKE2 to checksum methods
Xxhsum is at least consistent between multiple runs and a reboot (2.1 GB/s).
So this has to be a b3sum.exe issue and maybe we shouldn't spam the thread with this any further.
CrystalDiskMark:
[Read]
SEQ 1MiB (Q= 8, T= 1): 7453.595 MB/s [ 7108.3 IOPS] < 1125.00 us>
SEQ 128KiB (Q= 32, T= 1): 7440.779 MB/s [ 56768.6 IOPS] < 563.11 us>
RND 4KiB (Q= 32, T=16): 5733.378 MB/s [1399750.5 IOPS] < 365.23 us>
RND 4KiB (Q= 1, T= 1): 94.838 MB/s [ 23153.8 IOPS] < 43.10 us>
So this has to be a b3sum.exe issue and maybe we shouldn't spam the thread with this any further.
CrystalDiskMark:
[Read]
SEQ 1MiB (Q= 8, T= 1): 7453.595 MB/s [ 7108.3 IOPS] < 1125.00 us>
SEQ 128KiB (Q= 32, T= 1): 7440.779 MB/s [ 56768.6 IOPS] < 563.11 us>
RND 4KiB (Q= 32, T=16): 5733.378 MB/s [1399750.5 IOPS] < 365.23 us>
RND 4KiB (Q= 1, T= 1): 94.838 MB/s [ 23153.8 IOPS] < 43.10 us>
Re: Add BLAKE2 to checksum methods
2ZoSTeR
Right now I don't know any SW or a working sample of multi-threaded via TBB BLAKE3 С code just to give and compare.
We've already got single-threaded BLAKE3 С code as BLAKEX64.DLL so we'll go along with it until ghisler(Author) figures out how to deal with TBB. Or does he want it at all. Anyway I don't see that newer version performs slower than the older one and it's good.
All I've done is attracted Christian's attention to the possibility for BLAKE3 to become multi-treaded, but it's not ready to use code, as everybody understands now.
We shouldn't anyway. The truth is that I can't tell why your results are so inconsistent.we shouldn't spam
Right now I don't know any SW or a working sample of multi-threaded via TBB BLAKE3 С code just to give and compare.
We've already got single-threaded BLAKE3 С code as BLAKEX64.DLL so we'll go along with it until ghisler(Author) figures out how to deal with TBB. Or does he want it at all. Anyway I don't see that newer version performs slower than the older one and it's good.
All I've done is attracted Christian's attention to the possibility for BLAKE3 to become multi-treaded, but it's not ready to use code, as everybody understands now.
- ghisler(Author)
- Site Admin
- Posts: 50817
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: Add BLAKE2 to checksum methods
I have added blake3 multi-threading to Total Commander 11.55 RC1 64-bit now. The multi-threading library doesn't seem to work with 32-bit, and requires a lot of memory, so I only have a 64-bit version. Currently I'm calculating the hash in 1 GByte chunks (mapped into memory), which seems to work well, larger buffers don't seem to provide any benefit. Only chunks larger than 1 MByte are put through the multi-tasking function.
It can be configured via wincmd.ini section [Configuration] with the option named
CrcBlake3BlockSize
Which can be set to the block size for Blake3 hash in MBytes. Special values:
0: 256kBytes, no memory mapping (like TC 11.51 or older)
1: 1MByte, no multi-threading
>=2: multi-threading enabled
2048: Maximum supported value (2GB)
Please test it!
It can be configured via wincmd.ini section [Configuration] with the option named
CrcBlake3BlockSize
Which can be set to the block size for Blake3 hash in MBytes. Special values:
0: 256kBytes, no memory mapping (like TC 11.51 or older)
1: 1MByte, no multi-threading
>=2: multi-threading enabled
2048: Maximum supported value (2GB)
Please test it!
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Re: Add BLAKE2 to checksum methods
2ghisler(Author)
Christian, It sees that multi-threaded BLAKE3 С code is affected by this issue, which is closed, but not solved.
https://github.com/BLAKE3-team/BLAKE3/issues/390#issue-2231626034
If you try to hash a big file (4-6Gb) stored on a mechanic HDD with multi-threading enabled the speed drops to 60 Mb/s instead of 180-140 or whatever speed a mechanic HDD can provide. Here's the main developer Jack O'Connor explains why this happens.
https://github.com/BLAKE3-team/BLAKE3/issues/390#issuecomment-2299661934
If a device has only any flash memory-based drives (SSDs) as the storage - it's not affected, It affects any mechanic HDD and calculation of the big files.
For those people who have the combined storage SSD + HDD it might be a good idea to use a suggestion from here:
CrcBlake3BlockSize=1 = (--num-threads=1 for b3sum)
https://github.com/BLAKE3-team/BLAKE3/issues/390#issuecomment-2299661934 or
CrcBlake3BlockSize=0 = (--no-mmap for b3sum) and this is also disables multi-threading.
I checked the both options mentioned above do work.
Christian, It sees that multi-threaded BLAKE3 С code is affected by this issue, which is closed, but not solved.
https://github.com/BLAKE3-team/BLAKE3/issues/390#issue-2231626034
If you try to hash a big file (4-6Gb) stored on a mechanic HDD with multi-threading enabled the speed drops to 60 Mb/s instead of 180-140 or whatever speed a mechanic HDD can provide. Here's the main developer Jack O'Connor explains why this happens.
https://github.com/BLAKE3-team/BLAKE3/issues/390#issuecomment-2299661934
If a device has only any flash memory-based drives (SSDs) as the storage - it's not affected, It affects any mechanic HDD and calculation of the big files.
For those people who have the combined storage SSD + HDD it might be a good idea to use a suggestion from here:
CrcBlake3BlockSize=1 = (--num-threads=1 for b3sum)
https://github.com/BLAKE3-team/BLAKE3/issues/390#issuecomment-2299661934 or
CrcBlake3BlockSize=0 = (--no-mmap for b3sum) and this is also disables multi-threading.
I checked the both options mentioned above do work.
- ghisler(Author)
- Site Admin
- Posts: 50817
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: Add BLAKE2 to checksum methods
You can disable multi-threading by setting
CrcBlake3BlockSize=1
as described above, but this brings you back to TC 11.51 speed.
It would be nice to find out programmatically whether the current drive is an SSD or a HDD. Any ideas?
CrcBlake3BlockSize=1
as described above, but this brings you back to TC 11.51 speed.
It would be nice to find out programmatically whether the current drive is an SSD or a HDD. Any ideas?
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Re: Add BLAKE2 to checksum methods
2ghisler(Author)
Yes, this is true, but if a device only has SATA-III drives as the storage, 1 thread of BLAKE3 with AVX2 has to 2000 MiB/s and that's enough, as I think.You can disable multi-threading by setting
CrcBlake3BlockSize=1
as described above, but this brings you back to TC 11.51 speed.
Unfortunately, BLAKE3 team now has no idea how to tell apart SSDs and HDDs to switch the threads automatically, neither do I have this idea.It would be nice to find out programmatically whether the current drive is an SSD or a HDD. Any ideas?
- ghisler(Author)
- Site Admin
- Posts: 50817
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: Add BLAKE2 to checksum methods
I found some hints on StackOverflow, one checking for the presence of the TRIM command, and one checking in WMI:
https://stackoverflow.com/questions/23363115/detecting-ssd-in-windows
Hopefully one of them works without admin rights.
https://stackoverflow.com/questions/23363115/detecting-ssd-in-windows
Hopefully one of them works without admin rights.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Re: Add BLAKE2 to checksum methods
2ghisler(Author)
There's a PS cmdlet:
https://learn.microsoft.com/en-us/powershell/module/storage/get-physicaldisk?view=winserver2012-ps
it shows "MediaType" as HDD or SSD, I think it uses the corresponding property from here:
https://learn.microsoft.com/en-us/windows-hardware/drivers/storage/msft-physicaldisk
There's a PS cmdlet:
https://learn.microsoft.com/en-us/powershell/module/storage/get-physicaldisk?view=winserver2012-ps
it shows "MediaType" as HDD or SSD, I think it uses the corresponding property from here:
https://learn.microsoft.com/en-us/windows-hardware/drivers/storage/msft-physicaldisk
Re: Add BLAKE2 to checksum methods
I checked using my SATA-III SSD Kingston KC600 512 MB and i7-2600K, with a reboot of the PC after each test, so:
TC 11.51 = 410 MiB/s (normal)
CrcBlake3BlockSize=1 = 190 MiB/s
CrcBlake3BlockSize=0 = 410 MiB/s
CrcBlake3BlockSize=1024-2047 = 410 MiB/s, for CrcBlake3BlockSize=2048 TC says something like "File doesn't exist".
And, yes, as ZoSTeR wrote for multi-threading CPU load is significantly higher, to 92% as opposed to a single thread with only 32%.
TC 11.51 = 410 MiB/s (normal)
CrcBlake3BlockSize=1 = 190 MiB/s
CrcBlake3BlockSize=0 = 410 MiB/s
CrcBlake3BlockSize=1024-2047 = 410 MiB/s, for CrcBlake3BlockSize=2048 TC says something like "File doesn't exist".
And, yes, as ZoSTeR wrote for multi-threading CPU load is significantly higher, to 92% as opposed to a single thread with only 32%.
- ghisler(Author)
- Site Admin
- Posts: 50817
- Joined: 2003-02-04, 09:46 UTC
- Location: Switzerland
- Contact:
Re: Add BLAKE2 to checksum methods
These are strange results, with my Samsung m.2 SSD it's about 2-3 times faster with CrcBlake3BlockSize=1024 than TC 11.51.
Author of Total Commander
https://www.ghisler.com
https://www.ghisler.com
Re: Add BLAKE2 to checksum methods
2ghisler(Author)
It's SATA-III SSD - 410 MiB/s is the max for it. For reading from OS's cache CrcBlake3BlockSize=1024 gives clear advantage of about 2,7 times minimum.
It's SATA-III SSD - 410 MiB/s is the max for it. For reading from OS's cache CrcBlake3BlockSize=1024 gives clear advantage of about 2,7 times minimum.