Page 1 of 1

SFV sort, detecting duplicates

Posted: 2013-06-27, 19:02 UTC
by isidro
It could be VERY useful to change SFV inner format so as to have FIRST the checksum, [then optionally FileSize], and finally FilePath and Name. This could be used to sort the file with a text editor (Textpad), and quickly find duplicates. It could be made configurable (filename first/ crc# first). Verify should easily handle any of both formats.

Posted: 2013-06-27, 20:18 UTC
by HolgerK
This may result in files which can't be read by other software.
http://en.wikipedia.org/wiki/Simple_file_verification wrote:SFV uses a plain text file containing one line for each file and its checksum in the format
FILENAME<whitespaces>CHECKSUM.
Any line starting with a semicolon ';' is considered to be a comment and is ignored for the purposes of file verification. The delimiter between the filename and checksum is always one or several spaces; tabs are never used.
Regards
Holger

Posted: 2013-06-29, 04:57 UTC
by isidro
That is why I suggested making this option configurable, default should be as usual which would make other programs happy, but if someone needs the other behaviour it would be very useful.
Maybe it could be another of the check options: sfv, MD5, SHA1 and newer SVF (inverse SFV)

It would be also nice to have speed process kb/s as in other functions.

Posted: 2013-06-29, 07:44 UTC
by MVV
isidro,
You can easilly get desired format using regexp replace ^(.*) +([0-9a-f]+)$ with \2\t\1 (this will exchange columns) in any text editor that supports regexp and then sort. E.g. EmEditor supports regexp replace from command line so you can create a TC button to open such file and do regexp replace at once.

Also you can simply use MD5/SHA1 checksums for comparing files. On modern computers it is as fast as CRC32 but provides both much better inequality guarantee and desired output file format.

Posted: 2013-07-01, 08:50 UTC
by ghisler(Author)
Just use MD5 or SHA1 as MVV has suggested - the specification defines that the checksum comes in front of the name for these formats.

Posted: 2013-07-14, 00:23 UTC
by isidro
ghisler(Author) wrote:Just use MD5 or SHA1 as MVV has suggested - the specification defines that the checksum comes in front of the name for these formats.
thanks!, didn't know that