-
Task
-
Resolution: Done
-
Normal
-
None
-
None
Apologies if this extractor already exists. I could not find one. Digests are obviously used for verified transfer of files over space and time. In addition they can used to identify duplicate files in a larger collection.
I'd recommend that we build an extractor that computes all of the commonly used digest algorithms in one pass through the data.
MD5 - very commonly used to prevent bit rot and verify transfers
SHA256 - efficient replacement for MD5, new very common
SHA512 - increasingly used for de-duplication purposes, less chance of collision
Are there others that we should generate while we are already traversing the file? Any need for an old fashioned checksum?
Note that these do not generally need to be current strong cryptographic hashes that prevent malicious attack. They are used in this case for fingerprinting the data.