Remix clone Hacker News

new | show | ask | jobs Github

3wolf 16 hours ago

They're using perceptual hashing, not cryptographic hashing of raw pixels. So it's invariant to variable bitrate, compression, etc.

hnlmorg 8 hours ago | parent [-]

How does perceptual hashing work?

Have you got any recommendations for further reading on this topic?

	▲	b_mc2 27 minutes ago \| parent \| next [-]
		These are two articles I liked that are referenced in the Python ImageHash library on PyPi, second article is a follow-up to the first. Here's paraphrased steps/result from first article for hashing an image: 1. Reduce size. The fastest way to remove high frequencies and detail is to shrink the image. In this case, shrink it to 8x8 so that there are 64 total pixels. 2. Reduce color. The tiny 8x8 picture is converted to a grayscale. This changes the hash from 64 pixels (64 red, 64 green, and 64 blue) to 64 total colors. 3. Average the colors. Compute the mean value of the 64 colors. 4. Compute the bits. Each bit is simply set based on whether the color value is above or below the mean. 5. Construct the hash. Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent. The resulting hash won't change if the image is scaled or the aspect ratio changes. Increasing or decreasing the brightness or contrast, or even altering the colors won't dramatically change the hash value. https://www.hackerfactor.com/blog/index.php?/archives/432-Lo... https://www.hackerfactor.com/blog/index.php?/archives/529-Ki...
	▲	tasty_freeze an hour ago \| parent \| prev \| next [-]
		In the same way that Shazam can identify songs despite the audio source being terrible over a phone, mixed with background noise. It doesn't capture the audio as a WAV and then scan its database for an exact matching WAV segment. I'm sure it is way more complex than this, but shazam does some kind of small windowed FFT and distills it to the dominant few frequencies. It can then find "rhythms" of these frequency patterns, all boiled down to a time stream of signature data. There is some database which can look up these fingerprints. One given fingerprint might match multiple songs, but since they have dozens of fingerprints spread across time, if most of them point to the same musical source, that is what gets ID'd.
	▲	Someone 5 hours ago \| parent \| prev \| next [-]
		https://en.wikipedia.org/wiki/Perceptual_hashing
	▲	gertrunde 8 hours ago \| parent \| prev [-]
		Possibly one of the better known (and widely used?) implementations is Microsoft's PhotoDNA, that may be a suitable starting point.