Remix.run Logo
VTimofeenko 17 hours ago

Most likely case is that the tv is computing hash locally and sending the hash. Judging by my dnstap logs, roku TV maintains a steady ~0.1/second heartbeat to `scribe.logs.roku.com` with occasional pings to `captive.roku.com`. The rest are stragglers that are blocked by `*.roku.com` DNS blackhole. Another thing is `api.rokutime.com`, but as of writing it's a CNAME to one of `roku.com` subdomains.

The block rates seem to correlate with watch time increasing to ~1/second, so it's definitely trying to phone home with something. Too bad it can't since all its traffic going outside LAN is dropped with prejudice.

If your network allows to see stuff like that, look into what PS5 is trying to do.

godelski 8 hours ago | parent | next [-]

  > Most likely ... sending the hash
If you're tracking packets can't you tell by the data size? A 4k image is a lot more data than a hash.

I do suspect you're right since they would want to reduce bandwidth, especially since residential upload speeds are slow but this is pretty close to verifiable, right?

Also just curious, what happens if you block those requests? I can say Samsung TVs really don't like it... but they will be fine if you take them fully offline.

VTimofeenko 8 hours ago | parent [-]

> If you're tracking packets can't you tell by the data size? A 4k image is a lot more data than a hash.

I admit, I've not gotten around to properly dumping that traffic. For anyone wanting to do this, there's also a spike of DNS requests every hour on the hour, even if tv is off(well, asleep). Would be interesting to see those too. Might be a fun NY holiday project right there. Even without decrypting (hopefully) encrypted traffic, it should be verifiable.

> Also just curious, what happens if you block those requests?

Due to `*.roku.com` DNS black hole, roku showed no ads but things like Netflix and YouTube using standard roku apps("channels") worked fine. I now moved on to playing content using nvidia shield and blocking outside traffic completely. Only odd thing is that the TV occasionally keeps blinking and complains about lack of network if I misclick and start something except HDMI input.

CursedSilicon 16 hours ago | parent | prev | next [-]

Hashing might not work since the stream itself would be a variable bitrate, meaning the individual pixels would differ and therefore the computed file hash

3wolf 16 hours ago | parent [-]

They're using perceptual hashing, not cryptographic hashing of raw pixels. So it's invariant to variable bitrate, compression, etc.

hnlmorg 8 hours ago | parent [-]

How does perceptual hashing work?

Have you got any recommendations for further reading on this topic?

b_mc2 22 minutes ago | parent | next [-]

These are two articles I liked that are referenced in the Python ImageHash library on PyPi, second article is a follow-up to the first.

Here's paraphrased steps/result from first article for hashing an image:

1. Reduce size. The fastest way to remove high frequencies and detail is to shrink the image. In this case, shrink it to 8x8 so that there are 64 total pixels.

2. Reduce color. The tiny 8x8 picture is converted to a grayscale. This changes the hash from 64 pixels (64 red, 64 green, and 64 blue) to 64 total colors.

3. Average the colors. Compute the mean value of the 64 colors.

4. Compute the bits. Each bit is simply set based on whether the color value is above or below the mean.

5. Construct the hash. Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent.

The resulting hash won't change if the image is scaled or the aspect ratio changes. Increasing or decreasing the brightness or contrast, or even altering the colors won't dramatically change the hash value.

https://www.hackerfactor.com/blog/index.php?/archives/432-Lo...

https://www.hackerfactor.com/blog/index.php?/archives/529-Ki...

tasty_freeze an hour ago | parent | prev | next [-]

In the same way that Shazam can identify songs despite the audio source being terrible over a phone, mixed with background noise. It doesn't capture the audio as a WAV and then scan its database for an exact matching WAV segment.

I'm sure it is way more complex than this, but shazam does some kind of small windowed FFT and distills it to the dominant few frequencies. It can then find "rhythms" of these frequency patterns, all boiled down to a time stream of signature data. There is some database which can look up these fingerprints. One given fingerprint might match multiple songs, but since they have dozens of fingerprints spread across time, if most of them point to the same musical source, that is what gets ID'd.

Someone 5 hours ago | parent | prev | next [-]

https://en.wikipedia.org/wiki/Perceptual_hashing

gertrunde 8 hours ago | parent | prev [-]

Possibly one of the better known (and widely used?) implementations is Microsoft's PhotoDNA, that may be a suitable starting point.

clbrmbr 15 hours ago | parent | prev | next [-]

What system do you use to get that level of visibility?

VTimofeenko 14 hours ago | parent | next [-]

Main data comes from unbound[1], I use vector[2] to ship and transform logs. Dnstap[3] log format IME works better than the standard logs, especially when it comes to more complex queries and replies. Undesired queries get 0.0.0.0 as a response which I track.

Firewall is based on hand-rolled nftables rules.

[1]: https://www.nlnetlabs.nl/projects/unbound/about/ [2]: https://vector.dev [3]: https://dnstap.info/Examples/

varenc 13 hours ago | parent | prev | next [-]

Besides what others have said, another dead simple option is to use Nextdns: https://nextdns.io

Doesn't require running anything locally and supports various block rules and lists and allows you to enable full log retention if you want. I recommend it to non-techies as the easiest way to get something like pi-hole/dnscrypt-proxy. (but of course not being self-hosted has downsides)

edit: For Roku, DNS blocking like this only works if Roku doesn't use its own resolver. If it's like some Google devices it'll use 8.8.8.8 for DNS resolution ignoring your gateway/DHCP provided DNS server.

ImPostingOnHN 11 hours ago | parent [-]

Seems like you could have a router or firewall mitm queries to e.g. 8.8.8.8 and potentially redirect/rewrite/respond

darkwater 8 hours ago | parent | next [-]

I would not be surprised if Google TV devices will sooner than later start using DoH to 8.8.8.8

godelski 8 hours ago | parent | prev [-]

I'm a noob at this, but can you do that when it is DoT or DoH? Like I thought the point of them is that you can't forget the DNS request. Even harder with oDoH, right? So does that really get around them?

nwellinghoff 11 hours ago | parent | prev | next [-]

Pfsense firewall. There is a week long learning curve and it’s best to put it on dedicated hardware.

mschuster91 15 hours ago | parent | prev [-]

Replace your router's DNS with something like pi-hole or a bog standard dnsmasq, turn up the logging, that's it. Ubiquiti devices I think also offer detailed DNS logging but not sure.

jakeydus 14 hours ago | parent [-]

I believe unifi offers aggregated dns logs ootb but you could always set up more detailed ones on the gateway itself.

NuclearPM 12 hours ago | parent | prev [-]

I don’t know why you quoted the addresses.

__MatrixMan__ an hour ago | parent | next [-]

It's polite to give parsers (human or otherwise) hints that they're about to encounter text which is now intended for a different kind of parser.

I recently forgot to surround my code in ``` and Gemini refused to help with it (I think I tripped a safety guardrail, it thought I was targeting it with an injection attack). Amusingly, the two ways to work around it were to fence off my code with backticks or to just respond to:

> I can't help you with that

With

> Why not?

After which it was then willing to help with the unquoted code. Presumably it then perceived it as some kind of philosophical puzzle rather than an attack.

RicoElectrico 11 hours ago | parent | prev | next [-]

Markdown habit.

alias_neo 5 hours ago | parent | prev [-]

Tell me you don't Markdown, without telling me you don't Markdown.

It's a developer thing, using backticks means the enclosed text is emphasised when rendered from Markdown.

jameshart 3 hours ago | parent | next [-]

Backticks mark fixed width inline code, not emphasis.

alias_neo 3 hours ago | parent [-]

I know what they do, it doesn't change the fact that we use them for emphasis.

adastra22 an hour ago | parent | prev | next [-]

Backticks long predate markdown.

freedomben 32 minutes ago | parent | prev [-]

How dare someone not be a developer!