Remix.run Logo
The Smart TV in Your LivingRoom Is a Node in the AIScraping Economy(blog.includesecurity.com)
144 points by nikcub 7 hours ago | 49 comments
xg15 4 hours ago | parent | next [-]

> After config fetch, the SDK opens a persistent WebSocket to:

wss://proxyjs.brdtnet.com:443

This hostname resolves to AWS Global Accelerator IPs

There is some irony that both the scrapers and the websites being scraped are probably hosted on AWS, while playing an elaborate cat-and-mouse game pretending that they weren't.

BLKNSLVR 37 minutes ago | parent | next [-]

Adding to DNS block list immediately.

xg15 17 minutes ago | parent [-]

Don't forget the config endpoint before as well.

> On every launch the SDK calls:

GET <https://clientsdk.bright-sdk.com/sdk_config_ios.json>?appid=<bundle>&ver=<sdk-version>&uuid=sdk-ios-<32hex>

cyanydeez 3 hours ago | parent | prev [-]

Kind how the American government needs commercial businesses which they poorly regulate so those businesses provide privacy invasions as a legal means to wash their hands.

rootsudo an hour ago | parent [-]

Same for arms dealing, and every other industry.

cobbzilla 4 hours ago | parent | prev | next [-]

I never connect any “smart” device to wifi. If it doesn’t work without connectivity, I don’t want it. I use my TVs as display devices. They have HDMI-in and that’s it.

graypegg an hour ago | parent | next [-]

I have a smart TV that's never spoken to the internet after exiting the factory, but it's a pretty tenuous state of affairs. I have this fear that someone staying over is going to see the "Services unavailable, press [menu] to troubleshoot" toast that shows up overtop the HDMI feed for a few seconds and think they're helping me by connecting it. 4-5 years worth of firmware updates all at once... half a decade of watch data somehow extracated from the HDMI feed and stored for precisely this moment... ads everywhere. Even if it doesn't happen instantly, I can only assume there's some flag deep in the OS called makeEverythingWorse just waiting to be flipped on the femtosecond The Beast catches a whiff of a slightly-higher patch number; now content in it's doomed state after having fufilled it's one true purpose of telling someone at samsung my favourite show is HDMI2.

I have had to back my mother down from that precipice on her own TV so I know it's worth worrying about. The siren call of an entirely empty TV homescreen beckoning us with a struck-out radio tower icon. "We have Disney+ and CraveTV too... press [menu]... pay no attention to the sticky note your son put on the coffee table"

dylan604 an hour ago | parent | next [-]

> I have this fear that someone staying over is going to

This happened to me. After they left, I tried a factory reset, but I don't have confidence there's not some code to remember previously saved wifi connections because my tinfoil hat is firmly in place. However, as you've said I only use the TV as an HDMI receiver. None of the TV's apps are used again. So I'm not sure how much they can detect from just the use of the HDMI port as the only thing being used. The games we play to get the subsidized pricing.

Eisenstein 25 minutes ago | parent [-]

HDMI is heavily used for ACR (automatic content recognition) in smart TVs:

"Our findings indicate that (1) ACR operates even when it is used as a “dumb” display via HDMI"

"For both LG (a) and Samsung (b)TVs, the scenarios with the highest ACR traffic are Linear and HDMI."

* https://dl.acm.org/doi/epdf/10.1145/3646547.3689013

archerx 36 minutes ago | parent | prev [-]

Find the TV’s MAC address and block it on your router. My brother home network had this system where your MAC address had to be whitelisted on the router to communicate with the network, as the days go by I see how in hindsight how this might be for the best in the end.

onesociety2022 19 minutes ago | parent [-]

I’m paranoid that actually blocking internet access to the TV will result in filling up the TV’s disk with all of this intrusive data they have collected waiting to be uploaded, eventually run out of space and brick the TV. This could be just bad software or actually malicious where they intentionally break something if it loses connectivity for too long and they can see you using it with other connected devices.

We really need normies to care enough about this to the point manufacturers will need to think they need to advertise on their TVs that they are privacy-friendly and don’t collect anything as a selling point. Until then, they don’t really care. I just wish someone like Apple made a TV with their Apple TV functionality baked in that I could trust.

jon-wood 37 minutes ago | parent | prev | next [-]

Frustratingly I do want some of the functionality that comes with connecting my TV to the network - specifically the ability to control things like turning it on and choosing which input its set to via an API it exposes. That's manageable by putting it on a VLAN which isn't allowed access to the outside world, but its also really annoying to me that I have to do that.

lelandfe 4 hours ago | parent | prev [-]

On my TCL TV, you have to connect it to read the Google policies you are agreeing to. If you don't, you agree to policies unread.

Thankfully, the blast radius of this is nothing without connectivity.

drhike 3 hours ago | parent | next [-]

If it has an Ethernet port I would use that then unplug it. It still gets to phone home once but you don't have to worry about it maliciously saving your Wi-Fi password for later

tamimio an hour ago | parent [-]

You can create a guest wifi with temporary password, I do that when I need to connect devices that might store the password like kindle or such.

idiotsecant 3 hours ago | parent | prev [-]

But it lets you continue without reading them? There's a lot of questionable terms of service rules but this one has to be unenforcable.

lelandfe 2 hours ago | parent [-]

You must check a checkbox in agreement to continue. To read the policies one agrees to, an internet connection is required. You may check the checkbox without reading.

As far as I have found from a lot of menu spelunking, this agreement is irrevocable. If I ever go online, it will be used.

calcifer 4 hours ago | parent | prev | next [-]

> The SDK’s config ships a flag “use_netifs”: true. That flag triggers code in the SDK binary that constructs its NWConnection with a specific required interface: en0 (WiFi) or pdp_ip0 (cellular), rather than using the system default route.

> On iOS, this bypasses any configured VPN’s tun0 interface entirely. The peer tunnel does not cross a user-configured VPN, even when the rest of the app’s HTTPS traffic does.

What's a legitimate use case for this API? When/why should an app be allowed to bypass a user-configured VPN?

chmod775 3 hours ago | parent | next [-]

> What's a legitimate use case for this API?

When you're the application providing the VPN or when you're any app built to communicate with something on a local-ish network, not something actually reachable globally.

picofarad 3 hours ago | parent | prev [-]

> When/why should an app be allowed to bypass a user-configured VPN?

temporarily if full tunnelling isn't working, one can split tunnel to route around issues due to VPN

But imo an app should never bypass something like a network boundary.

kotaKat an hour ago | parent [-]

Look at how far TikTok can go if you try blocking DNS. The hardcoded IPs, self-DNS-resolution and cat-and-mouse game of blocking is quite... interesting.

vsgherzi an hour ago | parent [-]

Is there anywhere I could read more about this ?

kotaKat an hour ago | parent [-]

https://github.com/M4jx/TikTokBlocklist

I think they may have scaled back from this, but they were running a 100% malware-style playbook to hit the Tiktok servers like it was some kinda sketchy C2 package. Lots of attempts of their own DoH (and DoT!) and normal DNS servers to try to get into the Tiktok network.

yodon 3 hours ago | parent | prev | next [-]

Naive question: what would I search for to find a tutorial on how to detect this on my devices, which are mostly iOS, or in my home network?

I'd love to find and remove any apps from my devices that have this SDk active.

tisdadd 3 hours ago | parent [-]

There could be better, but this looked reasonable at first glance if you also have a Mac.

https://www.thequantizer.com/tutorials/wireshark-iphone-traf...

It has been a while since I personally did such traces, but Wireshark was very simple to use and once the network is exposed, it has lots of information available online if you need more.

I found bypassing your VPN particularly appalling, as is the whole thing. Personally, it would be amazing if there were a limit on how much can be in Terms of Service, as no one wants to read that much anymore.

skinwill 4 hours ago | parent | prev | next [-]

Not if my firewall blocks it from accessing the outside world. (But allows HomeAssistant to control it)

blakesterz 3 hours ago | parent | prev | next [-]

Are there any defenses I can put in front of my websites that are good for stopping these things? The amount of traffic I see from residential proxies is just killing me. In particular defense against residential proxies.

jappgar an hour ago | parent | next [-]

The bots used by these proxies are detectable in a few ways. Remember the bot itself doesn't run on the proxy...

There is discernible lag from proxy to c&c node. The individual bots don't have access to a lot of compute, and are sometimes restricted wrt feature set (e.g. proprietary video codecs).

There are a few other techniques. It's a cat and mouse game though. And the bot owners are usually more motivated than you are.

bakugo an hour ago | parent | prev [-]

Add a captcha or proof-of-work challenge in front of your website. Those are pretty much your only options.

NewCzech 4 hours ago | parent | prev | next [-]

One of the problems I can see here is the problem that running a Tor exit node has: badly behaved users are going to be using it to hide their location.

Imaging having the police show up at your door because they've figured out that you're trafficking child porn, when the actual culprit is someone that is using your TV as a proxy to trade child porn.

iugtmkbdfil834 3 hours ago | parent [-]

I genuinely dislike how user hostile everything has become. I effectively have to become an expert in near everything and track all news on the off-change something major upends previous assumptions. And if I miss it somehow and complain about it, defenders will come out of the woodwork to defend, deflect or derail the conversation.

If there is any good news about this, it is that the fatigue seems to be hitting normal people. Buddy from work complained to me how he now is now forced to be a full blown wifi/internet admin so that his kids' restrictions/limits are appropriately enforced.

I am just venting, because I am not entirely certain what an appropriate solution here is.

amelius 2 hours ago | parent [-]

Solution is more regulation, stronger consumer organizations, and privacy watchdogs with actual teeth.

hackrmn 3 hours ago | parent | prev | next [-]

If the kind of proxying isn't illegal, in my opinion it should be -- saying it's bordering on circumvention of fundamental assumptions about Internet routing and IP address leasing (and ownership), would be a sorry understatement compared to what Bright Data has managed to package into a product payment:

> you are allowing Bright Data to occasionally use your device’s free resources and _IP address to download public web data from the internet_. (emphasis mine)

I think the misleading part -- to the end-user -- is the "download public web data" part. If the data is public why can't Bright Data download it themselves? Well, because the other end doesn't want them to, apparently. The product is make you help Bright Data circumvent the undesired properties of the "public" data providers, on behalf of someone who happens to have the cash but as of yet is at the short end of the Internet stick (for all the right reasons, I'd say).

This is absolutely deplorable, but knowing the directions this is heading, I am neither surprised nor concerned, frankly. People have long voted with their wallet -- it's not the privacy-conscious Joe the Hacker that is being proxied through here, it's our parents and millions of people who just want entertainment at the end of the working day, including _parents_ of small children.

Day by day the dark Internet theory sounds more plausible, and frankly I am all there for it. The Internet will collapse into a feudal internetwork where any routing will need hop-by-hop key, so real people (and agents, frankly) can maintain a measure of trust that right now is being actively circumvented.

rdtsc 36 minutes ago | parent | prev | next [-]

> The TLS certificate is CN=*.luminatinet.com — the domain for Luminati Networks, Bright Data’s pre-2018 corporate name

Ah yes. The big privacy scraping company called themselves The Luminati. It’s like they are side-investing in tin foil hats or something.

ddxv 2 hours ago | parent | prev | next [-]

I found some 60 iOS apps that have the SDK mentioned in the article: https://appgoblin.info/sdks/brdsdk.framework (sorry this requires a free login due to heavy scraping, feel free to contact me for list)

I was unable to find related Android SDKs. I tried looking at the various apps on AppGoblin to find the android versions, then looking through their unmapped SDK parts but didn't see anything.

https://github.com/BrightSDK/bright-sdk-gradle-plugin-docs

This looks like it should just be "com.brightdata" but I did not find anything. With 60 iOS apps there must be apps with Android SDK, but I'm not sure why I am not finding any.

If anyone knows, or would like to chat feel free to connect. I'm happy to share data.

metalman 43 minutes ago | parent | prev | next [-]

Having never owned a telivision because of how much I didn't like advertising when tv was the primary delivery method, the feeling of having avoided a life sentence of bieng lashed to the tube is wierd, I know that people might catch me looking all to intently into there eyes trying to see if they are realy in there.

tamimio an hour ago | parent | prev | next [-]

Years ago I had smart TV, and while I never used anything “smart”, one day I connected it to the network to update it and forgot it, two days later I was checking my dns and 80% of the traffic and blocked queries in the past two days were from one device, after tracking it, it was the TV!

So what I have now is a pre-smart TV I found at the thrift, still very good picture that’s more than enough for the few times I use it.

There should be a way to disable the “smart” garbage in new TVs, or an option to buy normal ones at least.

everybodyknows an hour ago | parent | prev | next [-]

FTA:

> MDM, mobile EDR

Anyone care to ELI5 these?

boilerupnc 44 minutes ago | parent [-]

MDM: Mobile Device Management. Software that helps ops folks control a fleet of mobile devices like tablets, phones, etc…

Mobile EDR: Endpoint detection and response. This is cybersecurity software to monitor and deal with network activity happening in mobile devices like tablets, phones, etc…

trumpdong 4 hours ago | parent | prev | next [-]

I find Cloudflare to be more unethical than Bright Data.

xg15 4 hours ago | parent [-]

Both are causing a dynamic that will lock down the internet evermore for everything straying slightly from the corporate-approved line.

If the divide was data center vs residential IPs, fine, but thanks to Bright Data and friends, residential IPs are getting suspicious as well, so I guess the next step is full-on client verification then...

clvx 4 hours ago | parent | next [-]

I wish federal or state laws could force providing transparency because asking for privacy is a dead end at this point. Just force products and providers that run in my home where they phone in. Then, I can decide what to do with that whether I send them to a black hole or let them pass.

trumpdong 2 hours ago | parent | prev [-]

These are legitimate client devices. Good luck with that.

skywhopper 4 hours ago | parent | prev | next [-]

Not the one in my living room.

ErroneousBosh 3 hours ago | parent | prev [-]

So wait a second then, it connects out using a websocket to its bot C&C server, right?

Which presumably passes it a URL to scrape and waits for it to return the data.

What happens if I write my own tool that connects to that C&C server, waits for a URL to scrape, and returns gigabytes of freshly brewed hot horseshit?

woffoor 3 hours ago | parent [-]

Most scrapped websites have https, so you need to perform a MITM attack. Scrapers will probably notice that.

voakbasda 2 hours ago | parent | next [-]

No, you just need to stand up your own website and feed the scraper a URL to it.

ErroneousBosh 5 minutes ago | parent [-]

I would just generate scads of Markov chain output and make it look like a plausible web page.

ErroneousBosh 28 minutes ago | parent | prev [-]

How would https affect it?

If they're making a request to my machine to go and curl a page, how do they even know whether or not it was https?