Remix.run Logo
John23832 2 days ago

You don't actually need any of those things until you no longer have a "project", but a business which will allow you to pay for the things you require.

You'd be amazed by how far you can get with a home linux box and cloudflare tunnels.

koito17 2 days ago | parent | next [-]

On this site, I've seen these kind of takes repeatedly over the past years, so I went ahead and built a little forum that consists of a single Rust binary and SQLite. The binary runs on a Mac Mini in my bedroom with Cloudflare tunnels. I get continuous backups with Litestream, and testing backups is as trivial as running `litestream restore` on my development machine and then running the binary.

Despite some pages issuing up to 8 database queries, I haven't seen responses take more than about 4 - 5 ms to generate. Since I have 16 GB of RAM to spare, I just let SQLite mmap the whole the database and store temp tables in RAM. I can further optimize the backend by e.g. replacing Tera with Askama and optimizing the SQL queries, but the easiest win for latency is to just run the binary in a VPS close to my users. However, the current setup works so well that I just see no point to changing what little "infrastructure" I've built. The other cool thing is the fact that the backend + litestream uses at most ~64 MB of RAM. Plenty of compute and RAM to spare.

It's also neat being able to allocate a few cores on the same machine to run self-hosted GitHub actions, so you can have the same machine doing CI checks, rebuilding the binary, and restarting the service. Turns out the base model M4 is really fast at compiling code compared to just about every single cloud computer I've ever used at previous jobs.

a day ago | parent | next [-]
[deleted]
busterarm 2 days ago | parent | prev [-]

Just one of the couple dozen databases we run for our product in the dev environment alone is over 12 TB.

How could I not use the cloud?

maccard a day ago | parent | next [-]

12TB is $960/month in gp3 storage alone. You can buy 12TB of NVMe storage for less than $960, and it will be orders of magnitude faster than AWS.

Your use case is the _worst_ use case for the cloud.

pnutjam a day ago | parent [-]

The most consistent misunderstanding I see about the cloud, is disk I/O. Nobody understands how slow your standard cloud disk is under load. They see good performance and assume that will always be the case. They don't realize that most cloud disks use a form of token tracking where they build up I/O over time and if you have bursts or sustained high I/O load you will very quickly notice that your disk speeds are garbage.

For some reason people more easily understand the limits of CPU and memory, but overlook disk constantly.

maccard a day ago | parent | next [-]

Even without that, you are still at the heart of it accessing over a SAN like interface with some sort of local cache. Getting an actual local drive on AWS the performance is night and day

pnutjam a day ago | parent [-]

Sure, you can work around it; but it blows up the savings alot of people expect when they don't include this in their math.

Also, SAN is often faster then local disk if you have a local SAN.

chrisandchris an hour ago | parent [-]

How is a SAN faster than a local disk? Any references / recommendations?

anktor a day ago | parent | prev | next [-]

What could I read to inform myself better on this topic? It is true I had not seen this angle before

pnutjam a day ago | parent [-]

This looks pretty informative. The terminology can be hard to follow. https://medium.com/@bounouh.fedi/understanding-iops-in-aws-w...

I/O is hard to benchmark so it's often ignored since you can just scale up your disks. It's a common gotcha in the cloud. It's not a show stopper, but it blows up the savings you might be expecting.

immibis a day ago | parent | prev [-]

At one time I had a project to run a cryptocurrency node for BSC (this is basically a fork of Ethereum with all the performance settings cranked up to 11, and blocks centrally issued instead of being mined). It's very sensitive to random access disk throughput and latency. At the time I had a few tiny VPS on AWS and a spinning drive at home, so I evaluated running it there. Even besides the price, you simply cannot run it on AWS EBS because the disk is just too slow to validate each block before the next one arrives. I spent a few hundred dollars and bought an NVMe SSD for my home computer instead.

sgarland a day ago | parent | prev | next [-]

First of all, if you have a dev DB that’s 12 TB, I can practically guarantee that it is tremendously unoptimized.

But also, that’s extremely easily handled with physical servers - there are NVMe drives that are 10x as large.

fx1994 a day ago | parent | next [-]

that's what I always brag to my devs, why is our DB 1TB, and only 20 users are working in our app. They are collecting all garbage and saving it to DB. Poor development skills I would say. Our old app did the same thing, and after 15 years it was barely 100GB with tens of users. devs today are SELECT *. If it does not work, they say we need more resources. Thats why I hate cloud.

xpe 14 hours ago | parent [-]

Nothing like piling transactions, analytics, and logs to the same database. /s

John23832 12 hours ago | parent | prev [-]

Eh, please find me a 120 TB NVMe.

otherjason 11 hours ago | parent [-]

https://www.solidigm.com/products/data-center/d5/p5336.html

mootothemax a day ago | parent | prev | next [-]

> Just one of the couple dozen databases we run for our product in the dev environment alone is over 12 TB.

> How could I not use the cloud?

Funnily enough, one of my side projects has its (processed) primary source of truth at that exact size. Updates itself automatically every night adding a further ~18-25 million rows. Big but not _big_ data, right?

Anyway, that's sitting running happily with instant access times (yay solid DB background) on a dedicated OVH server that's somewhere around £600/mo (+VAT) and shared with a few other projects. OVH's virtual rack tech is pretty amazing too, replicating that kind of size on the internal network is trivial too.

wheybags 2 days ago | parent | prev | next [-]

https://www.seagate.com/products/enterprise-drives/exos/exos...

selcuka 2 days ago | parent [-]

> one of the couple dozen databases

I guess this is one of those use cases that justify the cloud. It's hard to host that reliably at home.

c0balt a day ago | parent [-]

Not too push the point too hard, but a "dev environment" for a product is for a business (not an individual consumer). Having a server (rack) in an office is not that hard, but alas the cloud might be better here for ease of administration.

mcny a day ago | parent | next [-]

My understanding is that aws exists because we can't get any purchase approved in under three months.

darkwater a day ago | parent [-]

I don't think so. An organization so big and bureaucratic that needs 3 months to authorize a server purchase will for sure need a few weeks of paperwork to authorize a new AWS account creation, and will track the spending for OU and will cut budget and usage if they think you deserve it.

wongarsu a day ago | parent | prev [-]

And plenty of datacenters will be happy to give you some space in one of their racks.

Not wanting to deal with backups or HA are decent reasons to put a database in the cloud (as long as you are aware how much you are overpaying). Not having a good place to put the server is not a good reason

immibis a day ago | parent [-]

If anyone's curious about the ballpark cost, a carrier-owned (?) DC near me that publishes prices (most don't) advertises a full rack for 650€ per month, including internet @ 20TB/month @ 1 Gbps, and 1kW power.

Though both of which are probably less than you'd need if you needed a full of rack of space, which I assume is part of the reason that pricing is almost always "contact us". I did not bother getting a quote just for the purpose of this comment. But another thing that people need to be less afraid of, when they're looking to actually spend a few digits of money and not just comment about it, is asking for quotes.

koito17 2 days ago | parent | prev | next [-]

12 TB fits entirely into the RAM of a 2U server (cf. Dell PowerEdge R840).

However, I think there's an implicit point in TFA; namely, that your personal and side projects are not scaling to a 12 TB database.

With that said, I do manage approximately 14 TB of storage in a RAIDZ2 at my home, for "Linux ISOs". The I/O performance is "good enough" for streaming video and BitTorrent seeding.

However, I am not sure what your latency requirements and access patterns are. If you are mostly reading from the 12 TB database and don't have specific latency requirements on writes, then I don't see why the cloud is a hard requirement? To the contrary, most cloud providers provide remarkably low IOPS in their block storage offerings. Here is an example of Oracle Cloud's block storage for 12 TB:

  Max Throughput: 480 MB/s
  Max IOPS: 25,000
https://docs.oracle.com/en-us/iaas/Content/Block/Concepts/bl...

Those are the kind of numbers I would expect of a budget SATA SSD, not "NVMe-based storage infrastructure". Additionally, the cost for 12 TB in this storage class is ~$500/mo. That's roughly the cost of two 14 TB hard drives in a mirror vdev on ZFS (not that this is a good idea btw).

This leads me to guess most people will prefer a managed database offering rather than deploying their own database on top of a cloud provider's block storage. But 12 TB of data in the gp3 storage class of RDS costs about $1,400/mo. That is already triple the cost of the NAS in my bedroom.

Lastly, backing up 12 TB to Backblaze B2 is about $180/mo. Given that this database is for your dev environment, I am assuming that backup requirements are simple (i.e. 1 off-site backup).

The key point, however, is that most people's side projects are unlikely to scale to a 12 TB dev environment database.

Once you're at that scale, sure, consider the cloud. But even at the largest company I worked at, a 14 TB hard drive was enough storage (and IOPS) for on-prem installs of the product. The product was an NLP-based application that automated due diligence for M&As. The storage costs were mostly full-text search indices on collections of tens of thousands of legal documents, each document could span hundreds to thousands of pages. The backups were as simple as having a second 14 TB hard drive around and periodically checking the data isn't corrupt.

busterarm 2 days ago | parent [-]

Still missing the point. This is just one server and just in the dev enviornment?

How many pets do you want to be tending to? I have 10^5 servers I'm responsible for...

The quantity and methods the cloud affords me allow me to operate the same infrastructure with 1/10th as much labor.

At the extreme ends of scale this isn't a benefit, but for large companies in the middle this is the only move that makes any sense.

99% of posts I read talking about how easy and cheap it is to be in the datacenter all have a single digit number of racks worth of stuff. Often far less.

We operate physical datacenters as well. We spend multiple millions in the cloud per month. We just moved another full datacenter into the cloud and the difference in cost between the two is less than $50k/year. Running in physical DCs is really inefficient for us for a long of annoying and insurmountable reasons. And we no longer have to deal with procurement and vendor management. My engineers can focus their energy on more valuable things.

rowanG077 a day ago | parent | next [-]

What is this ridiculous bait and switch. First you talk about a 12 TB dev databases and "How could I not use the cloud?". And you rightfully get challenged on that and then suddenly it's about the number of servers you have to manage and you don't have the energy to do that with your team. Those two have nothing to do with each other.

2 days ago | parent | prev | next [-]
[deleted]
CyberDildonics a day ago | parent | prev | next [-]

Why do people think it takes "labor" to have a server up and running?

Multiple millions in the cloud per month?

You could build a room full of giant servers and pay multiple people for a year just on your monthly server bill.

jamesnorden a day ago | parent | prev [-]

Found the "AWS certified cloud engineer".

Aeolun a day ago | parent | prev | next [-]

Buy a pretty basic HDD? These days 12 TB isn’t all that much?

dublinben a day ago | parent | prev | next [-]

12 TB is easy. https://yourdatafitsinram.net/

n3t 2 days ago | parent | prev | next [-]

What's your cloud bill?

dragonelite a day ago | parent | prev | next [-]

Sounds more like your use case is like the 1~2% of the cases a simple server and sqlite is maybe not the correct answer.

cultofmetatron a day ago | parent | prev | next [-]

what are you doing that you have 12TB in dev??? my startup isn't even using a TB in production and we hands multiple millions of dollars in transactions every month.

esseph a day ago | parent | prev | next [-]

A high end laptop now can come with double that amount of storage.

cess11 a day ago | parent | prev | next [-]

My friends run hundreds of TB:s serviced onto the Internet for hobby and pleasure reasons.

It's not all HA, NVMe, web scale stuff, but it's not like a few hundred TB:s is a huge undertaking even for individual nerds with a bit of money to spend or connections at corporations that monotonically decommission hardware and is happy to not have to spend resources getting rid of it.

This summer I bought a used server for 200 euros from an acquaintance, I plan on shoving 140 TB in it and expect some of my future databases to exceed 10 TB in size.

a day ago | parent | prev [-]
[deleted]
bambax a day ago | parent | prev | next [-]

Exactly! I've been self hosting for about two years now, on a NAS with Cloudflare in front of it. I need the NAS anyway, and Cloudflare is free, so the marginal cost is zero. (And even if the CDN weren't free it probably wouldn't cost much.)

I had two projects reach the front page of HN last year, everything worked like a charm.

It's unlikely I'll ever go back to professional hosting, "cloud" or not.

John23832 a day ago | parent [-]

If you have explosive growth, sure cloud.

The vast majority of us that are actually technically capable are better served self hosting.

Especially with tools like cloudflare tunnels and Tailscale.

fragmede 2 days ago | parent | prev [-]

You can get quite far without that box, even, and just use Cloudflare R2 as free static hosting.

selcuka 2 days ago | parent [-]

CloudFlare Pages is even easier for static hosting with automatic GitHub pulls.

jen729w 2 days ago | parent [-]

Happy Netlify customer here, same deal. $0.

(LOL 'customer'. But the point is, when the day comes, I'll be happy to give them money.)

0cf8612b2e1e a day ago | parent | next [-]

Careful what you wish for. Netlify sent a guy a $104k bill from the free plan. Thankfully social media outage saved the guy.

https://news.ycombinator.com/item?id=39520776

ascorbic a day ago | parent [-]

Netlify changed their pricing after that so that free accounts are always free.

franciscop a day ago | parent [-]

Could you give a reference please? I was literally going to recommend Netlify at work, but didn't after I saw that story.

selcuka 15 hours ago | parent | next [-]

https://www.netlify.com/pricing/#faq

> The free plan is always free, with hard monthly limits that cannot be exceeded or incur any costs.

immibis 15 hours ago | parent | prev [-]

Is $0.55/GB not enough reason to avoid them? I guess not if your business is making more than that - bandwidth expense for a shopping site shouldn't be a problem when the customers are spending $100 for every 0.1GB - but that price should realistically be closer to $0.01/GB or even $0.002/GB. Sounds like they're forwarding you AWS's extremely excessive bandwidth pricing.

selcuka 15 hours ago | parent [-]

> Is $0.55/GB not enough reason to avoid them?

Where did you read that? The pricing page says 10 credits per GB, and extra credits can be purchased at $10 per 1500 credit. So it's more like $0.067/GB.

fvdessen a day ago | parent | prev [-]

FYI I just migrated from Netlify to Cloudflare pages and Cloudflare is massively faster across all metrics.