▲ | retinaros 7 days ago | ||||||||||||||||||||||||||||
I always struggled to understand how do you make a company adopt a platform like databricks to « manage data » isnt managing data a minefield with plenty of open source pieces of software that serve different purposes ? who is the typical databricks customer? | |||||||||||||||||||||||||||||
▲ | benrutter 7 days ago | parent | next [-] | ||||||||||||||||||||||||||||
I think that's the main offering of databricks- you get a "data platforn in a box" and navigating the forest of piecemeal solutions is replaced with telling your data science and analytics teams to "use databricks". It's easy to look on knowing lots about data tools and say "this could be better done with open source tools for a fraction of the cost", but if you're not a big tech company, hiring a team to manage your data platform for 5 analysts is probably a lot more expensive than just buying databricks. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
▲ | dahcryn 6 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
you kill off all open source pieces, in turn compliance is happy, and a CTO is happy because he has a maintenance contract and can blame other people if stuff goes wrong. It's a way to get those pesky Python people to shut up Oh, and a CTO is always valued more if he manages a 5 million Databricks budget, where he can prove his worth by showing a 5% discount het negotiated very well, than a 1 million whatever-else budget that would be best in class. Everybody wins. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
▲ | semi-extrinsic 6 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
> who is the typical databricks customer? The CTO of a "traditional" company who is responsible for "implementing digital transition". | |||||||||||||||||||||||||||||
▲ | kwillets 6 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
My company is doing the dbx thing, and the best I can tell my manager is that I'm neutral on it. My working theory is that the UI, a low-grade web-based SQL editor and catalog browser, is more integrated that the hodgepodge of tools that we were using before, and people may gain something from that. I've seen similar with in-house tools that collect ad-hoc/reporting/ETL into one app, and one should never underestimate the value that people give to the UI. But we give up price-performance; the only way it can work is if we shrink the workload. So it's a cleanup of stale pipelines combined with a migration. Chaos in other words. | |||||||||||||||||||||||||||||
▲ | naijaboiler 6 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
we have databricks at my company 50m ARR, 150 employee thats still growing at 15% YoY.. With 0 full time Data Engineer (1 data scientist + 1 db admin manages everything on there as part time jobs). We are able to have data from like 100 transactional database tables, Zendesk, all our logs of every API call, every single event from every user in our mobile and web applications, banking data, calendar data, goole play store data, apple store data, all in 1 place. We are a 2-sided marketplace, we can easily get 360 degree data on our B2B customers, B2C customers, measure employee productivity across all departments. My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org. And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me. I really struggle to imagine being able to that any cheaper. How else we can engineer a hub for all of our data and manage appropriate access, run complex calculations in seconds, join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case. | |||||||||||||||||||||||||||||
▲ | poisonwomb 6 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
I think the governance stuff might push it over the top for a lot of organisations; it's pretty well integrated with IAM providers not only for structured/modelled data but also workspaces for the data sciencey stuff. Pretty much everything has permissions associated with it. When you have a big data engineering/science push off the back of the AI hype I think it appeals to the cheque writers to have something centralised and controlled. Aside from that I do get the feeling that most small and medium sized companies have been oversold on it - they don't really have enough data to leverage a lot of the features and they don't really have the skill a lot of the time to avoid shooting themselves in the foot. It's possible for a reporting analyst upskilling to learn the programming skill to not create a tangled web of christmas lights but not probable in most situations. There seems to be a whole cottage industry of consultancies now that purport to get you up and running with limited actual success. At least it's an incentive for companies to get their data in order and standardise on one place and a set of processes. In terms of actual development the notebook IDE feels like big old turd to use tho and it feels slow in general if you're at all used to local dev. People do kinda like these web based tools tho. Can't trust people all the time! There's VS code and PyCharm extensions but my team work mainly with notebooks at the moment for good or ill and the experience there is absolute flaky dogshit. I think it's possible to make some good stuff with it and it's paying my bills at the moment, but I think a lot of the adoption may be doomed to failure lol | |||||||||||||||||||||||||||||
▲ | 7 days ago | parent | prev [-] | ||||||||||||||||||||||||||||
[deleted] |