Remix.run Logo
speedgoose 11 hours ago

Accenture managed to build a data platform for my company with Elasticsearch as the primary database. I raised concerns early during the process but their software architect told me they never had any issues. I assume he didn’t lie. I was only an user so I didn’t fight and decided to not make my work rely on their work.

kubi07 9 hours ago | parent | next [-]

I worked in a company that used elastic search as main db. It worked, company made alot of money from that project. It was a wrong decision but helped us complete the project very fast. We needed search capability and a db. ES did it both.

Problems that we faced by using elastic search: High load, high Ram usage : db goes down, more ram needed. Luckily we had ES experts in infra team, helped us alot.(ecommerce company)

To Write and read after, you need to refresh the index or wait a refresh. More inserts, more index refreshes. Which ES is not designed for, inserts become slow. You need to find a way to insert in bulk.

Api starts, cannot find es alias because of connection issue, creates a new alias(our code did that when it cant find alias, bad idea). Oops whole data on alias is gone.

Most important thing to use ES as main db is to use "keyword" type for every field that you don't text search.

No transaction: if second insert fails you need to delete first insert by hand. Makes code look ugly.

Advantages: you can search, every field is indexed, super fast reads. Fast development. Easy to learn. We never faced data loss, even if db crashed.

rectang 9 hours ago | parent | next [-]

Databases and search engines have different engineering priorities, and data integrity is not a top tier priority for search engine developers because a search engine is assumed not to be the primary data store. Search engines are designed to build an index which augments a data store and which can be regenerated when needed.

Anyone in engineering who recommends using a search engine as a primary data store is taking on risk of data loss for their organization that most non-engineering people do not understand.

In one org I worked for, we put the search engine in front of the database for retrieval, but we also made sure that the data was going to Postgres.

9rx 8 hours ago | parent [-]

> Anyone in engineering who recommends using a search engine as a primary data store is taking on risk of data loss for their organization.

It is true that Elasticsearch was not designed for it, but there is no reason why another "search engine" designed for that purpose couldn't fit that role.

ananthakumaran 3 hours ago | parent | prev | next [-]

ES should be thought of as a json key value store and search engine. The json key value store is fully consistent and supports read after write semantics, refresh is needed for search api. In some cases it does make sense to treat it as a database provided the key value store semantics is enough.

I used it about 7 years ago. Text search was not that heavily used, but we utilized the keyword filter heavily. It's like having a database where you can throw any query at it and it would return a response in reasonable time, because you are just creating an index on all fields.

thisisananth 9 hours ago | parent | prev | next [-]

agree with comment. We use ES quite extensively as a database with huge documents and touchwood we haven't had any data loss. We take hourly backups and it is simple to restore. You have to get used to eventual consistency. If you want to read after writing even by id, you have to wait for the indexing to be complete (around 1 second). You have to design the documents in such a way that you shouldn't need to join the data with anything else. So make sure you have all the data you need for the document inside it. In an SQL db you would normalize the data and then join. Here assume you have only one table and put all the data inside the doc. But as we evolved and added more and more fields into the document, the document sizes have grown a lot (Megabytes) and hitting limits like (max searchable fields :1000 can be increased but not recommended) search buffer limits 100MB).

My take is that ES is good for exploration and faster development but should switch to SQL as soon the product is successful if you're using it as the main db.

simianwords 8 hours ago | parent [-]

good ideas but sorry i simply don't understand why i would ever do a join at read time. one of the worst ideas!

aPoCoMiLogin 5 hours ago | parent | prev [-]

most of these is more lack of experience than the DB fault. most systems have its quirks, so you have to get used to it.

Andys 9 hours ago | parent | prev | next [-]

This is made possible because Elastic gained a write-ahead log that actually syncs to disk after each write, like Postgres.

victor106 10 hours ago | parent | prev | next [-]

> Accenture

They messed up a $30 million dollar project big time at a previous company. My cto swore to never recommend them

bigfatkitten 8 hours ago | parent | next [-]

How are they still in business?

I’ve either been involved with or adjacent to dozens of Accenture projects at 5 companies over the last 20 years, and not a single one had a satisfactory outcome.

I’ve never heard a single story of “Accenture came in, and we got what we wanted, on time and on budget.” Cases of “we got a minimum viable solution for $100m instead of $30m, and it was four years late” seem more typical.

stackskipton 2 hours ago | parent | next [-]

Just like IBM, they are big enough that no one ever got fired for buying them.

I've also found they do a good job of getting cadre of executives that float between companies hiring them when they move between companies while they get wined and dined.

almosthere 6 hours ago | parent | prev [-]

It's just that they're only seeing money to build and a place to make excuses on being late.

If you hire your own people you can make them feel how well the business is doing and get features out the door tomorrow and build to the larger thing over time.

9rx 8 hours ago | parent | prev [-]

I've seen some mess-ups in my life, but they started sticking out like a sore thumb long, long, long, long before anywhere close to $30 million was spent on it.

What does a $30 million dollar mess-up look like?

rawgabbit 5 hours ago | parent | next [-]

Teams of consultants on site, some remote, and many offshore. Tons of documents are created and many environments and DevOps pipelines are stood up. First code release is when the people who push buttons touch the system for the first time. It is crap. Several more code releases attempt to make the system usable. Eventually another consultant or two are brought to evaluate the project and they say the project violated every best practice and common sense rule. Most egregiously the internal stakeholders who voiced serious concerns at the beginning of the project were dismissed or forced out etc.

9rx 3 hours ago | parent [-]

So much the same as what I've seen before, except instead of abandoning ship when the mess was clear and present, doubling down to see how far of a hole one can dig? Sunk cost must be one hell of a drug for the aforementioned CTO.

nwallin 6 hours ago | parent | prev [-]

I am not OP and am not speaking for them.

"A $30 million mess-up" can look like (at least) two things. It can be $30 million was spent on a project that earned $0 revenue and was ultimately canceled, or it can look like $x was spent on a project to win a $30 million contract but a competitor won the contract instead.

CuriouslyC 10 hours ago | parent | prev [-]

Elastic feels about as much like a primary data store as Mongo, FWIW.