Remix.run Logo
Show HN: Visually manipulate and clean large datasets; local and remote(cocoalemana.com)
1 points by ryanmelehan 15 hours ago

Hi there! I'm working on an application which allows data scientists, analysts (and really whoever) to visually manipulate & clean datasets of any size, without writing code.

This stems out of my own personal experience over the past 6 years of maximal frustration trying to do Data Science and being forced to do lots of engineering to get anything done. (BTW I am a software engineer first, and yes, this still annoys me).

Lots of time (up to 80%) of data scientist's time can be attributed to working with data manually – think cleaning, validation, merging, visualization, data movement, dependency installation / configuration etc.

The tools are also super brittle meaning that everything is ad-hoc AND unstable. Uff.

Coco Alemana is my attempt to build an IDE which abstracts away the engineering layer from Data Science – as 85%+ of data scientists come from hard science instead of software engineering.

You're able to load massive datasets from formats like Parquet, CSV, and JsonLines – or remotely via Amazon Athena (and more to come soon). You can then move data around like Excel & Figma had a forbidden child. Joins can be done just by dragging one column into another frame, etc.

We have a bunch of cool stuff like auto-warning identification, union consolidation, easy column value mergers and renames, re-ordering, sorting, filtering, group by and the list goes on.

You can download the application and load a massive file within less than 3 minutes. It's free for everyone here. I'll keep an eye out for emails and will extend trials accordingly.

I'd love to hear what you all think :) Happy to receive any type of feedback, or roast ;)