▲ | Show HN: Visually manipulate and clean large datasets; local and remote(cocoalemana.com) | |
1 points by ryanmelehan 15 hours ago | ||
Hi there! I'm working on an application which allows data scientists, analysts (and really whoever) to visually manipulate & clean datasets of any size, without writing code. This stems out of my own personal experience over the past 6 years of maximal frustration trying to do Data Science and being forced to do lots of engineering to get anything done. (BTW I am a software engineer first, and yes, this still annoys me). Lots of time (up to 80%) of data scientist's time can be attributed to working with data manually – think cleaning, validation, merging, visualization, data movement, dependency installation / configuration etc. The tools are also super brittle meaning that everything is ad-hoc AND unstable. Uff. Coco Alemana is my attempt to build an IDE which abstracts away the engineering layer from Data Science – as 85%+ of data scientists come from hard science instead of software engineering. You're able to load massive datasets from formats like Parquet, CSV, and JsonLines – or remotely via Amazon Athena (and more to come soon). You can then move data around like Excel & Figma had a forbidden child. Joins can be done just by dragging one column into another frame, etc. We have a bunch of cool stuff like auto-warning identification, union consolidation, easy column value mergers and renames, re-ordering, sorting, filtering, group by and the list goes on. You can download the application and load a massive file within less than 3 minutes. It's free for everyone here. I'll keep an eye out for emails and will extend trials accordingly. I'd love to hear what you all think :) Happy to receive any type of feedback, or roast ;) |