▲ | Expand.ai (YC S24) Is Hiring a Founding Engineer to Turn the Web into an API | |
6 days ago | ||
Imagine if the internet were a database. What would you build? --- We’re building out the early engineering team of expand.ai, where we’re (a) solving a problem the world urgently needs to be solved, that is (b) one of the most challenging problems in technology, (c) alongside a team of exceptional engineers (d) offering you the chance to have a massive impact, (e) while having fun (f) and be reward accordingly. (a) Problem While LLMs are democratizing intelligence, access to data remains a huge bottleneck. We're changing that by building state-of-the-art web extraction agents that structure millions of websites—one at a time. Scraping is not new. Folks have written XPaths, manual beautiful soup scripts, etc. However, two things changed: 1. Scraping is needed even more than before. Almost any AI application needs access to some data from the internet. 2. We’re in the post-LLM era. While LLMs are too expensive to run over the internet, they enable previously impossible capabilities. That’s why we’re on a mission to build a reliable data layer for the web. When we soft launched in September, we got an insane amount of interest, with over 170 demos booked in 24 hours. This confirms how much of a problem this is for folks. We’re luckily not alone on this mission and have backing from the best: YCombinator, Guillermo Rauch (CEO, Vercel), Swyx (Founder, smol ai), Sarah Guo (Conviction Embed), Alana Goyal (basecase capital), Max Claussen (system.one), Ellen Chisa, Charly Poly, … and many more! (b) Tech challenges We have collectively coded for over 20 years and never faced this difficulty of a technical challenge. We’re building a highly dynamic system that needs to deal with the undeterministic nature of the internet, at scale. Some of the challenges we’re tackling: 1. Building a fair system across multiple tenants (noisy neighbors) 2. Quickly scaling up and down web agent and AI infra depending on demand 3. Coordinate thousands of web agents that concurrently try to extract data 4. High-quality data pipelines to make sure our clients get only the correct data. We care a lot about reliability and correctness. 5. Scaling our system to millions of websites 6. Our agents need to be able to perform actions on websites - not all of them directly show the information we need to get! 7. As this is such new territory, a lot of the tooling we need doesn’t exist yet and we have to build it ourselves. What is our answer to these? How will we win? In short - good old software engineering. Longer Answer: - We’re making heavy use of [Effect](https://effect.website/), which allows us to gain as much control as you possibly can in such an undeterministic environment. Also, it will enable us to represent complex agentic workflows while keeping a deep level of observability. - We’re making use of the latest advances in AI, including training our own models and using the latest insights from research. - It took us a while to arrive at the current system design, and we certainly don’t have all the answers yet, but we have iterated on our main architecture multiple times, which allows us to represent very complex workflows. (c) Team We’re a nimble team of 2 right now. 1. [Tim Suchanek](https://github.com/timsuchanek) (Founder) 1. Tim was the first engineer at Prisma, where he led the development of several tools, including the GraphQL Playground and Prisma ORM, which are now used by millions of developers. 2. Tim then founded Stellate, a GraphQL API management company that handled billions of monthly requests from customers like Nasa, Puma, and Priceline. Stellate was acquired by Shopify in 2024. 3. Tim does Brazilian Jiu-Jitsu (BJJ) as a hobby and holds a blue belt in it. 2. [Tylor Steinberger](https://github.com/TylorS) (Founding Engineer) 1. Tylor joined the Cycle.js project early on, building out major parts of the system. 2. After a career in startup tech, Tylor last led the frontend development at Seasoned, where he implemented one of the [first functional full-stack frameworks](https://github.com/TylorS/typed), which is still used today. 3. As a hobby, Tylor implements his own programming language and always thinks about how to push the boundaries of programming. 3. You 1. What kind of engineering profile are we looking for? You’re a good fit if you 1. Are backend-focused and have worked on a difficult project before 2. Have a machine-learning background and have built data pipelines before 3. Have worked on scalable infrastructure 4. Have a distributed systems background 5. If none of the above matches, but you think you’re excellent - apply nevertheless! (d) Impact While there are established use cases for web scraping, such as e-commerce cataloging, extracting accurate data from websites is becoming a crucial building block for AI agents and supporting better decision-making. Transforming the internet into a queryable database will emerge a whole new economy, enabling more informed choices across various sectors. Our mission is to power that economy and facilitate improved decision-making processes. The demand for our system is undeniably strong, with thousands of people reaching out to gain access. (e) Fun As you can probably already read, we try not to take ourselves too seriously. We’re working very hard at [expand.ai](http://expand.ai), and it’s just super important that we’re all enjoying our work and each other's company. Yes, part of it also includes edgy (but respectful) jokes. One thing that makes it much more fun is being together in person! We have a nice, cozy office in Dogpatch, San Francisco. Eating together, doing whiteboard sessions together, hacking together—it’s just so much more fun to be in person! That’s why: **This is an in-person position only. Please don’t apply if you can’t move to SF.** (f) Compensation We’re putting a lot of love and energy into this project, so while having a massive impact, we also want you and us to be able to capture some of the value we create. We have a generous equity package, especially for the first few folks! If you want to grow, push your limits and are also excited about solving hard technical problems, please reach out to foundingengineer@expand.ai. |