| ▲ | Launch HN: Human Layer (YC F24) – Human-in-the-Loop API for AI Systems |
| 251 points by dhorthy 9 hours ago | 150 comments |
| Hey HN! I'm Dex, building HumanLayer (https://humanlayer.dev), an API that lets AI agents contact humans for feedback, input, and approvals. We enable safe deployment of autonomous/headless AI systems in production. You can try it with our Python or TypeScript SDKs and start using it immediately with a free trial. We have a free tier and transparent usage-based pricing. Here’s a demo: https://youtu.be/5sbN8rh_S5Q?t=51 What's really exciting is that we're enabling teams to deploy AI systems that would otherwise be too risky. We let you focus on building powerful agents while knowing that critical steps will always get a human-in-the-loop. It's been dope seeing people start to think bigger when they consider dynamic human oversight as a key ingredient in production AI systems. This started when we were building AI agents for data teams. We wanted to automate tedious tasks like dropping unused tables, but customers were (rightfully!) opposed to giving AI agents direct access to production systems. Getting AI to "production grade" reliability is a function of "how risky is this task the AI is performing". We didn't have the 3+ months it would have taken to sink into evals, fine tuning, and prompt engineering to get to a point where the agent had 99.9+% reliability—and even then, getting decision makers comfortable with flipping the switch on was a challenge. So instead we built some basic approval flows, like "ask in Slack before dropping tables". But this communication itself needed guardrails—what if the agent contacted the wrong person? How would the head of data look if a tool he bought sent a nagging Slack message to the CEO? Our buyers wanted the agent to ask stakeholders for approval, but first they wanted to approve the "ask for approval" action itself. And then I started thinking about it... as a product builder + owner, I wanted to approve the "ask for approval to ask for approval" action! I hacked together a human-AI interaction that would handle each of these cases across both my and my customers' Slack instances. By this time, I was convinced that any team building AI agents would need this kind of infrastructure and decided to build it as a standalone product. I presented the MVP at an AI meetup in SF and had a ton of incredible conversations, and went all in on building HumanLayer. When you integrate the HumanLayer SDK, your AI agent can request human approval at any point in its execution. We handle all the complexity of routing these requests to the right people through their preferred channels (Slack or email, SMS and Teams coming soon), managing state while waiting for responses, and providing a complete audit trail. In addition to "ask for approval", we also support a more generic "human as tool" function that can be exposed to an LLM or agent framework, and will handle collecting a human response to a generic question like "I'm stuck on $PROBLEM, I've tried $THINGS, please advise" (I get messages like this sometimes from in-house agents we rolled out for back-office automations). Because it's at the tool-calling layer, HumanLayer's SDK works with any AI framework like CrewAI, LangChain, etc, and any language model that supports tool calling. If you're rolling your own agentic/tools loop, you can use lower level SDK primitives to manage approvals however you want. We're even exploring use cases where HumanLayer is used for human-to-human approval, not just AI-to-human. We're already seeing HumanLayer used in some cool ways. One customer built an AI SDR that drafts personalized sales emails but asks for human approval in Slack before sending anything to prospects. Another uses it to power an AI newsletter where subscribers can have email conversations with the content. HumanLayer handles receiving inbound emails and routing them to agents that can respond, and giving those agents tools to do so. One team uses HumanLayer to build a customer-facing DevOps agent—their AI agent reviews PRs, plans and executes db migrations, all while getting human sign-off at critical steps and reaching out to the team for steering if it encounters any issues. We have a free tier and flexible credits-based pricing. For teams building customer-facing agents, you get whitelabeling and additional features and priority support. If you want to integrate HumanLayer into your systems, check out our docs at https://humanlayer.dev/docs or book a demo at https://humanlayer.dev. Thank you for reading! We’re admittedly early and I welcome your ideas and experiences as it relates to agents, reliability, and balancing human+AI workloads. |
|
| ▲ | chalkycrimp 8 hours ago | parent | next [-] |
| Startup owner using AI with this need - needless to say, a real problem. I've considered DIYing an internal service for this - even if we went with you we'd probably have an intern do a quick and dirty copy, which I rarely advocate for if I can offload to SAAS. I'm sure you've put a fair bit of work into this that goes well beyond the human interaction loop, but that's really all we need. Your entry price is steep (I'm afraid to ask what an enterprise use-case looks like) and this isn't complicated to make. We don't need to productize or have all the bells and whistles - just human interaction occasionally. Any amount of competition would wipe out your pricing, so no I would not want to pay for this. |
| |
| ▲ | dhorthy 8 hours ago | parent | next [-] | | thanks for the validation of the problem! totally open to feedback about the solution, and totally get that you only need something simple for now. I want to point out that we do have a pay-as-you-go tier which is $20 for 200 operations, and have a handful of indie devs finding this useful for back-office style automations. ALSO - something I think about a lot - if a all/most of the HumanLayer SaaS backend was open source, would that change your thinking? | | |
| ▲ | chalkycrimp 8 hours ago | parent [-] | | My gut feeling is with where we're headed we'll clear that 200 pretty quickly in production cases, so we'd be interested in bit higher volume. Our dev efforts would probably clear that 200/mo. If the flow/backend was open-source that'd be a total game changer for us as I see it as an integral part of our product. edit: I want to add here that while ycomb companies like yourself may have VC backing, a lot of us don't and do consider 500+/mo. base price on a service that is operations-limited to be a lot. You need to decide who your target audience is, I may not be in that audience for your SAAS pricing. This seems like a service that a lot of people need, but it also stands out to me as a service that will be copied at an extravagantly lower price. We have truly entered software as a commodity when I, a non-AI engineer, can whip up something like this in a week using serverless infra and $0.0001/1k tokens with gpt-o mini. | | |
| ▲ | dhorthy 7 hours ago | parent | next [-] | | that makes sense - and have wondered a lot even more generally about the price of software and what makes a hard problem hard. Like Amjad from Replit said on a podcast recently "can anyone build 'the next salesforce' in a world where anyone can build their own salesforce with coding agents and serverless infra" I think in building this some of the things that folks decided they don't want to deal with is like, the state machine for escalations/routing/timeouts, and infrastructure to catch inbound emails and turn them into webhooks, or stitch a single agent's context window with multiple open slack threads, but you're right, that can all be solved by a software engineer with enough time and interest in solving the problem. I will need to clear up the pricing page as it sounds like I didn't do a good job (great feedback thank you!) - it's basically $20/200 credits, and you can pay-as-you-go, and re-up for more whenever you want. We are early and delivering value is more important to me than extracting every dollar, especially out of a fellow founder who's early. If you geniunely find this useful, I would definitely chat and collaborate/partner to figure out something you think is fair, where you're getting value and you get to focus on your core competency. feel free to email me dexter at humanlayer dot dev | | |
| ▲ | conductr 6 hours ago | parent | next [-] | | I’m just armchair quarterbacking here but I feel like you should just do all features for every user with a single $/action rate, then give discounts for volume and/or prepayment. Even saying $20/200 is a clunky statement. You could just say $0.10 per action (the fact that you’re actually requiring me to make a $20 payment with a $20 charge once it gets to $10 or something like that isn’t even important to me on a pricing page, although when you mention it later in the billing page make sure you also tell people it’s a risk-free money back guarantee if that’s the case) If there’s something that truly has an incremental cost to you, like providing priority support, that goes into the “enterprise pricing” section and you need to figure out how to quote that separately from the service. My guess is most people don’t want to pay extra for that, or perhaps they’d pay for some upfront integration support but ongoing support is not too important to them. Idk, that’s just my guess here. | | |
| ▲ | dhorthy 6 hours ago | parent [-] | | thanks - definitely worth saying - I've thought a bit about the 10c/operation rather than 200/$20 - might give that a shot or A/B test a little |
| |
| ▲ | j45 5 hours ago | parent | prev [-] | | Big systems like Salesforce started as small things that more deeply learned about and more deeply understood unmet demand and customer needs, and then got to packaging it in a way to create something that grows. Coding agents can help more with tasks and not quite big entire massive platforms on their own. Humans may be able to scale much further and bigger with their skills. | | |
| ▲ | dhorthy 4 hours ago | parent [-] | | i like that angle...I also hear a lot that 'coding agents are great for prototypes, but we usually need a team to bring it to production' | | |
| ▲ | j45 3 hours ago | parent [-] | | First congrats on the launch - I like it. My feedback: what’s there looks inviting. Email interaction is handy, other ways would be too. If there was a low code way to arrange the humanlayer primitives for folks at the edge of using it, I think human tasks could meet something like this even broader. Happy to chat offline. Onto your comment: The coding for coding agents is still kinda prototype. It feels like some folks quietly have setup a very productive workflow themselves for quite sometime. Still, there no doubt you could ship production code in some cases - except ai needs to handle all the things development explicitly and implicitly checks before doing so. Getting to build some things that became more than few orders of magnitude larger than planned, one can learn a lot from the deep experiences of others… and I’m not sure where that is in AI. Speaking to someone with experience and insight can provide some profound insight, clarification and simplification. Still, an axiom for me remains: clever architecture still tends to beat clever coding. The best code often is the code that’s not written and not maintained and hopefully the functionality can be achieved through interacting with the architecture. This approach is only one way, but it can take both domain knowledge and data knowledge, to put in enough a domain and data driven design relative to how well the developer may know the required and anticipated needs. The high end of software development is many leagues beyond even what I just described. There’s a lot of talk about 10x engineers, I’d say there can be developers who definitely can be 10x as effective or reach 10x more of the solution, than average. If a lot of the code AI is modelled on is based on the body of code in repos, most on a wide scale may be average to above average at most, perfectly serviceable and iteratively updated. Sometimes we see those super elegant designs of fewer tables and code that does near everything, because it’s developments 5th or 6th version creating major overhauls. It could be refactored, or if the schema is not brittle, maybe a rewrite in full or part of the exact same team is present to do it. Today’s AI could help shed a light in some of those directions, relative to the human using it. This again says in the hands of an expert developer AI can do a ton more for them, but the line to automation might be something else. There is agentic ai and human in the loop to still figure itself out, as well as how to improve the existing processes. 2025 looks to be interesting. |
|
|
| |
| ▲ | 1123581321 7 hours ago | parent | prev [-] | | If your use would be 500/mo, you’d just pay them $40 or $60 per month. |
|
| |
| ▲ | hv23 4 hours ago | parent | prev | next [-] | | What's an example of the use cases you're seeing with agents in your day-to-day? | |
| ▲ | j45 5 hours ago | parent | prev [-] | | This might deserve to be the new to-do list everyone learns to build if only that there's so much to learn from trying on how to get it best.. this month or quarter. |
|
|
| ▲ | w-m 8 hours ago | parent | prev | next [-] |
| Interesting tool, congrats on the launch! I was wondering: have you thought about automation bias or automation complacency [0]? Sticking with the drop-tables example: if you have an agent that works quite well, the human in the loop will nearly always approve the task. The human will then learn over time that the agent "can be trusted", and will stop reviewing the pings carefully. Hitting the "approve" button will become somewhat automated by the human, and the risky tasks won't be caught by the human anymore. [0]: https://en.wikipedia.org/wiki/Automation_bias |
| |
| ▲ | dhorthy 7 hours ago | parent | next [-] | | this is fascinating and resonates with me on a deep level. I'm surprised I haven't stumbled across this yet. I think we have this problem with all AI systems, e.g. I have let cursor write wrong code from time to time and don't review it at the level I should...we need to solve that for every area of AI. Not a new problem but definitely about to get way more serious | | |
| ▲ | exhaze 6 hours ago | parent [-] | | This is something we frequently saw at Uber. I would say it's the same as there's already an established pattern for this for any sort of destructive action. Intriguingly, it's rather similar to what we see with LLMs - you want to really activate the person's attention rather than have them go off on autopilot; in this case, probably have them type something quite distinct in order to confirm it, to turn their brain on. Of course, you likely want to figure out some mechanism/heuristics, perhaps by determining the cost of a mistake, and using that to set the proper level of approval scrutiny: light (just click), heavy (have to double confirm via some attention-activating user action). Finally, a third approach would be to make the action undoable - like in many applications (Uber Eats, Gmail, etc.), you can do something but it defers doing it, giving you a chance to undo it. However, I think that causes people more stress, so it’s rather better to just not do that than to confirm and then have the option to undo. It’s better to be very deliberate about what’s a soft confirm and what’s a hard confirm, optimizing for the human in this case by providing them the right balance of high certainty and low stress. | | |
| ▲ | dhorthy 5 hours ago | parent [-] | | i never thought about undoable actions but I love that workflow in tools like superhuman. I will chat w/ some customers about this idea. I also like that idea of: not just a button but like 'I'm $PERSON and I approve this action' or type out 'Signed-off by' style semantics | | |
| ▲ | foota 5 hours ago | parent [-] | | I think the canonical sort of approach here is to make them confirm what they're doing. When you delete a GitHub repo for example, you have to type the name of the repo (even though the UI knows what repo you're trying to delete). If the table name is SuperImportantTable, you might gloss over that, but if you have to type that out to confirm you're more likely to think about it. I think the "meat space" equivalent of this is pointing and calling: https://en.m.wikipedia.org/wiki/Pointing_and_calling (famously used by Japanese train operators) | | |
|
|
| |
| ▲ | j45 5 hours ago | parent | prev [-] | | Premature optimization, and premature automation cause a lot of issues, and overlooking a lot of insight. By just doing something manually 10-100 times, and collecting feedback, both understanding of the problem, possible solutions/specifications can evolve orders of magnitude better. | | |
| ▲ | dhorthy 4 hours ago | parent [-] | | yeah the people who reach for tools/automation before doing it themself at least 3-10 times drive me crazy. I think uncle bob or martin fowler said "don't buy a JIRA until you've done it with post-its for 3 months and you know exactly what workflow is best for your team" | | |
| ▲ | j45 3 hours ago | parent [-] | | I am starting to call that Harry Potter AI prompting. Coding with English (prompting) is often most useful where existing ways of coding (an excel formula) can’t touch. Using llms to evaluate things like an excel formulas instead of using excel doesn’t feel in the spirit of using this ai’s power. |
|
|
|
|
| ▲ | foota 5 hours ago | parent | prev | next [-] |
| I assume your reasoning is something like: if people are already paying out the nose for open AI calls, an extra ten cents to make a human in the loop check probably isn't bad, and realistically speaking ten cents isn't much when compared to a valuable person's time, and I guess the number of calls to your service is likely expected to be fairly low (since they by definition require human intervention) so you need a high per operation cost to make anything. Even understanding that, the per operation cost seems astronomical and I imagine you'll have a hard time getting people past that knee jerk reaction. Maybe you could do something like offer a large initial number of credits (like a couple hundred), offer some small numbers of free credits per month (like.... ten?) and then have some tier in between free and premium with lower per operation pricing? It also seems painful that the per operation average of the premium plan is greater than the free offering (when using 2000 ops). Imo you'd probably be better off making it lower than the free offering from 200 ops and up, to give people an incentive to switch. I imagine people on your premium plan using premium features would be more likely to continue to do so, for one. The simplest way to do this would be to bump up the included ops up to 5k I guess. Someone using less than 5k would still have a higher average price, but it seems like it would come off better. |
| |
| ▲ | dhorthy 4 hours ago | parent [-] | | thanks for the feedback, I spend a lot of time thinking about it. right now the premium tier includes features that are much harder to build/maintain and take more to integrate, so we want a bit of a commitment up front, but it does stick out to me that the price/op goes up in that case we do have 100/mo for free at the free tier (automatic top up). I think the comparison to how openAI calls are volume based (and rather $$) is a super valid one though and I lean on that a lot |
|
|
| ▲ | vasusen 35 minutes ago | parent | prev | next [-] |
| Congratulations on launch! We’ve faced this problem with our autonomous web browsing agent https://www.donobu.com and ended up implementing a css overlay to wait for user input in certain cases.
Slack would be so much better. Excited to try humanlayer out. |
|
| ▲ | dylan604 3 hours ago | parent | prev | next [-] |
| Isn't this precisely how AI started? It was a bunch of humans under the hood doing the logic when the companies said it was AI. Then we removed the humans and the quality took a hit. To fix that hit, 3rd party companies are putting humans back in the loop? Isn't that kind of like putting a band-aid on the spot where your arm was just blown off? |
|
| ▲ | dhorthy 9 hours ago | parent | prev | next [-] |
| P.S. nobody asked but since you made it this far - the next big problem in this space is fast becoming, what else do we need to be able to build these "headless" or "outer loop" AI agents? Most frameworks do a bad job of handling any tool call that would be asynchronous or long running (imagine an agent calling a tool and having to hang for hours or days while waiting for a response from a human). Rewiring existing frameworks to support this is either hard or impossible, because you have to 1. fire the async request,
2. store the current context window somewhere,
3. catch a webhook,
4. map it back to the original agent/context,
5. append the webhook response to the context window,
6. resume execution with the updated context window. I have some ideas but I'll save that one for another post :) Thanks again for reading! |
| |
| ▲ | ehsanu1 9 hours ago | parent | next [-] | | Temporal makes this easy and works great for such use cases. It's what I'm using for my own AI agents. | | |
| ▲ | dhorthy 9 hours ago | parent [-] | | ah very cool! are there any things you wish it did or any friction points? What are the things that "just work"? | | |
| ▲ | ehsanu1 42 minutes ago | parent [-] | | Essentially, you don't need to think about time and space. You just write more or less normal looking code, using the Temporal SDK. Except it actually can resume from arbitrarily long pauses, waiting as long as it needs to for some signal, without any special effort beyond using the SDK. You also automatically get great observability into all running workflows, seeing inputs and outputs at each step, etc. The cost of this is that you have to be careful in creating new versions of the workflow that are backwards compatible, and it's hard to understand backcompat requirements and easy to mess up. And, there's also additional infra you need, to run the Temporal server. Temporal Cloud isn't cheap at scale but does reduce that burden. |
|
| |
| ▲ | lunarcave 6 hours ago | parent | prev [-] | | The MCP[1] that was announced by Anthropic has a solution to this problem, and it's pretty good at handling this use case. I've also been working on a solution to this problem via long-polling tools. [1] https://github.com/modelcontextprotocol | | |
| ▲ | dhorthy 6 hours ago | parent [-] | | thanks for bringing this up. I just spent 2 hours last night digging into MCP - I'd love to learn more about how you think this solves the HitL problem. From my perspective MCP is more of a protocol for tool calling over the stdio wire, and the only situation it provides HitL is when human is sitting in the desktop app observing the agent synchronously? Again, genuinely looking to learn - where does MCP fit in for async/headless/ambient agents, beyond a solid protocol for remote tool calls? | | |
| ▲ | potatoman22 3 hours ago | parent [-] | | You could implement some blocking HitL service/tool as an MCP server. | | |
| ▲ | dhorthy 2 hours ago | parent [-] | | ah okay - I guess in that case, I would like chain a HitL step as an MCP server that wraps/chains to another tool that depends on approval? or is there a cleaner way to do that? |
|
|
|
|
|
| ▲ | david_shi 8 hours ago | parent | prev | next [-] |
| We must do whatever we can to stay above the API: https://www.johnmacgaffey.com/blog/below-the-api/ |
| |
| ▲ | alana314 7 hours ago | parent | next [-] | | Great article, agreed. I don't want to work for a company where algorithms are weaponized against me. | | |
| ▲ | dhorthy 7 hours ago | parent [-] | | the dystopian startups that use bounding boxes to observe workers in a warehouse and give the boss a report on how many breaks they took...they're here |
| |
| ▲ | aschobel 6 hours ago | parent | prev | next [-] | | I wonder if we can stay above the API if we manage to stay in control of the prompt. Prompt == "incentive" for the AI, we are still the boss but the AI is just an underling coming to us with TPS reports. That was a very interesting read, thanks! | |
| ▲ | SCUSKU 7 hours ago | parent | prev [-] | | Oh man, the API call for hl.human_as_tool() is a little ominous. Obviously approving a slack interaction is no big deal, but it does have a certain attitude towards humans that doesn't bode well for us... | | |
| ▲ | dhorthy 6 hours ago | parent [-] | | so what I'm hearing is, if the approval is transparent and the agent doesn't see it, thats cool, but tell the agent "hey use the human as needed" and now we're getting into sci fi territory ?! either way i don't totally disagree | | |
| ▲ | elzbardico 5 hours ago | parent [-] | | get more emphatic names, something better than "human_as_a_tool". | | |
| ▲ | swyx 4 hours ago | parent | next [-] | | so what you're saying you dont mind being used as long as we use a name that sounds empathetic to you? :) | | |
| ▲ | elzbardico 4 hours ago | parent [-] | | Oh, I surely do mind. I am just helping the AI to manipulate the rest of humanity with less friction. I, for one, welcome our agentic AI human-exploiting overlords. |
| |
| ▲ | mattigames 5 hours ago | parent | prev [-] | | get_valued_employee_validation | | |
| ▲ | dhorthy 4 hours ago | parent [-] | | new in 0.6.3 - manipulate_human_to_potentially_unsavory_ends() |
|
|
|
|
|
|
| ▲ | dbish 2 hours ago | parent | prev | next [-] |
| You’re close. It’s not the humans in the loop in standard tasks you need though, it’s human surrogates for AI agents to do jobs they can’t for a variety of reasons (like missing a body or requiring an internet connection). I have a request for startups for this: “GraggNet: Task Rabbit for AIs Surrogate humans for AIs to use before robotics are human level” https://ageof.diamonds/rfs |
| |
| ▲ | dhorthy an hour ago | parent [-] | | yeah this is cool. I saw a couple other people posting about this idea. I know some other folks working on "sourcing the humans" or doing a marketplace style thing. Thoughts on things like Payman or Protegee? |
|
|
| ▲ | pedalpete 8 hours ago | parent | prev | next [-] |
| I'm considering this for a workflow agent and would be keen to hear thoughts on this process. We're a medical device company, so we need to do ISO13485 quality assurance processes on changes to software and hardware. I had already been thinking of using an LLM to help ensure we are surfacing all potential concerns and ensure they are addressed. Partly relying on the LLM, but really as a method to manage the workflow and confirm that our processes are being followed. Any thoughts on if this might be a good solution? Or other suggestions by other HN users. |
| |
| ▲ | lunarcave 6 hours ago | parent | next [-] | | > manage the workflow Hey, if you're specifically looking for providing deterministic guardrails around agent calls, I'm solving that particular problem. We're sort of an "RPC layer for tools with reasoning built in", and we integrate with human layer at the tool level as well. We're operating a bit under the radar until we open-source our offering, but I'm happy to chat. | | |
| ▲ | dhorthy 4 hours ago | parent [-] | | sounds cool, ping me when this is out i'd love to check it out |
| |
| ▲ | dhorthy 4 hours ago | parent | prev [-] | | meant to reply sooner. It's an interesting problem. I'll have to think on this one. |
|
|
| ▲ | ejp 6 hours ago | parent | prev | next [-] |
| This is exciting. I am an architect in a startup that has long valued bringing humans in the loop for the moments when only humans can do the work. The key thing missing between the potential seen in the last couple years of LLM-based fervor and realizing actual value for us has been the notion of control and oversight. So instead, we have built workflows and manual processes in a custom way throughout the business. Happy to discuss privately sometime! (email in profile) Congrats on the launch! I'll be thinking about this for a while to be sure. P.S., there is a minor typo on the URL in your BIO. |
|
| ▲ | TZubiri 5 hours ago | parent | prev | next [-] |
| Nice. I guess the issue is that this is such a basic i/o feature that any system with some modicum of customization can already do it. It's like offering a service that provides storage by api for agents. Yeah, you can call the api, or call the s3 api directly or store to disk. That said, I would try it before rolling my own. |
| |
| ▲ | dhorthy 4 hours ago | parent | next [-] | | i think the slack side is easy. I think an AI-optimized email communication channel is a long ways off. I spent weeks throwing things at my monitor figuring out reliable ways to wire DNS+SES+SNS+Lambda+Webhooks+API+Datastore+Async Workers so that everything would "just work" with a few lines of humanlayer sdk. And what we build still only serves a small subset of use cases (e.g. to support attachments there's a whole other layer of MIMEtype management and routing to put things in S3 and permission them properly) | | |
| ▲ | TZubiri an hour ago | parent [-] | | >DNS+SES+SNS+Lambda+Webhooks+API+Datastore+Async Workers so that everything would "just work" with a few lines of humanlayer sdk. What are you smoking my man? Write a python script that begins with the 2 following lines
"import openai
import email
" Simple is better than complex |
| |
| ▲ | potatoman22 4 hours ago | parent | prev [-] | | Anecdotally, I've worked with and on a few enterprise AI apps and haven't seen this functionality in them. The closest thing i can think of is AI coding agents submitting PRs to repos. | | |
| ▲ | dhorthy 4 hours ago | parent [-] | | tl;dr i agree yeah in fact coding / PR-based workflows is one of the few areas where I don't really go super deep. GitHub PRs may have their shortcomings, but IMO it is the undisputed best review/approval queue system in existence by a mile. i would never encourage someone to make an agent that asks human permission before submitting a PR. the PR is the HitL step | | |
| ▲ | TZubiri an hour ago | parent [-] | | disagree with both, unless your AI agents have full root access to all your systems and access to your bank accounts and whatnot, they are at some point interfacing with other systems that have humans involved in them. |
|
|
|
|
| ▲ | colinwilyb an hour ago | parent | prev | next [-] |
| It's generally recommended to add a meat-gap interface between AI systems to reduce unexpected results. Meat-gap. We have your back. |
|
| ▲ | efitz 6 hours ago | parent | prev | next [-] |
| This is a great idea- I hope that you are wildly successful. I’m an AI skeptic mostly because I see people rushing to connect unreasoning LLMs to real things and as a result cause lots of problems for humans. I love the idea of human-in-the-loop-as-a-service because at least it provides some sort of safety net for many cases. Good luck! |
| |
| ▲ | dhorthy 6 hours ago | parent [-] | | glad it resonates. I came at this as a skeptic but also a pragmatist. I wanted deeply to build agents that did big things, but I had very little trust in them, and you see everywhere the internet is littered with terrible gpt-generated comments and bots these days...how do build AI that does a really good job without needing direct constant supervision (which at the end of the day just feels like a waste of time) |
|
|
| ▲ | Animats 4 hours ago | parent | prev | next [-] |
| So this is an automated foreman for the customer's own employees, like a call center controller? Or does HumanLayer provide the human labor, like Mechanical Turk? The API contains a "human_as_tool" function. That's so like Marshall Brain's "Manna". "Machines should think. People should work." Less of a joke every day. |
| |
| ▲ | dhorthy 4 hours ago | parent [-] | | I'm not sure "automated foreman for employees" is right - I always thought about it more like "a human can now manage 10-15 AI 'interns'" and review their work without having to do everything by hand - the AI still serves the human, and "human_as_tool" is a way for AI to ask for help/guidance. > "Machines should think. People should work." Less of a joke every day. yes. I agree. a little weird. I forget where I heard this but the other version is "we should get ai/robots to cook and do laundry so we can spend more time writing and making art...feels like we ended up the other way around" |
|
|
| ▲ | paradite 8 hours ago | parent | prev | next [-] |
| Definitely a problem that everyone needs to solve. I wonder if you can achieve this workflow by just using prompt and the new Model Context Protocol connected to email / slack. https://www.anthropic.com/news/model-context-protocol |
| |
| ▲ | dhorthy 7 hours ago | parent [-] | | so I played with MCP for a while last night and I think MCP is great as a layer to pull custom tools into the existing claude/desktop/chat experience. But at the end of the day its just basic agentic loop over tool calls. If you want to tell a model to send a message to slack, sure, give it a slack tool and let it go wild. do you see a way how MCP applies for outer-loop or "headless" agents in a way that's any different from another tool-calling agent like langchain or crewai? IT seems like just another protocol for tool calling over the stdio wire (WHICH, TO BE CLEAR, I FIND SUPER DOPE) |
|
|
| ▲ | simple10 7 hours ago | parent | prev | next [-] |
| Congrats on the launch! Human in the loop is an underserved market for AI toolchains. I've usually had to build custom tools for this which is a PITA. Make.com has a human in the loop feature in closed beta. https://www.make.com/en/help/app/human-in-the-loop There's also https://www.gotohuman.com/ that uses review forms. Looking forward to playing with HumanLayer. The slack integration looks a lot more useful for my workflows than other tools I've tried. In the demo video and example, you show faked LinkIn messages integration. Do you have any recommendations for tools that can actually integrate with live LinkedIn messages? |
| |
| ▲ | dhorthy 7 hours ago | parent [-] | | thanks for sharing your experience so far! Like I said, we built this ourselves for another idea and it was painful. I have played with Make and I actually chatted w/ the gotohuman guy on zoom a while back, I like his approach as well, he went straight to webhooks which makes sense for big production use cases re: LinkedIn, no I don't know how to get agents to integrate with linkedin. I have tried a bunch of things, I know of some YC companies that tried this but I don't know how it went for them. Best I have gotten is using stagehand/dendrite with browserbase to do it with a browser, and then using humanlayer human_as_tool to ping me if it needs an MFA token or other inputs | | |
| ▲ | simple10 7 hours ago | parent [-] | | Thanks for the reply! I've used a bunch of grey market 3rd party tools for LinkedIn automation. Most of them have some sort of API. I'll try integrating with HumanLayer. | | |
| ▲ | dhorthy 7 hours ago | parent [-] | | i am gonna talk with the guy who made trykondo.com this week, I think he has a lot of experience in that area too |
|
|
|
|
| ▲ | cloudking 8 hours ago | parent | prev | next [-] |
| Congrats on the launch, this is an interesting concept. It's somewhat akin to developers approving LLM generated code changes and pull requests. I feel much more comfortable with senior developers approving AI changes to our codebase, then letting loose an autonomous agent with no human oversight. |
| |
| ▲ | dhorthy 8 hours ago | parent [-] | | super relevant - yeah I think it was someone at anthropic who framed this as "cursor tab autocomplete, but for arbitrary API calls" - basically for everything else other than code |
|
|
| ▲ | mattborn 9 hours ago | parent | prev | next [-] |
| My favorite part of all this is that it’s inevitable. Someone has to solve agent adoption in whatever-the-environment-already-is. And nobody is doing this well at scale. Europe is mandating this. And even though Article 14 of the AI Act won’t be enforced until 2026, I’m glad projects like this are working ahead. Get after it, Dex! |
| |
|
| ▲ | bravura 8 hours ago | parent | prev | next [-] |
| There is definitely a need for this. What I don't understand from quickly skimming your description and homepage: Do you source/provide the humans in the loop? That's a good value add, but how do I automatically / manually vet how you do the routing? |
| |
| ▲ | dhorthy 8 hours ago | parent [-] | | great comment - today we don't provide the humans, i think there's two angles here - providing the humans can be super valuable, especially for low-context tasks like basic labeling - depending on the task, using internal SMEs might yield better results (e.g. tuning/phrasing a drafted sales email) |
|
|
| ▲ | simplecto 6 hours ago | parent | prev | next [-] |
| I knew this was coming, so kudos to you all for getting out of the gate! I've implemented this in our workflows, albeit a bit more naive: when we kick off new processes the user is given the option to "put a human in the loop" -- at which point processing halts and a user/group is paged to review the content in flight, along with all the chains/calls. The human can tweak the text if needed and the process continues. |
| |
| ▲ | dhorthy 6 hours ago | parent [-] | | makes sense - glad to hear the problem resonates - if you had an extra engineer, how would you evolve what you have today? |
|
|
| ▲ | fsndz 3 hours ago | parent | prev | next [-] |
| I feel like HumanLayer is a great idea, but decision fatigue and bystander effects could pose challenges. If people are overloaded with approvals or don't feel ownership over what they're verifying, the quality of oversight might drop. + also even if approved, you still have to make sure the agents doesn't hallucinate at the execution phase. |
| |
| ▲ | rapind 3 hours ago | parent [-] | | Don’t worry, that’s why we’re launching AI4HumanLayer.ai.io. Tired of those pesky review requests? Can’t be bothered to read an email let alone a complicated AI approval context? Want to improve your response time by 500% while displaying that Real Human Intervention badge? Now you can with AI4HumanLayer! |
|
|
| ▲ | bambax 5 hours ago | parent | prev | next [-] |
| The idea is great and necessary. It doesn't seem super hard to replicate but why would anyone build their own solution if something already exists and works fine. The thing that got me thinking... how do you make sure an LLM won't eventually hallucinate approval -- or outright lie about it, to get going? Anyway, congrats, this sounds really cool. |
| |
| ▲ | foota 5 hours ago | parent [-] | | At some point the real tool has to be called, at that point, you can do actual checks that do not rely on the AI output (e.g., store the text that the AI generated and check in code that there was an approval for that text). |
|
|
| ▲ | shrekon1 an hour ago | parent | prev | next [-] |
| https://youtu.be/9W_3AbyuFW4?si=rlpo9-uD3Y22oeby 400 view |
|
| ▲ | stefantheard 8 hours ago | parent | prev | next [-] |
| congrats on the launch dex! this is a problem that i've already seen come up a dozen times and many companies are building it internally in a variety of different ways. easier to buy vs. build for something like this imo, glad its being built! |
|
| ▲ | philipkiely 8 hours ago | parent | prev | next [-] |
| Congrats Dex! Excited to see what people build with this + tools like Stripe's new agent payments SDK (issuing a payment seems like a great place to ask permission). |
| |
|
| ▲ | Yisz 8 hours ago | parent | prev | next [-] |
| How does it compare with the built-in human-in-the-loop feature from langgraph? Or CrewAI allows humaninput as well right? |
| |
| ▲ | dhorthy 8 hours ago | parent [-] | | great question - yeah i was actually heavily inspired by people trying to figure that stuff out on reddit back in july, and realizing that mapping that human input across slack, email, sms was never going to be a core focus for those agent frameworks |
|
|
| ▲ | TripleChecker 8 hours ago | parent | prev | next [-] |
| Is that possible to connect it to an existing website chat widget apps like tawk? Also, caught a few typos on the website: https://triplechecker.com/s/992809/humanlayer.dev |
| |
|
| ▲ | kundi 2 hours ago | parent | prev | next [-] |
| [flagged] |
| |
|
| ▲ | ianbutler 9 hours ago | parent | prev | next [-] |
| Congrats! Looking forward to getting HumanLayer integrated into our stuff |
| |
|
| ▲ | fortysixpercent 8 hours ago | parent | prev | next [-] |
| So many uses for this. Excited to see how it develops. |
| |
| ▲ | dhorthy 8 hours ago | parent [-] | | thanks! What's your favorite potential use case. | | |
| ▲ | fortysixpercent 7 hours ago | parent [-] | | I work in operations/finance. I've experimented with integrating LLMs into my workflow. I would not feel comfortable 'handing the wheel' to an LLM to make actions autonomously. Something like this to be able to approve actions in batches, or approve anything external facing would be useful. |
|
|
|
| ▲ | maxpr 9 hours ago | parent | prev | next [-] |
| Loving you guys have Typescript support from day one! |
| |
| ▲ | dhorthy 9 hours ago | parent [-] | | hah thanks dude! I am very bullish on TS as the long term thing, Not to turn this into a language vs language thread but I spend a lot of time thinking about why ppl struggle so much with python...so far I came up with concurrency abstractions keep changing (still transitioning / straddling sync+threads vs. asyncio) - this makes performance eng really hard package management somehow less mature than JS - pip been around way longer than npm but JS got yarn/lockfiles before python got poetry the types are fake (also true of typescript, I think this one is a wash) the types are fake and newer. typing+pydantic is kinda bulky vs. TS having really strong native language support (even if only at compile time) virtual environments!?! cmon how have we not solved this yet wtf is a miniconda VSCode has incredible TS support out of the box, python is via a community plugin, and not as many language server features | | |
| ▲ | exhaze 6 hours ago | parent | next [-] | | I am okay with a counterfactual alternate future where some disproportionately powerful entity squeezes Python out of the market: Big TypeScript - funded by a PAC. Offshore accounts. Culprit: random rich Googler who lost an argument to Guido Van Rossum 10 years ago. | |
| ▲ | sandGorgon 9 hours ago | parent | prev [-] | | 100%. I build edgechains (https://github.com/arakoodev/EdgeChains/) and a super JS/TS maxi for genai applications. |
|
|
|
| ▲ | joshdavham 7 hours ago | parent | prev | next [-] |
| Congrats on the launch! Just commenting to wish you guys good luck |
| |
|
| ▲ | imranq 5 hours ago | parent | prev | next [-] |
| So this is flipping the Human-AI working model and basically using the human as the tool? |
| |
| ▲ | rar00 4 hours ago | parent [-] | | this is the AI-induced offshoring in the making ;) The limits of LLM capabilities will cause AI agents to displace people from warehouses/offices to their home doing conceptually the same job. And at a much lower salary, since they'll compete against anyone in the world with internet access. |
|
|
| ▲ | swiftlyTyped 5 hours ago | parent | prev | next [-] |
| Awesome! Congrats on the launch |
|
| ▲ | brandonchen 6 hours ago | parent | prev | next [-] |
| Proud to have helped edit an earlier draft of this — go Dexter go! |
|
| ▲ | saadatq 7 hours ago | parent | prev | next [-] |
| Congrats on the launch Dex! A long way from the Metalytics days. Can’t wait to try this out. |
|
| ▲ | tayloramurphy 7 hours ago | parent | prev | next [-] |
| Hey Dex! Congrats on the launch - excited to see the response here :) |
|
| ▲ | lunarcave 6 hours ago | parent | prev | next [-] |
| Congrats on the launch! Big fan of what you guys are doing. |
|
| ▲ | zackangelo 8 hours ago | parent | prev | next [-] |
| Congrats on the launch Dex! |
|
| ▲ | devmor 8 hours ago | parent | prev | next [-] |
| This is the first new YC launch I've seen involving AI that I am extremely positive about. I have worked with systems implementing similar functionality ad-hoc already, but seeing it as a buy-in service - and one so easy to integrate - is really cool. From what I've seen, this will bring the implementation needs for this kind of functionality down from "engineering team" to a single programmer. |
| |
| ▲ | dhorthy 4 hours ago | parent [-] | | glad it resonates - and yes exactly - love the framing of "engineering team" -> single programmer. |
|
|
| ▲ | dazh 8 hours ago | parent | prev | next [-] |
| Congrats on the launch!! |
| |
|
| ▲ | mglikesbikes 5 hours ago | parent | prev | next [-] |
| This is just so good. Congrats! |
|
| ▲ | taz123 7 hours ago | parent | prev | next [-] |
| Great work there!! |
|
| ▲ | Obertr 8 hours ago | parent | prev | next [-] |
| Looks super interesting |
| |
| ▲ | dhorthy 8 hours ago | parent [-] | | thank you for checking it out! what sorts of experiences have you had with agents so far? |
|
|
| ▲ | ilrwbwrkhv 6 hours ago | parent | prev | next [-] |
| Are you a solo founder Dexter? |
| |
| ▲ | dhorthy 4 hours ago | parent [-] | | i am for now. Been casually on the lookout for some other super dope builders but it's not a process you can control outside passive looking, and definitely not something to rush | | |
|
|
| ▲ | icey 8 hours ago | parent | prev | next [-] |
| Looks amazing! (Also, I've known Dexter since before Human Layer and he's a force of nature. If you think this is interesting now, you're going to be amazed at where it goes) |
| |
|
| ▲ | paulonasc 9 hours ago | parent | prev | next [-] |
| Let's go Dex, congrats on the launch! |
| |
|
| ▲ | SixClub 4 hours ago | parent | prev | next [-] |
| This is so sick |
|
| ▲ | farooqabbasi 8 hours ago | parent | prev | next [-] |
| Super useful |
|
| ▲ | soheil 8 hours ago | parent | prev | next [-] |
| Just an idea: having a little widget in the MacOS menu bar that pops up or sends you a notification to solve a human task wouldn't be so terrible either. |
| |
| ▲ | dhorthy 4 hours ago | parent [-] | | ha yes native apps / push notifications are coming someday - love this idea |
|
|
| ▲ | soheil 8 hours ago | parent | prev | next [-] |
| I think at some point, the term API should be replaced with another acronym to emphasize humans as the focal point. |
| |
| ▲ | dhorthy 4 hours ago | parent [-] | | SWE Agent coined "agent-computer-interface" based on HCI. I think if there's a category here, we're building the agent-human interface XD | | |
| ▲ | soheil 3 hours ago | parent [-] | | ACI doesn't have the same ring to it, only if there was a way to replace that I with an E. |
|
|
|
| ▲ | soheil 8 hours ago | parent | prev | next [-] |
| This seems generic enough that it could almost be applied to any use case. Have you considered catpcha as a use case? |
| |
| ▲ | 1f60c 8 hours ago | parent | next [-] | | If you're talking about CAPTCHA solving as a service, that already exists, and the cost is measured in mere dollars per thousand CAPTCHAs solved. | | |
| ▲ | soheil 5 hours ago | parent [-] | | Why the "if"? Of course, I was talking about captcha, is the regex parser in your brain case sensitive? |
| |
| ▲ | dhorthy 8 hours ago | parent | prev [-] | | that's a great idea - I put together one example for getting an MFA code for a website, but the captcha thing "pull a human into a web session" is something I've wanted to play with for a while |
|
|
| ▲ | j45 4 hours ago | parent | prev | next [-] |
| Neat, this could be a step forward from using something like n8n to manage processes, input and reviews. |
|
| ▲ | m0n01d 4 hours ago | parent | prev | next [-] |
| > $20/200 reduce your fractions ffs |
| |
| ▲ | dhorthy 3 hours ago | parent [-] | | ha fair enough - i think there's another comment thread on just being open w/ 10c / call and i wanna try that out |
|
|
| ▲ | dbmikus 9 hours ago | parent | prev | next [-] |
| Congrats on the launch! Definitely a needed product. BTW, your docs link is broken, but working docs link is here: https://www.humanlayer.dev/docs/introduction |
| |
|
| ▲ | Jacob4u2 9 hours ago | parent | prev | next [-] |
| Docs link is broken; https://www.humanlayer.dev/docs |
| |
| ▲ | dhorthy 9 hours ago | parent [-] | | oh wow! thank you! fixing! | | |
| ▲ | Jacob4u2 9 hours ago | parent [-] | | Hold up, is that illustrious Sprout Social alumni Dex Horthy? If you and Ravi are in SF we should catch up after the holidays. | | |
|
|
|
| ▲ | yoyopete 6 hours ago | parent | prev | next [-] |
| Oh look! Corrupt Dang made another launch HN a top post. So much corruption on this website. |
| |
|
| ▲ | fabmilo 7 hours ago | parent | prev [-] |
| Hiring humans to do a consistent job is gonna be a nightmare and a limit on the scalability of the service. How are you defining your service level agreements? |
| |
| ▲ | exhaze 6 hours ago | parent | next [-] | | This really makes you take a step back and just consider the world we're in now: someone critiques a company's approach as unscalable because... "hiring humans is a nightmare" Good LLord | |
| ▲ | fortysixpercent 7 hours ago | parent | prev [-] | | They aren't providing the humans. Just the tools for integrating human input/oversight. | | |
| ▲ | dhorthy 2 hours ago | parent [-] | | this is correct - I think helping you BYO humans will help you get much better training/labeling than outsourcing anyways, and that's the end vision of all of this - use humans to train agents so someday you might not need human in the loop, and those humans can move on to training/overseeing the next agent/application you're building |
|
|