Remix.run Logo
imiric 2 days ago

> The application layer should not inject error codes into the transport layer which is what HTTP is in this case.

Huh? HTTP is an application layer protocol. It's perfectly acceptable for the application to return a non-200 status code when the request is invalid and can't be processed. There's a widely accepted status code for that exact scenario: 400 Bad Request. It informs the client that there was something wrong with their request, and in well-designed APIs, reading the response body would tell them the reason why. It would be wasteful for the client to always read the response and parse structured data to decide whether the request was successful (at the application level) or not. Status codes allow us to do that.

That said, I've seen arguments for and against this practice, as sibling comments mention, and ultimately consistency and documentation are more important than semantics.

The reason this line is blurry nowadays is because in the beginning web servers didn't contain complex logic. The web server was the application. Then came CGI scripts and application servers, and suddenly the application itself was making protocol-level decisions. The way this is typically structured in large applications is to have protocol-level abstractions that translate app-level errors into HTTP errors. But in small applications it's acceptable, though unsightly, to have HTTP logic mixed with business logic.

> Do you also think that Apache/Nginx should be injecting codes into IP packets?

Web servers do speak TCP/IP, so I'm not sure what your point is. Usually this is not something regular web apps need to be concerned with, but it's possible and sometimes desirable to introduce logic at the TCP or IP layer. There are proxy tools that work at both layer 4 and layer 7.

> If your application injects codes into the HTTP layer, how on earth does the client know whether the error originated at the application or at the reverse proxy/webserver?

By the status code, error message, and headers. An application would typically never return 502 Bad Gateway, a 301/302 redirect, or set headers like Cache-Control. By that same token, a reverse proxy/webserver would typically never override a 404 with a 200, or inject JSON error messages in the payload.

The application ultimately decides the Content-Type of the response, which Content-Types it supports, and which headers it expects, so why shouldn't it also decide which status codes to return and which response headers to set? A gateway between it and the user can change or enhance this protocol, and specific gateways could be extracted to handle common things like authn/authz and load balancing, but the frontend gateway shouldn't override the message the application is sending (in typical circumstances). Both things can coexist with different responsibilities while speaking the same protocol. HTTP is flexible enough to support that.

I'm curious, though: if you treat HTTP as the transport layer, what protocol does your application speak to the gateway? Is there some translation gateway that translates application-level semantics into HTTP ones?

lelanthran 2 days ago | parent [-]

>> The application layer should not inject error codes into the transport layer which is what HTTP is in this case.

> Huh? HTTP is an application layer protocol.

I want to emphasise that "in this case" bit.

HTTP is an application layer protocol when the application in question in a webserver and nothing else.

In the case of REST, HTTP is simply a transport protocol. It is not necessary to use HTTP as the transport for RESTful applications. It's common, convention even, but not required.

> if you treat HTTP as the transport layer, what protocol does your application speak to the gateway?

WSGI, maybe? Sure, you can emit status codes there too, but it will be a different protocol you are talking over, not HTTP.

I've seen gRPC gateways for HTTP REST endpoints too.

> Is there some translation gateway that translates application-level semantics into HTTP ones?

I don't think we should be translating application status codes into HTTP status codes. I mean, sure, I've done it myself plenty of times, but it is a mixing of layers and a mixing of concerns.

The fact is, HTTP semantics are defined for (and in the context of) a webserver not an application server. That our application server is chatty with HTTP does not place it in the running context of a webserver.

The semantics of HTTP status codes makes absolutely no sense when emitted by an application.

You might argue that one of them (or maybe two, if we're being generous) such as "400 Bad Request" should be emitted by the application if (for example) a parameter is missing but even in that case it makes more sense for the application to send error-code/error-message so that more information can be given (such as which parameter is missing/invalid, etc).

If you're sending "400" status code for a missing parameter, how will the client know whether the HTTP request was malformed or whether the application input was mangled?

dogma1138 2 days ago | parent | next [-]

I suggest you read the actual RFC. HTTP status codes are intended to represent the state of your application. HTTP is part of the application layer it is not a transport layer protocol.

2 days ago | parent | prev | next [-]
[deleted]
imiric 2 days ago | parent | prev [-]

I get where you're coming from, but I think you're placing too much emphasis on theoretical definitions rather than real world usage.

> HTTP is an application layer protocol when the application in question in a webserver and nothing else.

I haven't heard that definition before, and don't really agree with it.

HTTP is the protocol web servers use to communicate with web clients. Whether the server is serving static files or dynamic content based on complex logic doesn't change this.

> In the case of REST, HTTP is simply a transport protocol. It is not necessary to use HTTP as the transport for RESTful applications. It's common, convention even, but not required.

That's true, but I don't see any practical benefit of this distinction. REST concepts map cleanly to HTTP semantics, and practically all REST deployments use HTTP.

> WSGI, maybe?

I guess so, but WSGI is an abstraction useful for interpreted languages and Python specifically. It was a solution to standardize the deployment of a growing number of web frameworks, and to address the lack of a production-ready HTTP server in Python itself. Other languages and ecosystems don't need this abstraction. It would be like trying to make Java servlets universal. Some approaches are a good fit for some ecosystems, but not for others.

As I mentioned in my previous post, the way this is typically handled in, say, a Go web application, is by having an HTTP layer that acts as an intermediary between the protocol and the application. This way your business logic can remain free from HTTP-specific tasks like serialization, parsing, validation, etc. But if the application is only ever meant to be exposed via HTTP, then there's no harm in avoiding the abstraction, and having it speak HTTP directly. This might not be a good idea for testing and maintainability, but it's fine for small applications.

> I've seen gRPC gateways for HTTP REST endpoints too.

That's different. gRPC builds on top of HTTP, and uses a fundamentally different payload and request mechanism. It requires supported clients to even use it, which is why gateways are useful. But REST over HTTP is still plain HTTP. Clients don't need to be aware that they're talking to a REST endpoint, and REST serves as usage documentation more than anything else.

> The semantics of HTTP status codes makes absolutely no sense when emitted by an application.

That depends on the application. If an HTTP endpoint wraps an application call to create a user, and the caller doesn't provide a user name, the application can return an error, which the HTTP endpoint can translate to a 400 status code, including the error message in the payload. OR the HTTP endpoint can do some validation upfront, and immediately return a 400.

I agree with you that it wouldn't make sense for the application code to return HTTP status codes, but not because it's wrong semantically. I think it's wrong from a design standpoint (separation of concerns). HTTP semantics are limited at describing all application concepts, but the ones that are there map pretty cleanly, especially when REST is used.

> If you're sending "400" status code for a missing parameter, how will the client know whether the HTTP request was malformed or whether the application input was mangled?

Again, by reading the response body. Just because HTTP status codes don't describe all application errors, doesn't mean that it's a good idea to abandon them entirely, and always return 200. If the client receives a 400 response, then they can immediately know that something went wrong with the request, and they should inspect the response body for details. Nothing stops the application from returning custom error codes internally that uniquely identifies the actual reason for the failure, if the clients find this useful.

If the request was malformed, then a 400 response would make sense. If the application input was mangled, then the status code will depend on what happened. Was the mangled data part of the request? Then it's still a 400. Was the data mangled during endpoint or application processing? Then a 5xx response would be more suitable.

There are no hard rules for this, and many, many APIs are poorly implemented. But this doesn't mean that applications shouldn't take advantage of the full breadth of HTTP concepts to implement user and computer-friendly interfaces.