Remix.run Logo
camel-cdr a day ago

Disallowing /[^\t]\t/ and /^ / is a good start.

ytpete a day ago | parent [-]

These regexes don't solve what I think is one of the major common problems/complaints though: using extra tabs past the logical indent point as a shortcut to avoid typing so many spaces for alignment purposes.

To take this example from a sibling post:

  if (foo) {
  »   frobnicate(bar,
  »   ...........baz);
  }

Many people will wind up doing this:

  if (foo) {
  »   frobnicate(bar,
  »   »   »   ...baz);
  }
And then your alignment is all messed up if you have a different tabs setting.

Checking for that requires something more like a linter with a detailed understanding of the syntax parse tree.

mananaysiempre a day ago | parent | next [-]

There are degrees of “detailed”.

For example, I considered writing a small Awk program to check C code in response to another poster’s complaint about lack of tooling, but then quickly came to the conclusion that, with C’s insistence that (say) /??/<newline>* is a valid comment starter, getting this exactly correct probably does need an actual lexer that would go character by character. That sounded like it wouldn’t fit in an HN comment, so I stopped there.

(That said, that’s as far as you’d need to go in the majority of cases. A dishonorable mention is warranted for languages that use the same character as an operator and a paired delimiter simultaneously, that is C++, Java, C#, TypeScript, and Rust with their abuse of the less-than and greater-than symbols, because that would in fact require a parser. In C++ especially, you’ll need full semantic analysis with template expansion, name resolution, and consteval evaluation. Because C++.)

Yet you probably don’t actually need to be that accurate, do you? The majority of syntax highlighters aren’t, and they are still useful. You can usually afford to say that code that perverse deserves to lose, and in return I expect you should be able to gain a fair amount of language independence, which could be worth the tradeoff.

So instead of checking if things are aligned with what they should be, you would just check they are aligned with something, like a left word boundary preceded by a delimiter, and so on. I can already see unpleasant corner cases after thinking about it for a few minutes, but it doesn’t look hellish yet, it looks something like you could experiment with over a weekend to see if it was viable.

g-b-r a day ago | parent | prev [-]

At some point you just have to flog people ;)

Anyhow, if code reviewers always use tools that highlight the tabs during the reviews, there's a good chance to catch these things.

Maybe you could also have the tab width set randomly at every review, to make these horrors stand out