Remix.run Logo
zimpenfish 4 days ago

> Nothing?

It breaks. Which is weird because you can create a string which isn't valid UTF-8 (eg "\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98") and print it out with no trouble; you just can't pass it to e.g. `os.Create` or `os.Open`.

(Bash and a variety of other utils will also complain about it being valid UTF-8; neovim won't save a file under that name; etc.)

yencabulator 3 days ago | parent | next [-]

That sounds like your kernel refusing to create that file, nothing to do with Go.

  $ cat main.go
  package main

  import (
   "log"
   "os"
  )

  func main() {
   f, err := os.Create("\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98")
   if err != nil {
    log.Fatalf("create: %v", err)
   }
   _ = f
  }
  $ go run .
  $ ls -1
  ''$'\275\262''='$'\274'' ⌘'
  go.mod
  main.go
kragen 3 days ago | parent | next [-]

I've posted a longer explanation in https://news.ycombinator.com/item?id=44991638. I'm interested to hear which kernel and which firesystem zimpenfish is using that has this problem.

yencabulator 2 days ago | parent [-]

I believe macOS forces UTF-8 filenames and normalizes them to something near-but-not-quite Unicode NFD.

Windows doing something similar wouldn't surprise me at all. I believe NTFS internally stores filenames as UTF-16, so enforcing UTF-8 at the API boundary sounds likely.

kragen 2 days ago | parent [-]

That sounds right. Fortunately, it's not my problem that they're using a buggy piece of shit for an OS.

commandersaki 3 days ago | parent | prev | next [-]

I'm confused, so is Go restricted to UTF-8 only filenames, because it can read/write arbitrary byte sequences (which is what string can hold), which should be sufficient for dealing with other encodings?

yencabulator 3 days ago | parent [-]

Go is not restricted, since strings are only conventionally utf-8 but not restricted to that.

commandersaki 3 days ago | parent [-]

Then I am having a hard time understanding the issue in the post, it seems pretty vague, is there any idea what specific issue is happening, is it how they've used Go, or does Go have an inherent implementation issue, specifically these lines:

If you stuff random binary data into a string, Go just steams along, as described in this post.

Over the decades I have lost data to tools skipping non-UTF-8 filenames. I should not be blamed for having files that were named before UTF-8 existed.

yencabulator 3 days ago | parent | next [-]

Let me translate: "I have decided to not like something so now I associate miscellaneous previous negative experiences with it"

kragen 3 days ago | parent | prev | next [-]

The post is wrong on this point, although it's mostly correct otherwise. Just steaming along when you have random binary data in a string, as Golang does, is how you avoid losing data to tools that skip non-UTF-8 filenames, or crash on them.

comex 3 days ago | parent | prev [-]

Yeah, the complaint is pretty bizarre, or at least unclear.

zimpenfish 3 days ago | parent | prev [-]

> That sounds like your kernel refusing to create that file

Yes, that was my assumption when bash et al also had problems with it.

kragen 3 days ago | parent | prev [-]

It sounds like you found a bug in your filesystem, not in Golang's API, because you totally can pass that string to those functions and open the file successfully.