Remix.run Logo
Show HN: Han – A Korean programming language written in Rust(github.com)
109 points by xodn348 5 hours ago | 66 comments

A few weeks ago I saw a post about someone converting an entire C++ codebase to Rust using AI in under two weeks.

That inspired me — if AI can rewrite a whole language stack that fast, I wanted to try building a programming language from scratch with AI assistance.

I've also been noticing growing global interest in Korean language and culture, and I wondered: what would a programming language look like if every keyword was in Hangul (the Korean writing system)?

Han is the result. It's a statically-typed language written in Rust with a full compiler pipeline (lexer → parser → AST → interpreter + LLVM IR codegen).

It supports arrays, structs with impl blocks, closures, pattern matching, try/catch, file I/O, module imports, a REPL, and a basic LSP server.

This is a side project, not a "you should use this instead of Python" pitch. Feedback on language design, compiler architecture, or the Korean keyword choices is very welcome.

https://github.com/xodn348/han

dwg 21 minutes ago | parent | next [-]

@apt-apt-apt-apt pointed out in a separate comment that: >A simple translation of keywords seems straightforward, I wonder why it's not standard.

I replied that for Japanese at least, probably due to symbol input being too tedious. However I think it's worth mentioning a potential mitigation, and maybe even an advantage.

As a mitigation, full-width symbols could be used instead, which are typically the default in Japanese input. Japanese itself is also fixed-width so if done across the board the language itself becomes fixed-width, not just by virtue of a font selection.

As an advantage, some logical symbols, greek letters, other rare characters are easy to input in Japanese mode, and this could lend itself to a more symbol-heavy language design. I already take advantage of this myself with Δ, φ, and τ use selectively in a few projects. Symbols with easy entry may differ by OS, but here are a few other examples that could be useful:

≠, ≡, ∞, ∴, λ, θ, α, β, ・, °, ※

m-hodges 2 hours ago | parent | prev | next [-]

When I was studying Computer Science in college, I once remarked how lucky we, English speakers, are that programming languages use English nouns and verbs. A ton of my classmates were here on a student visa, and English was not their first language. I always thought that programming in English put me at an advantage on the learning curve. I also always thought it was silly when someone would quip that programming should count for “foreign language” credit. Anyway, always cool to see non-English programming languages.

localuser13 an hour ago | parent | next [-]

At a risk of going against the hivemind, I disagree.

I self-taught programming quite early in my life, way before I had a good command of the English language. I've read books in my native language, talked on programming forums in my native language. In the end the "english" in programming languages is just a handful of keywords, and it didn't hinder me one bit that I had no idea "int" stands for "integer".

Of course, I started by writing code like "bool es_primo(int numero)" (in my language), but there's nothing in C that says identifiers must be english, just convention. Standard library and packages nowadays would be a problem, but back then standard library were thin and "strcpy" name is obscure anyway. The real hard part was always learning how to program and design properly.

And for more advanced topics, documentation and learning materials in english only are HUGE problem for ESL, because one has to actually read and understand them. But this is not something programming language can help with.

a57721 an hour ago | parent | next [-]

I have a similar experience, I learned English much later than my first programming languages, and picking up some keywords and basic APIs was never an issue (it was BASIC and C/C++ at the time). Maybe I would occasionally look up in a dictionary what is 'needle' and 'haystack' in a code snippet, and I was puzzled by the ubiquitous "foo, bar, baz", which to my relief turned out to be equally cryptic for the native speakers. I still don't feel about code as a kind of English prose, it occupies a separate part of my brain, compared to the natural languages.

xodn348 21 minutes ago | parent | prev [-]

I agree with your opinion and I was wonder how the Korean could be used in the world with full of Eng. Thanks for your feedback!

kccqzy 7 minutes ago | parent | prev | next [-]

That’s the least of their problems. The best computer science textbooks are published first and foremost in English and only translated belatedly. The research papers are in English and not often translated. Even the manuals of both commercial and FOSS programming tools tend not be translated. A few keywords is what, half an hour of rote memorization.

deepsun an hour ago | parent | prev | next [-]

Naah, my non-english-speaking friends say that the keywords are less than 1% complexity of a programmer's job, so it really doesn't matter.

Also, in most languages you already can name variables/classes/members in any Unicode letters. So only "if/for/while" keywords and stdlib classes remain English. It makes little sense to translate those.

thisislife2 2 hours ago | parent | prev | next [-]

True. English is a major reason why India is the IT back-office for most of the western world. I too have personally observed how my fellow classmates, who had done their schooling in their regional language, struggled with the coursework in college because it was solely in English. And some of them were state rankers - it felt bad to realise that they had to put in twice the effort needed to keep up their grades. I think there's a lot of potential wasted in India because of this kind of hardship / struggle - a lot of intelligent people are held back just because they lack an aptitude for multilingualism.

xodn348 2 hours ago | parent | prev [-]

Thank you for your empathy. English has been the one of the most frequent languages for globe so that it is reasonable to Eng in many coding project, though.

cubefox an hour ago | parent [-]

It's may also be reasonable to make localized translations for a programming language. This is rarely done in reality for obvious reasons. An exception are Excel's function names. People who don't know English, or hardly know it, appreciate it.

danparsonson 3 hours ago | parent | prev | next [-]

Wonderful! What a cool idea. For anyone interested, you can learn the whole of Hangul in an afternoon; it's cleverly designed to be very logical and has some handy mnemonics: https://korean.stackexchange.com/a/213

bryanhogan an hour ago | parent | next [-]

These are really cool! Will also add a version of these mnemonics to the Korean guide I have been writing: https://tolearnkorean.com/

Learning the Korean alphabet (Hangul) can be done quite quickly, it's only about as many "letters" as the English alphabet!

Remembering the words is a bit more difficult though, especially if you don't know a similar language. Have been using Anki and my own app for that: https://game.tolearnkorean.com/

xodn348 3 hours ago | parent | prev | next [-]

That is a deep knowledge that even Korean-natives would not know. I will add this site as a reference to Github. I am glad that I have you as a supporter!

zdragnar 29 minutes ago | parent [-]

Really? That's how it was taught to me by Korean teachers at University. Even if it isn't daily-useful bit of info, it's such a fundamental component of the written form that I would have expected it to be common knowledge.

xodn348 3 hours ago | parent | prev [-]

Just added that link to the README — it fits perfectly in the "Beauty of Hangul" section.

parksb 31 minutes ago | parent | prev | next [-]

Great work :) If you're interested in Korean programming languages, there's a functional one called 'Nuri': https://github.com/suhdonghwi/nuri/

Rather than just translating keywords, it lets you write code that actually uses Korean grammar. For example, "10을 5로 나누고 출력하다" (literally "10 by 5 divide and print") outputs "2".

You might already know this, but there's also a Korean programming language called 'Yaksok'. Here's a 2048 written entirely in Korean: https://github.com/yaksok/yaksok/blob/master/code_examples/2...

xodn348 23 minutes ago | parent [-]

That is a fair feedback and I have known those languages which are very reasonable and fairly designed language. But I wanted to more focused on translated into rust for english speakers first, which would make bigger user for this language. Thanks for your feedback!

apt-apt-apt-apt 3 hours ago | parent | prev | next [-]

A simple translation of keywords seems straightforward, I wonder why it's not standard.

    # def two_sum(arr: list[int], target: int) -> list[int]:
    펀크 투섬(아래이: 목록[정수], 타개트: 정수) -> 목록[정수]:
    # n = len(arr)
    ㄴ = 길이(아래이)

    # start, end = 0, n - 1
    시작, 끝 = 0, ㄴ - 1
    # while start < end:
    동안 시작 < 끝:
Code would be more compact, allowing things like more descriptive keywords e.g. AbstractVerifiedIdentityAccountFactory vs 실명인증계정생성, but we'd lose out on the nice upper/lowercase distinction.

I hear that information processing speed is nearly the same across all languages though regardless of density, so in terms of processing speed, may not make much difference.

csande17 2 hours ago | parent | next [-]

It's been tried with Chinese Python back in the early 2000s: http://reganmian.net/blog/2008/11/21/chinese-python-translat...

It never really took off. I think because computers already require users to read and type Latin letters in lots of other situations, and it's not that hard to learn what a few keywords mean, so you might as well stick with the English keywords everyone else is using.

UltraSane 2 hours ago | parent [-]

It might not be hard to learn the keywords in an English programming language but it seems hard to learn something like Spring Boot

xodn348 3 hours ago | parent | prev | next [-]

Good point about compactness — 실명인증계정생성 vs AbstractVerifiedIdentityAccountFactory is a real example where Korean shines.

One distinction though: Han uses actual Korean words, not transliterations. 함수 means "function" in Korean, 만약 means "if" — they're real words Korean speakers already know.

Your example uses transliterations like 펀크 and 아래이 which would look odd to a Korean reader. That difference matters for readability.

WillAdams 2 hours ago | parent [-]

Using actual Korean words rather than transliterations greatly aids readability --- I can still remember stumbling over the transliteration of "Walker Hotel" when taking Korean at the Presidio of Monterey, and pretty much everyone else had the same problem.

dwg 30 minutes ago | parent | prev | next [-]

I can't speak to Korean, but thinking about Japanese, one probable reason it wouldn't catch on is how tedious and inefficient it would be to constantly switch between input modes. Japanese input mode is designed for prose, and isn't well-suited to efficiently entering the symbols commonly used in programming. Even spaces. It results in needing a lot of extra keystrokes.

sheept 2 hours ago | parent | prev | next [-]

Scratch supports Korean, but Scratch benefits from using JSON instead of bytes or code points to serialize programs, which allows the user to change their display language (similar to how hard tabs let users set indentation size).

There's probably a lot of reasons why non English programmers stick with English keywords, beyond just language/tooling support. Learning new keywords is already part of learning a programming language, and much of the documentation and resources available for languages and libraries are only in English. ASCII-only strings are still ubiquitous in software, like URLs and usernames. And in international teams, English is the go-to lingua franca.

Could this change with LLMs? Maybe, but most code in its training data is in English, so LLMs likely work most effectively in English.

xodn348 3 hours ago | parent | prev [-]

funny examples, though.

zellyn 2 hours ago | parent | prev | next [-]

I love this. Nice work!

It’s fun to look at your code samples, have absolutely no clue what any of it means, and think about just how many non-English-speaking programmers must have felt that way looking at our all-English programming languages.

Except lisp: that’s just inscrutable symbols like cond and cons and car and cadr and a bunch of parens! :-)

woctordho 2 hours ago | parent | next [-]

The real barrier is not just the language keywords but lots of documentations and discussions in English. I'm not sure whether there is a solution to this.

xodn348 2 hours ago | parent [-]

This tons of Eng documents and contents cannot be translated once but this project is trying to use another language for future use. Thanks for the comment, though.

xodn348 2 hours ago | parent | prev [-]

Haha, using Eng has been reasonable for decades. Thank you very much of your comments.

ovciokko 2 hours ago | parent | prev | next [-]

This is indeed a cool project! Happy to see experiments on non-English programming languages. I have one question — not trying to be offensive or doubting, just out of curiosity — does Han make use of the unique properties of Hangul (or Korean in general)? Like, I remember sawing a Turkish programming language on HN the other day, and I might be wrong but my impression was it makes use of some syntax unique to Turkish, and I wonder if Han has similar features. Or, asking it differently, if I replaces only the lexer to another lexer recognizing a different script, will it not work?

xodn348 2 hours ago | parent [-]

Honest answer: right now it's mostly a keyword translation with English-like syntax order. Korean is SOV (subject-object-verb) but Han follows English SVO order — 목록.추가(값) reads like "list.add(value)" not the Korean natural order. Changing that would require a fundamentally different syntax design, which is an interesting challenge for the future.

That said, a few things do lean into Korean specifically:

- Method names are real Korean verbs: .추가() (add), .삭제() (delete), .분리() (split)

- Error messages are in Korean

- The REPL prompt is 한> and exit command is 나가기 (literally "go out")

Good question — it pushed me to think about what makes this more than just s/function/함수/g.

ovciokko an hour ago | parent [-]

Thank you for replying! As a non-English speaker too, I always love to see people trying out new things different from the English mindset. Hangul is a very cool writing system and I'd love to see Han live and evolving.

clark1013 14 minutes ago | parent | prev | next [-]

The name reminds me of the character Han from Broken Sisters.

all2 32 minutes ago | parent | prev | next [-]

So... iirc Korean words are constructed out of symbols, would it be possible to mutate the meaning of keywords by giving the symbols meaning and constructing new blocks of symbols?

xodn348 18 minutes ago | parent [-]

That's a fascinating idea! Hangul is indeed compositional — 한 = ㅎ + ㅏ + ㄴ — so in theory you could assign meaning to individual jamo components.

But in practice, breaking syllables into jamo would make keywords less readable, which goes against Hangul's design goal. And considering how AI-assisted coding works today, fully named descriptive keywords actually reduce errors — LLMs perform better with explicit, unambiguous tokens than with cryptic symbol compositions.

So Han leans toward more descriptive Korean keywords rather than shorter symbolic ones. Readability over brevity.

Interesting direction to think about though — thanks for the question.

lgessler an hour ago | parent | prev | next [-]

I know this is mostly about keyword substitution but it still tickles me that you still write f(x) in this language and not (x)f given that Korean is SOV but I guess that's just how you notate that no matter what cultural context you're in. Hadn't ever considered that the convention of writing a function before its arguments might have been a contingency of this notation being developed by speakers of SVO languages.

localuser13 an hour ago | parent | next [-]

I think this notation is superior, because of syntax completion - get_name(user.id) can be syntax completed by IDE, (user.id)get_name can't. Just like "SELECT id, name FROM users" would be better of as "FROM users SELECT id, name" (LINQ in C# fixed this mistake, and most modern query languages do too).

borski an hour ago | parent [-]

…if you’re typing from left to right. :)

cubefox an hour ago | parent | prev [-]

Object oriented programming languages also use object.method rather than method(object), so I don't think prefix/suffix notation has much to do with language.

xukeek an hour ago | parent | prev | next [-]

Curious about the name — if the language uses Korean (Hangul) keywords, why call it “Han”? In Latin letters that usually reads like the Chinese 汉 (Han) / pinyin rather than Korean. Is there a specific reason for that choice?

water_badger 3 hours ago | parent | prev | next [-]

fun fact, you can easily write c in any language you want through the power of macros

https://github.com/farant/rhubarb/blob/main/include/latina.h

edit: oh, maybe you can’t do full unicode. that’s too bad!

xodn348 3 hours ago | parent [-]

Ha, neat trick. But macro substitution and a purpose-built language are very different — Han has a full pipeline (lexer → parser → AST → interpreter + LLVM codegen) designed around Korean from the ground up.

Error messages, REPL, LSP hover docs are all in Korean. You can't get that from #define 만약 if.

water_badger 3 hours ago | parent [-]

yeah, making a whole language is way more impressive!

anecdotally it is also interesting to use with ai because apparently it is "harder to be on autopilot" based on a huge pre-existing corpus of code when you write it in a different language. could activate different reasoning regions somehow.

(i just appreciate what can be trivially accomplished in c even if it's kind of janky after spending way too much time in the JS preprocessor mines...)

raaspazasu 4 hours ago | parent | prev | next [-]

I don't know Korean at all, but this looks cool and a fun project. I'm curious if this reduces typing or has any benefits being in Hangul vs Latin?

xodn348 4 hours ago | parent | next [-]

Thanks! One thing that motivated me was curiosity about prompt efficiency in the AI era. Hangul is beautifully dense — a single syllable block packs initial consonant + vowel + final consonant into one character. I wondered if Korean-keyword code might produce shorter prompts for LLMs.

I actually tested this with GPT-4o's tokenizer, and the result was the opposite — Korean keywords average 2-3 tokens vs 1 for English. A fibonacci program in Han takes 88 tokens vs 54 in Python.

The reason comes down to how LLM tokenizers work. They use BPE (Byte Pair Encoding), which starts with raw bytes and repeatedly merges the most frequent pairs into single tokens. Since training data is predominantly English, words like `function` and `return` appear billions of times and get merged into single tokens.

Korean text appears far less frequently, so the tokenizer doesn't learn to merge Hangul syllables — it falls back to splitting each character into 2-3 byte-level tokens instead.

It's a tokenizer training bias, not a property of Hangul itself. If a tokenizer were trained on a Korean-heavy corpus, `함수` could absolutely become a single token too.

So no efficiency benefit today. But it was a fun exploration, and Korean speakers can read the code like natural language. It could also be a fun way for people learning Korean to practice reading Hangul in a different context — every keyword is a real Korean word with meaning.

ralferoo 3 hours ago | parent | next [-]

I don't know how to read Hangul (I know the general idea about how the character is composed). To me just looking at the examples, it doesn't seem as obvious what the structure of the code is, compared to Latin letters and punctuation. Actually, most punctuation looked OK, but the first couple of examples used arrays and [ and ] seemed to just blend in with the identifiers wherever they appeared. I'm not sure how distinct they look with familiarity with Hangul characters. I'm sure it's also nothing that colour syntax highlighting wouldn't make easier.

xodn348 3 hours ago | parent [-]

Fair point that [ ] can blend in.

For Korean readers the character systems look quite different, but I can see how it's hard to parse visually without familiarity.

As you said, syntax highlighting helps a lot — there's a colored screenshot at the top of the README showing how it looks in practice.

topce 4 hours ago | parent | prev [-]

Very Interesting...

I have similar idea to train LLM in Serbian, create even new encoding https://github.com/topce/YUTF-8 inspired by YUSCII. Did not have time and money ;-) Great that you succeed. Idea if train in Serbian text encoded in YUTF-8 (not UTF-8) it will have less token when prompt in Serbian then English, also Serbian Cyrillic characters are 1 byte in YUTF-8 instead of 2 in UTF.Serbian language is phonetic we never ask how you spell it.Have Latin and Cyrillic letters.

xodn348 4 hours ago | parent [-]

Really interesting approach — attacking token efficiency at the encoding level is more fundamental than what I did.

Even without retraining BPE from scratch, starting with YUTF-8 and measuring how existing tokenizers handle it would already be a worthwhile experiment.

Hope you find the time to build it, good luck!

bbrodriguez 4 hours ago | parent | prev [-]

Korean doesn’t reduce typing compared to English from my experience. What looks like a “character” is actually a syllable block called “eumjeol” that’s made up of consonants (moeum)and vowels (jaeum). You can’t have a vowel only syllable either so you always have to pair it with a null consonant no matter what (which kinda looks like a zero: ㅇ) and while nouns can be much more concise compared to English, verbs can get verbose.

The main benefit of Korean actually comes from the fact that the language itself fits perfectly into a standard 27 alphabet keys and laid out in such a way that lets you type ridiculously fast. The consonant letters are always situated in the left half and the vowels are in the right half of the keyboard. This means it is extremely easy to train muscle memory because you’re mostly alternating keystrokes on your left hand and right hand.

Anecdotally I feel like when I’m typing in English, each half of my brain needs to coordinate more compared to when I’m typing in Korean, the right brain only need to remember the consonant positions for my left hand and my left brain only need to remember the vowel positions.

lunarboy 2 hours ago | parent | next [-]

It's fascinating and lucky how well Korean fits into a keyboard designed for English. Japanese and Chinese are notoriously inefficient to type.

xodn348 3 hours ago | parent | prev [-]

만나서 반가워요!

What you talked is mostly right and I did not know about typing in Korean, the left-hand side and right-hand side. Btw, Consonant(Jaeum) and vowel(Moeum).

In experience-wise, what you had would be precise.

WillAdams 2 hours ago | parent | prev | next [-]

Have you looked into whether there are any Hanja (Chinese characters) which would be sufficiently expressive to warrant supporting as an alternative way to represent keywords?

Perhaps look to APL for efficient ways to represent math concepts/structures?

xodn348 2 hours ago | parent [-]

I personally don't know Hanja at all, and I think that's common for most younger Koreans. Korean did borrow from Chinese characters historically, but it's similar to how English was influenced by Dutch, German, French, and Latin — each language developed independently.

Korean has its own pure Korean words (순우리말) as its foundation, and borrowed some Chinese-origin vocabulary on top of that.

Hangul was specifically created so people wouldn't need to learn Chinese characters.

So Han's keywords use native Korean words where possible — it fits the spirit of Hangul itself.

anesxvito 3 hours ago | parent | prev | next [-]

Really cool to see more developer tools built in Rust. I've been using Rust for a desktop app backend (Tauri v2) and the performance difference vs Electron is night and day — native memory usage, instant startup. Curious what the compile times look like for Han compared to rustc itself.

anigbrowl an hour ago | parent | prev | next [-]

I wonder why there are not more programming languages not only with non-English keywords, but with different grammars. For example, if X then Y and many other constructions closely follow English grammar, but when you study other languages you quickly become away of many other possible constructions.

dmurray an hour ago | parent [-]

I don't think programming languages follow English grammar or syntax closely.

If X then Y, sure. While X, do Y? Maybe. For X equals Y, X equals Z, X is incremented, do A? Hardly. Match X case Y1 Z1 case Y2 Z2? Definitely not

Native English speakers have a small leg up understanding the vocabulary, not the syntax.

bouncycastle 2 hours ago | parent | prev | next [-]

I'd imagine that a transpiler for Perl with this idea would make one-liners even more potent.

technol0gic 3 hours ago | parent | prev | next [-]

i only code in this when no ones around. one might say I...han solo

xodn348 3 hours ago | parent [-]

Force for good be with you

marysminefnuf 3 hours ago | parent | prev | next [-]

My dream is to one day make a chaldean programming language for my kids. Stuff like this is inspiring

xodn348 3 hours ago | parent [-]

The fact that you're already thinking about it means you can do it. Go for it!

tw1984 an hour ago | parent | prev | next [-]

The name is interesting, it is just like some aussies creating a new language calling it Anglo.

sudo_cowsay 2 hours ago | parent | prev | next [-]

Imagine if you had to work with a Korean company using this

Amazing though!

AndrewKemendo 2 hours ago | parent | prev | next [-]

I’ve always wondered why there weren’t more non-english charactered programming languages but I can only assume it was just inertia

This seems like a reasonably good security measure too

AndrewKemendo 2 hours ago | parent | prev [-]

I’ve always wondered why there weren’t more non-english charactered programming languages but I can only assume it was just inertia