Remix.run Logo
Show HN: Latex-wc – Word count and word frequency for LaTeX projects(github.com)
10 points by sethbarrettAU 3 days ago | 7 comments

I was revising my proposal defense and kept feeling like I was repeating the same term. In a typical LaTeX project split across many .tex files, it’s awkward to get a quick, clean word-frequency view without gluing everything together or counting LaTeX commands/math as “words”.

So I built latex-wc, a small Python CLI that:

- extracts tokens from LaTeX while ignoring common LaTeX “noise” (commands, comments, math, refs/cites, etc.)

- can take a single .tex file or a directory and recursively scan all *.tex files

- prints a combined report once (total words, unique words, top-N frequencies)

Fastest way to try it is `uvx latex-wc [path]` (file or directory). Feedback welcome, especially on edge cases where you think the heuristic filters are too aggressive or not aggressive enough.

gucci-on-fleek a day ago | parent | next [-]

Are you aware of the "texcount" program [0] that's distributed with TeX Live by default?

[0]: https://ctan.org/pkg/texcount?lang=en

mci a day ago | parent | prev | next [-]

  detex "$@" | wc
  detex "$@" | tr -cs '[:alnum:]' '\n' | grep . | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn
dang 3 days ago | parent | prev [-]

We need a link!

jxmesth 3 days ago | parent [-]

I think it's this - https://www.piwheels.org/project/latex-wc/

dang 2 days ago | parent [-]

Added above. Thanks!

elashri 2 days ago | parent [-]

I think the link to source code repository would be better

https://github.com/sethbarrett50/LaTeX-wc

dang a day ago | parent [-]

Changed to that. Thanks!