Indeed it is exactly that process. They cutup words. Then categorize them based on some metric of nearness (not random). Then link them up. Obviously this process is much more sophisticated than what I have described here.