| ▲ | marginalia_nu an hour ago | |
You can actually identify clusters of websites based on the cosine similarity of their outbound links. Pretty useful for identifying content farms spanning multiple websites. Have a lil' data explorer for this: https://explore2.marginalia.nu/ Quite a lot of dead links in the dataset, but it's still useful. | ||