▲ | 1vuio0pswjnm7 6 days ago | |
The proxy log contains the timestamps but not the titles For the titles I could extract them from pcaps; I also have a running tcpdump capture that logs to a (daemontools) multilog directory The URL consumption might be different, and difficult to compare, for a number of reasons, e.g., I do not use a browser that sends automatic HTTP requests for resources like images, CSS files, Javascripts, etc. I do not use a browser that runs Javascript so there are no XHR or other Javascript-triggered requests I do not use remote DNS, I use "curated" DNS data, so the URLs are only for resources at domains I specifically request I use HTTP/1.1 pipelining so I have large numbers of URLs that are for resources from a single domain, for example DoH (I do not include these in the URL database) Generally the proxy log is rather clean and excludes garbage requests that are being sent automatically; IME, use of a "modern" browser will fill a log with such garbage The proxy's self-signed certificate blocks many potential requests from hardware with pre-installed software from so-called "tech" companies, e.g., Google, Apple, Microsoft, because the TLS connections fail These attempted connections to the mothership are incessant; they would fill a proxy log with garbage URLs if they were accepted All this makes it easier to for me keep a URLs database; storing all those garbage URLs would make the database less useful |