Funding could help, but it still requires PyPI/Warehouse to ship and operate a new public search interface that is safe at internet scale.

▲

coldtea 5 hours ago | parent | next [-]

They operate a public package hosting interface, how is a search one any harder?

▲

miketheman 5 hours ago | parent [-]

PyPI responses are cached at 99% or higher, with less infrastructure to run.

Search is an unbounded context and does not lend itself to caching very well, as every search can contain anything

▲

bastawhiz 4 hours ago | parent [-]

Pypi has fewer than one million projects. The searchable content for each package is what? 300 bytes? That's a 200mb index. You don't even need fancy full text search, you could literally split the query by word and do a grep over a text file. No need for elasticsearch or anything fancy.

And anyway, hit rates are going to be pretty good. You're not taking arbitrary queries, the domain is pretty narrow. Half the queries are going to be for requests, pytorch, numpy, httpx, and the other usual suspects.

▲

woodruffw an hour ago | parent | next [-]

The searchable context for a distribution on PyPI is unbounded in the general case, assuming the goal is to allow search over READMEs, distribution metadata, etc.

(Which isn’t to say I disagree with you about scale not being the main issue, just to offer some nuance. Another piece of nuance is the fact that distributions are the source of metadata but users think in terms of projects/releases.)

▲

froh 2 hours ago | parent | prev [-]

I wonder how a PyPi search index could be statically served and locally evaluated on `pip search`?

	▲	firesteelrain 2 hours ago \| parent [-]
		PyPI servers would have to be constantly rebuilding a central index and making it available for download. Seems inefficient

▲

bastawhiz 4 hours ago | parent | prev [-]

Pypi has a search interface on their public website, though?