▲ | Supporting Our AI Overlords: Redesigning Data Systems to Be Agent-First(arxiv.org) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
67 points by derekhecksher 3 days ago | 22 comments | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | mark_l_watson 3 days ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Interesting paper, and something I have been thinking about. I am retired so I am a lighter user of AI than most people here but I still have Gemini and ChatGPT to a half dozen deep research studies for me a week. It is sobering to see how many web sites are speculatively searched. I mostly find the results useful and I prefer this new process to manual web search. After deep research asking for 'the best' reference link usually produces something else worth reading in addition to the research report. Someone else here recommended sites maintaining their own CLAUDE.md file, good idea but too vendor specific. Ten months ago someone online was recommending the name llms.txt as a generic markdown file for agent use and I added one https://markwatson.com/llms.txt I stopped collecting web page visit statistics so I have no idea how often that file is discovered however. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | cs702 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
The paper's title is too clickbaitish for my taste, but its subject is important: How should we rethink query interfaces, query processing techniques, long-term data stores, and short-term data stores to be able to handle the greater volume of agentic queries we will likely see, whether we want it or not, in coming years, if people and organizations continue to adopt AI systems for more and more tasks. The authors study the characteristics of agentic queries they identify (scale, heterogeneity, redundancy, and steerability) and outline several new research opportunities for a new agent-first data systems architecture, ranging from new query interfaces, to new query processing techniques, to new agentic memory stores. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | frenchmajesty 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
The proposed design in this paper is bad, but the core of the idea is very interesting. At a high-level, 90% of the complexity of their data retrieval system can be deleted by simply having attaching a `CLAUDE.md` file to every data store that is automatically kept up to date the agents can read. High-throughput queries by an agent don't feel much different than high-throughput querying that large scale systems Instagram and Youtube need to service on a daily basis. Whatever works for 10M active users per second on IG would also work for 50 agents making 1M queries per second. I can see a need for innovation in data store still. My little startup probably can't afford the same AWS bill than Meta but the tide would lift all boats, not just AI-specific use cases. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | lyu07282 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Is there an appendix with prompts separately somewhere? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | apwell23 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
15 ppl worked on this glorified blogpost? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | Towaway69 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Is the title to be taken seriously or is “AI Overlords” become some type of well-meaning indication of the positivity of having overlords? I thought AI can do anything, why do I have to help it if it’s so smart and powerful and intelligent and useful? Is it really just a complex computer program that is actually trained to do very narrowly defined activities? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | pevcr 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
[flagged] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | pevcr 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
[flagged] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | unisyncd 3 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
It is a debate for the main visitors the Internet services serve. A few decades ago, people visit each other using IP protocol, it is people themselves that collect news, read information, and publish new data. After that, browsers visit each site using HTTP protocol, it is browsers that collect data, translate pages, and interact with user. Nowadays, it is highly possible that, AI, WILL, involves into our daily life, and the above rewrites by, AIs request each <what> using <new> protocol, it is AIs that <do a lot thing>, and interact with user. Information never get unavailable, but main method for retrieving info does change. We could of course manipulate command-line utilities instead of browsers when browser became popular, we could of course continue to search and click everywhere on browser instead of AI-enhanced searching when AI got hot today. However it is a trend that AI will bring us to a new evolution in fast-pace information era. Users are whom sit behind the screen, they never changes, but their methods/agents/proxies change over time. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|