top of page

LLM-OS: Will LLMs Become Operating Systems?




A new research paper proposes an intriguing idea - treating large language models (LLMs) like computer operating systems. This allows LLMs to manage memory and effectively have unlimited context. It's an interesting concept that could revolutionize how we interact with AI.


First, a quick primer on operating systems. An OS manages a computer's memory which is split into primary (RAM) and secondary (hard drive). It swaps data between them so everything runs smoothly. The OS builds a virtual memory combining the two.


The paper "Towards LLMs as Operating Systems" explores doing this with LLMs. The LLM would have a main context (like RAM) and external context (like a hard drive). This creates a virtual context with unlimited size.


How it works:

- The LLM has a main context window (ex: 4,000 tokens)

- There is also an external context with recall and archival storage

- The main context is actively used for conversation

- The external context stores conversation history

- The LLM swaps info between them as needed


This architecture allows perpetual conversations without losing context. The LLM can pull from past interactions stored externally. It also enables features like updating user preferences on the fly.


The researchers built a prototype system called Mem-GPT. It uses the GPT-3 API and an open source vector database for storage. Initial experiments look promising for extended dialog and "deep memory retrieval".


Of course, many challenges remain. It relies on GPT-3 for now, which has limitations. But the core concept opens up exciting possibilities. The code is public for anyone to explore.


Treating LLMs as operating systems could be a game changer. It may seem crazy today, but often big advances start with crazy ideas. This line of research is one to keep an eye on as LLMs continue evolving. The future could see AI assistants with unlimited memory and more human-like conversation abilities.


Links

Commentaires


bottom of page