Broke Man’s LLM Wiki? An Attempt at Frugality

A few weeks ago, I read Andrej Karpathy’s post about his “LLM Wiki”. The idea was very cool, and it was something I could immediately see the benefits of (which is much more than what I can say for most “AI-powered” apps). TL;DR of that post: It is essentially an outline for an automated knowledge compiler and synthesiser; using an LLM to do the heavy lifting of the constant, dreadful, maintenance work required to keep such an evolving archive functional, accurate and useful.

I loved the whole concept. But alas, as most convenient things, it came with a hidden price. Unfortunately, the standard approach I saw online for this sort of continuous compilation, bookkeeping, cross-referencing, and synthesis was just… spamming API calls. I can see why this is convenient. The LLM handles everything for you and it’s flexible too. Unfortunately, there is one problem for me (and I imagine, for countless others): I do not spend money on LLM subscriptions for API access. That stuff is expensive, especially for a broke college student :(

I tried finding lighter alternatives to the standard LLM Wiki which wouldn’t completely decimate whatever free LLM API credits I have. Unfortunately, I did not find many promising ones. Additionally (and quite significantly, since it led to the creation of this whole thing), I was bored :P

So I began pondering, with my thoughts centred around a single question: “How do I make this sort of wiki while remaining broke and subscription-less?”

Lessons from Military Strategy

Let us take a lesson from military history, to motivate the point of this entire project. Consider the Napoleonic Era, more precisely the land battles fought during the Napoleonic Wars. Consider the cavalry regiments and their tactical usage. Against a routing or disorganised enemy, they were devastating and lethal. However, it was imperative that they be used sensibly. An order (such as the disastrous one given by Marshal Ney to the legendary Cuirassier regiments at Waterloo) for cavalry to charge straight into solidly-formed, bayonet-bristling infantry squares would be extremely expensive and ultimately suicidal! It would be considered absolute tactical tomfoolery!

If we were to put this into our current context, does it now make sense to call a multi-billion or multi-trillion parameter model to do something that some simple Python or Bash scripts could do 80% as well? Add a human into the loop with the scripts, et voila! One has achieved ~95% of the desired result for less than 5% of the cost. A far more tactically sound use of our “LLM cavalry”, if I do say so myself. With the power of such “combined arms” working in tandem, much can be achieved with less. I will expand a little on this later; for now let us proceed with the actual system.

What I immediately noticed was that a lot of this indexing and linting and cross-referencing could easily be replaced with scripts that would do a decent enough job. The only significant “AI” part of the process is the initial content generation and the more high-level, sort of “semantic”, cross-referencing. If I could delegate as much work as possible to scripts, I would be left with only the actually required LLM calls.

The rest (proofreading, editing, and so on) I could do myself. Of course, this would require more effort than the spray-API-calls-and-pray approach that many seem to be taking. I suppose one must pay either way: with currency or with time. It do be like that sometimes…

What resulted after an hour or so of brainstorming and subsequently Antigravity-assisted programming is Archivum Intellectus (Latin for “Archive of Understanding”). It’s a minimalist, lightweight version of the LLM Wiki driven majorly by scripts and some human effort, designed for the ~~broke~~ people without expensive LLM API subscriptions (cough cough me cough cough).

To give a dignified summary of the entire architecture, if you will:

Scripts and humans do the structural work; the LLM is called surgically for synthesis only. The LLM is never run on a schedule. Every LLM-summoning ~~ritual~~ API call is a deliberate choice by the user.

(For a rather undignified summary, one could put it as my friend did: “So you’ve stripped a Tesla down to a bicycle.”)

De-LLM-ing the LLM Wiki

To make this work, I had to cut out the “AI stack” bloat.

1. No Vector DBs or Embeddings I used SQLite FTS5 for full-text search over all wiki pages. It returns ranked results with snippets, it’s blazingly fast, and most importantly: it’s entirely free. Embeddings are overkill for a personal wiki under a few hundred pages. There is however a problem with this which is that semantic search is hard :(

2. Immutable Raw Files & Hashed Ingestion My directory layout is strictly separated. I drop source files (papers, blog posts, notes) into a raw/ folder, which is strictly immutable. A pure-Python digest.py script tracks the SHA256 hashes of these files. When I run my ingestion pipeline, the system only looks at new or changed files. If I just fix a typo in a raw file, it computes a unified diff and sends only the diff to the LLM for a targeted update.

3. Scripted Linting Checking broken links is not what I want to spend my precious tokens on :/. I wrote a lint.py script that runs locally (and automatically on push via GitHub Actions) to catch orphan pages, broken [[wikilinks]], missing frontmatter, and stale sources without token wastage.

4. A Regular “Inbox Pass” For dumping thoughts or notes, I have an inbox.md append-only scratch file. I spend a few days or so dumping rough notes, URLs, pastes, and random thoughts in there. Once it reaches a certain size (or at the end of a week, whichever comes first), I run inbox_flush.py. In one LLM call, the model classifies each chunk of content, deciding which existing wiki page it belongs to, or whether it warrants a new page, and applies the changes. Maybe I’ll change this later to thrust the human deeper into this loop as well to retain full control, but that’s a future topic.

The not-an-API “Interactive” API Provider

This is one of the more interesting and funnier bits, I must admit.

Through a student offer, I gained access to Google One’s Pro tier (GitHub Copilot offers something similar), which gives me generous token limits inside my IDE’s chat interface, including access to models like Claude Opus. However, these IDE-integrated models do not expose a traditional API interface. But we can still make full use of this opportunity.

Enter the “Interactive Provider”. Essentially a tool that lets you use your free Antigravity or Copilot tokens for the LLM calls, with only a wee bit of hassle.

When a tool in my archive needs the LLM to synthesize something, it doesn’t make a network request. Instead:

The script writes the full prompt to _pending/prompt.md
The terminal prints the file path and pauses
I open the prompt in my IDE (usually Antigravity) and pass it to the LLM
I paste the model’s reply into _pending/response.md
I press Enter in the terminal, the script reads the response, and continues its automated pipeline

It’s economical, slightly hacky, and effective. I like it.

Committing to the Bit

I also ended up making dedicated Antigravity workflows, a skill (my first one :D) and an AGENTS.md file to integrate the “Interactive” provider better.

Some Thoughts

That concludes most of the major points I want to discuss about this project. Now for the point brought up before: “much with less” and “combined arms”.

Much with Less: 80% of the work in 20% of the cost
- I would say I achieved about 80% of the optimal result I would want with about 20% of the total cost (both labour and monetary) I would need for an ideal version of this system
- To get the remaining 20% out of this however, would probably take a lot more
- Somewhere along the line it will reach a point where it becomes less fun and interesting to solve problem after problem than it is rewarding in terms of functionality
Combined Arms: Thinking strategically in the age of LLM-spam
- This was, during the Napoleonic Wars, and is in the current day as well, an approach to warfare that seeks to integrate all branches of armed forces, each acting in their specialised capacity to achieve a common goal
- Using this analogy, we got to a preliminary solution that is functional with as little LLM-spam as possible
- To me, that is a great way to use an LLM! No unnecessary calls, no slop, just used as intended (to ingest, generalise, and answer questions)

This would normally also lead me towards a rambling paragraph on LLMs and accessibility (which was pretty much the motivation for this project), but I believe I shall collect my thoughts and make a separate blog post later.

Conclusion

Archivum Intellectus is just files, and Markdown files at that. One could just use Obsidian to view them. If I lose my local copy, I can just clone my repo again and run the digest script to rebuild the local SQLite index from the markdown files. Simple, and I think quite elegant. I will keep tinkering and fidgeting with this entire system over time, but for now, this is cool!

Did I end up re-inventing the wheel and implement something someone else has already thought of? Probably. Was it fun? Yes. Is it good? Eh, it works. Did I learn something? Absolutely. Am I satisfied? For now, yes :D

If you want to try out this system, or if you just want to see how the scripts are put together, you can check out the full project on GitHub.