Arcology is my site generator Project currently written in FastAPI and backed by my org-mode wiki set up on top of org-roam. The Goal for the project is to be able to publish subsections of my notes and code and knowledge across multiple websites, and to be able to publish a feed of new and updated pages within those subsections over open protocols like RSS and ActivityPub.
What's in a Name?
Arcology is an architectural and cultural school of design, a portmanteau of Ecology and Architecture, a philosophy of design principles for densely populated, ecologically low-impact human habitats. In Sci-Fi they're closed-system environments where human needs are met without harming the wider environment, through technical, social, and automated means. A solar plant and a vertical farm feeding thousands of residents, with countless acres of land left to wild outside the Arcology. In the real world there's a prototype desert commune in northern Arizona.
My Arcology is a technical and cultural school of design, of building systems which mesh cleanly with those outside it and the user(s) within it. It's a self-sustaining system choosing to remain unreliant on proprietary and liberty-hostile systems. This is a Privacy-focused self-hosted document publishing platform, a Hypermedia application exposing the inner working of my knowledge archive, link archive, and inner thinking. It's part of A Modern User Agent built for myself for now, and for the family and community I will have in a decade.
Vision and Goals
I am working towards a Minimum Viable Product for this tool as a system to publish any of my org-documents to the web for sharing among friends and family and my technical peers. It's a dynamic site written in Python, leveraging a metadata of my org-roam Knowledge Base and the Knowledge Base itself to build my websites.
I intend to slowly improve the org-mode caching system through my Arroyo System Cache project; I am finding it to be a compelling system badly in need of polish and presentation. org-roam provide a sqlite3 database, Arroyo gives it a re-normalized schema, and my Arcology server process does nothing but1 read this database and the org-mode documents on the file-system to provide a dynamically updating website and eventually social features like RSS and ActivityPub.
I am trying to fully develop this as an Literate Programming application, and so far the results have been mixed. See Arcology FastAPI Introduction.
Project History
Arcology Static Site Generator
Arcology started as an Emacs Lisp library which produced a static site where individual org-mode headings would be exported using a custom templating system to provide a site that was built around Microformat metadata specifications and the informal protocols around the Indieweb community.
This is the software which generate[s/d/ing] https://notes.whatthefuck.computer and it worked well enough, but it's failed to scale with my current system, and especially with EXWM having a long syncronous process in my Emacs process was a nonstarter – I stopped publishing my site because of how costly it was, and that must change.
I was also butting up against the organisational bad habits that I had picked up with regard to how my org-mode documents were organized and how I handled my personal Knowledge Base. I explored and began to realize that what I wanted was less of outline-style heirarchical organization of thought, and I was more interested in non-heirarchical structures like graphs and links.
Arcology Rails
Arcology Rails was the first attempt at a "dynamic thinking environment", a set of CRUD applications that could help me think, and remember. It was an experiment in building a simple graph structure without namespacing, and building intelligence on top of that.
I had visions of building dynamic applications on top of three components:
- a content addressed store of files
- a graph-database
- a lens-based programming environment on top of them
I am very far from being interested in building this sort of plumbing in the style and systems I am investing in. If those exist some day in a decentralized, equitable, tactile option, I would love to see them.
A brief exploration of TiddlyWiki
TiddlyWiki is a very cool programmable notebook, and I spent some time working with it in January and February 2020. Very promising thing, and it lead me to the value of embedded interlinks over metadata. But I wanted a lot of things that it couldn't give me, and so I went back to the drawing board on a static site generator built on top of org-mode.
Arcology Gen
Arcology Gen was another attempt at a smarter, more streamlined static site generator, but this time outside the Emacs process. This is where I started to realize that I need a cache database, something queryable, and something persistent. I constructed some very simple org-mode parsers, and that was interesting for publishing a directory, but I needed metadata extraction to power indices and feeds.
Arcology Elixir
In early 2020, org-roam gave me that database, and with some extensions, I could build the website. Around this time, Tor and I started working on a Home Mesh project in Elixir, and co-hosting that with my web-site seems like a reasonable cross-over. I wanted to use Ecto to build the site, and it may some day provide static HTML, but for now it's a dynamic Elixir application, a mashup of Phoenix and Ecto and some custom code to glue it to an Emacs environment
Now in early 2021 I have a working minimum viable prototype of the Arcology at https://dev.arcology.garden and I must evaluate the next steps. Jethro of org-roam is currently evaluating a 2.0 refactor and simplification and the vision of it doesn't quite line up with what I want even if it's for the good of the software and the ecosystem. Alas, I'll have to modify the Arcology Link Routers to handle these opaque ID links, and I think i can, but this major version bump lays bare the cost of my dependencies.
The Literate Programming Elixir Phoenix project is worth evaluating in its own right, still: Is developing entire services as an org-mode meta application viable or for that matter fun? How does the interwoven format aid long-term maintenance, refactoring and development? I do intend to explore this at some point (open threads)
- An honest evaluation of the Arcology MVP
- An honest evaluation of Literate Programming the Arcology MVP
As of [2021-09-25] I haven't written those things, and have struggled to find the energy and will to work on the Elixir version of arcology. Work has pushed me towards picking up FastAPI and now I am using that, hoping that I can get a small custom web app put together as I found that I wasn't using the dynamic high-scale, interesting things which Phoenix and Elixir provide.
Arcology FastAPI
The FastAPI version is going to be a lightweight implementation of the ideas I fleshed out in the Elixir implementation. It's still literate-ly programmed. Flask would probably be good enough, but having the database stuff integated with SQLModel is going to be nice.
At this point I have boiled down the "Arcology as dynamic site" down to a set of essential functions and ideas which could lbe implemented in any language, and I intend to explore at least a Rust implementation of it next. However, this system is lithe and fun to work with.
Arcology Rust
I have lately been working on arcology-rust-extractor which starts to peel the Emacs org-roam parser (Arroyo System Cache + Generators) out of the process. The last time I did some spelunking for org-mode parsers I found a decent/useful org parsing crate in Rust called orgize and started to build a little metadata extractor around that. I wrote about this in [2023-06-14] at I am starting to experiment with a rust rewrite of the Arroyo Arcology Generator.
The Arcology continues to be revised, always looking for improvements and slowly changing the design and architecture of the system over the course of years or more. The Arcology is part of the Concept Operating System and the Concept Operating System evolves in fits and starts, but maybe trending toward something others could use.
INPROGRESS Implement org-roam cache and Arroyo Arcology Generator in Rust, including higher-level table stuff
The extractor can currently output an object which extracts the page metadata necessary to render the Arcology based on the database schema present in Arroyo Arcology Generator, but does not actually put them in a database yet. Once it does, the Arroyo generator will be modified to expect the database schema that I land on with Rust – at the very least I won't have to "dequote" everything that comes out of the DB like I do with the current emacsqlite version.
NEXT integrate ORM, persist DB
WAITING Reimplementing arroyo-arcology-update-file isn't gonna be fun.
NEXT implement Syncthing client for file update tracking
This was a great idea in I am
starting to experiment with a rust rewrite of the Arroyo Arcology
Generator, binding to the Syncthing API instead of inotify
to know which files need to be
reindexed.
NEXT Implement the HTML exporting in orgize
Orgize also supports customized HTML exporting which could eliminate
my other process dependency, Pandoc and
the regular expression soup I have to do to for Rewriting and Hydrating the Pandoc
HTML by implementing an exporter that can just query links and what
not straight from the DB, generating an attachment cache, etc that I
find difficult to do in Python or Pandoc. Don't get me started on the
excitement I feel by perhaps being able to implement a feed generator without needing to write
lua
.
NEXT Implement HTML templating and request router
NEXT Port the other Arroyo generators
It's possible and perhaps likely that Arroyo Systems Management will slowly be ported in to this Rust toolkit if the Arcology implementation works out. The S-expression library I am using in the Rust has specific dialect support for Emacs Lisp so in theory I'll be able to have the Rust spit out an expression which can be parsed by org-babel.
NEXT ActivityPub and Microblog
There has been a parallel project tumbling around in my head, the Mastadan-powered second brain, which could be implemented in the rust toolkit, too, since it can be used to generate org-mode docs too; I tried an elixir ActivityPub library but it was not ready for action yet; there is another one developed by the Lemmy folks which may be more reusable. I found another weird option too called Dialtone which could be a neat little Rust thing to integrate… Idk. Anyways, a little microblog that supported multi-actor posting and thread coalescing/search/export could be neat as heck. If the Bonfire ActivityPub library was ready for business I would probably be implementing all the site stuff in Elixir, but this seems like a more likely win; maybe if I take take enough the Elixir lib will be ready… :)
WAITING consider Literate Programming
I haven't thought too hard about the implications of designing the arcology in something so unsuited for Literate Programming, but I should – open threads – but I am pretty excited about the implications of a fully Rust version of the Arcology or Arroyo.
WAITING what does offline posting look like?
and yet what does it look like to post while offline with this? do i have to fall back to an org notebook if my server isn't available? what if i am offline only with my phone?
Data Architecture
org-mode documents are the canonical datastore
I value plain-text formats and org-mode provides enough structurally significant syntax that I can extract any information that I need from the text, and re-encode that without a huge loss of fidelity.
org-roam and Arroyo System Cache provides a SQLite3 metadata cache
Parsing org-mode outside of Emacs is a fools-game, frankly. see also, what alphapapa calls this Greenspun's eleventh rule.
With my org-roam:
#+KEYWORD caching in org-roam-db extensions, I can cache arbitrary
keywords and then query them in other processes, or establish
higher-level tables by querying the (file, key, value)
store. Everyone makes big
warning sounds and scary noises when people start talking about using
sqlite3 in production, but this is the perfect usecase for it: I only
have read concerns outside of org-roam. SQLite is not a toy
database
Arroyo System Cache currently is implemented in Elisp and there is a Python interface that maps to it:
SQLModel
provides an interface to the metadata
cache
FastAPI
provides some nice quality of life and a
high-performance HTTP server
Arcology FastAPI Task Work
NEXT [#A] HEAD support with opengraph and the social embed shit, rss feeds for site and page…
NEXT [#A] Work on Arcology once a week
DONE [#A] sitemap web like Arcology Elixir had
DONE [#A] page metadata for dynamic robots.txt
Disallow / by default, Allow tagged pages
NEXT [#A] 404 page with stats for missing links clicked
NEXT [#B] update Data Architecture above to serve as a "tour of the arcology" type of thing
DONE Base feature set
NEXT test coverage
DONE [#B] feed generator
WAITING [#B] webmention server
NEXT [#B] iNaturalist observations PESOS importer
A page for each plant, a heading for each observation.
similar PESOS My Instagram posts in to blog
dogsheep's iNaturalist to Sqlite is half the work
NEXT [#B] [2020-10-27 Tue 22:31] Arcology should make sure that pages in the Archive are saved in the Internet Archive
NEXT [#B] make Pandoc HTML generation able to optionally bundle media
it'd be nice to resize the images and strip exif data and whatnot, but maybe i just need to do that on "my" side…
WAITING [#C] ActivityPub server
using tsileo/little-boxes
WAITING [#C] calendar generator
WAITING Trailblazer Mode
INPROGRESS
[#A] feed2toot
pulls feed list from a JSON API
endpoint dynamically rather than needing a deploy
dynamically add feeds for event pages and have them go to Fediverse automagically