Back to Home | Back to PostsWritten:
---
One thing that’s very important to me is maintaining a healthy media diet. While I’m not averse to recommendation algorithms, I find that they optimize for engagement and on-platform time, as is required to produce unending growth in ad revenue. As a result, I find that such algorithms sacrifice quality for quantity, or outrage impact, or speed. At its extreme, in the case of the conservative control of mainstream media, this disregard for quality manifests as disregard for truth. Basically, whatever the reasoning, the thing that we lose is quality. So - as someone who deeply values quality, I find that I seek ways to make my own algorithm. This post is about two potential ways to automate the way that I consume written journalistic media - and I plan to eventually implement one of them. Think of this as an architecture document for this component of my home network.
What I’d like to be able to do is:
I plan to to use Ollama to serve models. It supports a broad variety of models, and is compatible with LangChain (which we’ll get into later). To start, I plan to use openhermes:v2.5
(based on Mistral-7b) as the main model, but using Ollama allows me to change the model fairly painlessly. I’ll deploy Ollama using this docker-compose template - in testing, using a GTX 1070 on fairly ancient hardware I was able to get 20-30 tokens per second of output, which should be adequate for this use case. Faster processors and better memory bandwidth will also help.
I plan to write an API to encapsulate common LLM transactions (such as summarize
or search
) using LangChain and FastAPI. Namely, I’ll implement some basic retrieval-augmented generation (RAG) use cases.
summarize
endpoint will take a URL, retrieve its contents, and use the Ollama endpoint to summarize it. Experimentally, openhermes:v2.5
does well enough with larger contexts, which you can set when you make the call to Ollama.search
endpoint will take a search query, retrieve the top ten results using my local SearxNG instance (a meta-search engine that looks at numerous different search engines), and yield a summary of the snippets.In time, I may extend or write a variant of the search
endpoint that fans out to call the summarize
endpoint on each search result for increased summary depth. I would then use a map-reduce pattern to summarize those summaries and return links that provided novel or important information to the summaries.
I’ll probably use uvicorn
as the ASGI server for this, since it’s built-in with FastAPI, and deploy the API using docker-compose.
I’ll also need to figure out how to measure the quality of a given webpage. The reasoning here is that even high-quality sources sometimes produce low-quality content. The Verge, for instance - often has posts that highlight discounts that they monetize using affiliate links. I’m not philosophically opposed to this as a business practice, as long as editorial remains independent, and the products that they highlight are actually of high quality. Slippery slope, though…
A few approaches I might try (likely, it’ll be more than one of these):
pihole
and pfblocker-ng
(which I use, personally) - if too many links are blockable, consider it low quality.I have a non-federated Matrix homeserver that I use for consuming event streams. It uses the dendrite
variant of Matrix, which is written in Go.
<digression>
One of the things that I am lucky to be good at is one-shot information retention.
As a result, consuming information by reading/text chat has always worked very well for me, and I rarely find benefit in taking notes. I am very interested in exploring the contours of this - in particular, the nature of text chat as event stream that is compatible with retention means it ought to be possible to accumulate a personal knowledge base.
I may experiment with some kind of indexed storage for chats that have been delivered - the Matrix homeserver database is not the ideal place to access this kind of stuff.
<\digression>
One thing my homenet has never had is a centralized, standardized way to schedule and monitor workflows. I’ve never really had a pressing need, as my existing RSS workflows are done using RSS-specific applications - formerly freshrss and currently maubot-rss. I also have the ability to run workflows on a cron schedule (or webhook) using the drone
ci platform.
However, I think I’d rather not use drone for this - I’m wary of abusing CI compute beyond its design scope (aka the jenkins-server-that-does-everything approach).
So, I think the two approaches I’d like to consider are…
I’m quite familiar (and fond of) Dagster because I use it (specifically Dagster Cloud Serverless) at work, and I think that I could fit this workflow into Dagster concepts:
If we’re committed to having maximum fun, we could also release the Matrix resource to the Dagster community, surely someone out there may have a need of it….
The other very common pattern for LLM applications is to use a no-code workflow management system such as Zapier to handle the business logic of collecting feed items. There’s numerous “entrepeneurs” out there that will sell you a course on how to create and manage these automations. Sort of a shady business, but I can’t deny the concept is sound, and would require a platform that supports one or more of the following components:
Of the two options I’ve identified, it seems that ActivePieces is newer and pretty slick-looking, whereas n8n is “low-code automation for technical people” (it me), and seems to have a very strong and usable abstraction on top of ANY http request.
First off, it’s become clear that I do NOT have a technology in mind for indexable storage. I should look into that. This TODO alone justifies writing this piece to me… I didn’t realize I didn’t know that.
Be it resolved - I’ll try a No-Code Framework first, and see how it goes. I am reasonably confident that I could make Dagster work.
I’m thinking n8n…