Co-op Cloud Spike

Co-op Cloud Spike

As part of the Multi-Site Wikipedia project, this spike is a technical experiment to see if switching to using Co-op Cloud will work for us.

Context

LoRes Node is already able to install, upgrade, delete, start and stop apps. LoRes Node apps are docker swarm stacks defined in a compose file in a git repository. Traffic is routed to apps using Traefik.
Co-op Cloud is an open source project to make it easier to self-host open source web apps. It also does this by installing docker swarm stacks from a compose file in a git repository and routes traffic to them using Traefik.

Opportunity

Co-op cloud has features that LoRes Node doesn't have, and using it would come with some technical advantages:
- Application management is handled by a command-line tool (“abra”) rather than a web tool. Web tools can be built on top, but LoRes Node currently doesn't have any way to interact from the command line and the UI is a bit too coupled to the functionality
- Co-op cloud Recipes already exist for many common open source web apps we care about
- App backup exists for co-op cloud, and possibly other useful services like that
Co-op cloud has an established federated democratic structure and community which comes with advantages compared to LoRes hoping to build one:
- Demonstrated history of decision making
- Federation includes other interesting parties, eg Bonfire
- Co-op cloud has some demonstration of radical values
- A community of collaborators exists already and is active on matrix and git
- Has an established model for using open collective to handle funds, receive donations and pay for development

Unknowns

We have the following unknowns in using Co-op Cloud:

It says that it does not support ARM processors (and thus the Raspberry Pi). In practice it is apparently an issue of the app recipes. I've been told that some people are doing it. Does this work? What difficulties will we encounter? Will we be able to tell which apps will work and which will not?
If our LoRes Apps are co-op cloud recipes installed by Abra, how do we provide additional capabilities, such as P2Panda comms?
How do we incorporate Co-op cloud into the LoRes Node design? Do we call abra from our user interface? Or does LoRes Node become an app installed by Abra?

Experiments

✅ Install an existing Co-op Cloud app with abra
✅ Build a Kiwix recipe for Co-op Cloud
✅ Run the Co-op Cloud Kiwix recipe on a Raspberry Pi
✅ Deploy a dummy app that can list other installed apps

Experiment 1: Try using abra

Learned so far:

Abra is designed to be run on the developer's computer, not on the server, which is a very different approach to what I've been doing.
- If Abra is run locally, the config could be put in git and shared, avoiding clickops.
- Abra can be run server side (docs)
- Question to answer: Should we run abra locally or on the node server?

Overall using abra was a very nice experience. The manual commands needed to go from new VPS to abra are so few that it doesn't really justify the use of Ansible or something to automate them.

I'm leaning towards thinking that Abra should be run on the development machine, not on the server. That means that Merri-bek Tech Node Stewards (Operators in Co-op cloud speak) could share a git repository and store the abra config there. I'd like to test this approach.

Experiment 2: Build Kiwix Recipe

This was pretty straightforward to build. I used abra locally, which seems pretty essential for local development of a recipe. It handled things pretty well. It's up and running (for the moment) at https://kiwix.spike2.merri-bek.tech/

Experiment 3: Run Kiwix on Raspberry Pi

There seems to be no fundamental obstacle to running kiwix on a Raspberry Pi, providing you pick a container image built for the pi. My recipe uses the image provided by offline internet - offlineinternet/kiwix.

The recipe that I build is a draft at this stage. It has the following compose

---
services:
  kiwix-serve:
    image: offlineinternet/kiwix:latest
    networks:
      - proxy
    volumes:
      - kiwix_data:/data
    deploy:
      restart_policy:
        condition: on-failure
      labels:
        - "traefik.enable=true"
        - "traefik.http.services.${STACK_NAME}.loadbalancer.server.port=8080"
        - "traefik.http.routers.${STACK_NAME}.rule=Host(`${DOMAIN}`${EXTRA_DOMAINS})"
        - "traefik.http.routers.${STACK_NAME}.entrypoints=web-secure"
        - "traefik.http.routers.${STACK_NAME}.tls.certresolver=${LETS_ENCRYPT_ENV}"
        - "coop-cloud.${STACK_NAME}.version="
networks:
  proxy:
    external: true
volumes:
  kiwix_data:

Here are some things I learned:

There is no way to mark the Coop Cloud recipe as supporting particular platforms, despite the fact that the docker image is tagged that way.
As part of this I ran this on a pi on my local network at radish house. Accessing it on my local network as both radish.local and via an external domain like radish.nodes.merri-bek.tech was hard. I had to pick one. There is an EXTRA_DOMAINS feature in the traefik recipe to play with though.
If serving locally, eg at radish.local, Traefik SSL certificate generation isn't going to work. Instead, locally signed certificates are needed. There is information about how to do that in handbook under Running an Offline Coop Cloud Server. There is also a certificate generator called certificados.

Experiment 4: Ensure that an app can get access to the running stacks

I conducted this by building a dummy recipe which I called spikespy. It had the following config: compose.yml

When I log into the debian/nginx image that is connected to the socket proxy, I can apt update && apt install docker-cli. Then docker is there and I can use it without having the normal docker socket mounted, by specifying the host.

For example, it works if I do:

docker -H tcp://socket-proxy:2375 ps

But, it is configured to not allow docker exec, and so if I do:

docker -H tcp://socket-proxy:2375 exec -it effe902f3e66 sh

I get a forbidden error.

Notes for Coop Cloud

When using abra on the server (obviously not the golden path, so may be unimportant) the default server is not given a domain (it's just default), so each app needs to have that domain entered on creation. Otherwise it'll use traefik.default as the domain. This is fine but a little awkward. It'd be nice if the server domain was set in one place.
In the New Operators Tutorial, it's light on how to provision servers, which is fine, but I think it should probably give some advice on whether to ssh in as root or a non-root user. I'm inferring that the usual approach is a non-root user, maybe with sudo access, but that access isn't used by abra anyway. Does this user need sudo? Do people often use a user per operator (for traceability) or a single abra user?
The New Operators tutorial could perhaps help people through edge cases of server setup a bit more. For example, a fresh Ubuntu noble VPS on Digital Ocean hits the problem with swarm init where it detects two IP addresses and it needs you to use –advertise-addr.
In the New Operators tutorial, it suggests the command groups | grep docker is to check if the docker group exists. Isn't that command actually to check if the current user is in the docker group already? Especially in the context of the next command groupadd docker which then invariably says “this group already exists”
The reason I first tried using abra on the server was actually just that the documentation for installing abra is a bit subtle about switching to using your own machine after all previous sections had been running the commands on the server. It is written there, in the first sentence Now we can install abra locally on your machine but I'd probably make it strong. Like “Let's now switch you your development machine for the next step. You should have a computer with these dependencies installed, xxx. If you don't, there's another option to run abra on the server which is covered in the handbook here…”
It might be nice if the documentation mentioned that on deploying a new app, you'll need to refresh the page a few times to cause the SSL certificate to get fetched.

Table of Contents