User Tools

This is an old revision of the document!


Mirroring Wikipedia

Getting a snapshot

The simplest way to mirror Wikipedia is using the Kiwix tools and a Zim format snapshot.

Downloading one of these snapshots is fairly simple - the Kiwix site maintains pages with links to zim files for various facets of Wikipedia. They're broken down by language (en, in the filename), topic-specific (eg Chemistry) and other variations (options like mini, maxi, nopic allow us to further control how much data we're willing download.)

So to download that Chemistry snapshot, we could use a command like this:

curl -L https://download.kiwix.org/zim/wikipedia/wikipedia_en_chemistry_mini_2024-04.zim -o wikipedia_en_chemistry_mini_2024-04.zim

(The '-L' flag tells Curl to follow redirects (which the Kiwix links are) and the '-o' allows us to specify the filename for the downloaded file.)

Serving the snapshot

On a Debian-based system (like Ubuntu or RaspberryPi OS) you can install the Kiwix Tools (including their server) like this:

sudo apt-get install kiwix-tools

…and then run it with our snapshot like this:

kiwix-serve -p 8080 wikipedia_en_chemistry_mini_2024-04.zim

…which will print out a URL (something like http://192.168.1.223:8080/) where we can view our site.

(The '-p' flag allows us to specify the port we'll find our site on - if we omit this option kiwix-serve will try to run on the normal HTTP port (80) which will fail, unless we also run the command as root, by prepending sudo to it.)

This website uses cookies. By using the website, you agree with storing cookies on your computer. Also, you acknowledge that you have read and understand our Privacy Policy. If you do not agree, please leave the website.

More information