====== Mirroring Wikipedia ====== The simplest way to mirror Wikipedia is using the [[https://kiwix.org/|Kiwix tools]] and a [[https://en.wikipedia.org/wiki/ZIM_(file_format)|Zim]] format snapshot. ===== Getting a snapshot ===== Downloading one of these snapshots is fairly simple - the Kiwix site [[https://download.kiwix.org/zim/wikipedia/|maintains pages with links]] to ''zim'' files for various facets of Wikipedia. They're broken down by language (''en'', in the filename), topic-specific (eg [[https://download.kiwix.org/zim/wikipedia/wikipedia_en_chemistry_mini_2024-04.zim|Chemistry]]) and other variations (options like mini, maxi, nopic allow us to further control how much data we're willing download.) So to download that Chemistry snapshot, we could use a command like this: ''curl -L https://download.kiwix.org/zim/wikipedia/wikipedia_en_chemistry_mini_2024-04.zim -o wikipedia_en_chemistry_mini_2024-04.zim'' (The '-L' flag tells Curl to follow redirects (which the Kiwix links are) and the '-o' allows us to specify the filename for the downloaded file.) ===== Serving the snapshot ===== On a Debian-based system (like Ubuntu or RaspberryPi OS) you can install the Kiwix Tools (including their server) like this: ''sudo apt-get install kiwix-tools'' ...and then run the serving app with our snapshot like this: ''kiwix-serve -p 8080 wikipedia_en_chemistry_mini_2024-04.zim'' ...which will print out a URL (something like ''http://192.168.1.223:8080/'') where we can view our site. (The '-p' flag allows us to specify the port we'll find our site on - if we omit this option ''kiwix-serve'' will try to run on the normal HTTP port (80) which will fail, unless we also run the command as root, by prepending ''sudo'' to it.)