Keeping your software up-to-date is of key importance when it comes to security and efficiency. This is especially true in the case of software exposed to Internet, and that is the most common scenario of NextCloud deployments. In this post, we show how to upgrade the NextCloud platform in a secure way with the minimum downtime.
This writing was inspired by one of our clients, who, after a security audit, discovered that he was using NextCloud version 17 (we are in 2021). “Why to touch it when it works” – that was his answer! But you only need to look at the release plan of the NextCloud versions to find out that version 17 has been unsupported for a long time, and therefore, in particular, no security vulnerability fixes are issued for it.
So, we have covered this case to help others looking to upgrade their dockerized NextCloud easily and safely.
Rule no.1 – do not test in production!
Some people prefer to update the production environment ad-hoc, and the deployments at a cloud service provider are great for it – just take a snapshot of the NextCloud hosting machine, perform the update process, and when it fails, restore that snapshot. This approach is also acceptable, but it is worth remembering to switch the entire NextCloud environment to maintenance mode after taking a snapshot. The point is that in the course of quite lengthy upgrading works, users do not perform operations that would be lost in case of need of restoring.
However, we prefer the second approach – that is, to clone the production environment and perform the upgrade test on it. This allows reducing the time the service is unavailable (downtime) to end-users.
So we start out exactly the same – by taking a snapshot, but this time we’re recreating it as a clone of the original machine.
In the second step, it will be easiest to configure the firewall of that cloned, test machine (preferably a cloud-provided firewall, not an operating system one) that only allows network traffic from and to the host from which we will carry out the upgrade process (at least SSH and HTTP ports).
We need to know that updating the remote version is basically replacing the version of the NextCloud docker image used by our environment. We, in this case, use a docker-compose ecosystem, so the section describing the NextCloud App container is looking like below:
app: image: nextcloud:17.0-fpm-alpine restart: always volumes: - nextcloud:/var/www/html environment: - POSTGRES_HOST=db - POSTGRES_DB=nc - REDIS_HOST=redis env_file: - db.env depends_on: - db - redis
However, when upgrading dockerized NextCloud with several major versions (e.g. from 17 to 22), it is important to perform a sequential update going through each major revision, because this is how the philosophy of approach to database (PostgreSQL) structure migration has been implemented by NextCloud developers. So in our case, the subsequent values of the image attribute would have to be: nextcloud: 17-fpm-alpine -> nextcloud: 18-fpm-alpine -> nextcloud: 19-fpm-alpine -> nextcloud: 20-fpm-alpine -> nextcloud: 21-fpm-alpine -> nextcloud: 22-fpm-alpine
So now, let us start with upgrading – we change nextcloud: 17-fpm-alpine -> nextcloud: 18-fpm-alpine, and issue docker-compose up to process changes. We can observe in the log file that NextCloud is switched into maintenance mode, and then the database migration is conducted. Below the small sample of a log is given:
app_1 | Nextcloud or one of the apps require upgrade - only a limited number of commands are available app_1 | You may use your browser or the occ upgrade command to do the upgrade app_1 | Setting log level to debug app_1 | Turned on maintenance mode app_1 | Updating database schema app_1 | Updated database app_1 | Updating <federation> ... app_1 | Updated <federation> to 1.10.1 app_1 | Updating <lookup_server_connector> ... app_1 | Updated <lookup_server_connector> to 1.8.0 app_1 | Updating <oauth2> ... app_1 | Updated <oauth2> to 1.8.0 app_1 | Updating <password_policy> ... app_1 | Updated <password_policy> to 1.10.1 app_1 | Updating <files> ... app_1 | Updated <files> to 1.15.0 app_1 | Updating <activity> ... app_1 | Updated <activity> to 2.13.4 app_1 | Updating <cloud_federation_api> ... app_1 | Updated <cloud_federation_api> to 1.3.0 app_1 | Updating <dav> ... app_1 | Fix broken values of calendar objects app_1 | app_1 | Starting ... app_1 | 0/0 [>---------------------------] 0% app_1 | Updated <dav> to 1.16.2 app_1 | Updating <files_external> ...
After migrating to the latest version we have to do some maintenance activity manually.
Now, let’s reconfigure our NextCloud clone so that it is accessible from a web browser because so far, it seems to him that it is available at the domain address, where our production server is located. So let’s make it available from a web browser under the IP address (not a domain name) and accessible via the HTTP protocol (i.e. not secured by a certificate), because right after setting up the clone you get the
HTTP 503 Service Temporarily Unavailable
First, we have to adjust the “web” section of the docker-compose.yml configuration file – more precisely the VIRTUAL_HOST key to point to our new IP address:
web: build: ./web restart: always volumes: - nextcloud:/var/www/html:ro environment: - VIRTUAL_HOST=188.8.131.52 - LETSENCRYPT_HOST=... - LETSENCRYPT_EMAIL=... depends_on: - app networks: - proxy-tier - default
After rereading the new configuration (e.g.
docker-compose up command) the logo of NextCloud is displayed in the web browser but followed by the message:
Access through untrusted domain Please contact your administrator. If you are an administrator, edit the "trusted_domains" setting in config/config.php like the example in config.sample.php. Further information how to configure this can be found in the documentation.
All we have to do is follow this command, but the configuration file indicated in the message is part of the docker container – so we need to log in to it:
sudo docker exec -it nexcloud_app_1 sh
and, when we get into that container:
apk add nano nano config/config.php
In the nano editor, we have to replace the old value of ‘trusted_domains’ entry with the new IP address, ‘overwrite.cli.url’ with the new IP address prefixed with HTTP://, and ‘overwriteprotocol’ with ‘http’ string.
Now, we can log in as an administrator into the web console, and in the menu administration->overview, we see some warnings about missing database indexes, primary keys, optional columns, etc. This is because the default upgrade is being processed in a way to be as fast as possible, leaving some steps to be done in a more controlled way.
Being still logged to the container we issue the following commands:
apk add su-exec su-exec www-data ./occ db:add-missing-indices su-exec www-data ./occ db:add-missing-primary-keys su-exec www-data ./occ db:add-missing-columns su-exec www-data ./occ db:convert-filecache-bigint
That is all! Now, you are ready to roll forward your upgrade on the production environment, being sure that everything will go right!
Another view angles
- There are several variations to a containerized NextCloud deployment. In this post, we show how to upgrade NextCloud implemented using the method described in the separate article published on our blog.
- Note that professional NextCloud implementations do not make it a lonely island – usually the whole system is a distributed system, at least by connecting external storage. Take this into account, lest you be surprised if you still see side effects after restoring your NextCloud host from a snapshot.
- In the above article, in order to verify the correctness of the upgrade, we recommend logging into the administrator’s account on the web console. Note that we use an unencrypted http connection there, and the password for the cloned environment is the same as it was on the production one. So you should either change the password before executing the clone, or use the command-line tool to change the password.