Just sharing my experiences hoping you gain more insight.
My hosting environment: 100% Digital Ocean (different sized instances for different setups).
Droplet 1: $5 (smallest instance) - 5 EE sites all low traffic sites
Droplet 2: $5 (smallest instance) - 1 EE site medium traffic site
Droplet 3 $20 (Medium instance) - 3 EE sites heavy traffic sites
Droplet 1 NEVER reboots properly. Any reboot results in sites that do not come back online and containers that do not start properly. Droplet 2 ALWAYS reboots/restarts properly, every single time. Droplet 3 ALWAYS reboots/restart properly every single time.
Looking at the logs for Droplet ,1 it becomes quickly apparent that proper startup order is not achieved and sites often attempt to run prior to either the global database or the nginx-proxy being properly initialized. Once that happens, the proper labels are never added to conf files and no amount of restarting of the sites or the services will fix this issue. I have also seen log entries where the database just runs out of memory on those smallest droplets and never recovers. But that mostly returns a database connection error.
Droplets 2 and 3 always seem to restart in the proper order. I assume on droplet 2 that with only a single site, there is not stack of commands running and all comes up in an orderly fashion. I can also assume that on Droplet 3, having considerably more RAM and CPU cycles that everything just starts fast enough to keep up with site loading.
My resolution that works every time -
sudo ee site disable sitename.com
once for each site
cd to the services directory /opt/easyengine/services
docker-compose down && docker-compose up -d
then enable all the sites again with
sudo ee site enable sitename.com
for each site
I have read where users repeatedly reboot their servers and get lucky and everything comes back in the proper order and starts. That seems to be a bit like pulling the lever on a slot machine and hoping for triple 7’s. Finding a programmatic way is always more efficient.
I have been able to duplicate this problem on Linode and Vultr accounts using their smallest instances and the same 5 sites as used on my droplet 1. This eliminates this as a Digital Ocean specific problem. I conclude that this is a resource issue and that the mysql server is probably the culprit being the one service utilizing the most resources on any server.
I have also noted that there are existing Github issues addressing both the configuration of mysql on small resource servers as well as assuring that the depends-on directives are properly followed during startup. I have the knowledge of docker-compose to tinker with different settings but the reality is that I only reboot the server every few months at most and just use the resolution explained above to get sites back up and running.
Perhaps a feature request for something like a sudo ee site reinitialize
or similar nomenclature which would verify that all the proxy and database conf files match the containers and restart all if needed will be the answer?