Site down after updating to v4.0.12

Mr_Anonymous · April 10, 2019, 9:13pm

After running ee cli update to upgrade from version 4.0.10 to 4.0.12 the site is no longer accessible. No response from the server to either the url https://dev.brightray.com/ or the direct IP address. Both the URL and IP address respond to ping but the web service doesn’t seem to be responding.

No visible errors during the upgrade. It came back saying successfully upgraded. After a reboot, the server came back up without issue but no response to web requests. I can telnet into the system, attempted to restart the services, watched the logs. No messages show up on the logs when trying to access either the site or the direct IP address.

System Information

ee cli info

+-------------------+----------------------------------------------------------+
| OS                | Linux 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44: |
|                   | 52 UTC 2019 x86_64                                       |
| Shell             | /bin/bash                                                |
| PHP binary        | /usr/bin/php7.2                                          |
| PHP version       | 7.2.17-1+ubuntu18.04.1+deb.sury.org+3                    |
| php.ini used      | /etc/php/7.2/cli/php.ini                                 |
| EE root dir       | phar://ee.phar                                           |
| EE vendor dir     | phar://ee.phar/vendor                                    |
| EE phar path      | /root                                                    |
| EE packages dir   |                                                          |
| EE global config  | /opt/easyengine/config/config.yml                        |
| EE project config |                                                          |
| EE version        | 4.0.12                                                   |
+-------------------+----------------------------------------------------------+

lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.2 LTS
Release:        18.04
Codename:       bionic

docker version

Client:
 Version:           18.09.4
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        d14af54266
 Built:             Wed Mar 27 18:35:44 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.4
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d14af54
  Built:            Wed Mar 27 18:01:48 2019
  OS/Arch:          linux/amd64
  Experimental:     false

docker-compose version

docker-compose version 1.23.2, build 1110ad01
docker-py version: 3.6.0
CPython version: 3.6.7
OpenSSL version: OpenSSL 1.1.0f  25 May 2017

madbradjohnson · April 11, 2019, 10:51am

You may need to restart your docker’s.

Mr_Anonymous · April 11, 2019, 12:58pm

I mentioned in the original post that I already tried that.

madbradjohnson · April 11, 2019, 1:21pm

No, you said you rebooted the server and restarted the services. Have you tried just restarted the nginx proxy or just the site?

ee site restart example.com

ee service restart nginx-proxy

Mr_Anonymous · April 11, 2019, 1:27pm

I have also attempted it again without success.

madbradjohnson · April 11, 2019, 1:36pm

Well that truly sucks ass… Its one the original reasons why I did not want to move over to EEv4 as I had a similar issue. Landed up staying on V3. Now I am V4 but I will not restart my servers just yet.

Are you able to create a new site and gain access to that?

Try disable UFW or any firewalls and see if that helps.

Also try ee site down command and the opposite ee site up.

You can also try this: @michacassola Can you try ee service enable db --force and check the output of docker ps -a and check if any of the site or service containers are restarting or exited?

There is an existing GitHub issue here: https://github.com/EasyEngine/easyengine/issues/1251

And this problem is mentioned here on the forum too: EasyEngine services not running on server reboot

Mr_Anonymous · April 11, 2019, 1:47pm

Tried all of the info you gave me without success. I am reading over the GitHub issue to see if I can find a fix there. Thank you for all your help. Hopefully, I can figure this out. Thankfully I did it on the dev site and not the live one. lol

Like I said in the original post it doesn’t even look like I can access the base nginx placeholder. Normally if it is an issue with the DB or the site I can still see the 503 message when attempting to access the server directly via the IP address but I don’t even get that as a response.

madbradjohnson · April 11, 2019, 1:48pm

Shit man, really sorry about this. Sorry I cannot be of more help at this point in time! But good luck and I will follow this thread for further issues and fixes.

lotusjeff · April 11, 2019, 1:49pm

Here is my quick and dirty V4 checklist for problems.

Are the dockers all running?

docker ps

For a single site, there should be 7 running dockers.

Multiple sites will have additional site-specific dockers.

Are all the docker networks running?

docker network ls

There should be 5.

Are the docker networks set up correctly?

docker network inspect <network name>

What you are looking for here is the correct containers assigned to the various networks. Below is a network diagram of the ee setup. It shows the containers and the docker network. Make sure each container is setup correctly in each network.

Once you have validated this info, then you have to start checking in each container for the flow of traffic.

Mr_Anonymous · April 11, 2019, 1:57pm

This is my output from running

docker ps

CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS                          PORTS               NAMES
28bd738086fa        easyengine/postfix:v4.0.0       "postfix start-fg"       10 minutes ago      Up 10 minutes                   25/tcp              devbrightraycom_postfix_1
1b4090158073        easyengine/nginx:v4.0.0         "/usr/bin/openresty …"   10 minutes ago      Up 10 minutes                   80/tcp              devbrightraycom_nginx_1
d3e236452b6a        easyengine/php:v4.0.2           "docker-entrypoint.s…"   10 minutes ago      Up 10 minutes                   9000/tcp            devbrightraycom_php_1
1247db8ddd7e        easyengine/nginx-proxy:v4.0.2   "/app/docker-entrypo…"   17 hours ago        Restarting (0) 21 seconds ago                       services_global-nginx-proxy_1
89164495b3d1        easyengine/cron:v4.0.0          "/usr/bin/ofelia dae…"   4 weeks ago         Up About an hour                                    ee-cron-scheduler
2d294b8f854c        easyengine/mariadb:v4.0.0       "docker-entrypoint.s…"   4 weeks ago         Up About an hour                3306/tcp            services_global-db_1
978426dcaf17        easyengine/redis:v4.0.0         "docker-entrypoint.s…"   4 weeks ago         Up About an hour                6379/tcp            services_global-redis_1

My nginx-proxy seems to be restarting outside of my force restarts and it seems to lack any port information. I am assuming that could be the problem. Now to figure out how to troubleshoot nginx-proxy.

This is the results from

docker network ls

NETWORK ID          NAME                         DRIVER              SCOPE
29c594ba9ff0        bridge                       bridge              local
36d6b4abac34        dev.brightray.com            bridge              local
2d271239c2a1        ee-global-backend-network    bridge              local
bd02fc5c1efc        ee-global-frontend-network   bridge              local
0b0fe6efc4c5        host                         host                local
dc2e556e4518        none                         null                local

I am not sure what I am looking for when I try to inspect the different networks nor how to check the different containers for a flow of traffic.

madbradjohnson · April 11, 2019, 2:09pm

Maybe stop all dockers. Then start only the nginx proxy and bring the rest up. Maybe there is a specific order in which to do things. It is quite possible.

Mr_Anonymous · April 11, 2019, 4:10pm

Anyone know how I might see the reasons why the easyengine/nginx-proxy:v4.0.2 container keeps restarting?

lotusjeff · April 12, 2019, 3:38am

You will need to run the following command from the shell

docker logs --tail=10 -f services_global-nginx-proxy_1

This will show the logs associated with the nginx proxy. Change the tail number to increase the number of lines shown. This will help you debug why the nginx-proxy is restarting. most likely it is a bad config file.

This will keep showing you the log files for the service until you enter a Ctrl+c.

The ee log show --global does not show any of the service related docker containers. I have entered an issue for this. https://github.com/EasyEngine/log-command/issues/5

Mr_Anonymous · April 12, 2019, 4:22am

Thank you for the info. Below is what I got from the log.

forego     | sending SIGTERM to nginx.1
forego     | sending SIGTERM to dockergen.1
Custom dhparam.pem file found, generation skipped
forego     | starting dockergen.1 on port 5000
forego     | starting nginx.1 on port 5100
nginx.1    | 2019/04/12 04:20:55 [emerg] 22#22: PEM_read_bio_DHparams("/etc/nginx/dhparam/dhparam.pem") failed (SSL: error:0906D06C:PEM routines:PEM_read_bio:no start line:Expecting: DH PARAMETERS)
nginx.1    | nginx: [emerg] PEM_read_bio_DHparams("/etc/nginx/dhparam/dhparam.pem") failed (SSL: error:0906D06C:PEM routines:PEM_read_bio:no start line:Expecting: DH PARAMETERS)
forego     | starting nginx.1 on port 5200
forego     | sending SIGTERM to dockergen.1
forego     | sending SIGTERM to nginx.1

Mr_Anonymous · April 12, 2019, 3:06pm

I was looking online for the error I posted above and came across an issue posted to the
nginx-proxy repository. Issue with recent container update and SSL #1226 It looks like the issue still isn’t resolved and I am not sure how to get around it. I recommend avoiding EE v4.0.12 until they resolve the issue.

lotusjeff · April 12, 2019, 3:27pm

To do a temp fix, you will need to locate the following file:

dockerfiles/nginx-proxy/Dockerfile

The first line reads:

FROM jwilder/nginx-proxy:latest

This needs to be replaced with:

FROM jwilder/nginx-proxy:0.5.0

This will revert to the older proxy version and remove the error. I do not know where this file is located within the easy engine folder structure. I have looked for it, but can not find it. any help would be appreciated. Here is the source code from git hub:

Mr_Anonymous · April 12, 2019, 7:40pm

Anyone have any idea where this file might be located in the EE structure?

Paleo · April 25, 2019, 5:19am

For me, this worked:

wget -qO ee rt.cx/ee4 && sudo bash ee
sudo ee site restart example.com
sudo ee service restart nginx-proxy

Mr_Anonymous · April 25, 2019, 1:46pm

Why did you run the install command again?

Paleo · April 26, 2019, 3:12pm

As it wasnt working fine, I tried to reinstall it, hopping things would go fine