Hitting a limit with the tuning, or am I?


#1

so far I have followed a lot of your tutorials and ended up with:

  • MYSQL - tuned
    • nginx + php5-fpm - tuned, using on_demand process manager + APC + WP object caching to APC
    • APC + fast_cgi caching to tmpfs

so to admire my work I tried running an ab test from another server:

ab -n 2000 -c 200 http://pacura.ru/  
This is ApacheBench, Version 2.3   
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/  
Licensed to The Apache Software Foundation, http://www.apache.org/  

Benchmarking pacura.ru (be patient)  
Completed 200 requests  
Completed 400 requests  
Completed 600 requests  
Completed 800 requests  
Completed 1000 requests  
**apr_socket_recv: Connection reset by peer (104)**  
Total of 1087 requests completed  

and it ends with: apr_socket_recv: Connection reset by peer (104)

so I googled that error, then checked my error logs on the tuned server and found LOADS of these:

2013/02/08 20:44:39 [crit] 22700#0: *536 open() "/var/www/pacura.ru/web/" failed (13: Permission denied), client: 108.162.231.27, server: pacura.ru, request: "GET / HTTP/1.1", host: "pacura.ru"  

So what's the story here? Clearly ab should be served from the fastcgi_cache so what is failing me here?


#2

You have hit socket connections limits! Its not caching issues I guess.

To check cache status, follow steps in this article - http://rtcamp.com/tutorials/checklist-for-perfect-wordpress-nginx-setup/ (cache should be tested by turning off PHP :wink:

Now for socket-limit, there are 2 solutions. Since they require some details, I have posted them here - http://rtcamp.com/tutorials/nginx-php-fpm-socket-tcp-ip-sysctl-tweaking/


#3

cache is fine. and I think I am right assuming that ab should be served from cache so how come I am running into socket problems? the requests shouldn't even reach php5-fpm.

I stayed with socket connection, didn't switch to TCP/IP

Anyway, I had already done some sysctl tweaking, here are my changed/added parameters so far:

sysctl -p net.ipv4.conf.all.rp_filter = 1 net.ipv4.ip_forward = 1 net.ipv4.conf.default.send_redirects = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.conf.default.forwarding = 1 net.ipv4.conf.default.proxy_arp = 0 kernel.sysrq = 1 net.ipv4.conf.eth0.proxy_arp = 1 net.core.somaxconn = 4096 net.ipv4.tcp_keepalive_intvl = 30 net.ipv4.tcp_keepalive_probes = 5 net.ipv4.tcp_tw_reuse = 1

I'll check your link and see which one of those I can add.

I've done the following: edited /etc/sysctl.conf and added:

fs.file-max=1048576 kernel.shmmax=8589934592 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.rmem_default = 1048576 net.core.wmem_default = 1048576 net.core.netdev_max_backlog=16384 net.core.somaxconn=32768 net.core.optmem_max = 25165824 net.ipv4.tcp_rmem = 4096 1048576 16777216 net.ipv4.tcp_wmem = 4096 1048576 16777216 net.ipv4.tcp_max_syn_backlog=32768 vm.max_map_count=131060

edited /etc/security/limits.conf and added because my nginx runs under the user www-data:

www-data soft nofile 15000 www-data hard nofile 30000

Since Debian ignored /etc/security/limits.conf one needs to edit /etc/pam.d/common-session and add the following line:

session required pam_limits.so

edited /etc/nginx/nginx.conf and added:

set open fd limit to 30000

worker_rlimit_nofile 30000;

Followed by: service nginx restart service php5-fpm restart sysctl -p

Then, from another server did a benchmark again: ab -n 20000 -c 200 http://pacura.ru/blog/shoot-for-a-fitness-manual/ This is ApacheBench, Version 2.3 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking pacura.ru (be patient) apr_socket_recv: Connection reset by peer (104) Total of 1790 requests completed

On the server I was hammering, I briefly saw CPU spike: 1 [|||||||||||||||||||||||||||||100.0%] Tasks: 256 total, 26 running 2 [|||||||||||||||||||||||||||||100.0%] Load average: 7.03 1.68 0.55 3 [|||||||||||||||||||||||||||||100.0%] Uptime: 1 day, 08:27:01 4 [|||||||||||||||||||||||||||||100.0%] 5 [|||||||||||||||||||||||||||||100.0%] 6 [|||||||||||||||||||||||||||||100.0%] 7 [|||||||||||||||||||||||||||||100.0%] 8 [|||||||||||||||||||||||||||||100.0%] Mem[||||||||||||| 2852/16028MB]

then it quickly fell of, I assumed that was the initial shock ;-) but still not good...

Btw. can you do this benchmark with your server? Just curios to know if it holds: ab -n 20000 -c 200 http://yourdomain.tld

Will read up some more and see if we can figure this one out!


#4

Problem kinda solved :-| I got those errors because ddosdeflate stopped me cold whenever I reached more than 200 parallel connections - not always exactly at the same time as it relies on cron to do so...


#5

Glad to know its fixed. Sysctl will be useful when you will need more connections in future. :-)


#6