Cache urls which use only "javascript-related queries"


#1

By javascript related queries I mean things like these:

  • Google Analytics:

http://www.mydomain.com/utm_campaign=CampaignName&utm_medium=Facebook

  • Kiss Metrics:

http://www.mydomain.com/?kmi=customer@email.com&kme=SignUp

  • Google Adwords:

http://www.mydomain.com/?gclid=XXXXXXXX (this is the trickiest, because X is different string for each visit)

In all cases, the html generated by WP is exactly the same than: http://www.mydomain.com/ so it makes sense to give them that cache.

I've never seen a configuration to solve this and most of my visits come from AdWords or Campaigns (with utms) so there's no point in caching if you are not serving cache to the 90% of your visits.

This is my standard configuration to avoid cache when the url contains a query:

if ($query_string != "") { set $cache_uri 'no cache'; }

As I am quite new to nginx so, could somebody provide an example of how this should be configured to serve "non-query cached files" to "javascript-related query urls"?

Thanks in advance! Luis.


Ignore query parameters with fastcgi cache
#2

You can turn on cache when query_string is present. On this site we serve cached content even if query string is present. We ignore anything after "?" i.e. query-string.

If you want fine grain control you can add if conditions using $arg variables-check.

For example, for query string variable gclid, you can have:

if ($arg_gclid != "") { set $cache_uri ‘cache’; } #use cached version  

You may need to add one or more checks like above.


#3

Lovely.

With this line set $cache_uri 'cache'; am I getting the "non-queried url cache" http://www.mydomain.com or the system is caching the "url with the query": http://www.mydomain.com?something

If it's the first case, then I think this should work, shouldn't it?

 # POST requests and urls with a query string should always go to PHP  

  if ($request_method = POST) {  
      set $cache_uri 'no cache';  
  }  
  if ($query_string != "") {  
      set $cache_uri 'no cache';  
  }  
  # Use cached version if gclid or utm_campaing args are present  
  if ( ($arg_gclid != "")  || ($arg_utm_campaign != "") ) { set $cache_uri 'cache'; }   

A newbie question:

How do I know the page I'm watching at has been served from Memcachier and not from WP php? With WP-Supercache and W3TC you can look for the tags, but when you use memcachier or batcache I don't know what I should look at.


#4

I don't think following line will work:

   if ( ($arg_gclid != "")  || ($arg_utm_campaign != "") ) { set $cache_uri 'cache'; }   

Nginx doesn't support multiple if-conditions, so you may need to write it as:

   if ( $arg_gclid != "" ) { set $cache_uri 'cache'; }   
   if ( $arg_utm_campaign != "" ) { set $cache_uri 'cache'; }   

Regarding, cache-check....

  1. Open a page from a browser in which you are not logged-in. Assuming you are not caching pages for logged-in users.
  2. Next, open shell on server and shutdown PHP and/or MySQL.
  3. Try refreshing page that you have opened in step#1. If you can see the page, then it means cache is working fine already! ;-)

More details: http://rtcamp.com/tutorials/checklist-for-perfect-wordpress-nginx-setup/


#5

Great. Thanks Rahul!!

I'm going to do some tests. I will let you know. I'm using heroku so I have to figure out a way to break it! Maybe changing the mysql env variable which feeds the wp-config. Locally I'm using apache (MAMP PRO) because it's easier to set up so I can't try it.

I still wonder why nobody in this big world cares about caching the adwords landing pages! There isn't any information out there!


#6

Unfortunately... it doesn't work.

I've doing some tests this is what I found:

  • If I visit www.mydomain.com and turn off the database, www.mydomain.com is still available, so it's cached. Fine.
  • If then I try to visit www.mydomain.com/?glicd=XXXX it doesn't work, so it's not using the www.mydomain.com cache.

  • If I visit www.mydomain.com/?glicd=XXXX and turn off the database www.mydomain.com/?glicd=XXXX is still available, so it's cached.

  • If then I try to visit www.mydomain.com/?glicd=YYYY it doesn't work, so definitely it is caching the complete url.

The problem here is that gclid is different each time, so there no point on caching each complete url. It's actually a waste of memory.

Is there a way to tell nginx to use the www.mydomain.com cache instead? That would be perfect.


#7

I'm not quite good at regexp but I've managed to create this:

Maybe this can be combined with the $query_string param of FastCgi?

The problem here is that it has to load the cache of www.mydomain.com but the url must be the original with the gclid parameter, so the javascript codes get executed.


#8

The problem here is that gclid is different each time, so there no point on caching each complete url. It’s actually a waste of memory.

I can understand what is going behind the scene. I am not sure of side-effects, but you can try following:

Replace line:

fastcgi_cache_key "$scheme$request_method$host$request_uri";  

With line:

fastcgi_cache_key "$scheme$request_method$host$uri";  

Difference is: for cache key, we are using $uri rather than $request_uri.

In layman's terms: $request_uri - $query_string = $uri.

Please try this and let me know if it works for you.


#9

Thanks!!

This has been a huge step, there's only one problem left, which right now I can't get to solve.

This is what I did: First, I remove queries I want to be ignored. If there is something left, which means there are more args non-javascript-related, don't use cache.

# remove GET parameters if ($args ~ (.)utm_source=[^&](.)) { set $args $1$2; } if ($args ~ (.)utm_medium=[^&](.)) { set $args $1$2; } if ($args ~ (.)utm_campaign=[^&](.)) { set $args $1$2; } if ($args ~ (.)gclid=[^&](.)) { set $args $1$2; } # cleanup any repeated & introduced if ($args ~ (.)&&+(.)) { set $args $1&$2; } # cleanup leading & if ($args ~ ^&(.)) { set $args $1; } # cleanup ending & if ($args ~ (.)&$) { set $args $1; }

# if there are any arg left, don't use cache if ( $args != "" ) { set $cache_uri 'no cache'; }

Then, add your line to location / {}

fastcgi_cache_key "$scheme$request_method$host$uri";

.

I have created a custom page which populates an input field with the javascript queries. It has two fields: gclid and koko. Right now, this is the behavior:

----- If I visit /cache-test/ and then turn off the database, I still can load properly /cache-test/?gclid=XXXX and the javascript input field gets populated correctly. WORKS!

----- If I visit /cache-test/ and then turn off the database, I can't load properly /cache-test/?koko=XXXX. WORKS!

----- If I visit /cache-test/ and then turn off the database, I can't load properly /cache-test/?koko=XXXX&gclid=XXXX or /cache-test/?gclid=XXXX&koko=XXXX. WORKS!

----- If I visit /cache-test/?koko=XXXX and then turn off the database, I can load properly /cache-test/?koko=XXXX which means it's caching it! DOESN'T WORK!!

----- If I visit /cache-test/?koko=XXXX and then turn off the database, I can't load properly /cache-test/?koko=YYYY which means it's caching the exact query.

Any ideas on what could be wrong?


#10

I'm back with this.

I've discovered that my initial configuration was wrong.

This line doesn't seem to work: if ($query_string != "") { set $cache_uri 'no cache'; }

And neither do this: if ( $args != "" ) { set $cache_uri 'no cache'; }

because it is always caching urls with queries.

This one seems to work: if ($request_uri ~* "(\/wp-admin\/|\/xmlrpc.php|\/wp-(app|cron|login|register|mail).php|wp-.*.php|index.php|wp-comments-popup.php|wp-links-opml.php|wp-locations.php)") { set $cache_uri "no cache"; }

so I think the set $cache_uri "no cache"; works.

Maybe there's something wrong with $query_string and $args.


#11

Do not overuse $query_string and $args.

In my most config, by default, we set "cache" ON and then check some conditions which when met set cache to OFF.

Logically, try to find list of query string parameters for which you want to use cache and the other list for which you do not wish to use cache.

I will recommend, finding list of query string parameters for which you want to force cache ON.

So as per our flow, at first we will turn OFF caching for all query string parameters:

  if ($query_string != "") {  
      set $cache_uri 'no cache';  
  }  

Then for every single query parameter, you want to turn caching ON, add an if-condition as follows:

   if ( $arg_gclid != "" ) { set $cache_uri 'cache'; }   
   if ( $arg_utm_campaign != "" ) { set $cache_uri 'cache'; }   
   if ( $arg_XXX != "" ) { set $cache_uri 'cache'; }   

Do not use regex and set $args combos as you have tried above in Feb 15 reply.

Regex-based logic doesn't play nice with query-string parameter styles.

Don't forget to use:

fastcgi_cache_key "$scheme$request_method$host$uri";  

By the way, you can play more with fastcgi_cache_key.

Also, if you are do not wish to cache error pages, tweak fastcgi_cache_use_stale line.


#12

Hi Rahul,

The set $cache_uri doesn't work at all in my setup. I'm not sure why. But I've finally managed to get it working the way I like but I had to modify advanced-cache.php.

This is what I did:

  • Javascript-only queries: These are the ones I want to be ignored when it serves cache, but I want it to be passed so the js scripts can do their work.
    • For example: www.mydomain.com?javascript_tag=123456 should serve the cache of www.mydomain.com
    • These are typically tags of AdWords (gclid), Google Analytics (utm_campaign, utm_source...)
    • I removed them from the nginx $args variable, like this: if ($args ~ ((.*)(\?|&)javascript_tag=\w+$|(.*)&?javascript_tag=\w+&?)(.*)) { set $args $2$4$5; }
  • Non-cache queries: These are the ones I don't want to be cached at all. If they exist, cache won't be served.
    • For example: www.mydomain.com?no-cache=123456 should be load always dynamically.
    • These are typically queries used by php to modify the html or database in some way.
    • I added them in advanced-cache.php, after $batcache = new batcache($batcache); if ($_GET['no-cache'] != '') { return; }
    • I suppose that if I wouldn't want to cache anything with query strings by default I could use something like: if ($_SERVER['QUERY_STRING'] != '') { return; } but I haven't tested it myself.
  • Cache queries: These are the ones I want to be cached.
    • For example: www.mydomain.com?cache-me=123456 should serve www.mydomain.com?cache-me=123456
    • I didn't do anything, this was the default behaviour.

I hope this can help somebody else.


#13

For 2 different URLs:

http://www.mydomain.com?javascript_tag=123456 http://www.mydomain.com

Nginx's $request_uri will contain different value but $uri will have same value.

That was the purpose of using $uri in line:

fastcgi_cache_key "$scheme$request_method$host$uri";  

Based on other examples you have given, I think you can use a custom variable in fastcgi_cache_key, like this example:

fastcgi_cache_key "$scheme$request_method$host$rt_uri";  

Then, initialize $rt_uri with value of $request_uri (and by default set cache to be ON)

set $rt_uri $request_uri;  

Then tell nginx NOT to cache any query_strings.

  if ($query_string != "") {  
      set $cache_uri 'no cache';  
  }  

This will take care of http://www.mydomain.com?no-cache=123456 without any additional rule.

Then tell nginx to CACHE some queries,

if ( $arg_cache-me != "" ) { set $cache_uri 'cache'; }

Next, you can set $rt_uri to $uri and force caching when you want http://www.mydomain.com?javascript_tag=123456 to show cached content for http://www.mydomain.com

  if ($$arg_javascript_tag != "") {  
      set $cache_uri 'cache';  
      set $rt_uri $uri;  
  }  

Above flow will not require any changes in php but I am not sure if it will solve your problem.


#14

Hi Rahul,

As I said, set $cache_uri is not working at all in my setup. The only things which makes a difference is the advanced-object.php and $args.

The following lines: if ($query_string != "") { set $cache_uri 'no cache'; } just don't work. Queries are cached anyway.

I'm not sure why but I didn't create the nginx conf, I'm not expert.

Anyway, thanks for your help. I think I couldn't have achieved it without it!


#15