Checklist pour débugger un site Drupal qui a des problèmes de performance.
Contents
Checklist
Following through this checklist, I was able to get the frontpage of a site to go from 9 seconds load time (HTML only!) down to 1.5 seconds, see Isuma/OptimisationNotes for historical notes.
This is in accordance with PratiquesOptimisation.
Identify requirements
The first step is to identify what is "slow". When will the site's performance be acceptable? What is the criteria for success, and which page should be tested?
Metrics will vary from site to site, but a reasonable expection should be:
frontpage: <= ~1s (especially for anonymous users)
AJAX queries: <<= 1s (~50-200ms)
- search queries: ~2-10s acceptable
Define your own criterias of acceptation before starting the test, and which page to test. Run through the tests and the fixes one page at a time.
See also PratiquesOptimisation for a more general approach.
Benchmark the site
You can use Firebug to get quick timing information for a page. Hit f12 and enable the network pane. In Chromium, this is builtin: hit f12 and click on the network pane, then reload the page. You should see how long the page loads, one line for each resource.
Online version : http://tools.pingdom.com/
If the first item (the HTML from the Drupal site itself) takes most of the time, then you need to tweak the Drupal. If not, then you need to tweak the theme in the Drupal to avoid loading many images, for example. This may vary from site to site, but the general idea is to always look at the biggest time chunk you can attack.
You should also use ApacheBench, to get a more reliable idea of the site's performance. Too often the performance will vary enormously, especially on production sites, depending on the load. ApacheBench (or Siege) allow you to average out performance over multiple request, even concurrent request.
To run a single request, use:
ab http://www.example.com/
Then, as you improve the site, you can run multiple samples to get a proper idea. For example, here's how to run 100 requests on the site:
ab -n 100 http://www.example.com/
We do not enable concurrency at first, especially if the site is really slow as it will make the benchmark awfully slow. Once we have tuned the site better, we can start adding some load (say 10 concurrent users) and raise the number of requests:
ab -n 100 http://www.example.com/
Start with a very small sample size (e.g. the first example, only one) to avoid destroying the server. Then as you gauge the site's performance, you can increase the sample size. 10 is too small to have statistical significance, aim for 100 or 1000, although finding a right sample size is a whole topic of Statistics course.
Here's an example benchmark of 10 requests:
anarcat@desktop006:~$ ab -n 10 http://www.isuma.tv/ This is ApacheBench, Version 2.3 <$Revision: 655654 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking www.isuma.tv (be patient).....done Server Software: Apache/2.2.9 Server Hostname: www.isuma.tv Server Port: 80 Document Path: / Document Length: 29192 bytes Concurrency Level: 1 Time taken for tests: 98.285 seconds Complete requests: 10 Failed requests: 9 (Connect: 0, Receive: 0, Length: 9, Exceptions: 0) Write errors: 0 Total transferred: 297369 bytes HTML transferred: 290739 bytes Requests per second: 0.10 [#/sec] (mean) Time per request: 9828.491 [ms] (mean) Time per request: 9828.491 [ms] (mean, across all concurrent requests) Transfer rate: 2.95 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 21 28 3.3 28 32 Processing: 6841 9801 2124.9 10854 12436 Waiting: 6126 9644 2232.1 10759 12342 Total: 6862 9828 2125.9 10884 12464 Percentage of the requests served within a certain time (ms) 50% 10884 66% 11051 75% 11542 80% 12322 90% 12464 95% 12464 98% 12464 99% 12464 100% 12464 (longest request)
Some notes:
The fastest benchmark (min above) is really important because it tells us how fast the site can load without interference from other processes or visitors on the server
- Look at the median time (less is better) and requests per second (more is better). The median time is what you want to get down to your target defined at the beginning of the benchmark
High standard deviation means there is constantly varying load on the server
Profiling
Once you have identified a problematic page, you want to profile it to determine what exactly is taking all that time.
The most basic way of doing this is to enable the page timer in devel.module to get timing information for MySQL vs PHP. This will allow us to figure out if we need to optimize SQL requests or PHP code.
MySQL with the devel module
Enable the devel module, go to the devel settings (D6: admin/settings/devel, D7: ?), enable the query info, log, page timer and memory usage checkboxes. This can also be done with drush:
drush dl devel drush en devel drush vset dev-mem 1 drush vset dev-timer 1 drush vset dev-query 1 drush vset devel-query-display 1
Reload the page and compare the total page execution time with the mysql execution time. Total time includes mysql time. If MySQL time takes more than 75% of the load time, start hunting down the slowest requests or the requests that are repeated and try to eliminate them. A lot of the techniques to do this are detailed below, but this will vary from install to install, depending on the problem found, the modules installed, the data size of the site, the traffic, etc.
The rest with xhprof
XHprof allows you to diagnose what's taking so long *outside* of MySQL.
To install xhprof, you need to use PECL:
apt-get install php5-dev make pear upgrade PEAR pecl install xhprof
source - there's no Debian package yet, see #698972 for more information.
In Drupal 6, you need to use the devel module to enable profiling and provide links at the bottom of pages for the profiling information. You have to setup a special Alias (or a whole VirtualHost) directive to have something to display the xhprof output. So to wrap it up:
drush vset devel_xhprof_enabled 1 drush vset devel_xhprof_directory /usr/share/php/ drush vset devel_xhprof_url http://www.example.com/xhprof/
The alias, for example in /var/aegir/config/server_master/apache/vhost.d/www.example.com:
Alias /xhprof/ /usr/share/php/xhprof_html/
In Drupal 7, it's enough to just install the xhprof module and activate profiling in its settings page, or run the following drush command:
drush vset xhprof_enabled 1
To be detailed! Follow Mark's excellent presentation for now.
Common problems and solutions
Enable views caching
ENABLE VIEWS CACHING! Too often a lot of time is wasted constantly rebuilding the same content in views. There are two caches in views: the "results cache" and the "content cache". The results cache keeps the results of the SQL queries in cache and the content cache keeps the rendered HTML in cache.
Enable both. (<- not sure about this!)
For blocks, you also have to enable the cache in the lower part (block settings) - use "cache once for everything" unless your block is user-specific or changes according to the page it's on.
Huge tables
Drupal can create huge tables with time, especially the accesslog, the sessions and watchdog tables.
If you have a really big table (I had the pleasure of dealing with an 11 million row table), do not try to DELETE FROM watchdog WHERE... - just TRUNCATE watchdog, as deleting parts of the table may destroy it.
To find out the size of your tables, you can use a variant of this sql command, which will return the five biggest tables and their sizes in megabytes:
SELECT TABLE_NAME, SUM(DATA_LENGTH + INDEX_LENGTH)/1024/1024 mb FROM information_schema.TABLES where TABLE_SCHEMA = 'databasename' GROUP BY TABLE_NAME ORDER BY mb DESC LIMIT 5;
Simple replace the databasename with the name of your database. drush core-status can provide you with this info.
Access log
This table should be empty. If it is not, and especially if it is big, it needs to be truncated, and the statistics module be disabled, or at least the access logging be disabled in Reports -> Access log settings. (That menu will be absent if the module is disabled.) The node count table is separate and will still show the number of views of nodes if you need it.
Sessions table
This one can get huge, especially if you're on Debian and are using the default PHP settings and not the debian package for Drupal (which was fixed for this).
A good setting is this:
session.gc_probability = 1 session.gc_divisor = 100 session.gc_maxlifetime = 604800 ; one week
This means the garbage collector is going to run in one out of 100 requests, and is going to pass 604800 (second, one week) as a parameter to the garbage collector, which means sessions (ie. logins, probably) will expire after one week in all applications.
This can also be changed in the settings.php:
ini_set('session.gc_probability', 1); ini_set('session.gc_divisor', 100); ini_set('session.gc_maxlifetime', 200000);
watchdog rotation
Limit the watchddog size to 1000 or less to avoid groing the table too big. Also make sure the watchdog is emptied regularly because it can become a performance hit, as mentioned above
Inspect the watchdog
Look into the "recent log entries" for problems. Sometimes there are PHP errors that just flood the watchdog. Those should be fixed or disabled.
Disable PHP notices
One way of doing this is to change the error_reporting setting (error_reporting() on php.net). It is common for Debian servers to have this setting in php.ini:
error_reporting = E_ALL & ~E_DEPRECATED
Change this to:
error_reporting = E_ALL & ~E_DEPRECATED & ~E_NOTICE
This is done by default by the koumbit::service::web puppet class.
For heavy logging sites: syslog logging
If you have access to the system logs, use syslog instead of the dblog module. this is done through the modules page and will save an INSERT query for every error on the site, which can be a huge amount. Plus it takes care of rotation automatically if the server is properly configured.
404s
Some sites are really busy serving 404 pages for random content, like missing CSS files. Just create an empty file in place of the missing one, if necessary or better: fix the originating URL, which is documented in the watchdog entry.
other
- missing indexes (explain mysql indexes here)
- l() and drupal_lookup_path
- Fonctions cache fournies par des modules lourds
- panels: activer simple_cache
- cache de drupal (admin/settings/performance), mais ne pas activer si civicrm est installé.
- Mysql
- vérifier que la table sessions est correctement vidée
- optimiser les tables à toutes les semaines?
- Sur HAG
- Ajouter "FileETag None" dans le htaccess (une variante est inclue dans boost, choisir "none").
Créer un vhost comme on fait sur Aegir, avec le htaccess dans un "<Directory>" et "AllowOverride None". Voir cabm.net dans les koumbit_hacks. Mais attention, créer un vhost/koumbit_hack implique qu'il faut le mettre à jour dans Puppet à chaque fois que le core de Drupal est mis à jour. Donc à éviter!
Ceci dit, plusieurs sources disent que le gain de performance est d'environ 6-8%, ce qui est en soit intéressant pour un gros site, mais quand on combine au lag de NFS, on peut imaginer que c'est beaucoup plus. doso-cpu-day.png -- voir ce graphe Munin peu après l'activation du vhost sur CABM (qui avait l'aggrégation CSS désactivée, donc faisait beaucoup trop de hits pour rien). Mais pourrait être juste une coïncidence.
- Débugger un cron lent
dans includes/modules.inc, à la fonction module_invoke_all, ajouter un "print $function . ' ' . time() . "<br>\n";" après le "foreach". Ceci permet de voir combien de temps est passé sur chaque module. Desfois ça peut aider à trouver un cludge. ATTENTION: si le site est en prod, ça affecte toutes les pages. Visiter http://example.org/cron.php pour tester.
Bugs de performance connus
Drupal 6
Il y a certains problèmes de perfomance assez courants dans Drupal, pour lesquels des patches existent et qui pourraient vous sauver beaucoup de temps!
menu_rebuild (patch)
optimisation module_implements() (patch) - utile pour les sites avec beaucoup de modules
Drupal 7
À détailler!
Theming / frontend
- activer la compression javascript/css (admin/settings/performance)
- désactiver le "rebuild theme registry"
- firebug/yslow (à documenter avec plus de précision)
Images
If the site has images that take a lot of space, optimizing them can save bandwidth and therefore reduce page load time.
JPEG: libjpeg 8d comes with a binary called "jpegtran". It can sometimes optimize (losslessly) a jpeg file up to 90% smaller:
call it with : jpegtran -optimize -outfile new_image.jpg image.jpeg
available in Debian Squeeze in the package named libjpeg-progs
If there are a lot of small images, the above will not change much however. You should then consider turning some images into sprites.
Niveau sysadmin
Général
- S'assurer que Munin ou un autre outil de suivi de la performance soit installé. Voir si le iowait (disques lents), RAM suffisante, pas de swapping, nombre de visites
- Installer un script qui donne un état des requêtes apache, netstat, uptime, etc. lorsque la charge système devient élevée (pratique pour diagostiquer un crash qui arrive de temps en temps, relié à une forte charge système)
- mounter le système de fichier avec les paramètres noatime,nodiratime peuvent aussi faire une différence.
- considérer mettre les fichiers en cache si vous êtes sur un client nfs.
Apache
sur un très très gros site, placer le .htaccess dans la conf statique de Apache + "AllowOverride: None" peut éviter un paquet de stat(). ceci est fait par Aegir par défaut
Considérer retirer le openbasedir si possible, https://bugs.php.net/bug.php?id=52312
- faire le tuning des variables de caching realpath_cache_size realpath_cache_ttl
tuner MaxClients et compagnie
article principal: OptimisationApache
MySQL
- Activer le log des "slow query" mysql.
- Désactiver le "binary log" si ce n'est pas utilisé (replication)
- Activer le query cache
voir OptimisationMysql pour les détails
Cache everything before Drupal
Intégration avec les caches et reverse proxy
See also CachingService for background information about caches and VarnishGuide for an extended guide on how to setup varnish caching on a drupal site.
Look at the headers, this is an example of a site that will not be cacheable:
anarcat@desktop006:~$ curl -I http://www.isuma.tv/ HTTP/1.1 200 OK Cache-Control: store, no-cache, must-revalidate Cache-Control: post-check=0, pre-check=0 Content-Type: text/html; charset=utf-8 Date: Fri, 13 Jan 2012 15:59:45 GMT Expires: Sun, 19 Nov 1978 05:00:00 GMT Last-Modified: Fri, 13 Jan 2012 15:59:45 GMT Server: Apache/2.2.9 (Debian) PHP/5.2.6-1+lenny9 with Suhosin-Patch Set-Cookie: SESS308117942df32c6634fe1e122ade1db0=65f468ed9af8ca08d1e498bd1fb3741d; expires=Sun, 05 Feb 2012 19:33:05 GMT; path=/; domain=.isuma.tv Set-Cookie: localserver=deleted; expires=Thu, 13-Jan-2011 15:59:46 GMT; path=/; domain=isuma.tv Vary: Accept-Encoding X-Powered-By: PHP/5.2.6-1+lenny9 Connection: keep-alive
In the above, we see that the Cache-Controlheaders very explicitly specify no-cache, which means the paeg will not be cached either by the browser or intermediate proxies.
The Expires header also sets the page to expire immediately.
Finally, a cookie is set, which will keep the page from being cached in Varnish.
This is a proper set of headers:
anarcat@desktop006:~$ curl -I -x aegir.koumbit.net:80 http://anarcat.koumbit.org/ HTTP/1.1 200 OK Date: Wed, 28 Mar 2012 18:14:39 GMT Server: Apache/2.2.16 (Debian) X-Powered-By: PHP/5.3.3-7+squeeze8 Cache-Control: public, max-age=43200 Last-Modified: Wed, 28 Mar 2012 18:14:39 +0000 Expires: Sun, 11 Mar 1984 12:00:00 GMT Vary: Cookie,Accept-Encoding ETag: "1332958479" Connection: close Content-Type: text/html; charset=utf-8
This is done by running the site in Pressflow and configuring the cache as being external and with a min/max lifetime in the admin/settings/performance page. See VarnishGuide for more information on how to hookup with Varnish.
Boost
Boost can really improve the performance, but should be considered in shared hosting environments where Varnish or Nginx cannot be used, see boost.
Références
Voir aussi PratiquesOptimisation pour le processus général.
À lire
livre Pro Drupal Development 2nd edition, chapître 22 Optimizing Drupal, p. 527
Modules de performance Drupal
xhprof : identifier les fonctions lentes, voir aussi http://pecl.php.net/xhprof
devel: identifier les requêtes lentes, pages avec trop de requêtes
boost : cache de fichier pour les utilisateurs anonymes, voir DrupalBoost pour notes spécifiques à HAG et Aegir, fortement recommandé pour les sites avec beaucoup de hits anonymes
xdebug: Module PHP à installer sur un serveur de développement. Outil de profiling : http://www.xdebug.org/docs/profiler
authcache - another caching system, file-based, works for logged-in users (non-testé sur Aegir, en test sur HAG avec CDHAL)
cacherouter : APC, memcache, file-based caching systems (non-testé sur Aegir, testé sur HAG avec CDHAL mais voir au moins une issue)
cache_browser : cache diagnostics (non-testé sur HAG/Aegir)
dtools : graphes de performance pour les hooks.
no_anon : disable sessions for anonymous users (in core as of D7), on suggère Boost plutôt que no_anon.