*** Disclaimer: this is not a howto, this is just my personal experience ***I host several websites on a web server at home on a DSL line: zeRezo.com.
Since the beginning, I have been careful about bandwidth saturation, since even today I only have 1Mbits of upload.
So I try to keep my websites "light", or at least I avoid hosting big files such as videos, big software packages, etc.
When I need big files, I use external hosting. It is not a big issue, since I still have all my standard web pages hosted at home, so I keep a good idea of how many visitors I have.
After some years of such hosting, even if there have been some hardware updates (CPU, RAM, disks), my server starts to be overloaded.
The number of visitors is not so important, but it is still higher than before (great news), some websites have improved their features, and due to historical data the databases are bigger and bigger...
So anyway the fact is that today, sometimes, my server is very slow, almost not responding to local requests, so it was really time to investigate!
After some googling, I found this nice post which gives basic steps to identify the "weak" point on your installation.
The article focuses on the 3 main points of congestion for an overloaded server:
- RAM usage
- disk activity
- CPU usage
I had 5 process of SpamAssassin running on my system (default option) and it looks like they do not share their database in memory!
Since my system does not handle so much email, I changed default Debian option in
/etc/default/spamassassin to have only one process:
# NOTE: version 3.0.x has switched to a "preforking" model, so you # need to make sure --max-children is not set to anything higher than # 5, unless you know what you're doing. OPTIONS="--create-prefs --max-children 1 --helper-home-dir"Except this small detail, both RAM and disk usage are ok on my server.
So the problem is CPU, and it is not so difficult to see, the
top command at some times of the day can show very high load (>>10) with 100% CPU usage of course.So now we know what to do: optimize dynamic web pages, doing faster PHP code and nicer MySQL queries.
The other issue is how to identify which pages are responsible for the CPU usage.
There are nice options in Apache logging option, the %T and %D options allow to track "the time taken to serve the request".
I still run the old 1.3 version at home, so I only have the %T option (in seconds), and the result was not so helpful in my case: I sometime have very high times for static pages, which is strange...
But I know which are the bad pages on my server, since on many websites I include a little footer with the computation time of the page. Something like this:
function getmicrotime()
{
list($usec,$sec)=explode(" ",microtime());
return ((float)$usec+(float)$sec);
}
$time_start=getmicrotime();
/* the ugly code here */
$time_end=getmicrotime();
$time=$time_end-$time_start;
printf('Page generated in %f seconds',$time);
So I often see in my footers that I host ugly slow code :)To begin, I focused on MySQL queries, since it seemed to be the longer part of my slow web pages.
Here again, MySQL has got nice logging option.
In my configuration file, I switched on the
log-slow-queries option:
# Here you can see queries with especially long duration log-slow-queries = /var/log/mysql/mysql-slow.logWhen this option is present, all queries that take more than X seconds to complete are stored in the log file.
It is then easy to locate the slower queries.
Also, in order to profile SQL queries on a specific page, I use the quick & dirty following technique. I replace all calls to
mysql_query() by a custom _mysql_query().It would be nicer to just override the
mysql_query() builtin function, but I don't think I can do this with my PHP version.So here is what it looks like:
function _mysql_query($string)
{
global $REMOTE_HOST;
/* global timer to store total time spent in SQL queries */
global $time_mysql;
/* start te timer */
$t1=getmicrotime();
/* do the query and save the result to return it to the caller */
$r=mysql_query($string);
/* stop the timer */
$t2=getmicrotime();
/* only show the trace for the developer */
if ($REMOTE_HOST=='my_computer_name')
print '<div class="mysql" title="'.$string.'">'.round(1000*($t2-$t1)).'</div>';
/* increase total SQL time */
$time_mysql+=$t2-$t1;
/* return the results to caller just like mysql_query would */
return $r;
}
/* ... */
$query = 'SELECT * FROM foo';
$result = mysql_query($query);
The time I am talking about here is "real" time, not CPU time spent on this specific process, so it is not very accurate, but still helpful.I use a style for these trace <div> so they look like small boxes. The "title" on them allows to view the SQL query just by moving the mouse over it:
On this basic website, since the boxes are written when the query is done, it is easy to understand when and why a specific query was done, just looking where it is visually located on the website layout.
The grey boxes are another trace I use, to check for time spent in PHP.
To compute this time, I also use timers, but I remove the SQL time to really focus on PHP code time:
function _trace($string)
{
global $REMOTE_HOST;
/* global timer started at the beginning of the page */
global $time_start;
/* total time spent in SQL queries */
global $time_mysql;
/* only show the trace for the developer */
if ($REMOTE_HOST=='my_computer_name')
print '<div class="php" title="'.$string.'">'.
round(1000*(getmicrotime()-$time_start-$time_mysql)).'</div>';
}
/* ... */
_trace('start big command');
very_slow_procedure();
_trace('stop big command');
Again, this allowed me to find very nasty things in my website.For example, instead of using a static array of smilies, I used a PHP loop to parse the local smiley filenames (so if I had a smiley I don't have to update the code).
Ok this is not very nice, but I did not suppose it would be so slow ;-)
Now this is fixed!
One last tool which can be useful:
ab.
This is Apache HTTP server benchmarking tool, which allows to do many requests at a time on your server.ab -c 10 -n 100 http://google.com/For example this command will do 100 requests on http://google.com/, witch 10 requests in parallel.
Be careful with this command, it can stuck your server if you use too big number!
I used it in combination with a custom script to monitor LAMP activity:
- Number of Apache and MySQL processes
- Memory usage for Apache and MySQL
- CPU usage for Apache and MySQL
The next step could be another hardware upgrade... or to switch to a smarter solution like professional hosting.