|
Predicting Performance improvements from memory increase |
|
|
|
|
Written by Hemanshu
|
|
Friday, 22 August 2008 |
Predicting Performance improvements from memory increase
One common question I guess is how much should I see performance
improved in case I increase memory say from 16GB to 32GB. The
benefit indeed can be very application dependent - if you have
working set of say 30GB with uniform data access raising memory
from 16GB to 32GB can improve performance order of magnitude by
converting very IO bound load to CPU bound, it is well possible to
see limited gains - if your working set already fits in 16GB you
may not see any significant gains upgrading memory to 32GB.
Interesting enough similar can happen for very large working set -
for example if your main queries do full table scan of 100GB table
it does not matter if you have 16GB or 32GB the load is going to be
way too much IO bound anyway.
Interesting enough because of MySQL scaling issues it is also
possible to see performance to go down as you increase buffer pool
size. Some threads which would be safely sleeping waiting on IO
completion are now finding their data in buffer pool so they start
to compete on hot latches and performance go down. Now back to
original question - how do we predict the benefit from increasing
the memory and so cache sizes ? I typically start by looking at the
type of load we’re dealing with. If it is CPU bound and there
is little IO wait we typically do not expect to gain much by
increasing the memory. This however have to be watched carefully.
Performance does not always stays the same and the goal may not be
optimizing average performance. It may be heavy data processing
batch job which is IO bound and which runs too slow (and affects
other transaction) and may be increasing memory is helpful to solve
this problem. If it is IO bound (high IO system utilization, low
CPU usage) one should think about how much CPU capacity is
available. If your CPU is 25% busy you will unlikely get more than
4x even if you eliminate all IO completely (unlikely because there
is increased CPU overhead going IO path as well), so account for
that. Besides pure CPU based computation you should account for
locking. Consider for example bunch of transactions updating single
row in the table. Having such workload you would likely see no IO
and a lot of CPU idle and not because of internal Innodb limits but
because your application logical serialization problems. So what if
we have very IO bound application without serialization issues (say
reporting slave) which is very IO bound showing 100% IO subsystem
utilization on 5% of CPU usage ? This is the true challenge becomes
because MySQL has no tools to analyze working set (we have per
query working set in our patches but it is not enough). We have
couple of ideas how to do global working set profiling but it
should wait for now. At this point I typically use my intuition to
try to guess how much data application to get some ballpark figure
and often it is enough. If you would like to be more scientific
there are couple of other things you can do. First - you can test
by scaling down the data. If you have data for say 500.000 users on
the 16GB server get it down to the half of that and you will often
be close to seeing performance 32GB server would have. You however
have to be careful and understand how data is used in your
application. If you say have data for 10 years and load data for 5
years only to compare performance you may get misleading results if
reports are typically done for last few months. Basically in such
exercise your goal is load data so the working set would be half of
original so it would have cache fit similar to one on the larger
system you’re trying to compare to. Using this approach you
also should be careful with your estimations and take IO subsystem
into account - even with same cache hit ratio more data and more
load means there are higher demands for IO subsystem performance.
By far the best method is trying, if you can afford it - just get
memory upgrade and see how it affects performance. With many
vendors you can get the memory upgrade or the whole system to try
and return it back if it does not fit your needs. This approach
especially works well if you have many slaves (or many shards) in
which case you can see performance or capacity improvements quite
easily.
Artcile originaly published at <a
href:"http://www.mysqlperformanceblog.com/">http://www.mysqlperformanceblog.com</a>
|