Eli

To content | To menu | To search

Memcached

memcached-logo.jpg

Free & open source, high-performance, distributed memory object caching system

Tuesday, January 17 2012

Memcached 1.4.11 Released

Release notes from official site :

Overview

This release fixes race conditions and crashes introduced in 1.4.10, they should be rare, but users are strongly encouraged to upgrade.
Adds the ability to rebalance and reassign slab memory.

Fixes

  • Don't compute incorrect argc for timedrun
  • Fix 'age' stat for stats items
  • Binary deletes were not ticking stats counters
  • Fix a race condition from 1.4.10 on item_remove
  • Close some idiotic race conditions
  • Initial slab automover
  • Slab reassignment
  • Clean do_item_get logic a bit. fix race.
  • Clean up the do_item_alloc logic
  • Shorten lock for item allocation more
  • Fix to build with cyrus sasl 2.1.25


New Features

Slab Reassign

Long running instances of memcached may run into an issue where all available memory has been assigned to a specific slab class (say items of roughly size 100 bytes). Later the application starts storing more of its data into a different slab class (items around 200 bytes). Memcached could not use the 100 byte chunks to satisfy the 200 byte requests, and thus you would be able to store very few 200 byte items.

1.4.11 introduces the ability to reassign slab pages. This is a beta feature and the commands may change for the next few releases, so please keep this in mind. When the commands are finalized they will be noted in the release notes.

Enable slab reassign on startup:

$ memcached -o slab_reassign

Once all memory has been assigned and used by items, you may use a command to reassign memory.

$ echo "slabs reassign 1 4" | nc localhost 11211

That will return an error code indicating success, or a need to retry later. Success does not mean that the slab was moved, but that a background thread will attempt to move the memory as quickly as it can.

Slab Automove

While slab reassign is a manual feature, there is also the start of an automatic memory reassignment algorithm.

$ memcached -o slab_reassign,slab_automove

The above enables it on startup, and it may also be enabled or disabled at runtime:

$ echo "slabs automove 0" | nc localhost 11211

The algorithm is slow and conservative. If a slab class is seen as having the highest eviction count 3 times 10 seconds apart, it will take a page from a slab class which has had zero evictions in the last 30 seconds and move the memory.

There are lots of cases where this will not be sufficient, and we invite the community to help improve upon the algorithm. Included in the source directory is scripts/mc_slab_mover. See perldoc for more information:

$ perldoc ./scripts/mc_slab_mover

It implements the same algorithm as built into memcached, and you may modify it to better suit your needs and improve on the script or port it to other languages. Please provide patches!

Slab Reassign Implementation

Slab page reassignment requires some tradeoffs:
All items larger than 500k (even if they're under 730k) take 1MB of space
When memory is reassigned, all items that were in the 1MB page are evicted
When slab reassign is enabled, an extra background thread is used
The first item will be improved in later releases, and is avoided if you start memcached without the -o slab_reassign option.

New Stats

STAT slab_reassign_running 0
STAT slabs_moved 0

slab_reassign_running indicates if the slab thread is attempting to move a page. It may need to wait for some memory to free up, so it could take several seconds.

slabs_moved is simply a count of how many pages have been successfully moved.


You can download it from : http://memcached.googlecode.com/files/memcached-1.4.11.tar.gz




Thursday, November 17 2011

Memcached 1.4.10 Released

Release notes from official site :

Overview

This release is focused on thread scalability and performance improvements. This release should be able to feed data back faster than any network card can support as of this writing.

Fixes

  • Disable issue 140 's test.
  • Push cache_lock deeper into item_alloc
  • Use item partitioned lock for as much as possible
  • Remove the depth search from item_alloc
  • Move hash calls outside of cache_lock
  • Use spinlocks for main cache lock
  • Remove uncommon branch from asciiprot hot path
  • Allow all tests to run as root


New Features

Performance

For more details, read the commit messages from git.
Each change was carefully researched to not increase memory requirements and to be safe from deadlocks.
Each change was individually tested via mc-crusher (http://github.com/dormando/mc-crusher) to ensure benefits.

Tested improvements in speed between 3 and 6 worker threads (-t 3 to -t 6). More than -t 6 reduced speed.

In my tests, set was raised from 300k/s to around 930k/s. Key fetches/sec (multigets) from 1.6 million/s to around 3.7 million/s for a quadcore box. A machine with more cores was able to pull 6 million keys per second. Incr/Decr performance increased similar to set performance. Non-bulk tests were limited by the packet rate of localhost or the network card.

Multiple NUMA nodes reduces performance (but not enough to really matter). If you want the absolute highest speed, as of this release you can run one instance per numa node (where n is your core count):

numactl --cpunodebind=0 memcached -m 4000 -t n

Older versions of memcached are plenty fast for just about all users. This changeset is to allow more flexibility in future feature additions, as well as improve memcached's overall latency on busy systems.

Keep an eye on your hitrate and performance numbers. Please let us know immediately if you experience any regression from these changes. We have tried to be as thorough as possible in testing, but you never know.


You can download it from : http://memcached.googlecode.com/files/memcached-1.4.10.tar.gz




Tuesday, October 18 2011

Memcached 1.4.9 Released

Release notes from official site :

Overview

Small bugfix release. Mainly fixing a critical issue where using -c to increase the connection limit was broken in 1.4.8. If you are on 1.4.8, an upgrade is highly recommended.

Fixes

  • Add a systemd service file
  • Fix some minor typos in the protocol doc
  • Issue 224 check retval of main event loop
  • Fix -c so maxconns can be raised above default.


New Features

No new features in this version.

1.4.9 is *not* what 1.4.9-beta1 was.
1.4.10 will be the performance release


You can download it from : http://memcached.googlecode.com/files/memcached-1.4.9.tar.gz




Sunday, October 16 2011

Memcached server profiling with mk-query-digest

Introduction

Monitoring Memcached through hit rate, evicitions sometime is not enough to find what is really happening.
Maatkit have released a Perl toolkit for MySQL and PosteGreSQL, but one tool on this toolkit can also profile Memcached traffic.
This article will focus on mk-query-digest, the Memcached traffic profiler.

Dumping tcp traffic with tcpdump

Before running tcpdump, be aware that it take ~35% load of a one vCPU server here and the output file increase in size at a rate of 300 MB / min, but this may vary, depending on your cached data and requests rate.

You can also tcpdump directly from cache server to get all requests issued to this server, or from an Apache server (In case you only get memcached request asked by this particular server).

Installation

To start profiling we first have to capture memcached network traffic with tcpdump, in case you don't have it installed, you can grab it from www.tcpdump.org or using your system installation prefered method.

Using tcpdump

To start the tcpdump, use the command below :

/usr/sbin/tcpdump -s 65535 -x -n -q -tttt -i eth0 port 11211 > /tmp/tcpdump.log

For more information about tcpdump filters, take a look at this documentation
Server will respond if everything is fine :

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes*

To stop tcpdump, hit CTRL + C, result will show like this :

599513 packets captured
607766 packets received by filter
8244 packets dropped by kernel

And were done with the tcpdump. Let's profile !

Memcached profiling

Although mk-query-digest is designed to be as efficient as possible, it can use a lot of CPU and memory, so I prefer to move log output on our log analysis server. After log file is moved, we can install mk-query-digest.

Download the latest release of mk-query-digest

Simply execute

wget http://www.maatkit.org/get/mk-query-digest

or

wget  http://maatkit.googlecode.com/svn/trunk/mk-query-digest/mk-query-digest

We also need to enable execution of this perl script

chmod 0777 mk-query-digest


Profiling with mk-query-digest

Then we can start profiling with the help of http://www.maatkit.org/doc/mk-query-digest.html

During profiling, mk-query-digest will inform us that it is always running (A nice addition of a recent release)

dump_merlie.log:  28% 01:14 remain
dump_merlie.log:  57% 00:45 remain
dump_merlie.log:  85% 00:14 remain

Note that if you CTRL + C before the end, mk-query-digest will also put display the results of data he has already computed

Results by Memcached server response code

First command is to show all memcached response code from tcpdump

./mk-query-digest --type memcached --group-by res --limit 10 /tmp/tcpdump.log

After some time result will be displayed.

mk-query-digest_global.png

What it tell us :

General & time stats

mk-query-digest_global_overall.png
First line show us that during 331.83k memcached queries, there was 4 unique results for the --group-by res (VALUE, INTERRUPTED, NOT_FOUND, STORED), and that during the time interval it was issued 780.87 query per second with a 1.42x concurrency.
Lines below show us time taken to respond to theses commands (min, max, avg, 95%, ...)

Commands, miss & errors

mk-query-digest_global_commands.png

Finally, we see what commands were issued to memcached, and what errors takes place (that is why % sum are above 100%) but we have to distingish two set of datas :

Memc_get tell us logically that 96% of 331K commands where get, 3% were set commands, and 4 commands where replace, this finally lead us to 100%
Memc_miss is for any command that result in a NOT_FOUND, in this case, 3% of all command tried to access to a nonexistent key.
Memc_error refers to retrieval (gets) commands that were INTERRUPTED, 56 of the 308K gets were in this case.

Memcached results

mk-query-digest_global_results.png
The columns should be fairly self-explanatory: rank, query ID (Not in use here), response time sum and percentage of total, number of calls and response time per call, and results (asked by our --group-by res)

First observation

As we can deduce from above screen, it seems that this server have a pretty good hit ratio (Only 3% miss) and every NOT_FOUND seems to be followed by a STORED (A miss in a get command followed by a set), and everything is fast. But we found 56 INTERRUPTED (0.0001%) that takes alone 24.8% of the time taken to execute 270K commands. Next we have to find what is wrong with these 0.0001% of total command that goes INTERRUPTED.

Following results : VALUE result

After this summary, every result is detailed, first the VALUE result

mk-query-digest_value_global.png

Let's see what each part say :

General & time stats

mk-query-digest_value_overall.png
This part is similar to the general time & stats screen, you've got here time stats (total, max, avg, 95%, ...), hosts that asked for values with percent repartition, size of return values (total, max, avg, ...), and distribution of commands (I think the two set commands is a misread by mk-query-digest)

Query time distribution

mk-query-digest_value_query_time.png
This is particulary interesting logarithmic chart of time clustering, it show distribution of time of requests execution. As we see, most of them are under 100us, and a small part of them are a bit slow.
You have to check this charts, high response time must be avoid.

Looking a overall stats, it seem one request take 200s to execute, it can be a lot of things, client closing connection, network problem, in my case, 314s total - 200s max = 114 s to execute and send 308K get request seem to be pretty good, i will not investigate this more, but will watch it regulary. I will make another profiling to see what's happening after explaining INTERRUPTED.

INTERRUPTED

mk-query-digest_interrupted_global.png
Time chart show that INTERRUPTED requests take lot of time, that one object is 1023.86kBytes (limit is 1024), but key print can't show us if one particular key is the cause.
So we have to change our mk-query-digest parameters to investigate further

Memcached result INTERRUPTED, grouped by keys

What I want is to see what key retreival were interrupted, to see if a particular key is broken or something similar.

Building the mk-query-digest query

Mk-query-digest support PERL regular expressions and give acces to EventAttributes (see this documentation).
Do not hesistate to test your own commands, if something is wrong, mk-query-digest will tell it when you launch it.
So with some tweak, we now have a request that will ask to only show INTERRUPTED events grouped by key to find what iis messed up.

./mk-query-digest --type memcached --group-by key  --limit 10 --filter 
                  '($event->{res}) =~ m/INTERRUPTED/' /tmp/tcpdump.log


Result

mk-query-digest_interrupted_bykey.png
We now see that one particular key take 116 sec, the others are bellow 1 sec.
By using phpMemcachedAdmin, to make a get request with theses key, nothing seemed wrong, values were returned very quickly, so my first clue about connection closed seems right.

Another thing i note is that using hash for memcached key is a good idea, but our developers need to put some ID before, like projectA_aee2ff ...., because looking a theses key, i can't see what it refers to.

Looking for big return values

We can also look for response time of big values, or find values that will go beyond the 1MBytes Memcached limit.

./mk-query-digest --type memcached --group-by fingerprint  --limit 10 --filter
                  '($event->{bytes}) >= 1_000_000'  /tmp/tcpdump.log
Results

I only get one result for this particular request mk-query-digest_bigvalues.png

This key is exactly at the Memcached maximum object size, and must be watched and/or reduced in size.

Useful Options

--save-results
Save results to the specified file.

--timeline
Show a timeline of events.

./mk-query-digest --type memcached --timeline --group-by res /tmp/tcpdump.log

mk-query-digest_timeline.png

--watch-server
This option tells mk-query-digest which server IP address and port (like "10.0.0.1:3306") to watch when parsing tcpdump

Resources

Tcpdump homepage : www.tcpdump.org
Tcpdump filters explanation : http://www.cs.ucr.edu/~marios/ethereal-tcpdump.pdf
Maatkit homepage : http://www.maatkit.org/
Mk-query-digest homepage : http://www.maatkit.org/doc/mk-query-digest.html




Tuesday, October 4 2011

Memcached 1.4.8 Released

Release notes from official site :

Overview

Feature and bugfix release. New Touch commands, counters, and a change to connection limit functionality. Included is an important bugfix for binary protocol users. The binary get command was not activating the LRU algorithm. Fetching an item would not prevent it from getting expired early.

Fixes

  • Fix to write correct pid from start-memcached
  • Fix to enable LRU when using binary protocol
  • Upgrade stats items counters to 64bit
  • Add new stats expired_unfetched, evicted_unfetched
  • Allow setting initial size of the hash table
  • Expose stats for the internal hash table
  • Issue 220 : incr would sometimes return the previous item's CAS
  • Fixed bug on multi get processing
  • Experimental maxconns_fast option
  • Add an ASCII touch command
  • Add binary GATK/GATKQ
  • Backport binary TOUCH/GAT/GATQ commands
  • Issue 221 : Increment treats leading spaces as 0
  • Fix compile error on OS X


New Features

Touch Commands

Binary Touch/GAT commands were backported from 1.6. New GATK/GATKQ commands were added for completeness. Finally, an Ascii protocol touch command was also added.

Touch commands are used to update the expiration time of an existing item without fetching it. Say you have a counter set to expire in five minutes, but you may want to push back the expiration time by five more minutes, or change it to 15 minutes. With touch, you can do that.

The binary protocol also adds GAT commands (Get And Touch), which allow you to fetch an item and simultaneously update its expiration time.

Fast Connection Limit Handling

A new option, -o, has appeared! With -o new, experimental, or highly specific options are given full names. The first of which is maxconns_fast

$ memcached -o maxconns_fast

This option changes the way the maximum connection limit is handled. By default, when memcached runs out of file descriptors, it stops listening for new connections. When this happens, connections will sit in the listen backlog (defaulting to 1024, and adjustable with the -b option). Once some connections close off, memcached will starts accepting new connections again and they will be served.

This is undesireable as it can cause clients to delay or timeout for a long period of time. Long enough that it may be quicker to treat the items as a cache miss.

When a client connects and memcached is configured with maxconns_fast, it writes an error to the client and immediately closes the connection. This is similar to how MySQL operates, whereas the default is similar to Apache.

It is experimental as it is unknown how clients will handle this change. Please help test and report any issues to upstream client maintainers!

Internal Hash Table

STAT hash_power_level 16
STAT hash_bytes 524288
STAT hash_is_expanding 0

Now it's possible to see how much memory the hash table itself uses. This can be useful for deciding on RAM limits for very large instances.

There is also a new option for setting the size of the hash table on startup:

$ memcached -o hashpower=20

If you run instances with many millions of items, and items are added very rapidly on a restart, it may be desireable to presize the hash table. Normally memcached will dynamically grow the hash table as needed, and this operation is generally very low overhead. If you put decals on your '96 Mazda grapefruit shootermobile, you may like this option.

Just examine the hash_power_level before restarting your instances, and adjust the startup command.

expired_unfetched, evicted_unfetched

The two stats represent items which expired and memory was reused, and valid items which were evicted, but never touched by get/incr/append/etc operations in the meantime. Useful for seeing how many wasted items are being set and then rolling out through the bottom of the LRU's.

If these counters are high, you may consider auditing what is being put into the cache.

You can download it from : http://memcached.googlecode.com/files/memcached-1.4.8.tar.gz




- page 1 of 6