Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OP here

The solution is very simple: as mentioned in the article, just use a newer kernel and always set memory limits for containers, the blog post is based on an older kernel (2.6.32) that quite a few people irresponsibly still use in containerized environments, mostly because EL6 is so popular among enterprises.

In newer kernels, allocations from object pools are now tied to the limits of the memory cgroups that requested them in userspace, if any, so you wouldn't incur in this specific issue and you would just effectively have a container not being able to use more than X MB of dcache entries (although there are probably other minor ones, for example related to sharing global kernel mutexes and such).



I couldn't understand two things from the article:

1. If one of the two containers caused the issue, then the why you needed both of the containers to produce the issue? Why running just the offending one was not enough?

My guess is that "worker" container requested those non-existent files from a volume mounted by the other container, is it right?

2. Kernel hash table implementation. The whole point of hash table is that it's size is O(N), where N is the number of elements it holds.

Capping the hash table size to some constant and putting all the excess elements to its linked lists makes it perform like a linked list divided by the constant, no surprise. So it sounds like there's a bug in dentry hash table implementation -- it should either increase its size accordingly to elements count, or stop accepting new/evict old entries.


> 1. If one of the two containers caused the issue, then the why you needed both of the containers to produce the issue? Why running just the offending one was not enough?

Running just the offending one would have been clearly enough, since its effects would have caused the same increased latency for every other process in the system (including itself). However, using a second container to observe the performance degradation proves the point that one container is able to affect another one, which is sort of the gist of the article, since too many people think containers provide much more isolation than what in reality happens.

> My guess is that "worker" container requested those non-existent files from a volume mounted by the other container, is it right?

No, the containers didn't share any volume, the dentry cache is effectively a singleton within the kernel, so even if the set of volumes is not overlapping, all processes in the system will see a performance degradation, regardless of where the files being accessed reside.

> 2. Kernel hash table implementation. The whole point of hash table is that it's size is O(N), where N is the number of elements it holds.

Your speculation is correct, however, there are sound reasons for doing such a thing in the kernel (and not allowing the main array of the hash table dynamically expand/shrink), so I wouldn't consider it a bug per se. I'll refer you to this excellent comment: https://news.ycombinator.com/item?id=14660954


Thank you. Very good article, thank you for writing it!


It's not irresponsible to use a perfectly fine OS.

What is irresponsible is for Docker to purposefully avoid to mention that it has endless issues on these widely used OS.

The 2.6.X is used in CentOS/RHEL 6, which is the standard in numerous enterprises.

It is not a 2.6 kernel by the way, redhat is backporting tons of stuff from the 3 and 4 branches.


> It's not irresponsible to use a perfectly fine OS.

The first problem with this statement is the idea that there's such a thing as a "perfectly fine OS". We don't even need to consider containers, the longer an OS has been in the wild, the longer its potential vulnerabilities have been found and exploited.

Windows XP is a perfectly fine OS; using it nowadays is irresponsible.

> What is irresponsible is for Docker to purposefully avoid to mention that it has endless issues on these widely used OS.

That responsibility doesn't and should never fall on the developers of an application. The extent of one's responsibility as a developer is to define the recommendations for its use. Anything beyond that is entirely on the user.

One would go insane if one had to wonder every single operating system someone decided to use one's application in.

> It is not a 2.6 kernel by the way, redhat is backporting tons of stuff from the 3 and 4 branches.

"Backporting stuff" doesn't make it not the 2.6 Kernel, it very much is.


>We don't even need to consider containers, the longer an OS has been in the wild, the longer its potential vulnerabilities have been found and exploited.

I challenge you to find exploitable bugs in its kernel. Windows XP is not supported anymore, while RHEL 6 is.


> ES6 is so popular among enterprises

I had to re-read this a few times-- I think you meant EL6, right?


Updated, thanks! I am working with ElasticSearch (ES) more than EL these days and my muscle memory tricked me ;)


I ran into a similar issue with kernel memory caching behavior.

While it's nice to just say LOL upgrade you fool, most of us are stuck with the environment were given.

You can adjust kernel level memory behavior, in particular vfs_cache_pressure can be set very high to force dentry to empty more aggressively.

https://www.kernel.org/doc/Documentation/sysctl/vm.txt




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: