Tuesday, February 19, 2008

JVM Performance Tuning

Last week was JBoss World, and it was exciting to be a part of it. I gave a presentation on performance tuning our Enterprise Application Platform or EAP, and it was packed. In fact, people were sitting on the floor in pretty much all available space. What struck me about the presentation, and much of the discussion I had with individuals afterwards, is that JVM tuning is a big topic. So, I thought I would share some of what I learned over the past couple of months as I was preparing for my presentation.

In preparing for my presentation, I wrote an EJB 3 application, wrote a load test for it, and applied optimizations to various configuration parameters within the EAP, the JVM and the operating system. In particular, one JVM and OS setting really made a huge difference in throughput, and its something that I wanted to share here.

When using a 64-bit OS, in my case Fedora 8 and RHEL 5.1, I wanted to investigate the usage of large page memory support, or HugeTLB as its referred to within the Linux kernel. What I found was very scarce documentation around using this, and that that documentation was incomplete to actually make it work. What I also found, is it makes a huge difference in overall throughput and response times of an application, when using heap sizes above 2GB.

So, without further ado, let's dive into how to set this up. These instructions are for Linux, specifically for Fedora 8 and RHEL 5.1, but the results should be generally applicable to any 64-bit OS and 64-bit JVM that supports large page memory (which all the proprietary UNIX's do, and I found an MSDN article describing how to use this on 64-bit Windows).

You must have root access for these settings. First, you need to set the kernel parameter for shared memory to be at least as big as you need for the amount of memory you want to set aside for the JVM to use as large page memory. Personally, I like to just set it to the maximum amount of memory in the server, so I can play with different heap sizes without having to adjust this every time. You set this by putting the following entry into /etc/sysctl.conf:

kernel.shmmax = n

where n is the number of bytes. So, if you have a server with 8GB of RAM, then you would set it to 8589934592, or 1024*1024*1024*8, which is 8GB.

Second, you need to set a virtual memory kernel parameter to tell the OS how many large memory pages you want to set aside. You set this by putting the following entry into /etc/sysctl.conf:

vm.nr_hugepages = n

where n is the number of pages, based on the page size listed in /proc/meminfo. If you cat /proc/meminfo you will see the large page size of your particular system. This varies depending on the architecture of the system. Mine, is an old Opteron system, and it has a page size of 2048 KB, as shown by the following line in /proc/meminfo:

Hugepagesize: 2048 kB

So, I wanted to set this to 6GB. I set the parameter to 3072, which is (1024*1024*1024*6)/(1024*1024*2). Which is 6GB divided by 2MB, since 2048 KB is 2MB.

After this, you need to set another virtual memory parameter, to give permission for your process to access the shared memory segment. In /etc/group, I created a new group, called hugetlb, you can call it whatever you like, as long as it doesn't collide with any other group name, and it had a value of 501 on my system (it will vary, depending on whether you use the GUI tool, like I did, or whether you do it at the command line, and vary depending on what groups you already have defined). You put that group id in /etc/sysctl.conf as follows:

vm.hugetlb_shm_group = gid

where gid, in my case was 501. You also add that group to whatever your user id is that the JVM will be running as. In my case this was a user called jboss.

Now, that concludes the kernel parameter setup, but there is still one more OS setting, which changes the users security permissions to allow the user to use the memlock system call, to access the shared memory. Large page shared memory is locked into memory, and cannot be swapped to disk. Another major advantage to using large page memory. Having your heap space swapped to disk can be catastrophic for an application. So, you set this parameter in /etc/security/limits.conf as follows:

jboss soft memlock n
jboss hard memlock n

where n is equal to the number of huge pages, set in vm.nr_hugepages, times the page size from /proc/meminfo, which in my example would be, 3072*2048 = 6291456. This concludes the OS setup, and now we can actually configure the JVM.

The JVM parameter for the Sun JVM is -XX:+UseLargePages (for BEA JRocket its -XXlargePages, and for IBM's JVM its -lp). If you have everything setup correctly, then you should be able to look at /proc/meminfo and see that the large pages are being used after starting up the JVM.

A couple of additional caveats and warnings. First, you can dynamically have the kernel settings take affect by using sysctl -p. In most cases, if the server has been running for almost any length of time, you may not get all the pages you requested, because large pages requires contiguous memory. You may have to reboot to have the settings take affect. Second, when you allocate this memory, it is removed from the general memory pool and is not accessible to applications that don't have explicit support for large page memory, and are configured to access it. So, what kind of results can you expect?

Well, in my case, I was able to achieve an over 3x improvement in my EJB 3 application, of which fully 60 to 70% of that was due to using large page memory with a 3.5GB heap. Now, a 3.5GB heap without the large memory pages didn't provide any benefit over smaller heaps without large pages. Besides the throughput improvements, I also noticed that GC frequency was cut down by two-thirds, and GC time was also cut down by a similar percentage (each individual GC event was much shorter in duration). Of course, your mileage will vary, but this one optimization is worth looking at for any high throughput application.

Good luck!

Monday, February 04, 2008

Yahoo and Microsoft; Mixing Oil and Water

Every since Microsoft announced its $44.6 billion dollar offer for Yahoo, there have been many articles flying around about the potential merger. What I find most interesting, is the lack of coverage of the technology issues around such an integration.

I have seen only two articles that have mentioned technology differences between the two companies as an integration challenge. I think this is a huge oversight in the coverage of the acquisition.

From what I know of Microsoft, and what I have heard of Yahoo's technology, you simply cannot downplay the challenge of putting these two companies together. They are polar opposites where engineering is concerned, and Microsoft is living in a dream world if they think they are going to get any synergy from combining the two engineering teams.

Good software developers tend to be pretty picky about the technologies they work with, and are probably with the company they are with, in large part, because of the technologies employed.

In the case of Microsoft, there is no speculation about what technologies will be employed. They will be Microsoft technologies, period. This is illustrated by Microsoft's acquisition of HotMail. HotMail was deployed on an open source infrastructure, and I believe they were using BSD as the operating system. When Microsoft acquired them, the first thing Microsoft wanted to do was move HotMail to a Microsoft platform. Of course, this failed at first, but I believe they eventually did succeed in getting HotMail moved to a Windows platform. With the difficulties of just moving this one application, you have to consider moving the entire Yahoo portfolio over to a new platform to be an insurmountable task.

From everything I've heard about Yahoo's technology platform, it is largely based on open source. Just like HotMail was. If I were a Yahoo software developer, and I was asked to move my work to a Microsoft platform, I would simply quit. Now, they will have a Microsoft retention package that will attempt to keep them at the company, but I really don't see this as something that will keep the most talented folks around. Now, Microsoft could decide to allow the Yahoo platform to be the platform that stays, but this is so totally against the Microsoft culture, that I don't see this happening. This also poses a lot of problems for all their existing technology and their other acquisitions. Would they truly be willing to throw all the other technology away, or have those engineers move their technology to open source, and into the Yahoo infrastructure? Again, I don't see that happening, and they would also risk losing those existing engineers for the same reason that Yahoo engineers would leave.

If this merger isn't akin to mixing oil and water, I don't know what is!

Friday, February 01, 2008

The State of JBoss and the Upcoming JBoss World

Lately, it seems like the world views JBoss as a failed acquisition by Red Hat. Failed by a couple of measures in fact. The first being sales of our subscriptions, and the second being that we are no longer innovating. I wanted to take the time to address both of those items.

Where sales is concerned, I believe this was the biggest problem with the integration of JBoss into Red Hat. There for a time, sales really did lag, and things weren't looking very good. We had lots of experienced middle-ware sales people leave as a result of the integration into the larger Red Hat sales organization. I also believe that Red Hat didn't truly understand that the sales process and cycle were different for middle-ware than it was for RHEL. Having said that, I now see that Red Hat truly does understand the differences (some people just have to learn the hard way), and while the losses of good, experienced middle-ware sales folks probably still hurts some, we are seeing a major turn around in this area. While I cannot publicly talk about the actual sales figures (we are a public company after all), I can say that sales of JBoss subscriptions are growing, and have been growing for quite a while now. Demand for training and consulting is also strong. So, the picture is looking bright, and brighter all the time. I believe the company has learned a hard lesson, and the resultant actions from that lesson is now paying off, and will continue to pay off in the future.

Now, let's turn our focus to innovation. Has innovation really stopped at JBoss, or at least come to a very slow crawl. I really take issue with that. Since the acquisition, we have changed our product model somewhat, which certainly slowed some other things down, but when you look at what we have accomplished, its actually quite amazing.

First, we released our first two products under our new product model, with our Enterprise Application Platform 4.2 (In July of last year), and the very recent Enterprise Platform 4.3 (EAP for short). Our EAP 4.3 release contains the very latest in Java Web Services, with fully supported JAX-WS and JSR-181 annotation support, as well as an entirely new JMS implementation based on our JBoss Messaging technology.

With JBoss Messaging, you now have a JMS provider that can be horizontally scaled in a cluster, with fully transparent load balancing and fail-over. Also, performance is substantially enhanced over the old (and now retired) JBossMQ. For those of you that have experience with JBossMQ in a cluster, with its band-aid approach to clustering and fail-over, you will definitely appreciate JBoss Messaging. This is a world-class messaging system, and certainly a shining example of innovation from JBoss.

Second, we have also released a new version of our Portal platform, JBoss Portal 2.6, and it includes much better usability, manageability, and capabilities to support newer technologies, like Google Gadgets.

Third, we have released our JBoss Communications Platform, based on our Mobicents open source project, providing the only enterprise class JSLEE implementation in the world. With ongoing enhancements, the next platform release will also support the very latest SIP servlets specification as well, so you can start out with SIP servlets, and move up to full JSLEE as you need. This is a revolutionary platform for telecommunications, and another shining example of the innovation coming from JBoss.

Fourth, we have released our JBoss ESB, into the community, and as we speak we are working hard on delivering our SOA Platform product based on JBoss ESB. This product will have a very large impact on the ESB/SOA marketplace, as the first truly enterprise class open source product in the market (and yes I don't count MuleSource and Service Mix, because they don't have the kind of support organization that we have). This will be a game changer!

Last but not least, we also delivered JBoss Developer Studio. I can't say enough about this accomplishment. The developers deserve all the credit for getting this to market. It fills a huge hole in our product portfolio, and makes it that much easier for IT managers to move to JBoss technology.

And that's not all. There are many other things in the works at the JBoss division of Red Hat, and for that, you should consider coming to JBoss World. We are having a JBoss World in Orlando, starting on February 13th, and I believe there is one more week to register. I would encourage everyone that can to come, and check out all the exciting things happening with JBoss, and I think you will be convinced that we are still innovating, and there are lots of reasons to consider JBoss technology for your projects.