Improving quality with 100 Hudson test servers

We recently installed 100 custom built Hudson test servers at our colo facility. They are maxed out on the RAM, have the fastest quad core available, and the second generation Intel SSD hard drives that Linus Torvalds recommended us. We had calculated the cost of doing it on the cloud, but it was a lot more economical for us to run it at our own colo facility because these test servers require so much horse power and run continuously on every SVN commit.

So why did we need so many test servers? Our EE builds are each certified with rougly 10,000 tests per version. Each test must be run on all of our different supported combinations (i.e. application servers, databases, and operating systems). The time required to run these tests vary roughly from seconds to hours depending on the test itself and the environment we run the test on (ie. deploying a portlet to WebLogic and WebSphere takes a lot longer than deloying to Tomcat).

These servers are another milestone in helping us reach the quality that our enterprise clients depend on.

Here's the front view of our cage. It's quite massive. It stacks up to way over 7 feet.

We aren't just fanatics about our code quality, we're also fanatics about how we tie the cables for our test servers.

That's Jeff testing the network.

And that's Louis goofing off by cutting the cable that Jeff was testing. All in good fun.

Just playing with ya Louis. Our IT and QA staff worked countless hours planning and building this to ensure that we can ship out the best product possible. We hope that you guys get to enjoy our labor of love.

Blogs
Hi Brian,

Great post with also great photos! I'm really interested, can you tell me more about different tests you are running? What king of tecnologies are you using to test?

Thanks!
Hi Brian could you tell me do you perform any tests for portlets developed in plugins environment (World of liferay portlets etc.)? As far as I could found in liferay public repo there are no tests for such portlets. Is it true?

It is important information for me cause we are targeting in developing portlets as much as possible in plugins environment.
Good to see that you are taking testing seriously. I'm using Robot framework (http://code.google.com/p/robotframework/) for my Liferay testing, it is framework and using selenium and have nice reports.

And for quality, one idea I would love to see for EE versions.

It would be good that if you would purchase penetration testing for your EE build for example tomcat environment. Then you could provide this report to your EE customers, which could use that as a base for their penetration testing. Then you could go even further to get some sort of certificates to guarantee of Liferay EE security.
Hey Sampsa,

That's exactly the diretion we're already heading toward. We provide testing certificates (which include both manual and automated) along with every release. Without QA's approval, product doesn't ship out the door. It's much more advanced than our early one man team days from 10 years ago.
Hi Brian, But in the longer run don't you think Cloud can actually help, since energy is a major concern driving organization these days.
Also can you come up with a Post on how the various servers performed during your testing, like you mentioned Tomcat took less time compared to Websphere. It will be quite useful for the Clients.
No way, in the LONG run, it's more expensive on the cloud. We calculated it to cost roughly 10-12x more to do it on the cloud than to house it internally even when you factor in all the other costs such as maintenance, electricity, bandwidth, etc.

Cloud is only useful when you don't know your demand. If you don't know whether you need 10 or 100 or 1000 servers, and you don't know how often they are actually being used, then cloud is cheaper because you have lower up front costs. But cloud is definitely more expensive per unit cost (Amazon has to make money some how). Since we know our usage (close to 90% because they're build servers), it's a LOT cheaper for us to do it in house.

You're also paying for expertise when doing it on the cloud, but that's something we already have in our team.
This is amazing! Is this a single Hudson cluster? Are these nodes virtualized?
We're running them as 100 separate Hudson servers for now but with smart partitioning. We had to do it that way because the configurations for the servers differ so much from server to server. (i.e. some of the will have WebSphere, some with Oracle, etc.)

Btw, thanks for making Hudson, it's an awesome product emoticon - the best continuous integration server out there by far.
Hi Brian

This is indeed my first comment in your blog and decided to comment in your last one. I am very, very new to Liferay ecosystem and I am trying to know if I can contact you via eMail, or the LIFERAY chat or via GTalk, Live Messenger, etc. because I have an Idea of mine I would like to discuss with you. Hoping you can reply. I will try to connect from time to time to know what you think about it.
[...] JUC This is the first JUC and will be my first time meeting so many Jenkins users! I'm pretty excited to talk about Liferay's monster Jenkins setup. This past week, I got to talk with our CSA Brian... [...] Read More
Can we update this? Don't we have 250 servers now?
[...] D’un côté nous avons Hudson, la plateforme d’intégration continue qu’on ne présente plus, développé par Kohsuke Kawaguchi, qui a quitté Oracle début avril pour fonder sa propre société, InfraDNA.... [...] Read More