<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.igorminar.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:blogger="http://schemas.google.com/blogger/2008" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" gd:etag="W/&quot;AkUFQH86eip7ImA9WhBbFU4.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950</id><updated>2013-05-14T06:30:11.112-07:00</updated><category term="DotSunEngineering" /><category term="Software Engineering" /><category term="SunWikis" /><category term="Rails" /><category term="Zoom23" /><category term="MacOS" /><category term="MacBook Pro" /><category term="Gadgets" /><category term="Security" /><category term="Fun" /><category term="Apple" /><category term="Java" /><category term="JavaOne2007" /><category term="JavaFX" /><category term="OpenSolaris" /><category term="Life" /><category term="Sun" /><category term="JRuby" /><category term="Slovakia" /><category term="Ruby" /><category term="Apps" /><category term="Other Ramblings" /><category term="Projects" /><category term="Apple Problems" /><category term="Solaris" /><category term="grizzly-sendfile" /><category term="Glassfish" /><category term="Mediacast" /><title>Igor Minar's Blog</title><subtitle type="html">A Sudden Burst of Ideas...</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://blog.igorminar.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://blog.igorminar.com/" /><link rel="next" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default?start-index=26&amp;max-results=25&amp;redirect=false&amp;v=2" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>113</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.igorminar.com/IgorMinarsBlog" /><feedburner:info uri="igorminarsblog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;AkYNRXo5eSp7ImA9Wx5UGUg.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-1802384511302293009</id><published>2010-08-08T23:10:00.000-07:00</published><updated>2010-10-24T15:09:54.421-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:09:54.421-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Life" /><title>Change of Status</title><content type="html">&lt;pre&gt;
$ sqlplus -s
SQL&gt; connect hr@oracle.com/hr
SQL&gt; UPDATE employees SET current = false WHERE email = "Igor.Minar@oracle.com";
SQL&gt; COMMIT;
SQL&gt; disconnect
SQL&gt; exit
&lt;/pre&gt;

&lt;pre&gt;
$ curl -X POST -H "Content-Type: application/json" \
   -d '{ "firstName":"Igor", "lastName":"Minar"}' \
   http://google.com/employee/
&lt;/pre&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/7LVK5aSPfM4" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/1802384511302293009/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=1802384511302293009" title="8 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1802384511302293009?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1802384511302293009?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/7LVK5aSPfM4/change-of-status.html" title="Change of Status" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>8</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/08/change-of-status.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkUNRH86fSp7ImA9Wx5UGUg.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-2698273426037599243</id><published>2010-08-03T15:48:00.000-07:00</published><updated>2010-10-24T15:11:35.115-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:11:35.115-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Life" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>Thanks for All the Fish</title><content type="html">&lt;p&gt;Hi guys,&lt;/p&gt;

&lt;p&gt;Most of you don't know, but today is the 3rd birthday of wikis.sun.com. In 2007 a bunch of us decided that it was worth it to boldly go where no man has gone before and on August 3, 2007 we launched wikis.sun.com.&lt;/p&gt;

&lt;p&gt;At that time very few corporations were actively using some kind of wiki internally and there was no known significant public wiki deployment run by a corporation. We were astonished to see the uptake and user interest and watched the project grow from a few power users and a few dozens of wiki pages to tens of thousands of users and tens of thousands of wiki pages.&lt;/p&gt;

&lt;p&gt;Thank you all for contributing, providing feedback and helping us to make the project successful.&lt;/p&gt;

&lt;p&gt;Despite being a small team (did you know that officially there wasn't a single person working on wikis full-time?), we managed to get a lot done and I'm very proud of our accomplishments.&lt;/p&gt;

&lt;p&gt;I do believe that there is still room for improvements, but I made a decision that these improvements will have to be implemented by someone else. I have found a new challenge that I'm going to pursue and unfortunately it's time for me to hand the wikis project over to a new group that will oversee the operations and development of the site. I did my best to make this transition as smooth as possible and I'm hopeful that the site will be in good hands.&lt;/p&gt;

&lt;p&gt;Just by chance my last day at Oracle coincides with wikis' 3rd birth day, I'll take that as a good sign. I wish you all the best and I hope to see you around. Internet is a small place.&lt;/p&gt;

&lt;p&gt;Good luck to you all.&lt;/p&gt;

&lt;p&gt;Cheers,&lt;br/&gt;
Igor&lt;/p&gt;

&lt;p&gt;PS: If you want to stay in touch, you can find me at linkedin at: &lt;a href="http://www.linkedin.com/in/igorminar"&gt;http://www.linkedin.com/in/igorminar&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/8kSsp87Cels" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/2698273426037599243/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=2698273426037599243" title="9 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2698273426037599243?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2698273426037599243?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/8kSsp87Cels/thanks-for-all-fish.html" title="Thanks for All the Fish" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>9</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/08/thanks-for-all-fish.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkEBRXo_eyp7ImA9Wx5UGUg.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-8850925192559891135</id><published>2010-08-03T01:53:00.000-07:00</published><updated>2010-10-24T15:17:34.443-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:17:34.443-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DGC VI: Wiki Organization and Working with the Community</title><content type="html">&lt;p&gt;This blog post is part of the &lt;a href="http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html"&gt;DevOps Guide to Confluence&lt;/a&gt; series. In this chapter of the guide, we’ll have a look at wiki organization and working with the user community. This post is going to be more subjective than the others, because the recommendation I'm going to make apply to a wiki site with similar goals and purpose as ours. I'm just going to share our experience and hopefully some of it will be useful for others.&lt;/p&gt;

&lt;h2&gt;The Purpose&lt;/h2&gt;
&lt;p&gt;First thing that should be clear for you when building a wiki site is what is the purpose that it's going to serve. Confluence has been successfully used for many purposes ranging from team collaboration, documentation writing, to website CMS system just to mention a few. When our team set out to build a wiki site, the goal was to create a wiki platform that could be used by anyone in our company to publicly collaborate with external parties without having to deploy and maintain their own wiki.&lt;/p&gt;

&lt;p&gt;It was a pleasant surprise when one of the first groups of users who joined our pilot three years ago were technical writers eager to drop their heavy-weight tools with lots of fancy features in exchange for lightweight and more importantly inclusive collaboration tool. The main issue they were facing was that their processes and tools were very exclusive, and next to impossible for a non-writer to quickly join in order to make small edits. This resulted in lots of proxying of engineering feedback, and inevitable delays. With a wiki, the barrier for entry is very low for almost everyone. There is nothing to install or configure, a browser is all one needs. A wiki allowed a relatively small and overloaded team of technical writers to more efficiently gather and more importantly incorporate feedback from subject matter experts into the documentation. Of course there were trade-offs, mainly in the area of post processing the content for printable documentation (i.e. generating PDFs), but I'm hopeful that as the wiki system matures, more attention will be paid to make this area stronger (Atlassian: hint hint).&lt;/p&gt;

&lt;p&gt;Anyway, with the tech writers on board, the purpose, goals and evolution of our site got heavily influenced by their feedback. In exchange we received a lot of high quality content that attracted new users who started using the wiki. This kind of bootstrap of the site greatly helped to speed up the viral adoption across our thirty-thousand-employee company.&lt;/p&gt;

&lt;h2&gt;Wiki Organization&lt;/h2&gt;
&lt;p&gt;When we launched our site three years ago, there were no other big corporations with a public facing wiki site (many corporations didn't even have an internal wiki yet, boy that has all changed since then), this put us into a position where we had to be the first explorers in search of best practices as well as things that didn't work at all.&lt;/p&gt;

&lt;p&gt;Fortunately, since our team successfully pioneered the area of corporate blogging before the wikis launch, we had some experience with building communities that we could leverage.&lt;/p&gt;

&lt;p&gt;Some of the main principles that we reused from our blogs site were:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Make the rules and policies as simple as possible&lt;/li&gt;
&lt;li&gt;It is a goal shared by all employees to create a good image of the company and make the company succesful. We should trust their judgement and empower them to be able to do the right thing.&lt;/li&gt;
&lt;li&gt;The team running the site is small, so the employees should be able to do as much as possible on their own (self-provisioning FTW!)&lt;/li&gt;
&lt;li&gt;Since we trust our employees, we should delegate as much decision making and as many responsibilities as possible, and let them delegate some to others, otherwise we won't be able to scale.&lt;/li&gt;
&lt;li&gt;There should be very little (close to none) policing or content organization done by the core team. We don't have the man-power for that. Besides, the Internet is not being policed by anyone and things tend to just work out. The popular, well organized and valuable content bubbles up, in one way or another.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Implemented Actions&lt;/h2&gt;
&lt;p&gt;With our principles laid out, we took these actions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We integrated Confluence with our single sign-on and user provisioning system, which made it super easy for employees and external users to log in using their existing accounts.&lt;/li&gt;
&lt;li&gt;Based on the information in our identity systems, we enrolled accounts of all of our employees into an employee-specific Confluence group, which we utilized when setting up global permissions.&lt;/li&gt;
&lt;li&gt;The global permissions were set up so that employees (and only employees) could create new wiki spaces on their own, whenever they had the need, for whatever purpose&lt;/li&gt;
&lt;li&gt;We also opened up all wiki spaces to be viewable by anyone on the Internet, but we left it up to the space admins to restrict permissions if they felt like it was necessary.&lt;/li&gt;
&lt;li&gt;In order to mitigate spam issues, we made it impossible for anonymous users to obtain any write permissions either on the global or space level&lt;/li&gt;
&lt;li&gt;We created a single Confluence theme that was applied to all wiki spaces by default and disabled all the other themes. This was done mainly for technical reasons — the Confluence UI has been changing dramatically over the last few years and these changes often resulted in a need to modify a custom theme to make it compatible with these changes. If we allowed anyone to create their own theme, we'd never be able to upgrade, because of a fear that we'd break someone's theme or alternatively we would have to coordinate our updates with all the maintainers of custom themes&lt;/li&gt;
&lt;li&gt;We created an internal mailing list where space admins and other wiki users could share their experience, ask questions, and report issues.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Things We Need to Work on&lt;/h2&gt;
&lt;p&gt;Nobody's perfect and neither are we, let's look at what could we improve.&lt;/p&gt;

&lt;p&gt;I know I just said that popular content always bubbles up, but considering how hideous the default Confluence front page is, I'd much prefer to utilize that real estate better and highlight popular or interesting content there.&lt;/p&gt;

&lt;p&gt;I also think that we could do a better job at highlighting hardworking community members. There were &lt;a href="http://blogs.sun.com/peterreiser/entry/community_equity_specification"&gt;some elaborate attempts&lt;/a&gt; to do this, but in my opinion a more lightweight approach could be more suitable for most of the sites.&lt;/p&gt;

&lt;p&gt;Lastly, I think that staying in touch with our community is very important, and we could have done a better job at it if we had e.g. quarterly internal mini-conferences on various topics during which we could better gather their feedback. Also some better organized training sessions for our novice users could help boost our growth even further.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The recommendations and practices that worked for us might not be suitable for all Confluence deployments, but in our case things have worked out. There are still many areas where we could have done a better job, but I guess it's good to always have some space for improvements.&lt;/p&gt;

&lt;p&gt;In the next chapter of my guide, we'll discuss issues and solutions that are specific for Internet-facing Confluence deployments.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/qvdjhE_hw-U" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/8850925192559891135/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=8850925192559891135" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/8850925192559891135?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/8850925192559891135?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/qvdjhE_hw-U/dgc-vi-wiki-organization-and-working.html" title="DGC VI: Wiki Organization and Working with the Community" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/08/dgc-vi-wiki-organization-and-working.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A0YGR3w6fyp7ImA9Wx5UGUg.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-291132960710016721</id><published>2010-08-02T07:29:00.000-07:00</published><updated>2010-10-24T15:25:26.217-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:25:26.217-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DGC V: Customizing and Patching Confluence</title><content type="html">&lt;p&gt;This blog post is part of the &lt;a href="http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html"&gt;DevOps Guide to Confluence&lt;/a&gt; series. In this chapter of the guide, we’ll have a look at how to customize and patch Confluence.&lt;/p&gt;

&lt;h2&gt;Customizing Confluence&lt;/h2&gt;
&lt;p&gt;Before we talk about any customization at all, I need to warn you. Any kind of customization of Confluence (or any other software) comes with a maintenance and support cost. The problems usually arise during or after a Confluence upgrade, and if they catch you unprepared, you might get yourself in a lot of trouble. Keep this in mind and before you customize anything, justify your intent.&lt;/p&gt;

&lt;p&gt;There are several ways how to customize Confluence. For some the maintenance and support cost is low, others give you lots of flexibility at a higher cost. So depending on your needs and requirements you can pick one of the following.&lt;/p&gt;

&lt;h3&gt;Confluence User Macros&lt;/h3&gt;
&lt;p&gt;I already mentioned these in the &lt;a href="http://blog.igorminar.com/2010/07/dgc-iii-confluence-configuration-and.html"&gt;Confluence Configuration&lt;/a&gt; chapter — they are easy to create and usually don't break during upgrades, but they are a nightmare to maintain. Avoid them.&lt;/p&gt;

&lt;h3&gt;Confluence Themes, HTML Headers and Footers&lt;/h3&gt;
&lt;p&gt;You can easily inject html code in the header and footer by editing the appropriate sections of the Admin UI (described in the config chapter). If this html code contains visual elements, then it's possible that your code will break during upgrades. In general I would avoid editing headers and footers in this way as much as I could unless I was doing something very simple.&lt;/p&gt;

&lt;p&gt;Confluence themes are the way to go. You can either pick a theme that was already built and published by someone else, or you can build our own. Building your own theme will give you the most flexibility, but the cost of maintaining and supporting it will be the highest. You can do some things to cut corners, but be prepared to do some Confluence plugin development (a Confluence theme, is really just a type of Confluence plugin).&lt;/p&gt;

&lt;p&gt;What worked well for me and our heavily customized theme, is to create our theme as a patch for the Confluence default theme. I simply symlink all the relevant files from Confluence source code into a directory structure that can be built as a Confluence theme/plugin, add my &lt;tt&gt;atlassian-plugin.xml&lt;/tt&gt; and patch the files with changes I need no matter how complex they are. The advantage of this approach is that my theme will always be compatible with my Confluence version (after rebase) and I get all the new features introduced in the new version. The downside is that I often need to rebase my patches during Confluence upgrades, but with a good patch management solution (see below) this headache can be greatly minimized.&lt;/p&gt;

&lt;p&gt;Lastly there is &lt;a href="https://www.adaptavist.com/display/ADAPTAVIST/Builder"&gt;Theme Builder&lt;/a&gt; from Adaptavist. I haven't personally used this Confluence plugin because it was not popular when we initially created our theme and it was not desirable for us to depend on yet another (unknown at that time) vendor during our Confluence upgrades. If I were about to start creating a theme from scratch I would compare it with my patching method and see what gives me the most benefits. The main concern with Theme Builder I have, is my ability to version control the theme, which if not easily possible might be the deal breaker for me and many others.&lt;/p&gt;

&lt;h3&gt;Confluence Plugins&lt;/h3&gt;
&lt;p&gt;I mentioned Confluence Plugins already in the &lt;a href="http://blog.igorminar.com/2010/07/dgc-iii-confluence-configuration-and.html"&gt;previous chapter&lt;/a&gt;, so I'm not going to repeat myself here.&lt;/p&gt;

&lt;p&gt;What I'm going to add is that you really can extend and customize Confluence in crazy ways via the plugins. You can either discover the existing plugins at &lt;a href="https://plugins.atlassian.com/"&gt;Atlassian Plugin Exchange&lt;/a&gt; or you can build your own with Maven (or the &lt;a href="http://confluence.atlassian.com/display/DEVNET/How+to+Build+an+Atlassian+Plugin"&gt;Plugin SDK&lt;/a&gt;), Java (or another Java compatible language) and &lt;a href="http://confluence.atlassian.com/display/PLUGINFRAMEWORK/Plugin+Framework+Developer+Documentation"&gt;Atlassian Plugin Framework&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The nice thing about plugins is that they are encapsulated pieces of code that interact with the rest of Confluence via public API and additionally they are hot plugable. This means that in theory they should work after a Confluence upgrade and that you can install and uninstall them on the fly without a need for restart. While the latter is true in practice, the former is not always the case. Confluence's public apis sometimes change, plugins rely on behavior that was not considered to be part of the public api and the UI changes all the time, so any CSS/javascript code that relies on absolute or relative positioning or fixed DOM structure will need ocassional fixes during upgrades.&lt;/p&gt;

&lt;h3&gt;Patching Confluence (source) files&lt;/h3&gt;
&lt;p&gt;Lastly I'm going to mention that one can modify Confluence's behavior by modifying the Confluence core files, this is a large topic and deserves its own section. ;-)&lt;/p&gt;

&lt;h2&gt;Patching Confluence&lt;/h2&gt;
&lt;p&gt;Patching Confluence is definitely the most advanced way to customize Confluence, especially if you start changing the Java source code, recompiling and creating your own war files. On the other hand, this way you get the most flexibility and will be able to change anything you want, even those things that plugins can't, all at your own risk.&lt;/p&gt;

&lt;h3&gt;Issues to Be Aware of&lt;/h3&gt;
&lt;p&gt;There are several potential issues that you should be aware of before you head down this route:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;you might break something unintentionally — you can mitigate this with testing&lt;/li&gt;
&lt;li&gt;you might have a hard time preserving your changes during an upgrade — you can minimize the problems by using good patch management strategy (discussed below)&lt;/li&gt;
&lt;li&gt;you might have a problem getting support — this was never a problem for me mainly because most of my changes have been very isolated, so I could quickly tell if an issue is caused by my patch or if there is a bug in Confluence.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Reasons for Patching&lt;/h3&gt;
&lt;p&gt;The reasons for which you might want to patch Confluence fall generally into these four categories:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;config change - as I mentioned in the &lt;a href="http://blog.igorminar.com/2010/07/dgc-iv-confluence-upgrades.html"&gt;Confluence Configuration&lt;/a&gt; chapter, some of the configuration is done by modifying files that are part of the Confluence standalone or war distribution (usually those in &lt;tt&gt;WEB-INF/classes&lt;/tt&gt; directory). This is quite unfortunate because it adds a significant overhead to upgrades. I would much prefer if this configuration could be done via files in Confluence Home directory, but until that happens, the best way to manage these changes is by treating them as patches.&lt;/li&gt;
&lt;li&gt;security fix - Atlassian often releases patches that fix security vulnerabilities in older versions of Confluence. What they actually release is a binary or textual file that represents the fixed version of the affected code. This file can then be just dropped into the appropriate location in (typically) &lt;tt&gt;WEB-INF/classes&lt;/tt&gt; directory and the issue is fixed. This is a nice quick hack, but if your site is bigger or you plan to be on an older version for an extended period, your situation will be a lot more maintainable if you transform the fix into a patch against your version of Confluence.&lt;/li&gt;
&lt;li&gt;temporary bug fix - occasionally Atlassian releases a temporary fix for an issue in a form similar to a security fix that will later on be properly fixed. In the meantime, the temporary fix can be used to work around the problem. Again, for a bigger site things will be a lot more maintainable if you manage this change as a patch.&lt;/li&gt;
&lt;li&gt;a ui/behavior change - and lastly if you run a bigger site with lots of requirements coming from different groups of users, you might need to add a feature or disable an existing feature, add or remove a UI element, or change some behavior of Confluence in a way that is not possible via a plugin or a theme, in this case you definitively want to maintain every single such a change as an isolated patch. If you don't then you'll be in a big trouble when a time to upgrade Confluence comes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Patching Methods&lt;/h3&gt;
&lt;p&gt;Now that we know why we would be interested in patching Confluence, let's look at how to do it. Again, there are several ways, depending on what do you need to patch.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;patch the non-compilable files - these typically include config files as well as, javascript, css and template files. If you are patching only these types of files, then you can just create patches against the Confluence standalone or war distributions (since no compilation for these is needed). More likely than not once you get down the patching path, you'll want or need to do a lot more though.&lt;/li&gt;
&lt;li&gt;patch files that require re-compilation - the Java source code. Soon after I realized that I needed to patch Confluence, I ended up modifying the java source code in order to fix bug or modify behavior. Atlassian makes this relatively easy to do, because along with the binary releases they also offer source code releases which can be used to build Confluence from sources on your own. This is an huge benefit for their customers, especially those who are willing to get their hands dirty to get the most out of Confluence. Once you have access to buildable source code, you can patch it and create your own builds with relatively small effort. The benefit of creating patches against the source release is that you can patch anything and everything (though I'm not saying that you should), starting from config files, js, css and template files all the way to core java class files; and all of that in a consistent and reliable way.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Patch Management&lt;/h3&gt;
&lt;p&gt;As I mentioned already, whenever you modify the source code you want to create an isolated patch that is a logical grouping of changes needed for one bug fix, config change or feature. Once you have many smaller patches like these you can apply or omit them in a build or update them one at a time when needed.&lt;/p&gt;

&lt;p&gt;If you were to use the standard command line tools like &lt;tt&gt;diff&lt;/tt&gt; and &lt;tt&gt;patch&lt;/tt&gt; to work with these patches, you would probably go nuts quickly. There are far better, higher level solutions that can be used. Distributed source code management tools that are becoming an unstoppable force in the SCM arena of software development offer features that make patch management a piece of cake.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://git-scm.com/"&gt;Git&lt;/a&gt;, for example, offer's a feature called &lt;a href="http://www.kernel.org/pub/software/scm/git/docs/git-stash.html"&gt;Stash&lt;/a&gt; which allows you to create and maintain patches against your git repository. I don't have a personal experience with git-stash, but from the docs it looks like it should do what we want.&lt;/p&gt;

&lt;p&gt;The solution that I've been using and loving for the past 3 years is &lt;a href="http://mercurial.selenic.com/"&gt;Mercurial&lt;/a&gt; and it's core plugin &lt;a href="http://mercurial.selenic.com/wiki/MqExtension"&gt;Mercurial Queues&lt;/a&gt;. Working with this plugin is also well documented &lt;a href="http://hgbook.red-bean.com/read/managing-change-with-mercurial-queues.html"&gt;here&lt;/a&gt; and &lt;a href="http://hgbook.red-bean.com/read/advanced-uses-of-mercurial-queues.html"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here are some main points for how I do my patching:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I store Confluence sources in my main Mercurial repository. I simply grab the source zip from Atlassian's website, unzip it, rename the root directory to "confluence" and put it to my repository.&lt;/li&gt;
&lt;li&gt;When a new version of Confluence is released, I delete the &lt;tt&gt;confluence&lt;/tt&gt; directory in the working copy of my repo and replace it with the files from the new zip file and commit the files with &lt;tt&gt;--addremove&lt;/tt&gt; flag, which will automatically add all the new files and remove all the deleted files to/from the repository. This allows me to track diffs between Confluence versions, which is very handy when I'm debugging a new issue and want to find out in which release it was introduced.&lt;/li&gt;&lt;li&gt;In addition to this main repository I have a versioned (stored as a real hg repo) Mercurial Queue associated with it. The patch repo is very easy to create, just by issuing &lt;tt&gt;hg qinit -c&lt;/tt&gt; command.&lt;/li&gt;
&lt;li&gt;Now every time I want to change something, I create a new patch with &lt;tt&gt;hg qnew mypatchname.patch&lt;/tt&gt;, modify the confluence source and then just do &lt;tt&gt;hg qrefresh&lt;/tt&gt; to move my changes to &lt;tt&gt;mypatchaname.patch&lt;/tt&gt;. This doesn't commit the changes in the patch repo, you have to do that explicitly via &lt;tt&gt;hg qcommit&lt;/tt&gt; or by changing your current directory to &lt;tt&gt;.hg/patches&lt;/tt&gt; and issuing a regular &lt;tt&gt;hg commit&lt;/tt&gt; there.&lt;/li&gt;&lt;li&gt;Once you have patches in your queue, you can now easily apply and unapply patches with commands like &lt;tt&gt;hg qpush&lt;/tt&gt;, &lt;tt&gt;hg qpop&lt;/tt&gt; and &lt;tt&gt;hg qgoto&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;Additionally you can set "guards" on patches, so you can create collections of patches that should be applied only for certain builds. For example, if you have some patches that should be applied only in development environment, you can set guards on them via &lt;tt&gt;hg qguard&lt;/tt&gt; and then switch between these collections via &lt;tt&gt;hg qselect&lt;/tt&gt; followed by &lt;tt&gt;hg qpop -a&lt;/tt&gt; and &lt;tt&gt;hg qpush -a&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;If you have a need to modify an existing patch, just &lt;tt&gt;hg qgoto&lt;/tt&gt; to it, modify the confluence source code and run &lt;tt&gt;hg qrefresh&lt;/tt&gt; and finally&lt;tt&gt;hg qcommit&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;In order to store binary files in your patches (e.g. images), you'll need to tell MQ to use &lt;a href="http://mercurial.selenic.com/wiki/GitExtendedDiffFormat"&gt;Git's extended diff format&lt;/a&gt;. This is done by adding the following lines into your &lt;tt&gt;~/.hgrc&lt;/tt&gt;:&lt;pre&gt;[defaults]
diff = --git
qrefresh = --git
email = --git&lt;/pre&gt;&lt;/li&gt;&lt;li&gt;When upgrading Confluence, you first need to &lt;tt&gt;hg qpop&lt;/tt&gt; all the patches, replace and commit the new confluence sources as discussed above and then reapply your patches with &lt;tt&gt;hg qpush -a&lt;/tt&gt;. If you are lucky all changes will apply smoothly, however sooner or later, especially once your patch collection grows into decent size you'll need to rebase your patches. The conflicts are resolved via a &lt;a href="http://en.wikipedia.org/wiki/Merge_(revision_control)#Three-way_merge"&gt;3-way merge&lt;/a&gt;, which can be either done by hand, or by using a fancy &lt;a href="http://stackoverflow.com/questions/572237/whats-the-best-three-way-merge-tool"&gt;3-way merge tool&lt;/a&gt;. Once all the patches are applied, be sure to test that everything still works as originally intended.&lt;/li&gt;
&lt;li&gt;To make conflict resolution less frequent, I strive to create patches that do as little as possible to get things done. I avoid any major refactorings, api changes and "forget" about some best practices, especially in those cases when I know that Atlassian won't be interested in accepting my patch upstream. For these patches the main focus should be on getting things done, robustness and maintainability.&lt;/li&gt;
&lt;li&gt;Lastly, if there is a patch that is generally useful for all Confluence users, I usually attach it to relevant RFE/bug report on Confluence's bug tracker. The fewer patches I have to maintain the better. When a patch is accepted upstream, I simply remove it with &lt;tt&gt;hg qrm&lt;/tt&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I could go on and on about what a life-saver Mercurial Queues are but the best way to get to know them is to do some experimentation on your own. I strongly encourage you to do that, it's a good tool to have in your toolbox.&lt;/p&gt;

&lt;p&gt;Just to give you some inspiration, here is a list of some of patches that I created for our build:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bundle jdbc driver with the war (by specifying new maven dependency in confluence's pom file)&lt;/li&gt;
&lt;li&gt;configure Seraph login/security framework&lt;/li&gt;
&lt;li&gt;replace Confluence's favicon with ours&lt;/li&gt;
&lt;li&gt;modify the default log4j config&lt;/li&gt;&lt;li&gt;customize error pages&lt;/li&gt;
&lt;li&gt;turn Australian English language pack into US English&lt;/li&gt;
&lt;li&gt;remove all &lt;tt&gt;lower()&lt;/tt&gt; function calls from Hibernate mapping files and Java classes to get major boost in db performance (see &lt;a href="http://jira.atlassian.com/browse/CONF-10030?focusedCommentId=203023&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_203023"&gt;CONF-10030&lt;/a&gt; and &lt;a href="http://confluence.atlassian.com/display/DOC/Creating+a+Lowercase+Page+Title+Index"&gt;this doc&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;disable mail archiving UI&lt;/li&gt;
&lt;li&gt;enforce that our custom theme is the default and only available theme&lt;/li&gt;
&lt;li&gt;remote api security enhancements&lt;/li&gt;&lt;li&gt;allow only members of our employee group to become space admins&lt;/li&gt;
&lt;li&gt;as I already mentioned, our whole theme plugin is implemented as a bunch of patches against the default Confluence theme&lt;/li&gt;&lt;li&gt;and the list goes on and on...&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this chapter we went through several possible ways to customize Confluence. Plugins and themes are definitely the safest and most manageable way to go, however patching if done right, will give you the most flexibility. If you use the right tools for patch management (like Mercurial Queues), you'll be able to manage big collections of patches with a very little maintenance overhead.&lt;/p&gt;

&lt;p&gt;Next time we'll have a look at a &lt;a href="http://blog.igorminar.com/2010/08/dgc-vi-wiki-organization-and-working.html"&gt;non-technical aspect of running a large Confluence wiki site&lt;/a&gt;.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/4ymkUAg9Mec" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/291132960710016721/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=291132960710016721" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/291132960710016721?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/291132960710016721?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/4ymkUAg9Mec/dgc-v-customizing-and-patching.html" title="DGC V: Customizing and Patching Confluence" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/08/dgc-v-customizing-and-patching.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A0QARnwyfyp7ImA9Wx5UGUg.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-5223176422800976910</id><published>2010-07-30T21:40:00.000-07:00</published><updated>2010-10-24T15:29:07.297-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:29:07.297-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DGC IV: Confluence Upgrades</title><content type="html">&lt;p&gt;This blog post is part of the &lt;a href="http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html"&gt;DevOps Guide to Confluence&lt;/a&gt; series. In this chapter of the guide, we’ll have a look at Confluence upgrades.&lt;/p&gt;

&lt;h2&gt;Confluence Release History and Track Record&lt;/h2&gt;
&lt;p&gt;I started using Confluence at around version 2.4.4 (released March 2007). A lot has changed since then, mostly for better. In my early days, Atlassian was spitting out one release after another — typically 3 weeks or less apart — followed by a major release every 3 months. You can check out the &lt;a href="http://confluence.atlassian.com/display/DOC/Release+Notes"&gt;full release history&lt;/a&gt; on their wiki.&lt;/p&gt;

&lt;p&gt;This changed later on and recently there have been fewer minor releases and bigger major releases delivered 3.5-4 months. Depending on your point of view this is good or bad. It now takes longer to get awaited features and fixes, but on the other hand the releases are more solid and better tested.&lt;/p&gt;

&lt;p&gt;For major releases, Atlassian now usually offers &lt;a href="http://confluence.atlassian.com/display/DEVNET/Early+Access+Programs"&gt;Early Access Program&lt;/a&gt;, which gives you access to milestone builds so that you can see and mold the new stuff before it ships.&lt;/p&gt;

&lt;p&gt;Contrary to the past, the minor versions have been very stable lately and have contained only bugfixes, so it is generally safe to upgrade without a lot of hesitation.&lt;/p&gt;

&lt;p&gt;The same can't be said about major releases. Even though the stability of x.y.0 releases has been dramatically improving lately, I still consider it risky for a big site to upgrade soon after a major release is announced. Wait for the first bugfix release (x.y.1), monitor the &lt;a href="http://jira.atlassian.com/browse/CONF"&gt;bug tracker&lt;/a&gt;, &lt;a href="http://confluence.atlassian.com/display/CONFKB/Confluence+3.3+Known+Issues"&gt;knowledge base&lt;/a&gt; and &lt;a href="http://forums.atlassian.com/"&gt;forums&lt;/a&gt;,  and then consider the upgrade.&lt;/p&gt;

&lt;p&gt;Having gone through many upgrades myself, I think that it is a good practice to stay up to date with your Confluence site. We have usually been at most one major version behind and frequently on the latest version, but as I mentioned avoiding the x.y.0 releases. This has been working well for us.&lt;/p&gt;

&lt;h2&gt;Staying in Touch and Getting Support&lt;/h2&gt;
&lt;p&gt;In order to know what's going on with Confluence releases, it is a good idea to subscribe to the &lt;a href="http://forums.atlassian.com/forum.jspa?forumID=99"&gt;Confluence Announcements&lt;/a&gt; mailing list. This is a very low traffic mailing list used for release and security announcements only.&lt;/p&gt;

&lt;p&gt;Atlassian's tech writers usually do a good job at creating informative release notes, upgrade notes and security advisories, so be sure to read those for each release (even if you are skipping some).&lt;/p&gt;

&lt;p&gt;There are several other channels through which people working on Confluence (plugin) development can communicate and support each other, these include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;official &lt;a href="http://forums.atlassian.com/forum.jspa?forumID=99"&gt;Confluence Development forum&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;official &lt;a href="http://confluence.atlassian.com/display/DEVNET/IRC+Chat+Transcripts"&gt;Atlassian Development IRC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;community &lt;a href="http://www.skype.com/go/joinpublicchat?skypename=me3x%2esk&amp;amp;topic=Confluence%20Dev&amp;amp;blob=LEOYDIRM1D03S5epWlcMXAeupASJTIY7SCjJhOuvGIUpocILzWTBUYcwo9QE3CZIUbIaWXh-qjuTt9bf2p9kxuJHZ0r0BTVViCczotlNh6_lWKig_711tqhglR-J9xBUeZ5YiAKOrSSHlZ59Y0wSEFWCaec3HoxbYssG505bHAxaziZ6z1Hw7nss7-85biWq8MCU9NSaPkY9bO7gXNGSJPuIZOo"&gt;Confluence Development Skype chat&lt;/a&gt; - a place where some of us hang out and discuss issues or share Confluence related news.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Despite Atlassian's claims about their legendary support, I found the official &lt;a href="https://support.atlassian.com/"&gt;support channel&lt;/a&gt; rarely useful. Being a DIY guy and having a reasonable knowledge about Confluence internals, I usually found myself in need of a more qualified support than what the support channel was created for. For this reason my occasional support tickets usually ended up being escalated to the development team, instead of handled by the support team.&lt;/p&gt;

&lt;p&gt;On the other hand the &lt;a href="http://jira.atlassian.com/browse/CONF"&gt;public issue tracker&lt;/a&gt; has been an invaluable source of information and a great communication tool. I wish that more of my bug reports had been addressed, but for the most part I have been receiving reasonable amount of attention even though sometimes I had to request escalation to have someone look at and fix issues that were critical for us.&lt;/p&gt;

&lt;p&gt;The biggest hurdle I've been experiencing with bug fixes and support was that sites of our size are not the main focus for Atlassian and they are not hesitant to be open about it. I often shake my head when I see features of little value (for us that is - because they target small deployments and have little to do with core wiki functionality) being implemented and promoted, but major architectural issues, bugs and highly anticipated features go without attention for years. Just browser the issue tracker and you'll get the idea.&lt;/p&gt;

&lt;h2&gt;Confluence Upgrades&lt;/h2&gt;
&lt;p&gt;The core of the upgrade procedure will depend on the build distribution type you use (standalone, war, building from source), but fundamentally in all cases, you need to shut down your Confluence, replace your app (standalone or war) with the new version and then start it again. An automated upgrade process will take care of updating the database schema, rebuilding the search index and other tasks required for a successful upgrade.&lt;/p&gt;

&lt;p&gt;That was the good news, the bad news is that there is a lot more work to be done in order to successfully upgrade a site with as little downtime as possible.&lt;/p&gt;

&lt;h3&gt;Dev and Test Deployments and Testing&lt;/h3&gt;
&lt;p&gt;Before you upgrade the real thing, you should at first get familiar with the release by upgrading your dev and test environments.&lt;/p&gt;

&lt;p&gt;It's often handy to invite your users to do a brief UAT (user acceptance testing) on your test instance as they might catch something that you or your automated tests haven't.&lt;/p&gt;

&lt;h3&gt;Picking the Outage Window&lt;/h3&gt;
&lt;p&gt;Based on your users' usage patterns (as easily identified by web analytics solutions like Google Analytics), you should pick a time when the usage is low. For our global site this has been early mornings at around 4:30 or 5am PT.&lt;/p&gt;

&lt;p&gt;When it comes to picking a day, we usually stuck with Tuesdays, Wednesday or Thursdays. Nobody wants to be dealing with an issue during a weekend when internal (infrastructure) or external (Atlassian) support is harder to get hold of.&lt;/p&gt;

&lt;p&gt;You also want to communicate the planned outage to your users, so that they are not caught by surprise when you announce an outage on a day when they are releasing important documents on the wiki.&lt;/p&gt;

&lt;p&gt;As far as outage duration goes, we usually plan for a 30min outage during a 1 hour window and most of the time have been able to bring the site back online within 30min or less.&lt;/p&gt;

&lt;h3&gt;Ready, Set, Go!&lt;/h3&gt;
&lt;p&gt;The actual deployment consists of several steps, which in our case are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;disabling load balancing for both nodes (which automatically triggers redirection of all requests to a maintenance pages hosted elsewhere)&lt;/li&gt;
&lt;li&gt;shutting down both nodes&lt;/li&gt;
&lt;li&gt;disabling MySQL replication between the master and slave db&lt;/li&gt;
&lt;li&gt;taking ZFS snapshot of the Confluence Home directory&lt;/li&gt;
&lt;li&gt;taking ZFS snapshot of the MySQL db filesystem on the master&lt;/li&gt;
&lt;li&gt;deploying the new war file&lt;/li&gt;&lt;li&gt;starting one node (while the loadbalancer still ignores it)&lt;/li&gt;
&lt;li&gt;watching container and Confluence logs for any signs of problems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this point, we have one of our nodes up and running (hopefully :-)). We can log in with an admin account and check if everything works as expected. The next tasks include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;upgrading installed plugins&lt;/li&gt;&lt;li&gt;upgrading custom theme (if there is one)&lt;/li&gt;
&lt;li&gt;running a bunch of automated or manual tests, just to verify that everything is ok&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If things are looking good, we can allow the load balancer to start sending requests to our upgraded node. Continue watching logs and eventually deploy the war on the second node and re-enable the MySQL replication.&lt;/p&gt;

&lt;p&gt;If any issues occur during the deployment, we can simply:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;shut down the upgraded node&lt;/li&gt;
&lt;li&gt;revert to the latest Confluence Home snapshot&lt;/li&gt;
&lt;li&gt;revert to the latest MySQL db snapshot&lt;/li&gt;
&lt;li&gt;redeploy the older version of war file&lt;/li&gt;
&lt;li&gt;either retry the deployment or re-enable load balancer and deal work on resolving the issues outside of production environment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my experience from all the dev, test and prod deployments, we've had to roll back and redo an upgrade from scratch only once or twice. It's very unlikely that you'll have to do it, but it's better to be ready than sorry.&lt;/p&gt;

&lt;p&gt;If you are building Confluence from patched sources and deploy your own builds frequently, then you might want to consider automating your deployments with tools like &lt;a href="http://www.capify.org/"&gt;Capistrano&lt;/a&gt;. This will save you a lot of time and make the deployments more reliable and consistent.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;If you do your homework, Confluence is quite easy to upgrade. It's unfortunate that the entire cluster must be shut down for an upgrade even between minor releases, but if you plan your deployment well, you will be able to minimize the downtime to just a few minutes outside of peak hours.&lt;/p&gt;

&lt;p&gt;In the next chapter of this guide, we'll take a look at &lt;a href="http://blog.igorminar.com/2010/08/dgc-v-customizing-and-patching.html"&gt;customizing and patching Confluence&lt;/a&gt;.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/emWlZKfWEfI" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/5223176422800976910/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=5223176422800976910" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/5223176422800976910?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/5223176422800976910?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/emWlZKfWEfI/dgc-iv-confluence-upgrades.html" title="DGC IV: Confluence Upgrades" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/07/dgc-iv-confluence-upgrades.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A0ACQH8-eyp7ImA9Wx5UGUg.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-2866384952197166924</id><published>2010-07-30T08:12:00.000-07:00</published><updated>2010-10-24T15:36:01.153-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:36:01.153-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DGC III: Confluence Configuration and Tuning</title><content type="html">&lt;p&gt;This blog post is part of the &lt;a href="http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html"&gt;DevOps Guide to Confluence&lt;/a&gt; series. In this chapter of the guide, we’ll have a look at Confluence configuration and tuning.&lt;/p&gt;

&lt;p&gt;There are four ways how one can modify Confluence's runtime behavior:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Config Files in Confluence Home directory&lt;/li&gt;
&lt;li&gt;Config Files in WEB-INF/classes&lt;/li&gt;
&lt;li&gt;JVM Options&lt;/li&gt;&lt;li&gt;Admin UI&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Config Files in Confluence Home directory&lt;/h2&gt;
&lt;p&gt;&lt;a href="http://confluence.atlassian.com/display/DOC/Confluence+home+directory+contents"&gt;Confluence Home directory&lt;/a&gt; contains one or more config files that control runtime behavior of Confluence. The most important file is &lt;tt&gt;confluence.cfg.xml&lt;/tt&gt; that must be present in order for Confluence to start. This file can be modified by hand while confluence is shut down, but also gets modified by Confluence occasionally (mostly during upgrades). Your changes will be preserved, as long as you made them while Confluence was offline.&lt;/p&gt;

&lt;p&gt;Another relevant file is &lt;tt&gt;tangosol-coherence-override.xml&lt;/tt&gt; which must unfortunately be used to override Confluence’s lame multicast configuration needed for cluster configuration (see below).&lt;/p&gt;

&lt;p&gt;Lastly there is &lt;tt&gt;config/confluence-coherence-cache-config-clustered.xml&lt;/tt&gt; which contains configuration of the Confluence cache. Generally you don't want to modify this file by hand. I’ll come back to talk about cache configuration later in the Admin UI section of this chapter.&lt;/p&gt;

&lt;p&gt;In general it is advisable to be very consistent about your environment, so that you can then just have a single version of these files that you can distribute on all servers when needed. This includes the directory layout, network interface names, and so on.&lt;/p&gt;

&lt;p&gt;A combination of the first two files will allow you to configure the following:&lt;/p&gt;

&lt;h3&gt;Clustering&lt;/h3&gt;
&lt;p&gt;As I mentioned, this configuration is split between two config files. &lt;tt&gt;confluence.cfg.xml&lt;/tt&gt; contains &lt;tt&gt;confluence.cluster.*&lt;/tt&gt; properties, which allow you to set multicast IP, interface and TTL, but not the port. Only &lt;tt&gt;tangosol-coherence-override.xml&lt;/tt&gt; can do that.&lt;/p&gt;

&lt;p&gt;The cluster IP is by default derived from a "cluster name" specified via the Admin UI or installation wizard. For some reason Atlassian believes that in an enterprise environment one can just let a software pick a random IP and port to run multicast on. I don’t know about any serious datacenter where things work this way. You’ll likely want to explicitly set IP, port, interface name and TTL and the only way to do that is by modifying these files by hand and ignoring the "cluster name" setting in the UI. Make sure that settings are consistent in both files.&lt;/p&gt;

&lt;h3&gt;DB Connection Pool&lt;/h3&gt;
&lt;p&gt;Confluence comes with an embedded connection pool. I believe that you can use your own too (if it comes with your servlet container), but I’d suggest sticking with the embedded one since it is widely used and Atlassian runs their tests with it also. The pool is configured via &lt;tt&gt;confluence.cfg.xml&lt;/tt&gt; and its &lt;tt&gt;hibernate.c3p0.*&lt;/tt&gt; properties. The most important property is pool &lt;tt&gt;max_size&lt;/tt&gt; which will prevent the pool from opening more than a defined number of connections at a time. You want this number to be higher than your typical peak concurrent request count (are you monitoring that?), but not higher than what your db can handle. We have ours set to 300, which is double of our occasional peaks. Don’t forget that in order to take advantage of these connections, you’ll likely need to also increase the worker thread count in your servlet container.&lt;/p&gt;

&lt;h3&gt;DB Connection&lt;/h3&gt;
&lt;p&gt;The connection is configured via &lt;tt&gt;hibernate.connection.*&lt;/tt&gt; properties in &lt;tt&gt;confluence.cfg.xml&lt;/tt&gt;. Depending on your db, you might need to specify several settings for the connection to work well and grok UTF-8. For our MySQL db, we need to set the connection url to something like&lt;pre&gt;jdbc:mysql://server:3306/wikisdb?autoReconnect=true&amp;amp;useUnicode=true&amp;amp;characterEncoding=utf8&lt;/pre&gt;
Note that if you are editing this file by hand, you must escape illegal xml characters. More info about db connection can be found in the &lt;a href="http://confluence.atlassian.com/display/DOC/Configuring+Database+Character+Encoding"&gt;Confluence documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Config Files in &lt;tt&gt;WEB-INF/classes&lt;/tt&gt;&lt;/h2&gt;
&lt;p&gt;Just a side note: if you are building confluence from source then these files can be found at &lt;tt&gt;confluence/confluence-project/conf-webapp/src/main/resources/&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;These files are the most cumbersome to work with because you need to apply your changes to them after each upgrade. I'll describe how we use our automated patching machinery to do this in the future chapter of this guide. For now let's just go over the available config files and what you can change here.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;atlassian-user.xml&lt;/tt&gt; - used to configure user provisioning, e.g. LDAP. For more info read the &lt;a href="http://confluence.atlassian.com/display/DOC/Customising+atlassian-user.xml"&gt;docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;confluence-init.properties&lt;/tt&gt; - this file allows you to specify the path to Confluence Home directory. There is a better way to set this; see the JVM Options section below.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;log4j.properties&lt;/tt&gt; - modify logging preferences, this can also be done via the UI, but AFAIK the changes are not preserved after restart or upgrade.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;seraph-config.xml&lt;/tt&gt; - controls authentication framework. You'll likely need to modify this file if you have a custom authenticator and login page.&lt;/p&gt;

&lt;p&gt;I should note that there are many other (usually xml) configuration files bundled with individual jars in &lt;tt&gt;WEB-INF/lib&lt;/tt&gt;, but those rarely need to be modified.&lt;/p&gt;

&lt;h2&gt;JVM Options&lt;/h2&gt;
&lt;p&gt;Another way to configure certain settings is via  JVM options. From the &lt;a href="http://confluence.atlassian.com/display/DOC/Recognised+System+Properties"&gt;complete list of recognized options&lt;/a&gt; these are the ones we use:&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;-Dcom.atlassian.user.experimentalMapping=true&lt;/tt&gt; - this is a critically important setting for us with 180k users. Without it, our cluster panics due to data overload (&lt;a href="http://jira.atlassian.com/browse/CONF-12319"&gt;CONF-12319&lt;/a&gt;), unfortunately despite Atlassian’s claims that this experimental feature is production ready, it got broken soon after release, and then again &lt;a href="http://jira.atlassian.com/browse/USER-258"&gt;recently&lt;/a&gt;, so you’ll have to &lt;a href="http://jira.atlassian.com/browse/USER-258?focusedCommentId=197479&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_197479"&gt;patch&lt;/a&gt; atlassian-user module to get it to work.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;-Dconfluence.disable.peopledirectory.anonymous=true&lt;/tt&gt; - for big public deployments the people directory is a privacy risk and generally useless for anonymous users, we have it disabled for anonymous users.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;-Dconfluence.disable.mailpolling=true&lt;/tt&gt; - early on we decided that we don’t want people to build up mail archives on our site. While the feature is useful for small internal wikis, it’s too much of a risk with little reward to provide it on a public wiki. Unfortunately, this option only disables mail fetching. The UI for setting up mail archives will still be present in the wiki; you'll have to patch Confluence to remove it.&lt;/p&gt;

&lt;p&gt;I didn't learn about &lt;tt&gt;-Dconfluence.home&lt;/tt&gt; until recently. I would much prefer to use it than to mess with &lt;tt&gt;confluence-init.properties&lt;/tt&gt; file in &lt;tt&gt;WEB-INF/classes&lt;/tt&gt;.&lt;/p&gt;

&lt;h2&gt;Admin UI&lt;/h2&gt;
&lt;p&gt;&lt;a href="http://confluence.atlassian.com/display/DOC/Configuring+Confluence"&gt;Most of the Confluence settings&lt;/a&gt; can be configured via Confluence admin interface. The downside is that the configuration is not being versioned, and there is no easy way see diffs and to roll back unless you want to hack the db and replace data from backups. With that in mind lets look at the most important settings.&lt;/p&gt;

&lt;h3&gt;General Configuration&lt;/h3&gt;
&lt;p&gt;&lt;tt&gt;Server Base Url&lt;/tt&gt; - make sure this is set up correctly, otherwise confluence and its plugins won’t work properly.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Users see Rich Text Editor by default&lt;/tt&gt; - we have this set to off. In the past many RTE bugs were causing headaches to our writers especially those who did lots of editing. In Confluence 3.2 and 3.3 the editor has improved a lot and it might be the time for us to reconsider this decision.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;CamelCase Links&lt;/tt&gt; - this used to be one of THE wiki features in general a few years ago, but as wikis have matured and people started creating more and more content, the automatic linking started to cause more problems than help. We have it off.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Threaded Comments&lt;/tt&gt; - very useful; make sure it’s on.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Remote API (XML-RPC &amp;amp; SOAP)&lt;/tt&gt; - we have ours on, but I &lt;a href="http://jira.atlassian.com/browse/CONF-15160"&gt;patched the remote api code&lt;/a&gt; to restrict access to it.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Compress HTTP Responses&lt;/tt&gt; - OMG please turn this on if is isn't already. It’s a major performance booster. Alternatively you might want to do the compression in your webserver as Tim pointed out in comments below.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;JavaScript served in header&lt;/tt&gt; - we have this on, but for better performance it should be off. Unfortunately that breaks many plugins and legacy code that uses obtrusive javascript. Since this option has been around for a while, it might be worth it to just set it to off and deal with the remaining broken things as they are identified.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;User email visibility&lt;/tt&gt; - we have this set to visible to admins only, but our power users found it too be a collaboration barrier so I patched the code and made emails visible to our global employees group in addition to the admin group. It would be nice if confluence allowed such a configuration out the of box.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Anonymous Access to Remote API&lt;/tt&gt; - No sane person will leave this on. If I were in charge, I would go as far as removing it from Confluence product.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Anti XSS Mode&lt;/tt&gt; - This is a very handy feature. Not 100% bulletproof, but it helped to significantly decrease the number of XSS exploits in Confluence since its introduction.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Attachment Maximum Size (B)&lt;/tt&gt; - I mentioned this one already in the first chapter when discussing the db configuration. If you are running a cluster (or think that you will eventually run it), set this to some low value. Ours is 5MB.&lt;/p&gt;

&lt;p&gt;&lt;tt&gt;Connection Timeouts&lt;/tt&gt; - these options are pretty handy when you have lots of feed macros, gadgets and other plugins that pull contet from remote sites. In order to prevent worker thread pileup in your servlet container don’t go beyond the default 10sec (which is already pretty high).&lt;/p&gt;

&lt;h3&gt;Daily Backup Administration&lt;/h3&gt;
&lt;p&gt;As I previously mentioned, this backup feature is useless for anything but tiny sites. Disable it.&lt;/p&gt;

&lt;h3&gt;Manage Referrers&lt;/h3&gt;
&lt;p&gt;Collecting referrers is ok, but don’t display them publicly if you run a site on the Internet. Otherwise you run a risk of exposing some internal only URIs that might contain confidential information.&lt;/p&gt;

&lt;h3&gt;Languages&lt;/h3&gt;
&lt;p&gt;Most of our documentation and content is written in American English, but unfortunately Atlassian doesn’t provide such a language pack. I just patch the default Australian English pack to get a US English pack. It works great and is almost no hassle to maintain.&lt;/p&gt;

&lt;h3&gt;User macros&lt;/h3&gt;
&lt;p&gt;I discourage their use in enterprise environement. The lack of versioning, automated testing and documentation makes them a nightmare to maintain. Just create Confluence plugins for everything you need.&lt;/p&gt;

&lt;h3&gt;PDF Export Language Support&lt;/h3&gt;
&lt;p&gt;This is a tricky one. It took us quite a while to find the right single font that could be used to generate PDFs in almost all languages. Finally we found &lt;tt&gt;soui_zhs.ttf&lt;/tt&gt;, which is distributed with OpenOffice. It’s a huge file, but it works like charm for all kinds of non-wester languages.&lt;/p&gt;

&lt;h3&gt;Themes&lt;/h3&gt;
&lt;p&gt;For reasons I’ll discuss later, we disabled all the themes except for our custom one, which is the global and default space theme. To disable a theme you have to go to plugins view and disable the appropriate theme plugins.&lt;/p&gt;

&lt;h3&gt;Cache Statistics&lt;/h3&gt;
&lt;p&gt;The name of this section in the UI is misleading, because not only can you view cache statistics here, but more importantly you can fully control the cache size via the UI. And in this case, I’m really glad that there is a UI to manage the cache config xml file, which due to its size is really hard to work with by hand. The changes you make via the UI are persisted in the Confluence Home directory and propagated thought the cluster.&lt;/p&gt;

&lt;p&gt;Out of all the things you can tune via the admin UI, the cache tuning will have the biggest impact on your site’s performance. Confluence ships with cache settings optimized for smaller sites, so increasing the cache size is unavoidable for larger deployments.&lt;/p&gt;

&lt;p&gt;Tuning the cache settings is a time-consuming process because you need to balance the memory consumption with performance improvements. Usually I revisit the cache stats once a month and look for caches that are performing badly because the number of objects allowed in that particular cache is low. Confluence caching system is composed of many caches that are controlled via this UI.&lt;/p&gt;

&lt;p&gt;The best indicator of an overflowing cache is when the "Effectiveness" value is low (under 70-80%) AND “Percent Used” value is high (over 80%) AND usually the “Expired” value will be relatively high compared to “Hit” value in the same cell. This means that Confluence needs to go to the DB too often, even though it could cache the data in memory if the cache was bigger.&lt;/p&gt;

&lt;p&gt;If you don’t understand what all the cache names and numbers mean, don’t worry about that too much. As long as you don’t make any dramatic changes too quickly and you monitor your JVM heap usage, you can’t break anything.&lt;/p&gt;

&lt;p&gt;As you increase the cache sized, you’ll eventually start running out of heap space. That’s why you need to monitor the JVM and increase the &lt;tt&gt;-Xmx&lt;/tt&gt; value as needed. If the number of concurrent users increases, you might also need to slightly increase the &lt;tt&gt;-Xmn&lt;/tt&gt; value (see the &lt;a href="http://blog.igorminar.com/2010/07/dgc-ii-jvm-tuning.html"&gt;JVM Tuning chapter&lt;/a&gt; for more info).&lt;/p&gt;

&lt;p&gt;I wish Atlassian would provide better descriptions for all the available caches, because unless you know Confluence internals well, you won’t know what you are doing and that doesn’t feel good. Additionally, I’d like to see a way to limit memory usage, not the number of objects, because their size varies. Ideally, I'd really like to be able to just say "Use 3GB of memory for cache and distribute it in the most efficient way. Oh and let me know if you need more or less memory to work effectively". It would be better if Atlassian moved away from an in-process cache which in my opinion is not a good fit for Confluence. Maybe we'll get there one day.&lt;/p&gt;

&lt;h3&gt;Plugins&lt;/h3&gt;
&lt;p&gt;This section of the Admin UI is where you can install, uninstall, enable and disable plugins and their modules. There is also a Plugin Repository which additionally allows you to install plugins from Altassian’s remote servers or user specified URIs. The recently released &lt;a href="https://plugins.atlassian.com/plugin/details/23915"&gt;Atlassian Universal Plugin Manager&lt;/a&gt; will eventually replace the latter one (or both?), I’m glad to see that happening.&lt;/p&gt;

&lt;p&gt;I suggest that you disable plugins that you don’t use or don’t want your users to use as soon as possible. We disabled all the bundled themes because we wanted to provide users with only one custom theme developed and maintained by us (I’ll explain the reasoning in a future chapter). For security reasons the&lt;tt&gt;html&lt;/tt&gt; and &lt;tt&gt;html-include&lt;/tt&gt; macros should in my opinion be disabled on all but family Confluence deployments.  And for performance reasons &lt;tt&gt;Confluence Usage Stats&lt;/tt&gt; plugin is &lt;a href="http://confluence.atlassian.com/display/CONFKB/Poor+Performance+due+to+Confluence+Usage+Plugin+-+Space+Activity"&gt;not suitable for any bigger deployments&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Plugin installation is very easy to do. That’s both good and bad. The plugin framework provided by Confluence is a &lt;em&gt;very sophisticated&lt;/em&gt; piece of software which allows you to install and uninstall plugins on the fly without any need to restart the server. Need to quickly install a fixed version of a buggy plugin without disturbing hundreds or thousands of users that are currently using your site? Done. That’s how easy it is.&lt;/p&gt;

&lt;p&gt;On the other hand, it is tempting to install plugins just because they have cool names or promise great features. You can do that in your dev or test environment, but in production you should only install plugins that you picked after some serious consideration.&lt;/p&gt;

&lt;p&gt;This is what I look for when deciding whether to install a plugin or not:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;was the functionality provided by the plugin requested by larger group of users or is the plugin needed for site administration purposes?&lt;/li&gt;
&lt;li&gt;was the plugin developed and tested in-house, if no is it supported by Atlassian, if no can we or some respectable Atlassian partner support it should there be some problems?&lt;/li&gt;
&lt;li&gt;is the plugin compatible with our confluence version? does it have a track record of being compatible or was it made compatible with new Confluence versions as they were released?&lt;/li&gt;
&lt;li&gt;are there no major unresolved bugs in the areas of performance, scalability, data integrity and security?&lt;/li&gt;&lt;li&gt;does the plugin have an automated test suite with good test coverage?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you answer “yes” to all of these questions, then you may go ahead do a trial before installing the plugin in production. Otherwise, you might provide your feedback to the plugin authors and wait if the pending issues get resolved before proceeding.&lt;/p&gt;

&lt;p&gt;I don’t want to be harsh, but especially 2-3 years ago most of the plugins created for Confluence were crap. But as the platform matures, and Atlassian partners get involved more, the quality of available plugins has been slowly increasing. The main issue that I see is that the existing plugins are not developed and tested with large scale deployments in mind. Hopefully things will change as more and more deployments grow beyond small and medium sites. It’s unfortunate that even some commercial plugins, suffer from the very same issues that plague plugins created by bunch of volunteers and enthusiast. So pick your plugins carefully, do a trial, check for unresolved bugs and existing user complaints, and then decide.&lt;/p&gt;

&lt;p&gt;I've been reasonably active in the Atlassian development community and from these interactions, I'd like to highlight the work done by Dan Hardiker (&lt;a href="http://www.adaptavist.com/"&gt;Adaptavist&lt;/a&gt;) and Roberto Dominguez (&lt;a href="http://comalatech.com/"&gt;Comalatech&lt;/a&gt;). And though I haven't worked with guys from &lt;a href="http://www.customware.net/"&gt;CustomWare&lt;/a&gt;, they are also considered to be pretty sharp.&lt;/p&gt;

&lt;p&gt;Be especially careful with plugins that provide new macros for the wiki content. Once you install such a plugin you won't be able to uninstall it without breaking wiki pages until all the references to that macro are removed (with tens of thousands of pages and no ability to track the references this might be a big challenge).&lt;/p&gt;

&lt;p&gt;In general however, try to keep the number of plugins low. It’s better for performance and you won’t get in trouble as often when you need to upgrade Confluence but some of the plugins you use are not compatible with the new Confluence version.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;You should now have a good idea about how to configure Confluence and where this configuration is done. In the next chapters we'll look at &lt;a href="http://blog.igorminar.com/2010/07/dgc-iv-confluence-upgrades.html"&gt;upgrading Confluence&lt;/a&gt;, patching and more.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/MKylgvRW0Cw" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/2866384952197166924/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=2866384952197166924" title="9 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2866384952197166924?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2866384952197166924?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/MKylgvRW0Cw/dgc-iii-confluence-configuration-and.html" title="DGC III: Confluence Configuration and Tuning" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>9</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/07/dgc-iii-confluence-configuration-and.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkUDQnkzcSp7ImA9Wx5UGUs.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-2202784255489904385</id><published>2010-07-27T00:13:00.000-07:00</published><updated>2010-10-24T15:44:33.789-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:44:33.789-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DGC II: The JVM Tuning</title><content type="html">This blog post is part of the &lt;a href="http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html"&gt;DevOps Guide to Confluence series&lt;/a&gt;. In this chapter of the guide, I’ll be focusing on JVM tuning with the aim to make our Confluence perform well and operate reliably.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;JDK Version&lt;/h2&gt;
First things first: use a recent JDK. Java 5 (1.5) &lt;a href="http://java.sun.com/j2se/1.5/index.jsp"&gt;has been EOLed 1.5 years ago&lt;/a&gt;, there is absolutely no reason for you to use it with Confluence. As &lt;a href="http://www.atlassian.com/summit/2010/presentations/development-speed/performance-tuning-application-development.jsp"&gt;George pointed out&lt;/a&gt; in his presentation, there are some significant performance gains to be made just by switching to Java 6 and you can get another performance boost if you upgrade from an older JDK 6 release to a recent one. JDK 6u21 is currently the latest release and that’s what I would pick if I were to set up a production Confluence server today.&lt;br/&gt;&lt;br/&gt;

If you are wondering about which Java VM to use, I suggest that you stick with Sun’s HotSpot (also known as Sun JDK). It’s the only VM supported by Atlassian and I really don’t see any point in using anything else at the moment.&lt;br/&gt;&lt;br/&gt;

Lastly it goes without saying that you should use &lt;tt&gt;-server&lt;/tt&gt; JVM option to enable the server VM. This usually happens automatically on server grade hardware, but it's safer to set it explicitly.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;VM Observability&lt;/h2&gt;
For me using JDK 6 is not just about performance, but also about &lt;a href="http://www.sun.com/bigadmin/features/articles/java_se6_observability.jsp"&gt;observability of the VM&lt;/a&gt;. Java 6 contains many enhancements in the monitoring, debugging and probing arena that make JDK 5 and its VM look like an obsolete black box. &lt;br/&gt;&lt;br/&gt;

Just to mention some enhancements, the amount of interesting VM telemetry data exposed via JMX is amazing, just point a &lt;a href="https://visualvm.dev.java.net/"&gt;VisualVM&lt;/a&gt; to a local Java VM to see for yourself (no restart or configuration needed). Be sure to install &lt;a href="http://java.sun.com/performance/jvmstat/visualgc.html"&gt;VisualGC&lt;/a&gt; plugin for VisualVM. In order to allow remote connections you’ll need to start the JVM with these flags:&lt;pre&gt;-Dcom.sun.management.jmxremote.port=some_port
-Dcom.sun.management.jmxremote.password.file=/path/to/jmx_pw_file
-Djavax.net.ssl.keyStore=/path/to/your/keystore
-Djavax.net.ssl.keyStorePassword=your_pw&lt;/pre&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_p64VvtgZDp4/TE59F1HW4cI/AAAAAAAAAeI/BGIdnPWgnfY/s1600/visualvm.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 290px;" src="http://2.bp.blogspot.com/_p64VvtgZDp4/TE59F1HW4cI/AAAAAAAAAeI/BGIdnPWgnfY/s400/visualvm.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5498469734176711106" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;

Unless you make the port available only on some special admin-only network, you should password protect the JMX endpoint as well as use SSL. The JMX interface is very powerful and in the wrong hands could result in security issues or outages caused by inappropriate actions.&lt;br/&gt;&lt;br/&gt;

For more info about all the options available read &lt;a href="http://download.oracle.com/docs/cd/E17476_01/javase/1.5.0/docs/guide/management/agent.html"&gt;this document&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;

In addition to JMX, on some platforms there is also good &lt;a href="http://en.wikipedia.org/wiki/DTrace"&gt;DTrace&lt;/a&gt; integration which helped me troubleshoot some Confluence issues in production without disrupting our users.&lt;br/&gt;

And lastly there is &lt;a href="http://kenai.com/projects/btrace/pages/Home"&gt;BTrace&lt;/a&gt; that allowed me to troubleshoot &lt;a href="/2008/06/btrace-dtrace-for-java.html"&gt;a nasty hibernate issue&lt;/a&gt; once. It's a very handy tool that as opposed to DTrace, works on all OSes.&lt;br/&gt;&lt;br/&gt;

I can’t stress enough how important continuous monitoring of your Confluence JVMs is. Only if you know how your JVMs and app are doing, you can tell if your tuning has any effect. George Barnett has also a &lt;a href="http://confluence.atlassian.com/display/DOC/Performance+Testing+Scripts"&gt;set of automated performance tests&lt;/a&gt; which are handy to load test your test instance and compare results before and after you make some tweaks.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Heap and Garbage Collection Must Haves&lt;/h2&gt;
After upgrading the JDK version, the next best thing you can do is to give Confluence lots of memory. In &lt;a href="http://blog.igorminar.com/2010/07/dgc-i-infrastructure.html"&gt;the infrastructure chapter of the guide&lt;/a&gt;, I mentioned that you should prepare your HW for this, so let’s put this memory to use.&lt;br/&gt;&lt;br/&gt;

Before we set the heap size, we should decide between 32-bit JVM and 64-bit JVM. 64-bit VM is theoretically a bit slower, but allows you to create huge heaps. 32-bit JVM has heap size limited by the available 32-bit address space and other factors. 32bit OSes will allow you to create heaps up to only 1.6-2.0 GB. 64bit Solaris will allow you to create 32bit JVMs with up to 4GB heap (&lt;a href="http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_heap_32bit"&gt;more info&lt;/a&gt;). For anything bigger than that you have to go 64bit. It’s not a big deal, if your OS is 64bit already. The option to start the VM in 64bit mode is &lt;tt&gt;-d64&lt;/tt&gt;. On almost all platforms the default is &lt;tt&gt;-d32&lt;/tt&gt;.&lt;br/&gt;&lt;br/&gt;

Before I go into any detail, I should explain what are the main objectives of heap and garbage collection tuning for Confluence. The objectives are:
&lt;ul&gt;
&lt;li&gt;heap size - we need to tell JVM how much memory to use&lt;/li&gt;
&lt;li&gt;garbage collector latency - garbage collection often requires that the JVM stops your application, this is GC pauses are often invisible, but with large heaps and under certain conditions might become very significant (30-60+ seconds)&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;&lt;br/&gt;

Additionally we should also know a thing or two about how Confluence uses the heap. The main points are:
&lt;ul&gt;
&lt;li&gt;Objects created by Confluence and stored on the heap generally fall into three categories:
  &lt;ul&gt;
  &lt;li&gt;short-lived objects - life-cycle of these is bound to a http request&lt;/li&gt;
  &lt;li&gt;medium-lived objects - usually represent cache entries with shorter TTL&lt;/li&gt;
  &lt;li&gt;long-lived objects - represent cache entries with big TTL, settings and infrastructure objects (plugin framework, rendering engine, etc), cache entries taking most of the space.&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Confluence creates lots of short-lived objects per request&lt;/li&gt;
&lt;li&gt;Half or more of the heap will be used by long-lived cache objects&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;&lt;br/&gt;

By combining our objectives with our knowledge of Confluences heap profile, our tuning should focus on providing enough heap space for the application to have space for the cache, short-lived objects, as well as some extra buffer. Given that long-lived objects will (eventually) reside in the old generation of the heap, we want to avoid promoting short-lived objects there, because otherwise we’ll then need to do massive garbage collections of the old generation unnecessarily. Instead we should try to limit the promotion from young generation only to those objects, that will likely belong to the long-lived category.&lt;br/&gt;&lt;br/&gt;

We’ll also need to figure out how much heap you need to use. Unfortunately there isn’t an easy way to find this out, except for some educated guessing and trial &amp; error. You can also read this &lt;a href="http://confluence.atlassian.com/display/DOC/Server+Hardware+Requirements+Guide"&gt;HW Requirements document&lt;/a&gt; from Atlassian that can give you an idea about some starting points. I believe we started at 1GB, but over time went through 2GB, 3GB, 3.5GB, 4GB, 5GB all the way to 6GB.&lt;br/&gt;&lt;br/&gt;

The Confluence heap size depends on the number of concurrent users and the amount of content you have. This is mainly because Confluence uses a massive (well, in our case it is) in-process cache that is stored on the heap. We’ll get to Confluence and cache tuning in a later chapter of this guide.&lt;br/&gt;&lt;br/&gt;

So let’s set the max heap size. This is done via &lt;tt&gt;-Xmx&lt;/tt&gt; JVM option:&lt;pre&gt;-Xmx6144m 
-Xms6144m&lt;/pre&gt;
The additional &lt;tt&gt;-Xms&lt;/tt&gt; parameter says that the JVM should reserve all 6GB at startup — this is to avoid heap resizing which can be slow, especially when dealing with large heaps.&lt;br/&gt;&lt;br/&gt;

The rest of the heap settings in this post are based on 6GB heap size, you might need to make appropriate changes to adjust for your total heap size.&lt;br/&gt;&lt;br/&gt;

The next JVM option is &lt;tt&gt;-Xmn&lt;/tt&gt;, which specifies how much of the heap should be dedicated to young generation (you should read up on &lt;a href="http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#generations"&gt;generational gc&lt;/a&gt; if you don’t know what I’m talking about). The default is something like 25% or 33%, I set the young generation to ~45% of the entire heap:&lt;pre&gt; -Xmn2818m&lt;/pre&gt;&lt;br/&gt;

Increasing the permanent generation size is also usually required given the number of classes that Confluence loads. This is done via &lt;tt&gt;-XX:MaxPermSize&lt;/tt&gt; option:&lt;pre&gt;-XX:MaxPermSize=512m&lt;/pre&gt;&lt;br/&gt;

Given that determining the right heap size for your environment is non-trivial task for larger instances, especially if occasional memory leaks start consuming the precious memory, you always want to have as much data as possible to debug memory exhaustion issues. Aside from good monitoring (which I mentioned in the previous chapter) you should also configure your JVM to dump the heap, when an OutOfMemoryException occurs. You can then analyze this heap dump for potential memory leaks. &lt;br/&gt;&lt;br/&gt;

Since we are dealing with relatively big heaps, make sure you have enough space on the disk (heap dumps for 6GB heap usually take 2-4GB). I’ve had &lt;a href="http://blog.igorminar.com/2009/03/identifying-threadlocal-memory-leaks-in.html"&gt;a very good experience&lt;/a&gt; using &lt;a href=”http://www.eclipse.org/mat/”&gt;Eclipse Memory Analyzer&lt;/a&gt; to analyze these large heaps (VisuaVM or jhat are not up for analyzing heaps of this size). The relevant JVM options are:
&lt;pre&gt;
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/some/relative/or/absolute/dir/path
&lt;/pre&gt;&lt;br/&gt;

While trying to minimize gc latency in order to avoid situations when users have to wait several seconds for the stop-the-world (STW) gc to finish before their pages render is a commendable thing to do, the main reason why you want to do this is to avoid Confluence cluster panics.&lt;br/&gt;&lt;br/&gt;

Confluence has this “wonderful” cluster safety mechanism that is sensitive to any latency bigger than a few tens of seconds. In case a major STW gc occurs, the cluster safety code might announce cluster panic and shut down all the nodes (that’s right, all the nodes, not just the one that is misbehaving).&lt;br/&gt;&lt;br/&gt;

In order to be informed of any latencies caused by gc, you need to turn on gc logging. This is the magic combination of switches that works well for me:
&lt;pre&gt;
-Xloggc:/some/relative/or/absolute/path/wikis-gc.log 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps 
-XX:+PrintGCDateStamps 
-XX:+PrintTenuringDistribution
&lt;/pre&gt;&lt;br/&gt;

Unfortunately the file specified via &lt;tt&gt;-Xloggc&lt;/tt&gt; will get overwritten during a jvm restart, so make sure you preserve it either manually before a restart or automatically via some restart script. Additionally reading the gc log is a tough job that requires some practice and since the format varies a lot depending on your JDK version and garbage collector, I’m not going to describe it here.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Performance tweaks&lt;/h2&gt;
The first performance boosting JVM option I'd like to mention is &lt;tt&gt;-XX:+AggressiveOpts&lt;/tt&gt;, which will turn on performance enhancements that are expected to be on by default in the future JVM versions (&lt;a href="http://java.sun.com/performance/reference/whitepapers/tuning.html#section4.2.4"&gt;more info&lt;/a&gt;).&lt;br/&gt;&lt;br/&gt;

If you are using 64bit JVM then &lt;tt&gt;-XX:+UseCompressedOops&lt;/tt&gt; will make a big difference and will virtually eliminate the performance penalty you pay for switching from 32bit to 64bit JVM.&lt;br/&gt;&lt;br/&gt;

And lastly there is &lt;tt&gt;-XX:+DoEscapeAnalysis&lt;/tt&gt; which will boost the performance by another few percents.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Optional Heap and GC tweaks&lt;/h2&gt;
To slow down object promotion into the old generation, you might want to tune the sizes of the survivor space (a heap generation within the young generation). To achieve this, we want the survivor space to be slightly bigger than the default. Additionally I also want to keep the promotion rate down (objects that survive a specific number of collections in the survivor space will be be promoted to the older generation), so I use these options:
&lt;pre&gt;
-XX:SurvivorRatio=6
-XX:TargetSurvivorRatio=90
&lt;/pre&gt;&lt;br/&gt;

I also found that by using parallel gc for the young generation and concurrent mark and sweep gc for the older generation I can practically eliminate any significant SWT gc pauses. Your mileage might vary on this one, so do some testing before you use it in production. These are the settings I use:
&lt;pre&gt;
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=68
-XX:MaxTenuringThreshold=31
-XX:+CMSParallelRemarkEnabled
&lt;/pre&gt;&lt;br/&gt;

&lt;h2&gt;Resources&lt;/h2&gt;
The information above was gather from years of experience as well as various sources, including the following:
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html"&gt;Java SE 6 HotSpot Virtual Machine Garbage Collection Tuning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://java.sun.com/performance/reference/whitepapers/tuning.html"&gt;Java Tuning White Paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blogs.sun.com/jonthecollector/entry/the_fault_with_defaults"&gt;The Fault with Defaults&lt;/a&gt; (blog post)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://weblogs.java.net/blog/sdo/archive/2007/12/a_glassfish_tun.html"&gt;Scott Oaks - A Glassfish Tuning Primer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp"&gt;Java HotSpot VM Options&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.ibm.com/developerworks/java/library/j-jtp11253/"&gt;Garbage collection in the HotSpot JVM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blogs.sun.com/watt/resource/jvm-options-list.html"&gt;A Collection of JVM Options&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.md.pp.ru/~eu/jdk6options.html"&gt;The most complete list of -XX options for Java 6 JVM&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;h2&gt;Running Multiple Web Apps in one VM&lt;/h2&gt;
Don't do that. Really. Don't. Bad things will happen if you do (OOME, classloading issues etc).&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
Your JVM should now be in a good shape to host Confluence and serve your clients. In the next chapter of this guide I'll write about &lt;a href="http://blog.igorminar.com/2010/07/dgc-iii-confluence-configuration-and.html"&gt;Confluence configuration, tuning&lt;/a&gt;, upgrades and more.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/tYUbERi4t2M" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/2202784255489904385/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=2202784255489904385" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2202784255489904385?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2202784255489904385?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/tYUbERi4t2M/dgc-ii-jvm-tuning.html" title="DGC II: The JVM Tuning" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/_p64VvtgZDp4/TE59F1HW4cI/AAAAAAAAAeI/BGIdnPWgnfY/s72-c/visualvm.png" height="72" width="72" /><thr:total>3</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/07/dgc-ii-jvm-tuning.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Ck4ESHY5eip7ImA9Wx5UGUs.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-4054319096510216382</id><published>2010-07-25T15:00:00.000-07:00</published><updated>2010-10-24T15:55:09.822-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T15:55:09.822-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DGC I: The Infrastructure</title><content type="html">In &lt;a href="http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html"&gt;the introductory post&lt;/a&gt;, I mentioned that a Confluence cluster is the way to go big. Let's go through some of the main things to consider when you start preparing your infrastructure.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Confluence cluster&lt;/h2&gt;
To build a Confluence site, you need Confluence :-). Well, make it two... as in a two-node cluster license. I recommend this for any bigger site with relatively high uptime expectations, even if you know that your amount of traffic won't require load balancing between two nodes. I often find my self in a need of a restart (e.g. during a patch deployment) and with a cluster, you can restart one node at a time and your users won't even know about it.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Network&lt;/h2&gt;
My team operates other big sites, and from all of them we expect some level of redundancy. Typically we split everything between "odd" (composed of hosts with hostnames ending with an odd number) and "even" strings, and this applies to Confluence nodes as well (that's why you need two-node license). Each string is composed of a border firewall, load balancer, switches and the actual servers (web/application/database/whathaveyou) and both strings can either share the load or work as primary&amp;standby depending on your application needs and network configuration.&lt;br/&gt;&lt;br/&gt;

This kind of splitting, allows us to take half of our datacenter offline for maintenance when needed or allows us to absorb potential failure of any hardware or software within one string without any perceivable interruption of service.&lt;br/&gt;&lt;br/&gt;

Sure, you can make things even more redundant by adding a third or forth string, but none of our apps requires that level of redundancy and the cost and complexity of getting there is therefore hard to justify.&lt;br/&gt;&lt;br/&gt;

There are two important things that matter when it comes to setting up the network, and both can make or break you Confluence clustering.
&lt;ol&gt;
&lt;li&gt;The latency between the two nodes should be minimal. Ideally they should be just one hop apart and on a fast network (1GBit). There will be a lot of communication going on between your Confluence nodes, and you want it to happen as quickly as possible, otherwise the cluster synchronization will drag down your overall cluster performance. Don't even think about putting the two nodes into different datacenters, let alone on different continents. Confluence clustering was not built for that type of scenario.&lt;/li&gt;
&lt;li&gt;Make absolutely sure that your network (mainly switches, OS, firewall) supports multicast.&lt;/li&gt;
&lt;/ol&gt;&lt;br/&gt;

The best way to check that the multicast works reliably is to use the multicast test tool that is bundled with &lt;a href="http://www.oracle.com/technology/products/coherence/index.html"&gt;Coherence&lt;/a&gt; (a library that is bundled with Confluence). To run it just run the following command on all nodes and check if all packes are being delivered and no duplicates are present:
&lt;pre&gt;
java -cp $CONFLUENCE_PATH/WEB-INF/lib/coherence-x.y.jar:$CONFLUENCE_PATH/WEB-INF/lib/tangosol-x.y.jar \
com.tangosol.net.MulticastTest \
-group $YOUR_MULTICAST_IP:$YOUR_MULTICAST_PORT \
-ttl 1 \
-local $NODE_IP
&lt;/pre&gt;&lt;br/&gt;

In our environment, it took us months of waiting for the right patch from our network gear vendor and some OS patching to make things totally stable. Fortunately, our ops guys eventually found the magic combination of patches and settings, and then we were good to go.&lt;br/&gt;&lt;br/&gt;

Our site uses both http and https protocols for content delivery and since we already had an SSL accelerator available in our datacenter we utilized it for Confluence, but I don't think that with current hardware, hw acceleration is not very important these days.&lt;br/&gt;&lt;br/&gt;

Another noteworthy suggestion I have for your network is the load balancer configuration. We started off with a session-affinity-based load-balancing, but at one point people started to notice that sometimes they see different content than their colleagues. This was due to delay in propagation of changes throughout the cluster. Usually the delay is unnoticeable, but for some reasons it's not always the case. I haven't investigated this issue further and just switched to primary&amp;slave load balancing, which has been working great for us since. This of course will work only if each of your nodes can handle all the traffic on its own, but you can trust me that it solves all the issues with users that don't believe in eventual consistency :-).&lt;br/&gt;&lt;br/&gt;

Hopefully your load balancer will perform healthchecks against your nodes. The &lt;tt&gt;/errors.jsp&lt;/tt&gt; path is the ideal target for these healthchecks, because it returns HTTP 200 only if everything is ok with the node.&lt;br/&gt;&lt;br/&gt;

When it comes to firewall rules (you have a firewall right?), you shouldn't allow incoming connection from public networks directly to your servers, all the public traffic should go through loadbalancer only. As for outbound connections, you should allow your servers to connect to any public server on ports 80 (HTTP) and 443 (HTTPS); these connections are needed for feed retrieval, open social gadgets and plugin installation.&lt;br/&gt;&lt;br/&gt;


&lt;h2&gt;Hardware (cpu, memory, disk)&lt;/h2&gt;
&lt;em&gt;Update&lt;/em&gt;: I came across this &lt;a href="http://confluence.atlassian.com/display/DOC/Server+Hardware+Requirements+Guide"&gt;HW requirements&lt;/a&gt; document from Atlassian, which is helpful especially for smaller instances.&lt;br/&gt;&lt;br/&gt;

When you are making your hardware choices, I suggest you stick with a server that is relatively recent and has decent single-threaded performance, yet offers multicore parallelism. Confluence does relatively a lot of number crunching per http request, so both single-threaded and multi-threaded horse-power are needed to get good results. Additionally Confluence's boot process is not the best one, so with poor single-threaded throughput you'll end up waiting minutes for the app to start (at one point I did!).&lt;br/&gt;&lt;br/&gt;

Confluence loves memory! So don't be stingy. RAM is cheap these days, so get a few gigs that will be dedicated just to Confluence. My instance uses 6 GB JVM heap and with additional non-heap memory consumption, OS overhead and an extra buffer. I allocated 10GB of RAM for each Confluence node. You will likely start with much lower memory requirements, but as your instance grows, so will the memory requirements - keep that in mind.&lt;br/&gt;&lt;br/&gt;

When it comes to disk and disk space, you have to realize three things.
&lt;ol&gt;
&lt;li&gt;Confluence stores all of its persistent data in a (hopefully remote) database.&lt;/li&gt;
&lt;li&gt;Confluence relies on fetching data from its Lucene index, stored on the local file system (each node has its own copy). This index is built from the db contents and can be rebuilt at any time.&lt;/li&gt;
&lt;li&gt;Attachments, which can represent a huge chunk of your persistent data, will be stored in the database. Confluence won't let you use e.g. shared filesystem when you are running a cluster.&lt;/li&gt;
&lt;/ol&gt;&lt;br/&gt;

All of this means that you will need a few (dozen) gigabytes of local disk space that can be accessed reasonably quickly. SSD will likely not buy you much, use it for your DB! Server grade hard drives configured in redundant software or hardware RAID should be sufficient for your web/application server (you can skip the RAID if you can rebuild the server really quickly after a disk failure).&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;OS &amp; Filesystem&lt;/h2&gt;
The choice of OS is often a religious one. But I think that it's more important that you are comfortable administering your OS than anything else. We use Solaris 10, or more recently OpenSolaris everywhere. Especially OpenSolaris is superior to most (all?) of the OSes out there (heh, now I'm being religious), but it will be worth cat's pee to you if you have no clue about how to work with it and don’t have time or willingness to learn a lot of cool stuff about the OS. In general, I'd say that any &lt;em&gt;64bit&lt;/em&gt; *nix OS should be suitable as long as you know how to use it. You'll want 64bit OS so that you can load it with loads of RAM and create big JVM heap once you need it.&lt;br/&gt;&lt;br/&gt;

One nice thing that comes with Solaris and OpenSolaris (and BSD) is ZFS file system. If you don't know much about it, I suggest that you &lt;a href="http://en.wikipedia.org/wiki/ZFS"&gt;read a bit about it&lt;/a&gt;. ZFS can make your backup strategy &lt;em&gt;a lot&lt;/em&gt; simpler and allows you to revert from a failed upgrade in a matter of seconds. I'm not exaggerating, it happened to me several times. Hopefully &lt;a href="http://en.wikipedia.org/wiki/Btrfs"&gt;Btrfs&lt;/a&gt; will soon be production ready for Linux distros and will offer comparable conveniences. If you can't use either of these, you'll have to suck it up and deal with it. I don't envy you...&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Virtualization&lt;/h2&gt;
During the last 3 years, we tried several combinations of deployment configurations for our Confluence site. These include Solaris 10 servers shared by several apps, Solaris 10 Zones (one zone per app) and OpenSolaris with Xen virtualization. Xen and OpenSolaris is what we currently use. It works well, but if I were to make a decision today, I would probably go with OpenSolaris and Zones. This combination gives you the best stability, performance, resource virtualization and application isolation.&lt;br/&gt;&lt;br/&gt;

In any case, many people ask what is the performance penalty for going virtualized. My answer is that it depends on your application, but for a webapp, more likely than not, it isn't going be the main reason of your performance problems. Decent hardware is going to make the virtualization penalty almost invisible and at the same time will give you flexibility when allocating resources in your data center. Just to give you a rough idea, the overhead for Xen is 10-30%; for Solaris Zones it's a lot less.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Web Container&lt;/h2&gt;
Atlassian recommends using Tomcat as the web container for Confluence. We could again spend a lot of time fighting a religious battle here, but I'm going to avoid that. If Tomcat works for you and you don't find it lacking features that make enterprise deployments and operation easier, then good for you. You will most likely want it fronted it with Apache webserver or something similar though.&lt;br/&gt;&lt;br/&gt;

I've been using Sun Web Server 7 in my production environment and was quite happy with it. Another excellent choice is GlassFish v2.1 or v3, which I've been using for Confluence on my Mac. Unfortunately, Confluence doesn't adhere to the Servlet spec in some places, so you'll have to &lt;a href="http://blog.igorminar.com/2009/12/running-confluence-on-glassfish-v3.html"&gt;patch it to get it to run with GF v3&lt;/a&gt;. Glassfish v2.1 is not affected, but suffers from Xalan classes clashing, so to fix that you need to put Confluence's &lt;tt&gt;xalan-x.y.z.jar&lt;/tt&gt; into &lt;tt&gt;$GLASSFISH_HOME/domains/$YOURDOMAN/lib/&lt;/tt&gt;. Otherwise everything works as expected.&lt;br/&gt;&lt;br/&gt;

For a bigger site, you'll likely need to increase the worker thread count in your servlet container. Check your container's documentation to see what's the default and how to increase it. You should also know what your peak concurrent request rate is (monitor it!) and in combination with your infrastructure capabilities (load test it!) choose the right value for you. Ours is 256 which is higher than our usual peak traffic, but lower than what we could handle if we had to.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Logs &amp; Monitoring&lt;/h2&gt;
Paraphrasing my friends daughter: "More data, more better!". Log as much as you can and archive logs. You'll never know when you'll need to search for an exception log and confirm that it started to appear 7 months ago, right after that particular confluence upgrade.&lt;br/&gt;&lt;br/&gt;

What helped me on several occasions, was having detailed access logs. I use &lt;a href="http://httpd.apache.org/docs/1.3/logs.html#combined"&gt;Apache combined log format&lt;/a&gt; with an extra attribute - request duration in microseconds. This format will not only give you a good idea about your app's performance, but will also help you track various issues by logging http referal[sic] and user-agent headers. This can often be invaluable info!&lt;br/&gt;&lt;br/&gt;

Here is a list of different types of logs you should be gathering: confluence log, web container log, jvm gc log and http access log.&lt;br/&gt;&lt;br/&gt;

In order to get an in-depth information about your visitors, usage patterns and content, I suggest that you integrate your Confluence with web analytics services like Google Analytics or Omniture. From their reports you can learn more about how, when and from where your users use your site.&lt;br/&gt;&lt;br/&gt;

When it comes to monitoring, your strategy should most certainly include JVM and JMX monitoring. Confluence as well as JVM exposes quite some interesting metrics via JMX. You should know what these values look like throughout the day or week. Only then you'll be able to efficiently troubleshoot issues when they occur (and they will occur!). Bare minimum include: heap space usage, cpu usage, requests per 10 second, errors per 10 seconds, avg request duration&lt;br/&gt;&lt;br/&gt;

We have a custom monitoring app that allow us to gather, archive and analyze these JVM/JMX metrics, but there are also some open source tools available of various quality (e.g. &lt;a href="http://munin-monitoring.org/"&gt;Munin&lt;/a&gt; looks promising).&lt;br/&gt;&lt;br/&gt;

The second part of our monitoring strategy is implemented as a local agent (we use &lt;a href="http://github.com/igorminar/satan"&gt;Satan&lt;/a&gt;), that closely monitors the JVM process and the app itself by checking if it's not running out of heap space, as well as by performing http health checks. In the case that multiple failures are registered, the agent restarts the app and emails out an alert with the description of the failure. This allows us to sleep through the night without worrying that a pesky memory leak is going to take down our site at night. Fortunately, we haven’t seen any stability issues for a while now, but things were different in the past.&lt;br/&gt;&lt;br/&gt;

The last part of our monitoring strategy is implemented as remote http agents. These periodically perform http health checks from various locations on the Internet and send out alerts when an issue is detected. This gives us a good visibility into potential networking issues that wouldn't be caught by a local agent. There are several third party solution that you could use, or you can build your own (and host it cross the globe on EC2).&lt;br/&gt;&lt;br/&gt;


&lt;h2&gt;DB&lt;/h2&gt;
The choice is up to you. Pick something supported by Atlassian or else you'll likely regret it. We use MySQL5 and for the most part we've been quite happy with it. Our db currently takes ~26GB, so be sure to account for gigabytes of db files and several times that for db backups. The biggest space sucker are attachments. Since a Confluence cluster can currently store attachments only in the database, you have to limit the attachment size, or else you'll likely end up with performance problems due to overloaded db.&lt;br/&gt;&lt;br/&gt;

We limit attachment size to 5MB. There are several users that are not happy about that, but on the other hand, it helps people to realize that often a simple wiki page is a much better distribution medium than an OpenOffice document attached to a blank wiki page. I'd bet that our users would stick huge ISO images into our db if we allowed them to. My suggestion is to start with a low limit and increase it if there is a business justification for it. Maybe one day Confluence will support S3 or Google storage as the backend for attachments, until then, keep the size limit low.&lt;br/&gt;&lt;br/&gt;

The db should be hosted on a dedicated server with lots of RAM. I'm fortunate enough to have DBAs that take care of running the DB for me, so I don't have to worry about that part. A good DBA , MANY FAST disks (possibly SSD) and lots of RAM are the key ingredients to well performing db. Of course, make sure the latency between both Confluence nodes and the db server is minimal. You shouldn't think of doing anything worse than 1GBit network and locate the db within the same datacenter.&lt;br/&gt;&lt;br/&gt;

I mentioned ZFS before and I'll mention it again. If you put the db files that contain your Confluence database on a dedicated ZFS dataset (think volume), you'll be able to take snapshots of your db during upgrades or on the fly (you'll have to momentarily lock the db to do that) and then revert from these snapshots instantly when you need it. This is just awesome. :-)&lt;br/&gt;&lt;br/&gt;

If you are using MySQL5, your minimal &lt;tt&gt;my.cnf&lt;/tt&gt; should look like this:
&lt;pre&gt;
[mysqld]
default-storage-engine=innodb
default-table-type=innodb
default-character-set=utf8
default-collation=utf8_general_ci
max_allowed_packet=32M
&lt;/pre&gt;
The last setting will allow you to upload up to 32MB large files (attachments, plugins, etc) into the db.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Backups&lt;/h2&gt;
Your users will hate you if you lose any of their precious data, so don’t do it! The best way to avoid any data loss is to have a backup strategy in place. Ours is composed of several parts.&lt;br/&gt;&lt;br/&gt;

Config files are stored in our version control system, which is, surprise surprise, being backed up.&lt;br/&gt;&lt;br/&gt;

Confluence home directory on our Confluence nodes is being backed up only just before the upgrade via a ZFS snapshot. All the files in there (except for the config files) can be rebuilt from the database, so I don’t worry about them.&lt;br/&gt;&lt;br/&gt;

The database is being backed up nightly via a SQL dump, which is then backed up on a tape. Additionally, just before an upgrade, we take a ZFS snapshot of the filesystem the db files reside on. This allows us to do instant rollbacks in case the upgrade fails. I experienced a situation where it took us hours to roll back from a SQL dump. It’s slooow. Since then we switched to ZFS snapshots.&lt;br/&gt;&lt;br/&gt;

The database is really the master storage of all the Confluence data, so in addition to all the backups, we also run a redundant (remember “odd” and “even”?) db server, that the master database is being replicated to on the fly via MySQL master/slave replication. During an upgrade we now also stop the replication, so that we can use the slave right away if something happened to the master during an upgrade and we couldn't use ZFS to rollback.&lt;br/&gt;&lt;br/&gt;

As if that was not enough, there is one more layer that allows users to recover from user errors in a fine-grained manner. It’s Confluence wiki page versioning and wiki space trash. The combination of these two features, enables users to undo most of the editing mistakes on their own, without bothering site administrators (I’ll talk more about delegation in chapter IV of the guide).&lt;br/&gt;&lt;br/&gt;

There is also a Confluence built-in backup mechanism, but it works well only for small instances. This backup process is resource intensive, generates lots of data and if I remember correctly breaks ones you reach certain size. Don't use it. You'll have to explicitly disable it via the Confluence Admin UI.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Prod, Test, Dev Environments&lt;/h2&gt;
The ability to experiment in the production environment will decrease with the increase of users using the site. For this reason, you'll need to build a Test environment that closely matches your production environment. Here you can practice your Confluence upgrade, or run automated tests just before a release. If you are doing Confluence core or plugin development, you'll also need a dev environment. This one can be a simplified and scaled down version of production (e.g. you can forgo clustering) and should be conveniently located on your dev machine or server.&lt;br/&gt;&lt;br/&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;
If you follow my advice, you should now have an infrastructure that is will help you run your Confluence site in a performant, scalable and reliably way. If you found something important missing, feel free to post your suggestions as comments.&lt;br/&gt;&lt;br/&gt;

In the next chapter of this guide we'll look at the &lt;a href="http://blog.igorminar.com/2010/07/dgc-ii-jvm-tuning.html"&gt;JVM tuning&lt;/a&gt;.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/6ZKEh4XNylo" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/4054319096510216382/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=4054319096510216382" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/4054319096510216382?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/4054319096510216382?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/6ZKEh4XNylo/dgc-i-infrastructure.html" title="DGC I: The Infrastructure" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/07/dgc-i-infrastructure.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0MERHY_fCp7ImA9Wx5UGUs.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-6481863679712318374</id><published>2010-07-25T11:12:00.000-07:00</published><updated>2010-10-24T16:03:25.844-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T16:03:25.844-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Sun" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>DevOps Guide to Confluence (DGC)</title><content type="html">After working with &lt;a href="http://www.atlassian.com/software/confluence/"&gt;Atlassian Confluence&lt;/a&gt; for 3 years, running one of the bigger public Confluence installations, I realized that there is a major lack of information about how to run Confluence on a larger scale and outside of the intranet firewalls. I'm hoping that I can improve this situation with a blog series that will describe some of the (best?) practices that I implemented while running, tweaking, patching and supporting our Confluence-based site.&lt;br/&gt;&lt;br/&gt;

Just to throw out some numbers to give context of what I mean by "relatively large":
&lt;ul&gt;
&lt;li&gt;# registered users: 180k+&lt;/li&gt;&lt;li&gt;# contributing users: 7k+&lt;/li&gt;
&lt;li&gt;# wiki spaces: 1.5k+&lt;/li&gt;
&lt;li&gt;# wiki pages: 65k+&lt;/li&gt;&lt;li&gt;# page revisions: 570k+&lt;/li&gt;
&lt;li&gt;# comments: 10k+&lt;/li&gt;&lt;li&gt;# visits per month: ~300k&lt;/li&gt;&lt;li&gt;# page views per month: ~800k&lt;/li&gt;
&lt;li&gt;# http requests per day: ~1m+ (includes crawlers and users with disable javascript)&lt;/li&gt;
&lt;/ul&gt;

So I'm not talking about a huge site like amazon, twitter, etc, but still bigger than most of the public facing confluence instances out there.&lt;br/&gt;&lt;br/&gt;

Some of the practices described in this guide might be an overkill for smaller deployments, so I’ll leave it up to you to pick the right ones for you and your environment.&lt;br/&gt;&lt;br/&gt;

There are many aspects that need careful consideration if you want to go relatively big, and there are even more of them when you run your site on the Internet as opposed to doing it internally within an organization. In my blog series I'm going to focus on these areas that I consider important:&lt;br/&gt;&lt;br/&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2010/07/dgc-i-infrastructure.html"&gt;The infrastructure&lt;/a&gt;
  &lt;ul&gt;
  &lt;li&gt;Confluence cluster&lt;/li&gt;&lt;li&gt;hardware (cpu,memory,disk)&lt;/li&gt;
  &lt;li&gt;os, filesystem&lt;/li&gt;&lt;li&gt;web container&lt;/li&gt;
  &lt;li&gt;logs &amp;amp; monitoring&lt;/li&gt;
  &lt;li&gt;network&lt;/li&gt;&lt;li&gt;db&lt;/li&gt;
  &lt;li&gt;backups&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2010/07/dgc-ii-jvm-tuning.html"&gt;The JVM tuning&lt;/a&gt;
  &lt;ul&gt;
  &lt;li&gt;heap&lt;/li&gt;
  &lt;li&gt;garbage collector&lt;/li&gt;
  &lt;li&gt;fancy switches&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2010/07/dgc-iii-confluence-configuration-and.html"&gt;Confluence configuration and tuning&lt;/a&gt;
  &lt;ul&gt;
  &lt;li&gt;where and how to configure Confluence&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2010/07/dgc-iv-confluence-upgrades.html"&gt;Confluence upgrades&lt;/a&gt;
  &lt;ul&gt;
  &lt;li&gt;release history and track record&lt;/li&gt;
  &lt;li&gt;following announcements and getting support&lt;/li&gt;
  &lt;li&gt;the upgrade procedure&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2010/08/dgc-v-customizing-and-patching.html"&gt;Customizing and patching Confluence&lt;/a&gt;
  &lt;ul&gt;
  &lt;li&gt;customization options&lt;/li&gt;
  &lt;li&gt;reasons for patching&lt;/li&gt;
  &lt;li&gt;Mercurial Queues as patch management tool&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2010/08/dgc-vi-wiki-organization-and-working.html"&gt;Wiki Organization and Working with the Community&lt;/a&gt;
  &lt;ul&gt;
  &lt;li&gt;main principles&lt;/li&gt;
  &lt;li&gt;global and default permissions&lt;/li&gt;
  &lt;li&gt;delegating decision-making to space admins&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Internet-facing deployment and operation
  &lt;ul&gt;
  &lt;li&gt;Varnish or other caching reverse proxy&lt;/li&gt;
  &lt;li&gt;robots.txt&lt;/li&gt;
  &lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;

I'm not going to go into details about why to pick Confluence or why not to pick it. I really just want to focus on how to make it run smoothly and reliably while serving a relatively large audience of users (and robots).&lt;br/&gt;&lt;br/&gt;

Given that we want to run a site on the Internet, we are lucky to have &lt;a href="http://twitter.com/willsnow/status/17418065523"&gt;well defined maintenance windows&lt;/a&gt;, that we can work with. Meaning that any downtime will be perceived by at least a portion of your users as your failure, and the only way how you can avoid looking like an idiot is to keep the downtime to the absolute minimum.&lt;br/&gt;&lt;br/&gt;

You are now probably thinking that a Confluence cluster will solve all your problems with scalability and reliability.&lt;br/&gt;&lt;br/&gt;

&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_p64VvtgZDp4/TEyAdMzbz3I/AAAAAAAAAdw/vWVIgNphT3c/s1600/cluster-BS.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 74px;" src="http://2.bp.blogspot.com/_p64VvtgZDp4/TEyAdMzbz3I/AAAAAAAAAdw/vWVIgNphT3c/s400/cluster-BS.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5497910484254052210" /&gt;&lt;/a&gt;&lt;br/&gt;

Right, that's what the marketing people tell you. Anyone who knows a thing or two about software engineering, knows that there is no such a thing as "unlimited scalability" and ironically a Confluence cluster can hit several bottlenecks quite quickly in certain situations. That said, a Confluence cluster with all its pros and cons is really the way to go big with Confluence, but you should have realistic expectations about its scalability and reliability.&lt;br/&gt;&lt;br/&gt;

The fact that makes things even more difficult is that if you do things right, your wiki is going to take off. More users, more content, more traffic, more spam, more crawlers, more users unhappy about any kind of downtime... Growth is what you need to take into account from day one. I'm not saying that you have to start big, you just shouldn't paint your self into a corner and I'm going to mention some tips on how to avoid just that.&lt;br/&gt;&lt;br/&gt;

I was inspired to write up this guide after watching &lt;a href="http://www.atlassian.com/summit/2010/presentations/development-speed/performance-tuning-application-development.jsp"&gt;George Barnett’s presentation&lt;/a&gt; from this year’s Atlassian Summit. George made some really good points and I encourage you to watch his talk. My guide will not focus just on performance and scalability, but also on reliability, smooth day-to-day operation and more.&lt;br/&gt;&lt;br/&gt;

Continue reading: &lt;a href="http://blog.igorminar.com/2010/07/dgc-i-infrastructure.html"&gt;DGC I: The Infrastructure&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/9D3WNBlEL6k" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/6481863679712318374/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=6481863679712318374" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/6481863679712318374?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/6481863679712318374?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/9D3WNBlEL6k/devops-guide-to-confluence-dgc.html" title="DevOps Guide to Confluence (DGC)" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/_p64VvtgZDp4/TEyAdMzbz3I/AAAAAAAAAdw/vWVIgNphT3c/s72-c/cluster-BS.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/07/devops-guide-to-confluence-dgc.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkYFSX8_eyp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-2735094712954971294</id><published>2010-05-31T18:24:00.000-07:00</published><updated>2010-10-24T18:28:38.143-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:28:38.143-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Solaris" /><title>Improving Satan and Solaris SMF</title><content type="html">One of the features of Solaris that we heavily rely on in our production environment at work is &lt;a href="http://en.wikipedia.org/wiki/Service_Management_Facility"&gt;Service Management Facility&lt;/a&gt; or SMF for short. SMF can start/stop/restart services, track dependencies between services and use that to optimize the boot process and lots more. Often handy in production environment is that SMF keeps track of processes that a particular service started and if a process dies, SMF restarts its services.&lt;br/&gt;
&lt;br/&gt;
One gripe I have with SMF is that its process monitoring capabilities are rather simple. A process associated with a contract (service) must die in order for SMF to get the idea that something is wrong and that the service should be restarted. In practice, more often than not a process gets into a weird state that prevents it from working properly, yet it doesn't die. Failures might include excessive cpu or memory usage or even application level failures that can be detected only by interacting with the application (e.g. http health check). SMF in its current implementation is incapable of detecting these failures. And this is where Satan comes into the play.&lt;br/&gt;
&lt;br/&gt;
&lt;a href="http://letsgetdugg.com/2010/01/05/relax-satan-is-on-your-side/"&gt;Satan&lt;/a&gt; a small ruby script that monitors a process and following the &lt;a href="http://en.wikipedia.org/wiki/Crash-only_software"&gt;Crash-only Software&lt;/a&gt; philosophy, kills it when a problem is detected. It then relies on SMF to detect the process death(s) and restart the given service. I fell in love with the simplicity of Satan (which was inspired by &lt;a href="http://god.rubyforge.org/"&gt;God&lt;/a&gt;) and started exploring the feasibility of using it to improve the reliability of SMF on our production servers.&lt;br/&gt;
&lt;br/&gt;
Upon a code review of the script, I noticed several things that I wished were implemented differently. Here are some:
&lt;ul&gt;
&lt;li&gt;Satan watches processes rather than services as defined via SMF&lt;/li&gt;
&lt;li&gt;One Satan instance is designed to watch many different processes for different services, which adds unnecessary complexity and lacks isolation&lt;/li&gt;
&lt;li&gt;Satan is merciless (what a surprise! :-) ) and uses &lt;code&gt;kill -9&lt;/code&gt; without a warning&lt;/li&gt;
&lt;li&gt;Satan has no test suite!!! :-( (i.e. I must presume that it doesn't work)&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;&lt;br/&gt;

Thankfully the source code was &lt;a href="http://github.com/victori/satan"&gt;out there&lt;/a&gt; on GitHub and licensed under BSD license so it was just a matter of a few keystrokes to fork it (open source FTW!). By the time I was done with my changes, there wasn't much of the original source code left, but oh well :-)&lt;br/&gt;&lt;br/&gt;

I'm happy to present to you &lt;a href="http://github.com/IgorMinar/satan"&gt;http://github.com/IgorMinar/satan&lt;/a&gt; for review and comments. The main changes I made are the following:
&lt;ul&gt;
&lt;li&gt;One Satan instance watches &lt;i&gt;single&lt;/i&gt; SMF service and its one or more processes&lt;/li&gt;
&lt;li&gt;The single service to monitor design allows for automatic monitoring suspension via SMF dependencies while the monitored service is being started, restarted or disabled&lt;/li&gt;
&lt;li&gt;Several bugfixes around how rule failures and recoveries are counted before a service is deemed unhealthy&lt;/li&gt;
&lt;li&gt;At first Satan tries to invoke &lt;code&gt;svcadm restart&lt;/code&gt; and only if that doesn't occur within a specified grace period, it uses &lt;code&gt;kill -9&lt;/code&gt; to kill all processes for the given contract (service)&lt;/li&gt;
&lt;li&gt;Satan now has decent &lt;a href="http://www.rspec.info/"&gt;RSpec&lt;/a&gt; test suite (more on that in my &lt;a href="http://blog.igorminar.com/2010/05/testing-matters-even-with-shell-scripts.html"&gt;previous post&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;Improved HTTP condition with a timeout setting&lt;/li&gt;
&lt;li&gt;New JVM free heap space condition to monitor those pesky JVM memory leaks&lt;/li&gt;
&lt;li&gt;Extensible design now allows for new monitoring conditions (rules) to be defined outside of the main Satan source code&lt;/li&gt;
&lt;/ul&gt;

As always there are more things to improve and extend but, I'm hoping that my Satan fork will be a decent version that will allow us to keep our services running more reliably. If you have suggestions, or comments feel free to leave feedback.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/LYzmKuVomYI" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/2735094712954971294/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=2735094712954971294" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2735094712954971294?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2735094712954971294?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/LYzmKuVomYI/improving-satan-and-solaris-smf.html" title="Improving Satan and Solaris SMF" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>4</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/05/improving-satan-and-solaris-smf.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkQDQHwyfip7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-3959623257434633150</id><published>2010-05-31T11:24:00.000-07:00</published><updated>2010-10-24T18:32:51.296-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:32:51.296-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><title>Testing matters, even with shell scripts</title><content type="html">A few months ago, we migrated out production environment at work from Solaris 10 to OpenSolaris. We loved the change because it allowed us to take advantage of the latest inventions in Solaris land. All was good and dandy until one day one of our servers ran out of disk space and died. WTH? We have monitoring scripts that alert us long before we get even close to running out of space, yet no alert was issued this time. While investigating the cause of this incident, we found out that our monitoring scripts that work well on Solaris 10, didn't monitor the disk space correctly on OpenSolaris. When I asked our sysadmins if they didn't have any tests for their scripts that could validate their functionality, they laughed at me.&lt;br/&gt;&lt;br/&gt;

Fast forward a few months. A few days ago I started looking at &lt;a href="http://letsgetdugg.com/2010/01/05/relax-satan-is-on-your-side/"&gt;Satan&lt;/a&gt;, to augment the self healing capabilities of &lt;a href="http://en.wikipedia.org/wiki/Service_Management_Facility"&gt;Solaris SMF&lt;/a&gt; (think initd or launchd on stereoids). At first sight I loved the simplicity of the solution, but one thing that startled me during the code review was that there were no tests for the code, except for some helper scripts that made manual testing a bit less painful. At the same time, I spotted several bugs that would have resulted in an unwanted behavior.&lt;br/&gt;&lt;br/&gt;

Satan relies on invoking solaris commands from ruby and parsing the output and acting upon it. Thanks to its no BS nature, ruby makes for an excellent choice when it comes to writing programs that interact with the OS by executing commands. There are &lt;a href="http://blog.jayfields.com/2006/06/ruby-kernel-system-exec-and-x.html"&gt;several ways&lt;/a&gt; to do this, but the most popular looks like this:
&lt;pre&gt;
ps_output = `ps -o pid,pcpu,rss,args -p #{pid}`
&lt;/pre&gt;&lt;br/&gt;

All you need to do is to stick the command into backticks and optionally use &lt;code&gt;#{variable}&lt;/code&gt; for variable expansion. To get a hold of the output, just assign the return value to a variable.&lt;br/&gt;&lt;br/&gt;

Now if you stick a piece of code like this in the middle of the ruby script you get something next to untestable:
&lt;pre&gt;
module PsParser
 def ps(pid)
   out_raw = `ps -o pid,pcpu,rss,args -p #{pid}`
   out = out_raw.split(/\n/)[1].split(/ /).delete_if {|arg| arg == "" or arg.nil? }
   { :pid=&gt;out[0].to_i,
     :cpu=&gt;out[1].to_i,
     :rss=&gt;out[2].to_i*1024,
     :command=&gt;out[3..out.size].join(' ') }
 end
end
&lt;/pre&gt;&lt;br/&gt;

With the code structured (or unstructured) like this, you'll never be able to test if the code can parse the output correctly. However if you extract the command execution into a separate method call:
&lt;pre&gt;
module PsParser
 def ps(pid)
   out = ps_for_pid(pid).split(/\n/)[1].split(/ /).delete_if {|arg| arg == "" or arg.nil? }
   { :pid=&gt;out[0].to_i,
     :cpu=&gt;out[1].to_i,
     :rss=&gt;out[2].to_i*1024,
     :command=&gt;out[3..out.size].join(' ') }
 end

 private
 def ps_for_pid(pid)
   `ps -o pid,pcpu,rss,args -p #{pid}`
 end
end
&lt;/pre&gt;&lt;br/&gt;

You can now open the module and redefine the &lt;code&gt;ps_for_pid&lt;/code&gt; in your tests like this:
&lt;pre&gt;
require 'ps_parser'

PS_OUT = {
 1 =&gt; "  PID %CPU    RSS ARGS
12790   2.7 707020 java",
 2 =&gt; "  PID %CPU    RSS ARGS
12791  92.7 107020 httpd"
}

module PsParser
 def ps_for_pid(pid)
   PS_OUT[pid]
 end
end
&lt;/pre&gt;&lt;br/&gt;

And now you can simply call the &lt;code&gt;pid&lt;/code&gt; method and check if the fake output stored in &lt;code&gt;PS_OUT&lt;/code&gt; is being parsed correctly. The concept is the same as when mocking webservices or other complex classes, but applied to running system command and programs.&lt;br/&gt;&lt;br/&gt;

To conclude, what makes you more confident about a software you want to rely on. An empty test folder: &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_p64VvtgZDp4/TAP8glKXB-I/AAAAAAAAAYQ/Us1EMXP_o_g/s1600/satan-test-MIA.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 244px; border: none" src="http://3.bp.blogspot.com/_p64VvtgZDp4/TAP8glKXB-I/AAAAAAAAAYQ/Us1EMXP_o_g/s400/satan-test-MIA.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5477499208474232802" /&gt;&lt;/a&gt;
Or all green results from a test/spec suite? &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_p64VvtgZDp4/TAP7nzoGgmI/AAAAAAAAAYI/dd3eXbJOJAo/s1600/satan-rspec.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 132px; border: none" src="http://3.bp.blogspot.com/_p64VvtgZDp4/TAP7nzoGgmI/AAAAAAAAAYI/dd3eXbJOJAo/s400/satan-rspec.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5477498233104532066" /&gt;&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/8gwTG3XLbxg" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/3959623257434633150/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=3959623257434633150" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/3959623257434633150?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/3959623257434633150?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/8gwTG3XLbxg/testing-matters-even-with-shell-scripts.html" title="Testing matters, even with shell scripts" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/_p64VvtgZDp4/TAP8glKXB-I/AAAAAAAAAYQ/Us1EMXP_o_g/s72-c/satan-test-MIA.png" height="72" width="72" /><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2010/05/testing-matters-even-with-shell-scripts.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkIFQX84cSp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-1993805800487624943</id><published>2009-12-21T12:30:00.000-08:00</published><updated>2010-10-24T18:35:10.139-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:35:10.139-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Glassfish" /><title>Configuring Common Access Log Format in GlassFish v2 and v3</title><content type="html">For a long time &lt;a href="http://blogs.sun.com/lowbit/"&gt;Matthew&lt;/a&gt; and I had a dilemma about changing the non-standard access log format used by &lt;a href="https://glassfish.dev.java.net/"&gt;GlassFish&lt;/a&gt; v2 and v3, to the commonly used &lt;a href="http://httpd.apache.org/docs/1.3/logs.html#common"&gt;common&lt;/a&gt; or &lt;a href="http://httpd.apache.org/docs/1.3/logs.html#combined"&gt;combined format&lt;/a&gt; used by Apache.&lt;br/&gt;&lt;br/&gt;

GlassFish does allow one to specify the access log format, but how this works is not obvious. If one tries to create a formatting string, which should result in one of the Apache access log formats, the resulting output does contain all the specified fields and in the right order, but the field delimiters are not preserved from the formatting string and instead all the fields are quoted and separated by spaces. That's not quite what we want, especially if you plan to feed the logs into a log analyzer that expect the usual Apache syntax.&lt;br/&gt;&lt;br/&gt;

While getting &lt;a href="http://blog.igorminar.com/2009/12/running-confluence-on-glassfish-v3.html"&gt;Confluence wiki to run on GlassFish v3&lt;/a&gt;, I fetched the GF source code and since I already had it, I thought that it should be trivial to find out how the Access log format gets processed in GF.&lt;br/&gt;&lt;br/&gt;

To my big surprise, I found out that there are classes with very suspicious names: &lt;code&gt;CommonAccessLogFormatterImpl&lt;/code&gt;, &lt;code&gt;CombinedAccessLogFormatterImpl&lt;/code&gt; and &lt;code&gt;DefaultAccessLogFormatterImpl&lt;/code&gt;. A minute later I also found this piece of code "hidden" in &lt;code&gt;PEAccessLogValve&lt;/code&gt;:
&lt;pre&gt;
    // Predefined patterns
   private static final String COMMON_PATTERN = "common";
   private static final String COMBINED_PATTERN = "combined";


   ...
   ...


   /**
    * Set the format pattern, first translating any recognized alias.
    *
    * @param p The new pattern
    */
   public void setPattern(String p) {
       if (COMMON_PATTERN.equalsIgnoreCase(p)) {
           formatter = new CommonAccessLogFormatterImpl();
       } else if (COMBINED_PATTERN.equalsIgnoreCase(p)) {
           formatter = new CombinedAccessLogFormatterImpl();
       } else {
           formatter = new DefaultAccessLogFormatterImpl(p, getContainer());
       }
   }
&lt;/pre&gt;&lt;br/&gt;&lt;br/&gt;

Whoa! So both Apache formats are implemented already and one just needs to know how to "unlock" them. The "common" and "combined" constants looked like the magic keywords to do just that, and sure enough, when one sets either of them as the formatting string, the log will contain the expected output.&lt;br/&gt;&lt;br/&gt;

&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_p64VvtgZDp4/Sy_MIsUSfyI/AAAAAAAAAQM/WgXPNXpOR2g/s1600-h/common+log+format.png"&gt;&lt;img style="border: medium none ; margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 312px;" src="http://2.bp.blogspot.com/_p64VvtgZDp4/Sy_MIsUSfyI/AAAAAAAAAQM/WgXPNXpOR2g/s400/common+log+format.png" alt="" id="BLOGGER_PHOTO_ID_5417773326457274146" border="0" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;

You can also use &lt;code&gt;asadmin&lt;/code&gt; to make this config change: 
&lt;pre&gt;
asadmin set server.http-service.access-log.format="combined"
&lt;/pre&gt;&lt;br/&gt;

After a restart the log now uses the requested format:
&lt;pre&gt;
0:0:0:0:0:0:0:1%0 - - [21/Dec/2009:07:42:45 -0800] "GET /s/1722/3/_/images/icons/star_grey.gif HTTP/1.1" 304 0
0:0:0:0:0:0:0:1%0 - - [21/Dec/2009:07:42:45 -0800] "GET /images/icons/add_space_32.gif HTTP/1.1" 304 0
0:0:0:0:0:0:0:1%0 - - [21/Dec/2009:07:42:45 -0800] "GET /images/icons/feed_wizard.gif HTTP/1.1" 304 0
0:0:0:0:0:0:0:1%0 - - [21/Dec/2009:07:42:45 -0800] "GET /images/icons/people_directory_32.gif HTTP/1.1" 304 0
0:0:0:0:0:0:0:1%0 - - [21/Dec/2009:07:42:45 -0800] "GET /s/1722/3/_/images/icons/add_12.gif HTTP/1.1" 304 0
&lt;/pre&gt;&lt;br/&gt;&lt;br/&gt;

Believe it or not, this information is not documented anywhere in the official documentation, and even folks Matthew chatted with on Sun's internal support mailing lists had no clue about it. Ideally the GF documentation and UI should be updated to make changing the access log format as simple as it should be.&lt;br/&gt;&lt;br/&gt;

Oh btw, open source FTW, when documentation is lacking one can at least read the sources!&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/RTwv0GRdsUA" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/1993805800487624943/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=1993805800487624943" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1993805800487624943?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1993805800487624943?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/RTwv0GRdsUA/configuring-common-access-log-format-in.html" title="Configuring Common Access Log Format in GlassFish v2 and v3" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/_p64VvtgZDp4/Sy_MIsUSfyI/AAAAAAAAAQM/WgXPNXpOR2g/s72-c/common+log+format.png" height="72" width="72" /><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/12/configuring-common-access-log-format-in.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkIARX88cSp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-5261615407711755181</id><published>2009-12-20T19:31:00.000-08:00</published><updated>2010-10-24T18:35:44.179-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:35:44.179-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Glassfish" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>Running Confluence on GlassFish v3</title><content type="html">Even though Atlassian considers &lt;a href="https://glassfish.dev.java.net/"&gt;GlassFish&lt;/a&gt; to be an unsupported servlet container for &lt;a href="http://www.atlassian.com/software/confluence/"&gt;Confluence&lt;/a&gt;, it is quite easy to use Confluence with GlassFish v2.1. In fact that's the container that I've been using for a long time during my Confluence and Confluence plugin development.&lt;br/&gt;&lt;br/&gt;

I've been monitoring progress of GlassFish v3 development for several months and noticed that at some point Confluence 2.x and 3.0.x &lt;a href="https://glassfish.dev.java.net/issues/show_bug.cgi?id=8537"&gt;stopped working&lt;/a&gt; due to conflicts between different versions of Apache Felix used by both GFv3 and Confluence.&lt;br/&gt;&lt;br/&gt;

Fortunately Confluence 3.1 now contains Felix v2.x (an upgrade from 1.x), which solves the previously mentioned issues. Excited about the change, I tried to deploy Confluence 3.1 to GFv3 (final) and observed that there are a few more issues that one needs to deal with. I filed these two bugs and one RFE against Confluence and provided patches that anyone can use to get Confluence to run with GFv3:&lt;ul&gt;&lt;li&gt;&lt;a href="http://jira.atlassian.com/browse/SER-144"&gt;SER-144 - Illegal cyclic dependency between LoginFilter and SecurityFilter&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://jira.atlassian.com/browse/CONF-18093"&gt;CONF-18093 - DefaultUserProfileService tries to modify an unmodifiable collection (PATCH)&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://jira.atlassian.com/browse/CONF-18094"&gt;CONF-18094 - Bundle sun-web.xml with classloader settings required for GlassFish (PATCH)&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br/&gt;&lt;br/&gt;

Once the patches are applied to the Confluence source code, build a war file in the usual way, deploy it to GFv3 and you should be good to go. (There is a harmless exception thrown when Confluence starts, &lt;a href="https://glassfish.dev.java.net/issues/show_bug.cgi?id=11341"&gt;more info&lt;/a&gt;, just ignore it)&lt;br/&gt;&lt;br/&gt;

Oh, and be sure to vote for &lt;a href="http://jira.atlassian.com/browse/CONF-6603"&gt;CONF-6603&lt;/a&gt; to get Atlassian to officially support GlassFish.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/fZPrLeJwynU" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/5261615407711755181/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=5261615407711755181" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/5261615407711755181?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/5261615407711755181?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/fZPrLeJwynU/running-confluence-on-glassfish-v3.html" title="Running Confluence on GlassFish v3" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/12/running-confluence-on-glassfish-v3.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkEEQXY-fyp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-1312389829890063746</id><published>2009-11-14T14:05:00.000-08:00</published><updated>2010-10-24T18:36:40.857-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:36:40.857-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><title>Using Mercurial Bisect to Find Bugs</title><content type="html">Yesterday I tried to find a regression bug in &lt;a href="https://grizzly.dev.java.net/"&gt;Grizzly&lt;/a&gt; that was preventing &lt;a href="http://grizzly-sendfile.kenai.com/"&gt;grizzly-sendfile&lt;/a&gt; from using blocking IO. I knew that the bug was not present in grizzly 1.9.15, but somewhere between that release and the current head someone introduced a changeset that broke things for me. Here is how I found out who that person was.&lt;br/&gt;&lt;br/&gt;

Grizzly is unfortunately still stuck with subversion, so the only thing (besides complaining) that I can do to make my life easier, is to convert the grizzly svn repo to some sane SCM, such as mercurial. I used &lt;a href="http://pypi.python.org/pypi/hgsvn"&gt;hgsvn&lt;/a&gt; to convert the svn repo.&lt;br/&gt;&lt;br/&gt;

Once I had a mercurial repo, I wrote BlockingIoAsyncTest - a JUnit test for the bug. And that was all I needed to run &lt;a href="http://mercurial.selenic.com/wiki/BisectExtension"&gt;bisect&lt;/a&gt;:
&lt;pre&gt;
$ echo '#!/bin/bash                                                                          
mvn clean test -Dtest=com.sun.grizzly.http.BlockingIoAsyncTest' &gt; test.sh
$ chmod +x test.sh
$ hg bisect --reset #clean repo from any previous bisect run
$ hg bisect --good 375 #specify the last good revision
$ hg bisect --bad tip #specify a known bad revision
Testing changeset 604:82e43b848ae7 (458 changesets remaining, ~8 tests)
517 files updated, 0 files merged, 158 files removed, 0 files unresolved
$ hg bisect --command ./test.sh #run the automated bisect
...
(output from the test)
...
...
...

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9 seconds
[INFO] Finished at: Sat Nov 14 11:41:07 PST 2009
[INFO] Final Memory: 25M/79M
[INFO] ------------------------------------------------------------------------
Changeset 500:983a3fc2debe: good
The first good revision is:
changeset:   501:b5239bf9427b
branch:      code
tag:         svn.3343
user:        jfarcand
date:        Wed Jun 17 17:20:32 2009 -0700
summary:     [svn r3343] Fix for https://grizzly.dev.java.net/issues/show_bug.cgi?id=672
&lt;/pre&gt;

In under two minutes I found out who and with which revision caused all the trouble!

Bisect is very useful for large projects, developed by multiple users, where the amount of code and changes is not trivial. Finding regressions in this way can save a lot of time otherwise spent by debugging.&lt;br/&gt;&lt;br/&gt;

The only caveat I noticed is that you need to create a shell script that is passed in as an argument to the bisect command. It would be a lot easier if I could just specify the maven command directly without the intermediate shell script.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/FI8SiPEpkbc" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/1312389829890063746/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=1312389829890063746" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1312389829890063746?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1312389829890063746?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/FI8SiPEpkbc/using-mercurial-bisect-to-find-bugs.html" title="Using Mercurial Bisect to Find Bugs" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/11/using-mercurial-bisect-to-find-bugs.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkEDRHo_eip7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-2705753778911666250</id><published>2009-05-28T19:30:00.000-07:00</published><updated>2010-10-24T18:37:55.442-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:37:55.442-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="grizzly-sendfile" /><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Glassfish" /><title>grizzly-sendfile to Become an Official Grizzly Module</title><content type="html">After a chat with &lt;a href="http://weblogs.java.net/blog/jfarcand/"&gt;JFA&lt;/a&gt; about &lt;a href="http://grizzly-sendfile.kenai.com/"&gt;grizzly-sendfile&lt;/a&gt;'s future, I'm pleased to announce today that grizzly-sendfile 0.4 will be the first version of grizzly-sendfile released as an official module of grizzly. This is a huge news for grizzly-sendfile and I believe an equally important news for grizzly and its community.&lt;br/&gt;&lt;br/&gt;

&lt;h4&gt;What this "merger" means for grizzly-sendfile:&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;great opportunity to extend the reach&lt;/li&gt;
&lt;li&gt;opportunity to become the default static file handler in Grizzly&lt;/li&gt;
&lt;li&gt;aspiration to become the default static file handler in GlassFish v3&lt;/li&gt;
&lt;li&gt;more testing and QA&lt;/li&gt;
&lt;li&gt;easier and faster access to grizzly developers and contributors&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;

&lt;h4&gt;What this "merger" means for grizzly:&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;contribution of 1 year of research, development and testing time in the area of static http downloads&lt;/li&gt;
&lt;li&gt;several times better performance and scalability of http static file downloads&lt;/li&gt;
&lt;li&gt;built-in X-Sendfile functionality&lt;/li&gt;&lt;li&gt;better JMX instrumentation for http downloads&lt;/li&gt;
&lt;li&gt;and more&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;

If you can't wait for 0.4, go and get &lt;a href="http://blog.igorminar.com/2009/05/grizzly-sendfile-03-is-out.html"&gt;recently released version 0.3&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;

This is a great day for both projects! :-)&lt;br/&gt;&lt;br/&gt;

Project site: &lt;a href="http://grizzly-sendfile.kenai.com/"&gt;http://grizzly-sendfile.kenai.com/&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/1PT2mreMcAk" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/2705753778911666250/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=2705753778911666250" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2705753778911666250?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2705753778911666250?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/1PT2mreMcAk/grizzly-sendfile-to-become-official.html" title="grizzly-sendfile to Become an Official Grizzly Module" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/05/grizzly-sendfile-to-become-official.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkAGSHc4fSp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-7332903862043741427</id><published>2009-05-14T00:50:00.000-07:00</published><updated>2010-10-24T18:38:49.935-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:38:49.935-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="grizzly-sendfile" /><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Glassfish" /><title>grizzly-sendfile 0.3 is out!</title><content type="html">After a few months of late night hacking, &lt;a href="http://grizzly-sendfile.kenai.com/"&gt;grizzly-sendfile&lt;/a&gt; 0.3 is finally ready for prime time!&lt;br/&gt;&lt;br/&gt;

New features include:
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/SettingUp"&gt;grizzly 1.9.15 compatibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/SettingUp#Using_grizzly-sendfile_with_GlassFish_v3"&gt;glassfish v3b48 compatibility (via an OSGi bundle)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.igorminar.com/2009/05/grizzly-sendfile-and-comparison-of.html"&gt;kick-ass performance and scalability&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Grizzly-sendfile-server"&gt;grizzly-sendfile-server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/AutoSendfile"&gt;autosendfile mode&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Algorithms"&gt;Sendfile Algorithms&lt;/a&gt;: new EqualBlockingAlgorithm + improved EqualNonBlockingAlgorithm&lt;/li&gt;
&lt;li&gt;lots of improvements to &lt;a href="http://kenai.com/projects/grizzly-sendfile/sources/mercurial/show/test/faban/grizzly-sendfile-benchmark"&gt;grizzly-sendfile-benchmark&lt;/a&gt;&lt;/li&gt;&lt;li&gt;maven build system&lt;/li&gt;&lt;li&gt;lots of smaller improvements and bug fixes&lt;/li&gt;
&lt;/ul&gt;&lt;br/&gt;&lt;br/&gt;

I also started using &lt;a href="http://kenai.com/jira/secure/IssueNavigator.jspa?reset=true&amp;mode=hide&amp;pid=10034&amp;resolution=-1&amp;sorter/field=updated&amp;sorter/order=DESC"&gt;kenai's JIRA&lt;/a&gt; for issue tracking. So feel free to file bugs or RFE's there.&lt;br/&gt;&lt;br/&gt;

Benchmark &amp; Enjoy!&lt;br/&gt;&lt;br/&gt;

&lt;a href="http://grizzly-sendfile.kenai.com/"&gt;Project Website&lt;/a&gt;
&lt;a href="https://kenai.com/hg/grizzly-sendfile~mercurial/file/e91c72bf1346"&gt;The source code&lt;/a&gt;
&lt;a href="http://kenai.com/projects/grizzly-sendfile/downloads"&gt;The binaries&lt;/a&gt;&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/EwJh1LRBxJk" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/7332903862043741427/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=7332903862043741427" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/7332903862043741427?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/7332903862043741427?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/EwJh1LRBxJk/grizzly-sendfile-03-is-out.html" title="grizzly-sendfile 0.3 is out!" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/05/grizzly-sendfile-03-is-out.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0UGQH88eSp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-1840308271841899595</id><published>2009-05-10T19:00:00.000-07:00</published><updated>2010-10-24T18:47:01.171-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:47:01.171-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="grizzly-sendfile" /><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><title>grizzly-sendfile and Comparison of Blocking and NonBlocking IO</title><content type="html">From the very early beginnings of my work on &lt;a href="http://kenai.com/projects/grizzly-sendfile"&gt;grizzly-sendfile&lt;/a&gt; (&lt;a href="http://blog.igorminar.com/2009/02/announcing-grizzly-sendfile.html"&gt;intro&lt;/a&gt;) I was curious to compare blocking and non-blocking IO side to side. Since I didn't have any practical experience to understand which one would be more suitable when, I designed grizzly-sendfile to be flexible so that I could try different strategies and come to conclusions based on some real testing rather than theorizing or based on the words of others. In this post I'd like to compare blocking and nonblocking IO, benchmark them, and draw some conclusions as to which one is more suitable for specific situations.&lt;br/&gt;&lt;br/&gt;

grizzly-sendfile has a notion of algorithms that control the IO operations responsible for writing data to a &lt;a href="http://java.sun.com/javase/6/docs/api/java/nio/channels/SocketChannel.html"&gt;SocketChannel&lt;/a&gt; (grizzly-sendfile is based on &lt;a href="http://java.sun.com/javase/6/docs/api/java/nio/package-summary.html"&gt;NIO&lt;/a&gt; and leverages lots of great work put into &lt;a href="http://grizzly.dev.java.net/"&gt;grizzly&lt;/a&gt;). Different algorithms can do this in different ways, and this is &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Algorithms"&gt;explained in depth&lt;/a&gt; on the &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Home"&gt;project wiki&lt;/a&gt;. The point is that this allows me to create algorithms that use blocking or nonblocking IO in different ways, and easily swap them and compare their performance (in a very isolated manner).&lt;br/&gt;&lt;br/&gt;

Two algorithms I implemented right away were &lt;a href="http://kenai.com/projects/grizzly-sendfile/sources/mercurial/content/grizzly-sendfile/src/main/java/com/igorminar/grizzlysendfile/algorithm/SimpleBlockingAlgorithm.java"&gt;SimpleBlockingAlgorithm&lt;/a&gt; (SBA) and &lt;a href="http://kenai.com/projects/grizzly-sendfile/sources/mercurial/content/grizzly-sendfile/src/main/java/com/igorminar/grizzlysendfile/algorithm/EqualNonBlockingAlgorithm.java"&gt;EqualNonBlockingAlgorithm&lt;/a&gt; (ENBA), and only recently followed by &lt;a href="http://kenai.com/projects/grizzly-sendfile/sources/mercurial/content/grizzly-sendfile/src/main/java/com/igorminar/grizzlysendfile/algorithm/EqualBlockingAlgorithm.java"&gt;EqualBlockingAlgorithm&lt;/a&gt; (EBA). The first one employs the traditional approach of sending a file via the network (while not EOF write data to a SocketChannel using blocking writes), while ENBA uses non-blocking writes and &lt;a href="http://java.sun.com/javase/6/docs/api/java/nio/channels/Selector.html"&gt;Selector&lt;/a&gt; re-registration (in place of blocking) to achieve the same task. This means that a download is split into smaller parts, each sequentially streamed by an assigned worker thread to the client. EBA works very similarly to ENBA, but uses blocking writes.&lt;br/&gt;&lt;br/&gt;

I ran two variations of my &lt;a href="http://www.blogger.com/faban.sunsource.net/"&gt;faban&lt;/a&gt; &lt;a href="http://kenai.com/projects/grizzly-sendfile/sources/mercurial/show/test/faban/grizzly-sendfile-benchmark"&gt;benchmark&lt;/a&gt;&lt;a href="http://www.blogger.com/post-edit.g?blogID=6406593750327945950&amp;amp;postID=1840308271841899595#footnote-0"&gt;&lt;sup&gt;[0]&lt;/sup&gt;&lt;/a&gt; against these three algorithms. At first I made my simulated clients hit the server as often as possible and download files of different sizes as quickly as possible. Afterward I throttled the file download speed to 1MB/s per client (throttling was done by clients). While the first benchmark simulates traffic close to the one on the private network in a datacenter, the second benchmarks better represents client/server traffic on the Internet.&lt;br/&gt;&lt;br/&gt;

grizzly-sendfile delegates the execution of the selected algorithm to a pool of worker threads, so the maximum number of the treads in the pool, along with the selected algorithm, are one of the major factors that affects the performance&lt;a name="#footnote-1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt; and scalability&lt;a name="#footnote-2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt; of the server. In my tests I kept the pool size relatively small (50 threads), in order to easily simulate situations when there are more concurrent requests than the number of workers, which is common during traffic spikes.&lt;br/&gt;&lt;br/&gt;

&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Conc. Clients&lt;/th&gt;&lt;th&gt;Download limit&lt;/th&gt;&lt;th&gt;Algorithm&lt;/th&gt;&lt;th&gt;Avg init time&lt;a name="#footnote-3"&gt;&lt;sup&gt;[3]&lt;/sup&gt;&lt;/a&gt; (sec)&lt;/th&gt;&lt;th&gt;Avg speed&lt;a name="#footnote-4"&gt;&lt;sup&gt;[4]&lt;/sup&gt;&lt;/a&gt; (MB/s)&lt;/th&gt;&lt;th&gt;Avg total throughput (MB/s)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;0.019&lt;/td&gt;&lt;td&gt;&lt;b&gt;4.36&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;208.76&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;0.021&lt;/td&gt;&lt;td&gt;4.15&lt;/td&gt;&lt;td&gt;198.79&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;EBA&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.018&lt;/b&gt;&lt;/td&gt;&lt;td&gt;4.23&lt;/td&gt;&lt;td&gt;202.29&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="6" style="border-top: 1px solid black"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;4.666&lt;/td&gt;&lt;td&gt;&lt;b&gt;4.32&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;212.79&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.048&lt;/b&gt;&lt;/td&gt;&lt;td&gt;1.84&lt;/td&gt;&lt;td&gt;168.15&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;EBA&lt;/td&gt;&lt;td&gt;0.140&lt;/td&gt;&lt;td&gt;1.96&lt;/td&gt;&lt;td&gt;175.71&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="6" style="border-top: 1px solid black"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;14.288&lt;/td&gt;&lt;td&gt;&lt;b&gt;4.31&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;208.59&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.108&lt;/b&gt;&lt;/td&gt;&lt;td&gt;0.87&lt;/td&gt;&lt;td&gt;144.69&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;none&lt;/td&gt;&lt;td&gt;EBA&lt;/td&gt;&lt;td&gt;0.264&lt;/td&gt;&lt;td&gt;0.97&lt;/td&gt;&lt;td&gt;158.83&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br/&gt;&lt;br/&gt;

&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Conc. Clients&lt;/th&gt;&lt;th&gt;Download limit&lt;/th&gt;&lt;th&gt;Algorithm&lt;/th&gt;&lt;th&gt;Avg init time&lt;a name="#footnote-3"&gt;&lt;sup&gt;[3]&lt;/sup&gt;&lt;/a&gt; (sec)&lt;/th&gt;&lt;th&gt;Avg speed&lt;a name="#footnote-4"&gt;&lt;sup&gt;[4]&lt;/sup&gt;&lt;/a&gt; (MB/s)&lt;/th&gt;&lt;th&gt;Avg total throughput (MB/s)&lt;/th&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;0.003&lt;/td&gt;&lt;td&gt;1.0&lt;/td&gt;&lt;td&gt;&lt;b&gt;42.9&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;0.002&lt;/td&gt;&lt;td&gt;0.98&lt;/td&gt;&lt;td&gt;41.82&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (81k)&lt;a name="#footnote-5"&gt;&lt;sup&gt;[5]&lt;/sup&gt;&lt;/a&gt;&lt;/td&gt;&lt;td&gt;0.003&lt;/td&gt;&lt;td&gt;0.98&lt;/td&gt;&lt;td&gt;41.91&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;50&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (40k)&lt;a name="#footnote-6"&gt;&lt;sup&gt;[6]&lt;/sup&gt;&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.002&lt;/b&gt;&lt;/td&gt;&lt;td&gt;0.98&lt;/td&gt;&lt;td&gt;42.85&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="6" style="border-top: 1px solid black"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;20.5&lt;/td&gt;&lt;td&gt;1.0&lt;/td&gt;&lt;td&gt;40.4&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.003&lt;/b&gt;&lt;/td&gt;&lt;td&gt;0.99&lt;/td&gt;&lt;td&gt;&lt;b&gt;85.14&lt;b&gt;&lt;/b&gt;&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (81k)&lt;/td&gt;&lt;td&gt;0.018&lt;/td&gt;&lt;td&gt;1.0&lt;/td&gt;&lt;td&gt;84.19&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;100&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (40k)&lt;/td&gt;&lt;td&gt;0.013&lt;/td&gt;&lt;td&gt;0.99&lt;/td&gt;&lt;td&gt;84.34&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="6" style="border-top: 1px solid black"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;64.8&lt;/td&gt;&lt;td&gt;1.0&lt;/td&gt;&lt;td&gt;37.12&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.112&lt;/b&gt;&lt;/td&gt;&lt;td&gt;0.86&lt;/td&gt;&lt;td&gt;141.8&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (81k)&lt;/td&gt;&lt;td&gt;0.2&lt;/td&gt;&lt;td&gt;0.95&lt;/td&gt;&lt;td&gt;&lt;b&gt;156.59&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;200&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (40k)&lt;/td&gt;&lt;td&gt;0.159&lt;/td&gt;&lt;td&gt;0.96&lt;/td&gt;&lt;td&gt;154.2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td colspan="6" style="border-top: 1px solid black"&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;300&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;SBA&lt;/td&gt;&lt;td&gt;113.9&lt;/td&gt;&lt;td&gt;1.0&lt;/td&gt;&lt;td&gt;34.2&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;300&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;ENBA&lt;/td&gt;&lt;td&gt;&lt;b&gt;0.185&lt;/b&gt;&lt;/td&gt;&lt;td&gt;0.58&lt;/td&gt;&lt;td&gt;127.53&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;300&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (81k)&lt;/td&gt;&lt;td&gt;0.31&lt;/td&gt;&lt;td&gt;0.61&lt;/td&gt;&lt;td&gt;&lt;b&gt;133.66&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;300&lt;/td&gt;&lt;td&gt;1MB/s&lt;/td&gt;&lt;td&gt;EBA (40k)&lt;/td&gt;&lt;td&gt;0.239&lt;/td&gt;&lt;td&gt;0.63&lt;/td&gt;&lt;td&gt;132.75&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br/&gt;&lt;br/&gt;

The interpretation of the results is that with SBA the individual download speeds are slightly higher than when ENBA or EBA is used (this is mainly due to some extra scheduling related to the Selector reregistration). However, EBA and ENBA can utilize the available server bandwidth significantly better than SBA. This is especially true when there is a large number of slow clients downloading larger files. One big downside of SBA is that if the number of concurrent downloads is higher than the number of worker threads, the time to initiate download easily increases into extreme heights.&lt;br/&gt;&lt;br/&gt;

The conclusion is that SBA is well suited for controlled environments where there is a small number of fast clients (server-to-server communication on an internal network), while EBA shines in environments where there is very little control over the number and speed of clients (file hosting on the Internet). While EBA performs and scales better than ENBA, two advantages that ENBA has over EBA are smaller latency and higher resiliency to DoS or DDoS attacks when malicious clients open connections and block.&lt;br/&gt;&lt;br/&gt;

The results above do not represent well the performance and throughput of grizzly-sendfile (the benchmarks were run on my laptop!), but they certainly provide some evidence that can be used to determine characteristics of different algorithms. I'll do some more thorough testing later when I feel that grizzly-sendfile has enough features and it is time for some fine tuning (there is still a lot that can be done to make things more efficient).&lt;br/&gt;&lt;br/&gt;

I love explaining things visually, so I drew the following diagrams that describe the differences between the two algorithms much better than many words.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;SimpleBlockingAlgorithm&lt;/h3&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_NPFQKKI/AAAAAAAAAOY/Iy1jDGE_MyU/s1600-h/Blocking_SingleBurst_Algorithm.png"&gt;&lt;img style="border: 0pt none ; margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 171px;" src="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_NPFQKKI/AAAAAAAAAOY/Iy1jDGE_MyU/s400/Blocking_SingleBurst_Algorithm.png" alt="" id="BLOGGER_PHOTO_ID_5334301780263053474" border="0" /&gt;&lt;/a&gt;The server can concurrently handle only the same number of requests as the number of worker threads in the pool. All the extra downloads have to be queued until one of some workers completes a download. The downloads take shorter to process, but the worker is blocked while a write is blocked. The slower the client the more blocking occurs and the worker utilization (and server throughput) goes down.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;EqualNonBlockingAlgorithm&lt;/h3&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_VRCD4LI/AAAAAAAAAOg/XOuCWYo7-y8/s1600-h/NonBlocking_MultiBurst_Algorithm.png"&gt;&lt;img style="border: 0pt none ; margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 215px;" src="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_VRCD4LI/AAAAAAAAAOg/XOuCWYo7-y8/s400/NonBlocking_MultiBurst_Algorithm.png" alt="" id="BLOGGER_PHOTO_ID_5334301918225490098" border="0" /&gt;&lt;/a&gt;The number of downloads a server can concurrently handle is not limited by the number of workers. The downloads take slightly longer to process, but the workers are much better utilized. The increase in utilization is due to the fact that no blocking occurs and workers can multiplex - "pause" downloads for which clients are processing the data (stored in OS/network buffers). In the meantime workers can serve whichever client is ready to receive more data. This causes the download to be split into several parts, each possibly served by a different worker thread.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;EqualBlockingAlgorithm&lt;/h3&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_HbmHalI/AAAAAAAAAOQ/bbb_61oZbBE/s1600-h/Blocking_MultiBurst_Algorithm.png"&gt;&lt;img style="border: 0pt none ; margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px;" src="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_HbmHalI/AAAAAAAAAOQ/bbb_61oZbBE/s400/Blocking_MultiBurst_Algorithm.png" alt="" id="BLOGGER_PHOTO_ID_5334301680542902866" border="0" /&gt;&lt;/a&gt;This algorithm is a combination of the previous two. It uses blocking IO and multiplexing. I think that thanks to multiplexing the client has more time to process data in OS/network buffers and thus deplete the buffer more than without multiplexing, which results in less blocking. The blocked writes are also more efficient because they decrease the number of parts a download is split into and thus decrease the amount of overhead associated with re-registration.&lt;br/&gt;&lt;br/&gt;

Based on these benchmarks, I'm going to call EqualBlockingAlgorithm the winner and make it the default grizzly-sendfile algorithm for now. It is quite easy to override this via the grizzly-sendfile configuration, so one will still be able to pick the algorithm that fits their deployment environment the best. To be honest the results of EBA benchmarks surprised me a bit because I expected ENBA to be the winner in throttled tests, so I'm really glad that I went into the trouble of creating and benchmarking it.&lt;br/&gt;&lt;br/&gt;

All this work will be part of the 0.3 release, which is due any day now. &lt;a href="http://kenai.com/projects/grizzly-sendfile"&gt;Follow the project on kenai&lt;/a&gt; and subscribe to the &lt;a href="http://kenai.com/projects/grizzly-sendfile/lists"&gt;announce mailing list&lt;/a&gt; if you are interested to hear more news.&lt;br/&gt;&lt;br/&gt;

&lt;hr style="display:block"/&gt;&lt;br/&gt;
&lt;a name="footnote-1"&gt;&lt;/a&gt;[0] a few notes on how I tested - both clients and the &lt;a href=""&gt;grizzly-sendfile-server&lt;/a&gt; were running on my mac laptop using Java 6 and server vm. I set the ramp up period to 1min so that the server can warm up and then I started counting downloads for 10min and calculated the result. Files used for the tests were 1KB, 200KB, 500KB, 1MB 20MB and 100MB large and equaly represented. All the tests passed with 0 errors. A download is successful when the length, md5 checksum and http headers match expected values.&lt;br/&gt;
&lt;a name="footnote-1"&gt;&lt;/a&gt;[1] performance - ability to process a single download quickly&lt;br/&gt;
&lt;a name="footnote-2"&gt;&lt;/a&gt;[2] scalability - ability to process large number of concurrent downloads at acceptable performance&lt;br/&gt;
&lt;a name="footnote-3"&gt;&lt;/a&gt;[3] init time - duration from the time a request is made, until the first bytes of the http response body (not headers) are received&lt;br/&gt;
&lt;a name="footnote-4"&gt;&lt;/a&gt;[4] avg speed - for 100MB files. The smaller the file the lower the effective download speed because of the overhead associated with a download execution.&lt;br/&gt;
&lt;a name="footnote-5"&gt;&lt;/a&gt;[5] EBA (81k) - EqualBlockingAlgorithm with buffer size 81KB (the default socket buffer size on mac)&lt;br/&gt;
&lt;a name="footnote-6"&gt;&lt;/a&gt;[6] EBA (40k) - EqualBlockingAlgorithm with buffer size 40KB&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/jjGa2ljnVes" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/1840308271841899595/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=1840308271841899595" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1840308271841899595?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1840308271841899595?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/jjGa2ljnVes/grizzly-sendfile-and-comparison-of.html" title="grizzly-sendfile and Comparison of Blocking and NonBlocking IO" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/_p64VvtgZDp4/Sgc_NPFQKKI/AAAAAAAAAOY/Iy1jDGE_MyU/s72-c/Blocking_SingleBurst_Algorithm.png" height="72" width="72" /><thr:total>4</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/05/grizzly-sendfile-and-comparison-of.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Ck8DRnk7fip7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-7154859016006159489</id><published>2009-03-31T17:05:00.000-07:00</published><updated>2010-10-24T18:41:17.706-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:41:17.706-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>Identifying ThreadLocal Memory Leaks in JavaEE Web Apps</title><content type="html">A few weeks ago &lt;a href="http://wikis.sun.com/"&gt;wikis.sun.com&lt;/a&gt; powered by &lt;a href="http://www.atlassian.com/software/confluence"&gt;Confluence "Enterprise" Wiki&lt;/a&gt; grew beyond yet another invisible line that triggered intermittent instabilities. Oh boy, how I love these moments. This time the issue was that Confluence just kept on running out of memory. Increasing the heap didn't help, even breaking the 32bit barrier and using a 64bit JVM was not good enough to keep the app running for more than 24 hours.&lt;br/&gt;&lt;br/&gt;

The Xmx size of the heap suggested that something was out of order. It was time to take a heap dump using &lt;code&gt;jmap&lt;/code&gt; and check what was consuming so much memory. I tried &lt;code&gt;jhat&lt;/code&gt; to analyze the heap dump, but 3.5GB dump was just too much for it. The next tool I used was &lt;a href="http://www.alphaworks.ibm.com/tech/heapanalyzer"&gt;IBM's Heap Analyzer&lt;/a&gt; - a decent tool, which was able to read the dump, but consumed a lot of memory in order to do so (~8GB), and was pretty hard to use once the dump was processed.&lt;br/&gt;&lt;br/&gt;

While looking for more heap analyzing tools, I found SAP Memory Analyzer, now known as &lt;a href="http://www.eclipse.org/mat/"&gt;Eclipse Memory Analyzer, a.k.a MAT&lt;/a&gt;. I thought "What the heck does SAP know about JVM?" and reluctantly gave it a try, only to find out how prejudiced I was. MAT is a really wonderful tool, which was able to process the heap really quickly, visualize the heap in a easy-to-navigate way, use special algorithms to find suspicious memory regions, and all of that while using only ~2GB of memory. An excellent preso that walks through MAT features and how heap and memory leaks work, can be found &lt;a href="http://jazoon.com/jazoon07/en/conference/presentationdetails.html?type=sid&amp;amp;detail=2160"&gt;here&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;

Thanks to MAT I was able to create two bug reports for folks at Atlassian (&lt;a href="http://jira.atlassian.com/browse/CONF-14988"&gt;CONF-14988&lt;/a&gt;, &lt;a href="http://jira.atlassian.com/browse/CONF-14989"&gt;CONF-14989&lt;/a&gt;). The only feature I missed was some kind of PDF or HTML export, but I did quite well with using &lt;a href="http://skitch.com/"&gt;Skitch&lt;/a&gt; to take screenshots and annotate them.&lt;br/&gt;&lt;br/&gt;

One of the leaks was confirmed right away, while it wasn't clear what was causing &lt;a href="http://jira.atlassian.com/browse/CONF-14988"&gt;the other one&lt;/a&gt;. All we knew was that significant amounts of memory were retained via &lt;a href="http://java.sun.com/javase/6/docs/api/java/lang/ThreadLocal.html"&gt;ThreadLocal&lt;/a&gt; variables. More debugging was in order.&lt;br/&gt;&lt;br/&gt;

I got this idea to create a servlet filter, that would inspect the thread-local store for the thread currently processing the request and log any thread-local references that exist before the request is dispatched down the chain and also when it comes back. Such a servlet could be packaged as a &lt;a href="http://confluence.atlassian.com/display/DOC/Servlet+Filter+Plugins"&gt;Confluence Servlet Filter Plugin&lt;/a&gt;, so that it is convenient to develop and deploy it.&lt;br/&gt;&lt;br/&gt;

There was only one problem with this idea, the thread-local store is a private field of the &lt;code&gt;Thread&lt;/code&gt; class and is in fact implemented as an inner class with a package default access - kinda hard to get your hands on to. Thankfully &lt;a href="http://www.chuckcaplan.com/blog/archives/2005/08/java_private_do_1.html"&gt;private stuff is not necessarily private in Java&lt;/a&gt;, if you get your hands dirty with reflection code:&lt;br/&gt;
&lt;pre&gt;
Thread thread = Thread.currentThread();

Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);

Class threadLocalMapKlazz = Class.forName("java.lang.ThreadLocal$ThreadLocalMap");
Field tableField = threadLocalMapKlazz.getDeclaredField("table");
tableField.setAccessible(true);

Object table = tableField.get(threadLocalsField.get(thread));

int threadLocalCount = Array.getLength(table);
StringBuilder sb = new StringBuilder();
StringBuilder classSb = new StringBuilder();


int leakCount = 0;

for (int i=0; i &lt; threadLocalCount; i++) {
    Object entry = Array.get(table, i);
    if (entry != null) {
        Field valueField = entry.getClass().getDeclaredField("value");
        valueField.setAccessible(true);
        Object value = valueField.get(entry);
        if (value != null) {
            classSb.append(value.getClass().getName()).append(", ");
        } else {
            classSb.append("null, ");
        }
        leakCount++;
    }
}

sb.append("possible ThreadLocal leaks: ")
        .append(leakCount)
        .append(" of ")
        .append(threadLocalCount)
        .append(" = [")
        .append(classSb.substring(0, classSb.length() - 2))
        .append("] ");

logger.warn(sb);
&lt;/pre&gt;&lt;br/&gt;

A simple plugin like this, &lt;a href="http://jira.atlassian.com/browse/CONF-14988?focusedCommentId=151770&amp;amp;page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_151770"&gt;was able to confirm&lt;/a&gt; that the leaked &lt;code&gt;SAXParser&lt;/code&gt; instances are created and stored as thread-local variables somewhere within the code that exports content as PDF. That is good enough info to pinpoint the exact line of code that creates the thread-local instance by &lt;a href="http://blog.igorminar.com/2008/06/btrace-dtrace-for-java.html"&gt;BTrace&lt;/a&gt; (or code review), but that's a story for a separate blog post.&lt;br/&gt;&lt;br/&gt;

The morale of the story: &lt;code&gt;ThreadLocal&lt;/code&gt; variables are a very powerful feature, which as is common for powerful stuff can result in a lot of nasty things when not used properly. Hopefully all the info I provided to Atlassian will be enough to get a speedy fix for the issue and bring stability to wikis.sun.com - at least until we step over the next "invisible line".&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/vASYSuH4eBs" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/7154859016006159489/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=7154859016006159489" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/7154859016006159489?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/7154859016006159489?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/vASYSuH4eBs/identifying-threadlocal-memory-leaks-in.html" title="Identifying ThreadLocal Memory Leaks in JavaEE Web Apps" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>3</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/03/identifying-threadlocal-memory-leaks-in.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Ck4AQHs5eSp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-2010648548288377382</id><published>2009-02-05T20:52:00.000-08:00</published><updated>2010-10-24T18:42:21.521-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:42:21.521-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="grizzly-sendfile" /><category scheme="http://www.blogger.com/atom/ns#" term="JRuby" /><category scheme="http://www.blogger.com/atom/ns#" term="Java" /><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Mediacast" /><title>Announcing grizzly-sendfile!</title><content type="html">It's my pleasure to finally announce &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Home"&gt;grizzly-sendfile&lt;/a&gt; v0.2 - the first stable version of a project that I started after I got one of those "Sudden Burst of Ideas" last summer.&lt;br/&gt;&lt;br/&gt;

For people who follow the &lt;a href="https://grizzly.dev.java.net/"&gt;grizzly&lt;/a&gt; development or &lt;a href="http://mediacast.sun.com/"&gt;mediacast.sun.com&lt;/a&gt;, this is not exactly hot news. grizzly-sendfile has been used by mediacast since &lt;a href="http://wikis.sun.com/display/mediacast/2008/09/17/Mediacast+2.3+Deployment"&gt;last September&lt;/a&gt; and mentioned on the grizzly mailing list several times since then, but I haven't had time to promote it and explain what it does and how it works, so here I go.&lt;br/&gt;&lt;br/&gt;

&lt;i&gt;If you don't care about my diary notes, skip down to "What is grizzly-sendfile".&lt;/i&gt;&lt;br/&gt;&lt;br/&gt;

A bit of background: the whole story goes back to the end of 2007 when a bunch of us where finishing up the &lt;a href="http://blog.igorminar.com/2008/01/jruby-on-rails-rewrite-of.html"&gt;rewrite&lt;/a&gt; of mediacast.sun.com in &lt;a href="http://jruby.codehaus.org/"&gt;JRuby&lt;/a&gt; on Rails. At that time we realized that one of the most painful parts of the rewrite would be implementing the file streaming functionality. Back then Rails was single-threaded (&lt;a href="http://blog.headius.com/2008/08/qa-what-thread-safe-rails-means.html"&gt;not&lt;/a&gt; &lt;a href="http://guides.rubyonrails.org/2_2_release_notes.html#_thread_safety"&gt;any&lt;/a&gt; &lt;a href="http://blog.igorminar.com/2009/01/benchmarking-jruby-on-rails.html"&gt;more&lt;/a&gt;, yay!), so sending the data from rails was not an option. Fortunately, my then-colleague Peter, came up with an idea to use a servlet filter to intercept empty download responses from rails and stream the files from this filter. That did the trick for us, but it was a pretty ugly solution that was unreliable from time to time and was PITA to extend and maintain.&lt;br/&gt;&lt;br/&gt;

At around this time, I learned about X-Sendfile - a not well known http header - that some webservers (e.g. apache and ligttpd) support. This header  could be used to offload file transfers from an application to the web server. Rails supports it natively via the &lt;code&gt;:x_sendfile&lt;/code&gt; option of &lt;a href="http://api.rubyonrails.org/classes/ActionController/Streaming.html#M000401"&gt;&lt;code&gt;send_file&lt;/code&gt;&lt;/a&gt; method.&lt;br/&gt;&lt;br/&gt;

I started looking for the X-Sendfile support in &lt;a href="https://glassfish.dev.java.net/"&gt;GlassFish&lt;/a&gt;, which we have been using at mediacast, but it was missing. After some emails with glassfish and grizzly folks, mainly &lt;a href="http://weblogs.java.net/blog/jfarcand/"&gt;Jean-Francois&lt;/a&gt;, I learned that the core component of glassfish called &lt;a href="https://grizzly.dev.java.net/"&gt;grizzly&lt;/a&gt; could be extended via custom filters, which could implement this functionality.&lt;br/&gt;&lt;br/&gt;

The idea stuck in my head for a few weeks. I looked up some info on grizzly and &lt;a href="http://en.wikipedia.org/wiki/New_I/O"&gt;NIO&lt;/a&gt; and then during one overnight drive to San Diego, I designed grizzly-sendfile in my head. It took many nights and a few weekends to get it into reasonable shape and test it under load with some custom &lt;a href="http://faban.sunsource.net/"&gt;faban&lt;/a&gt; benchmarks that I had to write, but in late August I had version 0.1 and was able to "sell" it to &lt;a href="http://blogs.sun.com/rama"&gt;Rama&lt;/a&gt; as a replacement of the servlet filter madness that we were using at mediacast.&lt;br/&gt;&lt;br/&gt;

Except for a few initial bugs that showed up under some unusual circumstances, the 0.1 version was very stable. A few minor 0.1.x releases were followed by 0.2 version, which was installed on mediacast servers some time in November. Since then I've worked on docs and setting up &lt;a href="http://kenai.com/projects/grizzly-sendfile"&gt;the project&lt;/a&gt; at &lt;a href="http://kenai.com/"&gt;kenai.com&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;What is grizzly-sendfile?&lt;/h3&gt;From the &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Home"&gt;wiki&lt;/a&gt;:
grizzly-sendfile is an extension for grizzly - a NIO framework that among other things powers GlassFish application server.&lt;br/&gt;&lt;br/&gt;

The goal of this extension is to facilitate an efficient file transfer functionality, which would allow applications to delegate file transfers to the application server, while retaining control over which file to send, access control or any other application specific logic.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;How does it work?&lt;/h3&gt;By mixing some NIO "magic" and leveraging code of the hard working grizzly team, I was able to come up with an ARP (asynchronous request processing) filter for grizzly. This filter can be easily plugged in to grizzly (and glassfish v2) and will intercept all the responses that contain &lt;code&gt;X-Sendfile&lt;/code&gt; header. The value of this header is the path of the file that the application that processed the request wants to send to the client.&lt;br/&gt;&lt;br/&gt;

All that an application needs to do is to set the header. In Java a simple example of such a code looks like this:&lt;pre&gt; response.setHeader("X-Sendfile", "/path/to/file.avi");&lt;/pre&gt;
In Rails, it looks even nicer:&lt;pre&gt;send_file '/path/to.png', :x_sendfile =&gt; true&lt;/pre&gt;
That's it, grizzly-sendfile will take care of the rest.&lt;br/&gt;&lt;br/&gt;

&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_p64VvtgZDp4/SYut9yazg1I/AAAAAAAAAMw/8Z6La5M7k1k/s1600-h/grizzly-sendfile_architecture.png"&gt;&lt;img style="border: medium none ; margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 229px;" src="http://4.bp.blogspot.com/_p64VvtgZDp4/SYut9yazg1I/AAAAAAAAAMw/8Z6La5M7k1k/s400/grizzly-sendfile_architecture.png" alt="" id="BLOGGER_PHOTO_ID_5299520663549346642" border="0" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Why should you care?&lt;/h3&gt;&lt;br/&gt;
For me it was all about keeping my code clean and solving problems at layers where they made the most sense. Then it was also about &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Benchmarks"&gt;performance and scalability&lt;/a&gt; - the kind of stuff that one can do with NIO, can't be now done in JavaEE because of its synchronous nature. And then of course it was about having full control over downloads (like &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Plugins#DownloadResultSender_Plugin"&gt;successful download notification&lt;/a&gt; and other customizations that are possible via &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Plugins"&gt;grizzly-sendfile plugins&lt;/a&gt;). Oh, and I must mention &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/JMXInstrumentation"&gt;JMX monitoring&lt;/a&gt;:&lt;br/&gt;&lt;br/&gt;

&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://kenai.com/attachments/wiki_images/grizzly-sendfile/grizzly-sendfile_jmx.png"&gt;&lt;img style="border: medium none ; margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 600px;" src="http://kenai.com/attachments/wiki_images/grizzly-sendfile/grizzly-sendfile_jmx.png" alt="" border="0" /&gt;&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;What's next?&lt;/h3&gt;There is a lot of stuff on my &lt;a href="http://kenai.com/projects/grizzly-sendfile/pages/Roadmap"&gt;roadmap&lt;/a&gt;. Two of the main missing features are partial downloads and glassfish v3 (grizzly 1.9.x) support. Then there is better monitoring and tons of performance and scalability tuning, which I haven't really focus on yet. A lot of the API still needs to be polished and cleaned-up. Also needed is a solid test suite that is more fine grained than the system/integration tests that I created with faban.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Can you use grizzly-sendfile?&lt;/h3&gt;Yeah, &lt;a href="http://kenai.com/projects/grizzly-sendfile/downloads"&gt;go for it&lt;/a&gt;. This is my pet project that I developed in my free time. The project is licensed under GPL2, so you can even grab the &lt;a href="http://kenai.com/projects/grizzly-sendfile/sources/mercurial/show"&gt;code&lt;/a&gt; if you want.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Can you help?&lt;/h3&gt;Sure. Code reviews, patches, suggestions and help with testing and documentation are more than welcome!&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/Xc1wNhcRKgg" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/2010648548288377382/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=2010648548288377382" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2010648548288377382?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/2010648548288377382?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/Xc1wNhcRKgg/announcing-grizzly-sendfile.html" title="Announcing grizzly-sendfile!" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://4.bp.blogspot.com/_p64VvtgZDp4/SYut9yazg1I/AAAAAAAAAMw/8Z6La5M7k1k/s72-c/grizzly-sendfile_architecture.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/02/announcing-grizzly-sendfile.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Ck4NQn0_eip7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-4814098732920095589</id><published>2009-01-31T13:39:00.000-08:00</published><updated>2010-10-24T18:43:13.342-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:43:13.342-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="JRuby" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Rails" /><title>Benchmarking JRuby on Rails</title><content type="html">Last night, while working on a project I found a really neat use of Rails &lt;a href="http://api.rubyonrails.org/classes/ActionController/Components.html"&gt;Components&lt;/a&gt;, but I also noticed that this part of Rails is deprecated, among other reasons because it's slow.&lt;br/&gt;&lt;br/&gt;

Well, how slow? During my quest to find out, I collected some interesting data, and even more importantly put JRuby and MRI Ruby face to face.&lt;br/&gt;&lt;br/&gt;

&lt;b&gt;Disclaimer&lt;/b&gt;: the benchmarks were not done on a well isolated and specially configured test harness, but I did my best to gather data with informational value. All the components were used with OOB settings.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Setup&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0] + Mongrel Web Server 1.1.4&lt;/li&gt;&lt;li&gt;jruby 1.1.6 (ruby 1.8.6 patchlevel 114) (2008-12-17 rev 8388) [x86_64-java] + GlassFish gem version: 0.9.2&lt;/li&gt;&lt;li&gt;common backend: mysql5  5.0.75 Source distribution (InnoDB table engine, Rails pool set to 30)&lt;/li&gt;&lt;/ul&gt;&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Benchmarks&lt;/h3&gt;I used an excellent high quality benchmarking framework &lt;a href="http://faban.sunsource.net/"&gt;Faban&lt;/a&gt; for my tests. I was lazy, so I only used &lt;a href="http://blogs.sun.com/shanti/entry/http_load_generator"&gt;fhb&lt;/a&gt; (very similar to &lt;a href="http://httpd.apache.org/docs/2.0/programs/ab.html"&gt;ab&lt;/a&gt;, but without &lt;a href="http://weblogs.java.net/blog/sdo/archive/2007/03/ab_considered_h.html"&gt;its flaws&lt;/a&gt;) to invoke simple benchmarks:&lt;ul&gt;&lt;li&gt;simple request benchmark: bin/fhb -r 60/120/5 -c 10 http://localhost:3000/buckets/1&lt;/li&gt;&lt;li&gt;component request benchmark:  bin/fhb -r 60/120/5 -c 10 http://localhost:3000/bucket1/object1&lt;/li&gt;&lt;/ul&gt;Both tests were run with JRuby as well as with RMI Ruby and in addition to that I ran the tests with Rails in single-threaded as well as multi-threaded modes. I didn't use mongler clusters or glassfish pooled instances - there was always only one Ruby instance serving all the requests.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Results&lt;/h3&gt;
&lt;pre&gt;
ruby 1.8.6 + mongrel
---------------------------------
simple action + single-threaded:
ops/sec: 210.900
% errors: 0.0
avg. time: 0.047
max time: 0.382
90th %: 0.095

simple action + multi-threaded:
ops/sec: 226.483
% errors: 0.0
avg. time: 0.044
max time: 0.180
90th %: 0.095

component action + single-threaded:
ops/sec: 132.950
% errors: 0.0
avg. time: 0.075
max time: 0.214
90th %: 0.130

component action + multi-threaded:
ops/sec: 131.775
% errors: 0.0
avg. time: 0.076
max time: 0.279
90th %: 0.125

jruby 1.2.6 + glassfish gem 0.9.2
----------------------------------
simple action + single-threaded:
ops/sec: 141.417
% errors: 0.0
avg. time: 0.070
max time: 0.259
90th %: 0.115

simple action + multi-threaded:
ops/sec: 247.333
% errors: 0.0
avg. time: 0.040
max time: 0.318
90th %: 0.065

component action + single-threaded:
ops/sec: 107.858
% errors: 0.0
avg. time: 0.092
max time: 0.595
90th %: 0.145

component action + multi-threaded:
ops/sec: 179.042
% errors: 0.0
avg. time: 0.055
max time: 0.357
90th %: 0.085
&lt;/pre&gt;&lt;br/&gt;&lt;br/&gt;

&lt;table style="width: 458px; height: 117px;"&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Platform/Action&lt;/th&gt;&lt;th&gt;Simple&lt;/th&gt;&lt;th&gt;+/-&lt;/th&gt;&lt;th&gt;Component&lt;/th&gt;&lt;th&gt;+/-&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;th&gt;Ruby ST
&lt;/th&gt;&lt;td&gt;210ops&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;td&gt;132ops&lt;/td&gt;&lt;td&gt;0%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;Ruby MT&lt;/th&gt;&lt;td&gt;226ops&lt;/td&gt;&lt;td&gt;7.62%&lt;/td&gt;&lt;td&gt;131ops&lt;/td&gt;&lt;td&gt;-0.76%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;JRuby ST&lt;/th&gt;&lt;td&gt;141ops&lt;/td&gt;&lt;td&gt;-32.86%&lt;/td&gt;&lt;td&gt;107ops&lt;/td&gt;&lt;td&gt;-18.94%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;th&gt;JRuby MT&lt;/th&gt;&lt;td&gt;247ops&lt;/td&gt;&lt;td&gt;17.62%&lt;/td&gt;&lt;td&gt;179ops&lt;/td&gt;&lt;td&gt;35.61%&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;(ST - single-threaded; MT - multi-threaded)&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Conclusion&lt;/h3&gt;From my tests it appears that MRI is faster in single threaded mode, but JRuby makes up for the loss big time in the multi-threaded tests. It's also interesting to see that the multi-threaded mode gives MRI(green threads) a performance boost, but it's nowhere close to the boost that JRuby(native threads) can squeeze out from using multiple threads.&lt;br/&gt;&lt;br/&gt;

During the tests I noticed that rails was reporting more times spent in the db when using JRuby (2-80ms) compared to MRI (1-3ms). I don't know how reliable this data is but I wonder if this is the bottleneck that is holding JRuby back in the single threaded mode.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/-ClhGFVbu-I" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/4814098732920095589/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=4814098732920095589" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/4814098732920095589?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/4814098732920095589?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/-ClhGFVbu-I/benchmarking-jruby-on-rails.html" title="Benchmarking JRuby on Rails" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/01/benchmarking-jruby-on-rails.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0cDRn4ycCp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-4122874931418174796</id><published>2009-01-13T20:43:00.000-08:00</published><updated>2010-10-24T18:44:37.098-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:44:37.098-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="MacBook Pro" /><category scheme="http://www.blogger.com/atom/ns#" term="Solaris" /><category scheme="http://www.blogger.com/atom/ns#" term="MacOS" /><title>Using ZFS with Mac OS X 10.5</title><content type="html">A few days ago I got a new MacBook Pro. While waiting for it to be delivered, I started thinking about how I want to layout the installation of the OS. For a long long time I wanted to try to use &lt;a href="http://opensolaris.org/os/community/zfs/"&gt;ZFS file system&lt;/a&gt; on Mac and this looked like a wonderful opportunity. Getting rid of HFS+, which was causing me lots of problems (especially its case insensitive re-incarnation), sounds like a dream come true.&lt;br/&gt;&lt;br/&gt;

If you've never heard of ZFS before, check out this good &lt;a href="http://opensolaris.org/os/community/zfs/demos/basics/"&gt;5min screencast&lt;/a&gt; of some of the important features.&lt;br/&gt;&lt;br/&gt;

A brief google search revealed that there are several people using and developing ZFS for Mac. There is a Mac ZFS porting project at &lt;a href="http://zfs.macosforge.org/"&gt;http://zfs.macosforge.org&lt;/a&gt; and I found a lot of good info at &lt;a href="http://alblue.blogspot.com/search/label/zfs"&gt;AlBlue's blog&lt;/a&gt;.&lt;br/&gt;&lt;br/&gt;

Some noteworthy info:&lt;ul&gt;&lt;li&gt;The current ZFS port (build 119) is based on ZFS code that shipped with Solaris build 72&lt;/li&gt;&lt;li&gt;It's currently not possible to boot Mac OS X from a ZFS filesystem&lt;/li&gt;&lt;li&gt;Finder integration is not perfect yet - Finder lists a ZFS pool as an unmountable drive under devices&lt;/li&gt;&lt;li&gt;There are several reports of kernel panics, most of which appeared in connection to the use of cheap external USB disks (I haven't experienced any)&lt;/li&gt;&lt;li&gt;There are &lt;a href="http://zfs.macosforge.org/trac/wiki/issues"&gt;a bunch of minor issues&lt;/a&gt;, which I'm sure will eventually go away.&lt;/li&gt;&lt;/ul&gt;None of the above was a show stopper for me, so I went ahead with the installation. My plan was simple - repartition the internal hard drive to a small bootable partition and a large partition used by ZFS, which will hold my home directory and other filesystems.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Install ZFS&lt;/h3&gt;Even though MacOS X 10.5 comes with ZFS support, it's only a read-only support. In order to be able to really use ZFS, full ZFS implementation must be installed.&lt;br/&gt;&lt;br/&gt;

The installation is very simple and can be done by following these instructions: &lt;a href="http://zfs.macosforge.org/trac/wiki/downloads"&gt;http://zfs.macosforge.org/trac/wiki/downloads&lt;/a&gt;. Alternatively, AlBlue created a &lt;a href="http://alblue.blogspot.com/2008/11/zfs-119-on-mac-os-x.html"&gt;fancy installer&lt;/a&gt; for the lazy ones out there.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Repartition Disk&lt;/h3&gt;Once ZFS is installed and the OS was rebooted, I could repartition the internal disk. If you are using an external hard drive, you'll most likely need to use &lt;code&gt;zpool&lt;/code&gt; command instead.&lt;br/&gt;&lt;br/&gt;

First let's check what the disk looks like:
&lt;pre&gt;
$ diskutil list
/dev/disk0
#:                       TYPE NAME                    SIZE       IDENTIFIER
0:      GUID_partition_scheme                        *298.1 Gi   disk0
1:                        EFI                         200.0 Mi   disk0s1
2:                  Apple_HFS boot                    297.8 Gi   disk0s2&lt;/pre&gt;Good, the internal disk was identified as &lt;code&gt;/dev/disk0&lt;/code&gt; and it currently contains an EFI (boot) slice and ~300G data slice/partition.

Let's repartition the disk so that it contains two data partitions.&lt;pre&gt;$ sudo diskutil resizeVolume disk0s2 40G ZFS tank 257G
Password:
Started resizing on disk disk0s2 boot
Verifying
Resizing Volume
Adjusting Partitions
Formatting new partitions
Formatting disk0s3 as ZFS File System with name tank
[ + 0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% ]
Finished resizing on disk disk0
/dev/disk0
#:                       TYPE NAME                    SIZE       IDENTIFIER
0:      GUID_partition_scheme                        *298.1 Gi   disk0
1:                        EFI                         200.0 Mi   disk0s1
2:                  Apple_HFS boot                    39.9 Gi    disk0s2
3:                        ZFS tank                    252.0 Gi   disk0s3
&lt;/pre&gt;&lt;br/&gt;&lt;br/&gt;

Great, the disk was repartitioned and the existing data partition, which I call &lt;code&gt;boot&lt;/code&gt;, was resized into a smaller 40GB partition and the extra space was used to create a ZFS pool called &lt;code&gt;tank&lt;/code&gt;. Btw all the data on the &lt;code&gt;boot&lt;/code&gt; partition was preserved.&lt;br/&gt;&lt;br/&gt;

Let's check my new pool:
&lt;pre&gt;
$ zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                    256G    360K    256G     0%  ONLINE     -&lt;/pre&gt;&lt;pre&gt;$ zpool status
pool: tank
state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
 still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
 pool will no longer be accessible on older software versions.
scrub: none requested
config:

 NAME        STATE     READ WRITE CKSUM
 tank        ONLINE       0     0     0
   disk0s3   ONLINE       0     0     0

errors: No known data errors
&lt;/pre&gt;

The warning above just means that a new ZFS storage format is available but is not used by the current pool. As far as I could find there are no benefits for upgrading to the new format on Mac, but if I did, I would lose compatibility with Macs that have only the read-only ZFS support.&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Create Filesystems&lt;/h3&gt;
So now that the new pool exists, I can create a shiny new filesystem using a single command:
&lt;pre&gt;
$ sudo zfs create tank/me3x
$ zfs list
NAME        USED  AVAIL  REFER  MOUNTPOINT
tank        388K   252G   270K  /Volumes/tank
tank/me3x    19K   252G    19K  /Volumes/tank/me3x&lt;/pre&gt;To configure this new filesystem as my home directory, I created a temporary admin account, logged in under this account and mounted the ZFS fs as /Users/me3x:&lt;pre&gt;$ sudo mv /Users/me3x /Users/me3x.hfs
$ sudo zfs set mountpoint=/Users/me3x tank/me3x
$ sudo cp -rp /Users/me3x.hfs /Users/me3x
&lt;/pre&gt;

That's it. My Mac account now resides on a ZFS file system. Now I can finally enjoy all the benefits of using ZFS on my &lt;a href="http://opensolaris.org/os/"&gt;OpenSolaris&lt;/a&gt; box in my office as well as on my Mac. Bye bye HFS, I won't miss you! &lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/lOgusv8Iw_Q" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/4122874931418174796/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=4122874931418174796" title="30 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/4122874931418174796?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/4122874931418174796?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/lOgusv8Iw_Q/using-zfs-with-mac-os-x-105.html" title="Using ZFS with Mac OS X 10.5" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>30</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/01/using-zfs-with-mac-os-x-105.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0YCR38_fip7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-7339631980403720090</id><published>2009-01-11T16:16:00.000-08:00</published><updated>2010-10-24T18:46:06.146-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T18:46:06.146-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="JRuby" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Rails" /><title>Freezing activerecord-jdbc Gems into a Rails Project</title><content type="html">Over the Christmas break a Slovak friend of mine (Hi Martin!) asked me to build a simple book library management app for a school in the Philippines where he's been volunteering for the past year. I thought to my self that if someone can volunteer one year of his life in such an amazing way, I could spend a few hours to help him out too.&lt;br/&gt;&lt;br/&gt;

Since from his description it was obvious that he was looking for a low maintenance solution, I though that a rails application with an embedded database would be a good choice. I worked with &lt;a href="http://db.apache.org/derby/"&gt;derby (JavaDB)&lt;/a&gt; in the past and I knew that derby drivers were already available as an &lt;a href="http://kenai.com/projects/activerecord-jdbc/pages/Home"&gt;active-record adapter gem&lt;/a&gt;, so I thought that it would be pretty simple to set up dev environment using Rails, JRuby, and embedded derby db. Surprisingly there were a few issues along the way.&lt;br/&gt;&lt;br/&gt;

I started with defining the database config in &lt;code&gt;config/database.yml&lt;/code&gt;:
&lt;pre&gt;
development:
adapter: jdbcderby
database: db/library_development
pool: 5
timeout: 5000
...
...
&lt;/pre&gt;
The database files for the dev db will be stored under &lt;code&gt;RAILS_ROOT/db/library_development&lt;/code&gt;&lt;br/&gt;&lt;br/&gt;

Secondly I specified the gem dependency in &lt;code&gt;config/environment.rb&lt;/code&gt; (you gotta love this Rails 2.1+ feature):&lt;pre&gt;
Rails::Initializer.run do |config|
...
config.gem "activerecord-jdbcderby-adapter", :version =&gt; '0.9', :lib =&gt; 'active_record/connection_adapters/jdbcderby_adapter'
...
end&lt;/pre&gt;Note that you must specify the &lt;code&gt;:lib&lt;/code&gt; parameter, otherwise Rails won't be able to initialize the gem and you'll end up with:&lt;pre&gt;no such file to load -- activerecord-jdbcderby-adapter&lt;/pre&gt;So far so good. Now let's install the gems we depend on:&lt;pre&gt;$ jruby -S rake gems:install
(in /Users/me3x/Development/library)
rake aborted!
Please install the jdbcderby adapter: `gem install activerecord-jdbcderby-adapter` (no such file to load -- active_record/connection_adapters/jdbcderby_adapter)

(See full trace by running task with --trace)
&lt;/pre&gt;

Huh? I asked rake to install gems and I get an error that I need to install gems first? It turns out that this error comes from ActiveRecord, which tries to initialize db according to &lt;code&gt;database.yml&lt;/code&gt;, and only then &lt;code&gt;environment.rb&lt;/code&gt; gets to be read.&lt;br/&gt;&lt;br/&gt;

Ok, so let's install the db dependencies manually:
&lt;pre&gt;
$ sudo jruby -S gem install activerecord-jdbcderby-adapter
Password:
JRuby limited openssl loaded. gem install jruby-openssl for full support.
http://wiki.jruby.org/wiki/JRuby_Builtin_OpenSSL
Successfully installed activerecord-jdbc-adapter-0.9
Successfully installed jdbc-derby-10.3.2.1
Successfully installed activerecord-jdbcderby-adapter-0.9
3 gems installed
Installing ri documentation for activerecord-jdbc-adapter-0.9...
Installing ri documentation for jdbc-derby-10.3.2.1...
Installing ri documentation for activerecord-jdbcderby-adapter-0.9...
Installing RDoc documentation for activerecord-jdbc-adapter-0.9...
Installing RDoc documentation for jdbc-derby-10.3.2.1...
Installing RDoc documentation for activerecord-jdbcderby-adapter-0.9...&lt;/pre&gt;
Cool, let's check if all the dependencies are available:&lt;pre&gt;$ jruby -S rake gems
(in /Users/me3x/Development/library)
- [I] activerecord-jdbcderby-adapter = 0.9
- [I] activerecord-jdbc-adapter = 0.9
- [I] jdbc-derby = 10.3.2.1

I = Installed
F = Frozen
R = Framework (loaded before rails starts)
&lt;/pre&gt;

Yay, all dependencies are &lt;i&gt;installed&lt;/i&gt;.&lt;br/&gt;&lt;br/&gt;

In the past when dependencies couldn't be declared in environment.rb, I found developing with frozen rails and gems much more manageable, especially when the app is being developed by more than one person. This also made for less deployment surprises. With the &lt;code&gt;config.gem&lt;/code&gt; defined dependencies, the situation changes quite a bit, but there are situations when it still makes sense to freeze gems into the project. So let's freeze the gems:
&lt;pre&gt;
$ jruby -S rake gems:unpack:dependencies
(in /Users/me3x/Development/library)
WARNING:  Installing to ~/.gem since /usr/local/jruby/jruby-1.1.6/lib/ruby/gems/1.8 and
   /usr/local/jruby/current/bin aren't both writable.
Unpacked gem: '/Users/me3x/Development/library/vendor/gems/activerecord-jdbcderby-adapter-0.9'
Unpacked gem: '/Users/me3x/Development/library/vendor/gems/activerecord-jdbc-adapter-0.9'
Unpacked gem: '/Users/me3x/Development/library/vendor/gems/jdbc-derby-10.3.2.1'
&lt;/pre&gt;

Looks good, let's check it:
&lt;pre&gt;
$ jruby -S rake gems
(in /Users/me3x/Development/library)
- [F] activerecord-jdbcderby-adapter = 0.9
- [F] activerecord-jdbc-adapter = 0.9
- [F] jdbc-derby = 10.3.2.1

I = Installed
F = Frozen
R = Framework (loaded before rails starts)&lt;/pre&gt;

Nice all the dependencies are now &lt;i&gt;frozen&lt;/i&gt;!&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/_zqat4tOAlE" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/7339631980403720090/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=7339631980403720090" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/7339631980403720090?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/7339631980403720090?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/_zqat4tOAlE/freezing-activerecord-jdbc-gems-into.html" title="Freezing activerecord-jdbc Gems into a Rails Project" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2009/01/freezing-activerecord-jdbc-gems-into.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUYEQno4cSp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-6179506886443323648</id><published>2008-10-30T14:42:00.000-07:00</published><updated>2010-10-24T19:18:23.439-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T19:18:23.439-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="Glassfish" /><title>How to Install a Glassfish Patch</title><content type="html">Recently I've been working quite extensively with &lt;a href="https://glassfish.dev.java.net"&gt;glassfish&lt;/a&gt; and had a need to apply a patch for some of the core code. &lt;br/&gt;&lt;br/&gt;

To my big surprise I was not able to find the official way to apply patches to a glassfish installation. I found several patches posted on the issue tracker and paying customers have access to &lt;a href="http://blogs.sun.com/theaquarium/entry/new_patch_releases_for_sun"&gt;tested and supported patch bundles&lt;/a&gt;, which are released outside of the regular release cycle. But even some extensive googling didn't easily reveal how to apply them.&lt;br/&gt;&lt;br/&gt;

Then luckily I found &lt;a href="http://www.nabble.com/Selectively-replacing-server-classes-td13456703.html"&gt;this discussion&lt;/a&gt; on the glassfish mailing list, which describes the process.&lt;br/&gt;&lt;br/&gt;

To make life easier for others and google to index this information, here is a brief recap:&lt;ol&gt;&lt;li&gt;Create a directory &lt;code&gt;&amp;lt;glassfishroot&amp;gt;/lib/patches&lt;/code&gt;&lt;/li&gt;&lt;li&gt;Copy the jar with your patch into this directory&lt;/li&gt;&lt;li&gt;Edit your &lt;code&gt;&amp;lt;glassfishroot&amp;gt;/domain/&amp;lt;domainname&amp;gt;/config/domain.xml&lt;/code&gt; and add attribute &lt;code&gt;classpath-prefix="${com.sun.aas.installRoot}/lib/patches/&amp;lt;patchname.jar&amp;gt;"&lt;/code&gt; to the &lt;code&gt;&amp;lt;java-config ...&amp;gt;&lt;/code&gt; node&lt;/li&gt;&lt;li&gt;Restart your domain&lt;/li&gt;&lt;/ol&gt;
I tested this with Glassfish v2, I'll have to check if it works with the upcoming v3 as well.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/5JCY90JWs4o" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/6179506886443323648/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=6179506886443323648" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/6179506886443323648?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/6179506886443323648?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/5JCY90JWs4o/how-to-install-glassfish-patch.html" title="How to Install a Glassfish Patch" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2008/10/how-to-install-glassfish-patch.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEQBQ3g-fyp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-1718141255510051649</id><published>2008-10-13T18:50:00.000-07:00</published><updated>2010-10-24T19:05:52.657-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T19:05:52.657-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Projects" /><category scheme="http://www.blogger.com/atom/ns#" term="Software Engineering" /><category scheme="http://www.blogger.com/atom/ns#" term="SunWikis" /><title>My Confluence 3.0 Wishlist</title><content type="html">Confluence 2.9 was released last month and I've seen references to 2.10 in the Confluence issue tracker, so I expect to see it out in 1-2 months. That makes me think about what's next.&lt;br/&gt;&lt;br/&gt;

As a part of my adventures of working on Sun's external wiki &lt;a href="http://wikis.sun.com/"&gt;wikis.sun.com&lt;/a&gt;, I've been working on Confluence plugins and even the Confluence core code for a year and a half now, adding new features, enhancing the existing features and very often fixing bugs. Sometimes it was trivial to enhance the code or fix a bug, other times it was not, but what I want to write about today are things that were not possible at all without irreversibly forking the code.&lt;br/&gt;&lt;br/&gt;

Confluence 3.0 should be a version that really deserves to have the first digit incremented. Not because &lt;a href="http://confluence.atlassian.com/display/DOC/Confluence+Release+Cycle"&gt;marketing said it's time for that&lt;/a&gt;, but because the changes in the application are so significant.&lt;br/&gt;&lt;br/&gt;

I'm sure that Atlassian has lots of ideas about what Confluence 3.0 should look like,  but Atlassian guys, in case you start to run out of ideas, here is my wish list:&lt;br/&gt;&lt;br/&gt;

&lt;h3&gt;Fix the Database Schema&lt;/h3&gt;
Confluence has been in development for years and the database schema definitely shows that. Since the database is the heart of the application, I think it deserves a lot of attention and major performance boost could be gained by doing a clean up.&lt;br/&gt;&lt;br/&gt;

Specific improvements:
&lt;ul&gt;&lt;li&gt;Establish and in the future enforce naming conventions&lt;/li&gt;&lt;li&gt;Replace all the natural foreign keys with surrogate keys, e.g. user name, spacekey, group name should be replaced with ids in all the referencing tables (this would finally allow &lt;a href="http://jira.atlassian.com/browse/CONF-4063"&gt;CONF-4063&lt;/a&gt; to be implemented)&lt;/li&gt;&lt;li&gt;Add caches for the lower function (&lt;a href="http://jira.atlassian.com/browse/CONF-10030"&gt;patch&lt;/a&gt;) and maybe &lt;a href="http://railscasts.com/episodes/23"&gt;counter caches&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Rework the Clustering&lt;/h3&gt;Clustering is usually supposed to fulfill two functions: scalability and robustness. In the case of Confluence mainly the second attribute is missing. In fact, I'd go as far as saying that a Confluence cluster is less robust than a single instance of Confluence. Why? Because the way it is implemented makes the entire cluster vulnerable when one node has problems.&lt;br/&gt;&lt;br/&gt;

I personally experienced several cluster lock-ups or crashes, usually initiated by a separate Confluence bug, in which the effect was multiplied by the clustering code. One of such of these bugs: &lt;a href="http://jira.atlassian.com/browse/CONF-12319"&gt;CONF-12319&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;

&lt;a href="http://www.blogger.com/www.parleys.com/display/PARLEYS/Pragmatic+Clustering+Guide"&gt;Mike's presentation&lt;/a&gt; covers quite a few design goals behind the implementation in Confluence. Clustering can really get ugly and complicated and Mike covered it pretty well. Unfortunately the distributed share part of the clustering makes Confluence prone to problems.&lt;br/&gt;&lt;br/&gt;

One of the clustering goals that Mike emphasizes in his presentation is that clustering should be "admin-friendly" (low admin overhead and easy setup). While I agree with the low overhead part, the easiness of setup should not compromise the goals which clustering is trying to fulfill in the first place. Clustering is for people who are serious about running Confluence, and as such should be expected to be qualified for the job.&lt;br/&gt;&lt;br/&gt;

Specific improvements:&lt;ul&gt;&lt;li&gt;Either reevaluate the distributed share clustering so that it is super robust, or consider implementing clustering via a centralized share&lt;/li&gt;&lt;li&gt;Avoid shutting down the entire cluster when "cluster panic" is detected. A better solution, which avoids unnecessary downtime, would be to shut down all the nodes, except for the nodes properly clustered with the oldest node.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Clean Up the HTML and CSS Code&lt;/h3&gt;The html code that comes out of Confluence is horrendous. While the rendered output looks pretty pleasant, looking under the hood (browsing the source code in a browser) is not recommended for pregnant women, men with ED, high cholesterol, and generally not recommended for people over 50.&lt;br/&gt;&lt;br/&gt;

Some improvements were done in this area in the recent releases, but all of them were just minor cosmetic surgeries. Confluence really needs major surgery that will bring the html code up to current standards. The benefit of this will be much faster page loads and code that is easier to maintain and enhance.&lt;br/&gt;&lt;br/&gt;

Specific improvements:&lt;ul&gt;&lt;li&gt;Rewrite most of the templates and macros to make them XHTML 1.0 compliant&lt;/li&gt;&lt;li&gt;Minify and combine javascript and css files (&lt;a href="http://jira.atlassian.com/browse/CONF-8622"&gt;CONF-8622&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;Use &lt;a href="http://alistapart.com/articles/sprites"&gt;image sprites&lt;/a&gt; to even further speed up page loads (especially in the rich text editor)&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Redo the URI Namespace&lt;/h3&gt;Human friendly URIs and URLs are becoming more and more important on today's Internet. Confluence is not doing well in this area.&lt;br/&gt;&lt;br/&gt;

Specific improvements:&lt;ul&gt;&lt;li&gt;&lt;code&gt;/display/MySpace/My+Page&lt;/code&gt; - is the &lt;code&gt;/display&lt;/code&gt; part really necessary? Can't we do &lt;code&gt;/MySpace/My+Page&lt;/code&gt;&lt;/li&gt;&lt;li&gt;&lt;code&gt;/pages/diffpages.action?pageId=2490471&amp;amp;originalId=45714293&lt;/code&gt; - What is this? I don't know. How about: &lt;code&gt;/MyWiki/My+Page/diff/23:22&lt;/code&gt;. I think that actually means something. There might be a better format, this is just a thought.&lt;/li&gt;&lt;li&gt;I think in general redoing the URI name space using REST conventions would be interesting.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Improve Atlassian-Renderer&lt;/h3&gt;When I was creating come patches for the atlassian-renderer I was surprised to find that atlassian-render, the module responsible for rendering wiki markup into html is full of hardcoded html snippets. The main reason why this surprised me is that most of the Confluence code is pluggable, which allows for parts of the code to be replaced with a better version without a lot of problem. This is not the case with the render. And this presents two problems: it's not possible get Confluence to &lt;b&gt;directly&lt;/b&gt; render anything else than html (pdf and doc are only derived from the html), and it's not possible to use anything else than Confluence markup as the input for the renderer.&lt;br/&gt;&lt;br/&gt;

The first problem makes me unable to render custom output like &lt;a href="http://en.wikipedia.org/wiki/DocBook"&gt;docbook&lt;/a&gt; or to improve the PDF output, which is pretty poor.&lt;br/&gt;&lt;br/&gt;

The second issue means that all the customers that use Confluence are locked-in because all the content created via Confluence is Confluence-specific and can't be easily moved to a different wiki engine when needed.&lt;br/&gt;&lt;br/&gt;

In my opinion the sooner all major wiki engine developers settle on one wiki markup standard the sooner we will all be better off. This might be especially difficult for Atlassian to swallow and implement, because they standardized on their own markup that they also use in their other products.&lt;br/&gt;&lt;br/&gt;

An interesting initiative that is gaining a lot of traction is &lt;a href="http://www.wikicreole.org/"&gt;Creole&lt;/a&gt;, a standardized wiki markup. Confluence is one of the few major wiki players that doesn't support this initiative.&lt;br/&gt;&lt;br/&gt;

Specific improvement:&lt;ul&gt;&lt;li&gt;Split the current renderer into two pluggable parts: parser and renderer&lt;/li&gt;&lt;li&gt;Implement Creole support (&lt;a href="http://jira.atlassian.com/browse/CONF-12077"&gt;CONF-12077&lt;/a&gt;)&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Improve Developer Documentation
&lt;/h3&gt; spent countless hours, especially in the beginnings trying to figure out how Confluence works and how Confluence plugins should be written. I learned some new tricks and that's the good part, the bad thing is that the experience could have been much better if the the code contained more javadocs comments and if the plugin interfaces and mainly the configuration file format was better documented.&lt;br/&gt;&lt;br/&gt;

Specific improvements:&lt;ul&gt;&lt;li&gt;Add JavaDoc comments where missing&lt;/li&gt;&lt;li&gt;Finally provide &lt;b&gt;complete&lt;/b&gt; specification and documentation for the plugin config file (&lt;a href="http://jira.atlassian.com/browse/JRA-12183"&gt;JRA-12183&lt;/a&gt;)&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Thanks! :-)&lt;/h3&gt;That's about as much as I can think of for now. There are probably other things that I missed and then there enhancements around security, which I know are already on the roadmap.&lt;br/&gt;&lt;br/&gt;

I understand that most of the changes above will create incompatibilities with many existing themes and plugins, but hey, Confluence 3 will happen only once EVAR and releases like this are expected to bring major incompatibilities. Data can always be migrated automatically and existing plugins and themes will be migrated when there are people interested in using them and proper migration instructions are provided.&lt;br/&gt;&lt;br/&gt;

I hope that Confluence 3 will not be a "marketing" release, but instead something really cool that all users &lt;b&gt;and&lt;/b&gt; developers will enjoy working with.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/f3TGjEFcOoE" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/1718141255510051649/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=1718141255510051649" title="10 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1718141255510051649?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/1718141255510051649?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/f3TGjEFcOoE/my-confluence-30-wishlist.html" title="My Confluence 3.0 Wishlist" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>10</thr:total><feedburner:origLink>http://blog.igorminar.com/2008/10/my-confluence-30-wishlist.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEMFRns-cCp7ImA9Wx5UGUo.&quot;"><id>tag:blogger.com,1999:blog-6406593750327945950.post-114328985511979489</id><published>2008-09-11T20:21:00.000-07:00</published><updated>2010-10-24T19:06:57.558-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-24T19:06:57.558-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Security" /><category scheme="http://www.blogger.com/atom/ns#" term="OpenSolaris" /><category scheme="http://www.blogger.com/atom/ns#" term="MacOS" /><title>Unintrusive but secure passwordless ssh authentication</title><content type="html">On a daily basis I need to log in to many remote servers inside or outside of Sun via SSH, often dozens of times per day. This can get pretty tiresome if you need to type in your password with every log in.&lt;br/&gt;&lt;br/&gt;

Some suggest setting up so-called "passwordless" authentication by generating ssh keys and specifying empty passphrase for the private key. This will result in passwordless authentication, but will also decreased security. Should anyone get hold of your private key, (s)he'll get access to all of your remote systems.&lt;br/&gt;&lt;br/&gt;

&lt;a href="http://en.wikipedia.org/wiki/Ssh-agent"&gt;&lt;code&gt;ssh-agent&lt;/code&gt;&lt;/a&gt; can help a lot in keeping the security level high and minimizing the number of times you need to type in the password. However, if you use a terminal with tabs or use both local and remote terminals on your workstation, you'll end up running many ssh-agent processes and having to authenticate every time you start such a process, which diminishes most of the conveniences of using &lt;code&gt;ssh-agent&lt;/code&gt;.&lt;br/&gt;&lt;br/&gt;

Frustrated with this situation and with a bit of help from &lt;a href="http://blogs.sun.com/martin"&gt;Martin&lt;/a&gt;, I created a shell script, which I added to my &lt;code&gt;.bash_profile&lt;/code&gt; startup script. All I have to do now is to authenticate when my first terminal session starts and I'm good until the next time I restart my OS. sweeeet...&lt;br/&gt;&lt;br/&gt;

Here is how you could set it up on a &lt;b&gt;workstation&lt;/b&gt; and a &lt;b&gt;remote-server&lt;/b&gt;:&lt;br/&gt;&lt;br/&gt;

First, if you haven't generated your private/public ssh key pair, do that now:&lt;pre&gt;workstation $ &lt;b&gt;ssh-keygen&lt;/b&gt;
Generating public/private rsa key pair.
Enter file in which to save the key (/Users/user/.ssh/id_rsa): &lt;b&gt;**********************&lt;/b&gt;
Enter same passphrase again: &lt;b&gt;**********************&lt;/b&gt;
Your identification has been saved in id_rsa.
Your public key has been saved in id_rsa.pub.
The key fingerprint is:
01:a6:95:23:1c:74:53:c7:f4:87:07:a2:50:ef:99:16 user@Computer.local&lt;/pre&gt;
Now make sure that the file system permissions are set up correctly:&lt;pre&gt;workstation $ &lt;b&gt;cd ~/.ssh/&lt;/b&gt;
workstation $ &lt;b&gt;ls -l&lt;/b&gt;
total 56
-rw-------  1 user  staff  1743 Aug 31 00:13 id_rsa
-rw-r--r--  1 user  staff   398 Aug 31 00:13 id_rsa.pub
&lt;/pre&gt;The &lt;code&gt;id_rsa&lt;/code&gt; file must be readable only by owner, if this is not true, the key will be ignored.&lt;br/&gt;&lt;br/&gt;

On the &lt;b&gt;remote server&lt;/b&gt; you need to authorize your newly generated key pair by appending it's public key to the &lt;code&gt;~/.ssh/authorized_keys&lt;/code&gt; file under your remote home directory:&lt;pre&gt;workstation $ &lt;b&gt;cat ~/.ssh/id_rsa.pub |ssh user@remote-server 'sh -c "cat - &gt;&gt;~/.ssh/authorized_keys"'&lt;/b&gt;&lt;/pre&gt;
Now you can try to log in:&lt;pre&gt;workstation $ &lt;b&gt;ssh user@remote-server&lt;/b&gt;
Enter passphrase for /Users/user/.ssh/id_rsa: **********************
Identity added: /Users/user/.ssh/id_rsa (/Users/user/.ssh/id_rsa)
Last login: Thu Sep 11 20:19:19 2008
remote-server $&lt;/pre&gt;&lt;br/&gt;
If you open a new tab in your terminal and try to log in again, you'll be asked to enter the passphrase yet again. This is where my script becomes useful. First download the script from &lt;a href="http://mediacast.sun.com/"&gt;Mediacast&lt;/a&gt;: &lt;a href="http://mediacast.sun.com/users/IgorMinar/media/ssh-agent-init.sh/details"&gt;ssh-agent-init.sh&lt;/a&gt; and store it somewhere in your home directory&lt;pre&gt;workstation $ &lt;b&gt;mkdir ~/bin&lt;/b&gt;
workstation $ &lt;b&gt;cd ~/bin&lt;/b&gt;
workstation $ &lt;b&gt;wget http://mediacast.sun.com/users/IgorMinar/media/ssh-agent-init.sh&lt;/b&gt;
workstation $ &lt;b&gt;chmod o+x ~/bin/ssh-agent-init.sh&lt;/b&gt;&lt;/pre&gt;&lt;br/&gt;
The next (last) step is optional if you want to start the script manually you can skip it.&lt;br/&gt;&lt;br/&gt;

I wanted to have this script automatically invoked when I start my terminal for the first time in the interactive mode. All I needed to modify were my &lt;code&gt;.bash_profile&lt;/code&gt; (used for interactive sessions) and &lt;code&gt;.bashrc&lt;/code&gt; (used for non-interactive sessions) startup scripts for my bash shell (modifications are in italics):&lt;pre&gt;workstation $ &lt;b&gt;cat ~/.bash_profile&lt;/b&gt;
...
...

&lt;i&gt;. ~/bin/ssh-agent-init.sh&lt;/i&gt;

workstation $ &lt;b&gt;cat ~/.bashrc&lt;/b&gt;
&lt;i&gt;export NONINTERACTIVE=1&lt;/i&gt;
. ~/.bash_profile
&lt;/pre&gt;(Note: I source the &lt;code&gt;~/.bash_profile&lt;/code&gt; script from the &lt;code&gt;~/.bashrc&lt;/code&gt; script to enable code reuse between the two scripts.)&lt;br/&gt;&lt;br/&gt;

That's it! If you now try to open a terminal tab for the first time, you'll be asked for passphrase. Once that is done, any new tab or any other shell session created under the same account will reuse the same ssh-agent process.&lt;br/&gt;&lt;br/&gt;

I've been using this script on my MacOS X laptop as well as &lt;a href="http://opensolaris.com/"&gt;OpenSolaris&lt;/a&gt; workstation for a few weeks now and it's been working like charm.&lt;img src="http://feeds.feedburner.com/~r/IgorMinarsBlog/~4/E1dV7F9YuJI" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.igorminar.com/feeds/114328985511979489/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6406593750327945950&amp;postID=114328985511979489" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/114328985511979489?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6406593750327945950/posts/default/114328985511979489?v=2" /><link rel="alternate" type="text/html" href="http://feeds.igorminar.com/~r/IgorMinarsBlog/~3/E1dV7F9YuJI/unintrusive-but-secure-passwordless-ssh.html" title="Unintrusive but secure passwordless ssh authentication" /><author><name>Igor Minar</name><uri>http://www.blogger.com/profile/03520548417275543432</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://blog.igorminar.com/2008/09/unintrusive-but-secure-passwordless-ssh.html</feedburner:origLink></entry></feed>
