30 September 2008
My plan is to put together a few clumps of slides, prepare a cluster on EC2, and see if anyone wants to hear about JBoss Rails. I'll of course put them online sometimes before/during/after the camp, since that's part of the rules.
29 September 2008
JBoss on Rails will indeed cluster!
After modifying and dropping my jboss-rails.deployer into an 'all' configured server of JBoss AS 5, and firing up 3 instances on my localhost (non-trivial on OSX...):
10:43:28,409 INFO [RPCManagerImpl] Received new cluster view: [127.0.0.10:63740|2] [127.0.0.10:63740, 127.0.0.11:63747, 127.0.0.12:63749] 10:43:28,435 INFO [RPCManagerImpl] Cache local address is 127.0.0.12:63749 10:43:28,469 INFO [ComponentRegistry] JBoss Cache version: JBossCache 'Poblano' 2.2.0.GA
And I've got 3 nodes running the same Rails app, all sharing a cookie and a JBossCache cache. Nick Sieger's JRuby-Rack handles binding the Rails session to the actual servlet session, and JBossCache takes care of the rest.
A little 8-line perl round-robinning load-balancer is wired up through mod_rewrite in my Apache httpd.conf to throw requests to each of the nodes. Anything set in the session is immediately available at the next request which lands at a different node.
Further down the line, we can look at a clustered cache for caching AR models and view fragments. Not too shabby.
It should be fairly easy to create a nice Amazon EC2 AMI with Fedora+AS5+jboss-rails, plus some better Rake/capistrano tasks, and make for quick cluster deployment. Any EC2 experts wanting to jump in?
28 August 2008
Back in May, I was a manager.
I feebly attempted to direct 8 great guys and gals to further the goals of JBoss.org. After the Codehaus, you'd think I'd be able to help build an opensource community with fun and flair. But I came to realize that it's hard to build a community as an active effort. Instead, I think community develops as a by-product of a useful and well-run project. And that's under the control of the project leaders and contributors, not necessarily some external third party.
Back in May, I gave up being a manger.
Now, the day after Labor Day, fittingly enough, I'll be jumping back into the world of JBoss. But not as a manager. When I was burned out and felt like resigning, Mark Proctor and Sacha Labourey instead talked me into taking a sabbatical. And I'm truly grateful to them. Now, after unwinding for a few months, I asked to rejoin the team as an engineer. Through Sacha's patience and budget manipulation, I'm once again excited to go to work. I think JBoss should definitely be held up as a company that takes care of its people. They could've easily given me the boot, but instead they've been extremely kind and accommodating.
So, what will I be doing?
After talking to Java developers and Rubyists alike, my first goals are to look at Rails as just-another-way to write J2EE apps (or "JEE" I reckon, these days...). Yes, I know about (and plan to use) things like Warbler and JRuby-Rack. Both are good things.
But I also have full control of the deployment environment, to build a stack to make it happier than "build and deploy a WAR".
Through the miracle of AS5 built on JBossMicrocontainer, along with the awesome VFS bits, it should be possible to deploy a Rails app in-situ, right from your working directory. There should be no reason to have to build a WAR while you're hacking a rails app. And deployment to a server should still involve capistrano (in my opinion). Stick to the Rails way of doing things, but make it Java under the covers.
Various blog posts have shown Rails apps on Glassfish in 12, 10, or 5 steps. My goal is to get it down to 1 step. And you should magically be able to pick up and use all the wonderful JEE bits that maps to the Rails functionality the Railers of the world enjoy, without having to be aware of the JEE bits.
Speaking with Mark Newton (the guy who runs JBoss.org now), it seems sensible to view Rails as simply yet-another-programming-model for writing Java apps. The idea is to avoid leaky abstractions, so we're not having to write some psuedo RubyJava application.
Once we've got that base covered, then we can make fun and exciting Ruby bindings to all the powerful JBoss tools, such as Drools, ESB, Cache or MQ.
I expect to have a bit of fun with this. More fun than being a manager, certainly.
21 August 2008
In addition to the previously-mentioned JRuby mirror from Codehaus SVN to GitHub, I'm now also mirroring:
All are trunk-only mirrors, not picking up branches or tags. Since the JBoss repository path has about 77,000 subversion revisions, and at one point held any and all JBoss software ever written, I have not mirrored it in its entirety. Instead, I've only grabbed http://anonsvn.jboss.org/repos/jbossas/trunk back to revision 77,200. It'll mirror going forward, but the github repository does not include any ancient history.
For those of you playing along at home, the way to fetch just a cauterized "tip" from SVN to a git repository is to mirror as before, but for the initial "git svn fetch" command, add a SVN-style revision range
git svn fetch -r77200:HEAD
For me, at least, trying to fetch the tip revision for the directory resulted in failure. Going back a few revisions, and using a range that includes HEAD worked much better. Then just push to GitHub has normal, and start your rebase/push cronjob.
The JBoss projects are updated from SVN every 15 minutes. But we're updating from the anonymous SVN repository at JBoss, which itself is delayed from the developer repository by some amount of time. So, ultimately, the GitHub mirror should be mostly up-to-date, but could lag behind actual developer commits by up to and hour, I reckon.
If you're wanting to track these repositories using my git mirror, only track the vendor branch. I make no claims about the stability or sanity of the 'master' ref at any point in time. I will make sure 'vendor' exactly matches the Subversion history, though.
20 August 2008
So, I'm gearing up to work on some Java+Ruby (via JRuby) stuff. The Java world still seems fairly entrenched in the cult of Subversion, while the Rubyists have gone with Git lately.
I'm still wrapping my mind around Git, but with GitHub, it's fairly easy and straight-forward. I paid my $7 for the micro account, to give me room to screw around.
There's quite a few posts about mirroring SVN to a Git repository, but I feel the need to add my own, of course.
My goal is mirror the trunk of the JRuby project from Codehaus SVN to my account on GitHub. By doing this, I can track the trunk development, and also work on my own patches.
I started by creating an empty repository on my GitHub account, called 'jruby'.
Now, over on my always-on, Contegix-powered server, I create a brand new local git repository, also called jruby.
mkdir jruby cd jruby git init
Next I use 'git svn init' to setup the SVN repository as a remote code source to track. Using the -T switch points git to the trunk, and ignores branches and tags, which is fine for my purposes.
git svn init -T http://svn.codehaus.org/jruby/trunk/jruby/
That does not pull any code, but it lets my local working tree know that I'm going to be pulling from an SVN repository at some point. This setup only occurs in your local repository, and does not seem to ever get pushed to GitHub once we get to that point.
So, now we do the initial pull. Once again, this is on my always-on, Contegix-powered server, not my local laptop. I'm doing this on a server because towards the end, we'll be setting up a cronjob to accomplish it all.
git svn fetch
It'll think for a while, it'll slurp down the SVN revision history, it'll stop and ponder occasionally, and eventually, it'll be done. Woo-hoo! Our local working tree is now up-to-date with the subversion HEAD as of that moment.
To reduce disk-space used by your local repository, go ahead and run the garbage collector
On my system, that reduced the space from over 600mb to under 70mb.
Now, that's great, but it's still just on my local repository. Time to push it to GitHub. We're not going to follow their directions exactly, since this will ultimately be a cronjob and needs to use ssh. And I'm slightly paranoid about my ssh keys.
So, the first thing I do is create another keypair, for used only by my mirroring process, and only for pushing changes to github. It has no passphrase. This allows me to keep my top-secret keys off my shared, always-on server. If these keys are compromised, all an attacker can use them for is to push changes to GitHub. Which, being revision-control, is more annoying than dangerous. (Hooray for "git reset").
ssh-keygen -t dsa -f .ssh/id_dsa_github_mirroring
Next, I edit my .ssh/config to add a "fake host" so that ssh connections invoked by git will use this new key.
As with all previous bits, this is still on my always-on server, not my local laptop.
Host githubmirror User git Hostname github.com IdentityFile /home/bob/.ssh/id_dsa_github_mirroring
This will cause any invocation of "ssh githubmirror" into "ssh email@example.com -i .ssh/id_dsa_github_mirroring".
I then installed id_dsa_github_mirroring.pub into my GitHub account.
Now, GitHub's instructions say to run this command to add the GitHub repository as a remote named "origin"
git remote add origin firstname.lastname@example.org:bobmcwhirter/jruby.git
Instead, we teak it to use the "fake host" we added to .ssh/config
git remote add origin git@githubmirror:bobmcwhirter/jruby.git
We're almost done, I promise.
Next, we need to do the first push from my server up to GitHub. We first push to the 'master' branch, since the repo really wants to have a master branch.
git push origin master
Now, GitHub doesn't allow you to fork a repository you own, and since this mirror is owned by me, where can I do my own hacks and patches? The 'master' branch of course. But I still want an unmolested, straight-from-subversion mirror. So, I create a 'vendor' branch in my workspace. It's initialized to match 'master' exactly.
git checkout -b vendor
Now, I push that to GitHub, too.
git push origin vendor
Awesome. I now have two branches, identical at the moment, called "vendor" and "master".
Now, as far as I can tell, all the Subversion setup that we did only lives in the local repository on my always-on server. Anyone who clones from the GitHub repository will not have that stuff. They can of course do a 'git svn init' themselves, to add it to their local repository. But it doesn't flow through GitHub.
But that's fine, since I've been doing this on my always-on server anyhow. My workspace is sitting in the 'vendor' branch that's tracking the vendor branch from github.
I can pull the latest changes from Subversion by typing
git svn rebase
The 'rebase' command is neat, in that any changes that exist in the git repository are floated to be applied to whatever the latest HEAD is. But since I'm only concerned with a one-way SVN-to-Git mirror, there will never be any changes to float, and this will just tack on subsequent SVN commits as Git commits onto the 'vendor' branch. It'll leave the 'master' branch un-touched.
After rebasing, you gotta push the 'vendor' branch up to GitHub.
git push origin vendor
Now, type that every 15 minutes, and your 'vendor' branch will stay mostly up-to-date.
Or use cron.
I've cronned a script that fires every 15 minutes
#!/bin/sh cd /home/bob/github-svn-mirrors/$1 git svn rebase git push origin vendor
It's run with the repository name as the first (and only) argument
*/15 * * * * /home/bob/github-svn-mirrors/bin/mirror jruby
Now, over on my laptop, finally, I can clone the repository, work on topic branches, push to master and have my own controlled environment and fork, while knowing the 'vendor' branch reflects the pure SVN state which I can also pull into my hackings as-desired.
When I submit a patch, if it ultimately floats back to me through the vendor branch, git is supposedly smart enough to realize that the same changes have arrived in my 'master' (assuming it's applied verbatim) and keep things nice and tidy. Else, I can force a merge, trampling my half-assed patch with the official JRuby code.