I do remember the proposal for this project (by Nick Solter) on OpenSolaris OHAC community, and just now i could play with it. Shame on me.
Two things became very clear to me with this… first, the OpenSolaris project is a success without a doubt; there are so many people participating and so many projects, that you just can not follow. Second, moving to another city is a tough procedure. I did loose contact with many projects, and really take some time until you get your life back on rails. Ok, no more cry and let talk about this excellent project…
When i initially read the proposal for this project, the main point for me was the integration with the OpenSolaris OS. But now i see that project colorado is a big step in being pragmatic (i love this word), and give options to the users and administrators. The integration with the projects COMSTAR, and Crossbow, and features like weak membership, gives a lot of possibilities to Open High Availability Cluster.
Talking about weak membership, it’s true that you will have some limitations if you set up a two node cluster without a quorum device, but at least you will have the option. And let me say, in our profession i think the solution is *to have options*. Flexibility. Resilience. I think that is the key for High available services. When i did start (post in brazilian portuguese) to make some tests with Sun Cluster 3.2, i did it because that was the first version that we could create a two node cluster without, at least, two interfaces for the cluster interconnect. It was not easy to find a way to do the configuration because (first version with that feature), it had a bug in the configuration phase that was preventing me to use it.
Well, i did all the tests and everything did work just fine. The machines that i had to make the tests did have two onboard nics, and just pci-e expansion slots. We did not have off-board nics pci-e at the time, and in the old company where i did work, a proccess to buy something was not a easy one. Well, the fact is that options is always good.
I know that two interconnects are better than one, but that is not say that just one will not work. Another point is that shared storage is not a simple thing. I guess many users do not have some kind of HA because of that. The goal for my project is just eliminate it, and give the users the option to have some kind of HA without shared storage at all. We have some problems with AVS and the requirement for shared storage when installed on a cluster, but now there is another feature from project colorado: iSCSI support.
That is great! So, we can use it to give the AVS the shared storage it wants, and can use iSCSI as a quorum device; use a quorum server, or not use a quorum at all. It’s easy to create a solution with so many options. Thanks to OHAC.
In these days, where everyone is thinking about horizontal scalability, the requirement for NON shared storage for HA is more evident. There is a lot of solutions using some kind of replication to have a good MTTR or HA, and i think we need one in our community, that’s why i did implement the NON-SharedDevice resource type (Agent). We need a simple way to have apache, mysql, NFS, and etc HA without shared storage, because many people have many small machines like soldiers ready to save the country.
But i got some problems trying to compile the project colorado. I know my laptop is not the best, it has just 2GB of memory, and i did use a OpenSolaris VBox guest with 512MB of RAM to compile it, but i don’t know if is a problem with the compilation scripts from OHAC or the SunStudioExpress that tries to create thousands of forks, and that just gives us *out of memory*, or *fork failed* errors. In opensolaris.sh script to build the ON, there is a simple edit to fix the parallelism, but in OHAC i could not find it. So, for the compilation work, i had to reconfigure the machine to 1GB, and add 2G of swap space (and we need a tip in the compilation instructions about the “-i” option, we can use to resume the compilation after some error). ;-) By the way, reading about memory allocation on Solaris/OpenSolaris and Sun Studio, i did find this excellent post about how it works in GNU/Linux, and this other post playing with the “GNU/Linux way to handle memory allocation” that i need to reproduce here:
An aircraft company discovered that it was cheaper to fly its planes
with less fuel on board. The planes would be lighter and use less fuel
and money was saved. On rare occasions however the amount of fuel was
insufficient, and the plane would crash. This problem was solved by
the engineers of the company by the development of a special OOF
(out-of-fuel) mechanism. In emergency cases a passenger was selected
and thrown out of the plane. (When necessary, the procedure was
repeated.) A large body of theory was developed and many publications
were devoted to the problem of properly selecting the victim to be
ejected. Should the victim be chosen at random? Or should one choose
the heaviest person? Or the oldest? Should passengers pay in order not
to be ejected, so that the victim would be the poorest on board? And
if for example the heaviest person was chosen, should there be a
special exception in case that was the pilot? Should first class
passengers be exempted? Now that the OOF mechanism existed, it would
be activated every now and then, and eject passengers even when there
was no fuel shortage. The engineers are still studying precisely how
this malfunction is caused.

ps.: There is a way to avoid that behaviour on GNU/Linux, what i think must be more evident. But, like i said before, options is always good.
peace.