Semintelligent

Why you should not use the default HSQLDB backend with Confluence · 19 July 2008, 15:18

Confluence clearly states on their web site that users should not use Confluence with the default HSQLDB backend in production. However, they fail to list any reason why HSQLDB should not be used. This may lead some people (like me) into thinking that HSQLDB will be ok for the long term since they do not list any reasons for their statement.

So we stuck with the HSQLDB back end for our Wiki. It was bad. After a few months we did have some data integrity problems (fortunately, with records that had been deleted), but even this was enough to break the built-in Hibernate-based backups and force us to migrate.

The migration was hard too. Why:

Some of the challenges we ran into:

I really wish Atlassian would list some of these potential problems on their site, had they given more than just ‘do not use it’ we would have listened :).

— Max Schubert

Comment

---

How come I get 'no server suitable for synchronization found' from my NTP client when the server is returning a valid NTP response to the client? · 4 May 2008, 12:44

This is one I hadn’t seen before last week. On our Solaris 9 clients, running

ntpdate -d -u ip.of.server

Kept on returning

no server suitable for synchronization found

even though the debug mode showed UDP responses coming back from the server. The server in question runs using UDP / unicast mode.

We used snoop to look at the NTP response

snoop -v -v port 123

(use -v -v to get the protocol decode output), and saw these suspicious field/value pairs:

NTP: Leap: 0x03 - clock unsynchronized
NTP: Reference clock: INIT
NTP: Reference time: 0x00000000.00000000

There were other headers, but they did not indicate problems. 0×03 in the Leap field, the INIT state, and a reference time of 0×00 indicated that the NTP server was not properly initialized / configured properly. Further investigation revealed that indeed, this was the case, the Sidewinder / G2 NTP server was not properly configured.

— Max Schubert

Comment

---

How do I know what group Oracle expects to run as? · 4 May 2008, 12:29

Learned the hard way about how picky Oracle is about the group it runs as … do not change the group ownership of files owned by Oracle unless you have the .c source for the distribution of Oracle you are using and a lot of time to fix :p ..

The group Oracle runs as is found in the files

/rdbms/lib/config.c
/rdbms/lib/config.s

You can also do a ‘strings’ command on config.o and you will see the group name.

Unfortunately in my case the version of Oracle came with the commercial product (eHealth) and the only thing we found we could do was reinstall … took our in-house database staff to figure out what the problem was, CA left our ticket open for over a month without a resolution (RMAN backups were failing, not a trivial problem).

really getting tired of the lack of support commercial support contracts provide, seems like support used to be so much better back in the old 90s :p.

— Max Schubert

Comment

---

Solaris SVM: stuck in pre-maintenance mode: resolution · 21 April 2008, 18:24

Well, the Sun techs just recommended we restore from backups or upgrade our distribution .. great, thanks for the in-depth technical insight … so one system we did restore from backups.

A coworker of mine who is quite brilliant with bare metal troubleshooting was able to get the first system back online by doing the following:

rm -rf /mnt/dev/*

cp -Rp /dev/* /mnt/dev/
cp /etc/path_to_inst /mnt/etc/path_to_inst

He then unmounted the drive, moved it back to the original system, and voila, we could get into single user (maintenance) mode with

boot -m milestone=none

Turned out that the HBA card coincidentally (no joke) went bad during or after being removed. Replacing the HBA card fixed that issue .. and let us boot single user, great!

So, I then re-initialized the meta device database, restored all mirrors and submirrors and rebooted .. and … whoops, kernel starts complaining about /etc/system being full of junk and the system doesn’t boot.

A boot from CD-ROM showed that now both root partitions on both disks were full of what appeared to be random garbage (2 MB worth!)

The Sun tech wrote back about a day or two after this failed and proceeded to in essence ‘scold’ us for trying to copy devices from one system to another .. well, at least that got the system boot single user! She then asked again about restoring at which point I told her to just close the ticket as we were making more progress on our own than we did with her (she is resonsible as well for the ‘9MB zip file’ quote under the Humor section of my blog).

End result – we had to restore both systems and have no idea why breaking the mirror on these systems wasn’t something we could recover from the way we are supposed to be able to do with SVM.

Disappointing, especially since our tier 4 (Sun) was not able to help us get through this without restoring, in fact, they started suggesting restoring after 1 call to their tier 2 people. So much for paying for support contracts and expecting expertise :(.

— Max Schubert

Comment

---

Solaris SVM: stuck in pre-maintenance mode · 9 April 2008, 17:52

I rarely have pleasant encounters with LVM/disksuite/SVM. Maybe that is because I just suck at Solaris :p. The problem this time:

All self-tests pass, then the system starts to boot from the disk .. it gets to the point where the Sun copyright message and kernel revision is displayed, then the following is displayed over and over in a loop:

Requesting Maintenance Mode (see /lib/svc/share/README for more information.)

After breaking the boot process, and rebooting using boot -m debug, the following additional information is displayed:

INIT: Executing svc.startd

INIT: Restarting svc.startd
Requesting Maintenance Mode
(see /lib/svc/share/README for more information.)

Root console services are never started, svc.startd doesn’t dump core, or output messages in system logs, or output anything on STDOUT or STDERR.

Sun technicians are working with me to try to figure out what is causing svc.startd to die (boy does phone support NOT want to escalate a ticket to tier 3 .. they were suggesting I restore the system from backups rather than continue to help me figure out what is causing this (same thing is happening on two systems) .. frustrating!)

Will post more when I get a resolution to this (hopefully more than just “I restored from backups”).

— Max Schubert

Comment

---

Disruptive Technology · 6 June 2007, 18:23

Some time ago a client I work for talked to a group of us about Disruptive Technology; the majority of the presentation focused on technological history and Internet history. Personally I see how Internet technologies were disruptive at the inception of the Internet, but in 2007? How is the Internet disruptive now??

What technologies do I find to be disruptive these days? Outdated interfaces, physical or programmed, that hang around because the companies who sell / promote them have a customer base that is used to whatever awkward / unwieldy interface they are pushing.

A prime example: remote controls; every manufacturer decides that layout N is the layout they will use, rarely do the layouts between vendors match, rarely do vendors choose the same naming conventions beyond basics like the Play button, fast forward, rewind.

Take a 70 year-old who just wants to have a nice experience with his new home theater, hand him 6 remotes, and ask him what his definition of disruptive technology is!

— Max Schubert

Comment

---