alex chiang: web 6.0

November 18, 2008

git as code drop condom

Filed under: geek — alex @ 1:18 am

[Continuing with my experiment of posting something every day, I think I'll go with posting photos on MWF and actually writing something on TuTh. Of course, those are minimum goals; maybe I'll get motivated and post even more than that, but the intarwebs can only handle so much of one person's BS, especially if that person is me, so maybe not.]

By now, the hip kids have cottoned onto the fact that git is the new hotness when it comes to managing source code. [Yes, git manages content but I choose to ignore the maniacs who want to use it as a blog or wiki backend or manage their home directory with it or maybe stick it in a pipe and attempt to smoke it just to see what happens.]

So yes, git is fantastic, and if you’re not using it for your own source management, then you have my sincere sympathy that you are doing the equivalent of sticking raw lemons on your eyes every day.

One thing that I’ve found git to be extremely useful for is protecting yourself from, shall we say, unenlightened partners.

If you happen to find yourself in the hellish position between the Scylla of enabling hardware you do not control and the Charybdis of testing 3rd-party vendor-supplied drivers that are supposed to enable said hardware, well, the first thing I would tell you is to get a new job because if you don’t, the only long term answer to reduce the world of hurt you’re experiencing is suicide. But if you need to make mortgage payments for a little while longer, then git is your friend.

For those Web 2.0 wunderkinds who don’t know what a pointer is and whose testes haven’t yet descended, there’s an entire world of software development that doesn’t include the strings “java” or “script” or “extensible”, and consists of, you know, actually booting the machine that runs your lame Facebook app and other boring stuff like transmitting and receiving packets from the “ether nets”.

This world is inhabited by suspender-wearing greybeard overlords and bitter college new hires whose dreams of becoming a physicist were ruined by the soul-crushing mountain of horrible FORTRAN code crapped on their heads in grad school before dropping out of a Ph.D program, thinking that “hey this Linux thing is pretty neat and C is niiice!” but not realizing that they’d only made the 2nd of a long series of bad career moves, choosing a lifetime of bit banging instead of spending 7 minutes to learn “hello world” in Rails and then spending another 10 minutes to make another Google Maps + Craigslist hooker mashup that they could sell for $72 million and spend the rest of their lives doing extreme sports or something equally useless like the Peace Corps.

No, in this world, if you’re lucky, the company you work for actually makes the chip and you get to write the driver. But woe to the extremely unlucky schlep whose company merely integrates the chip into some other device.

This circle of hell typically unfolds as you, the hapless junior programmer, receive a gigantic tarball from clueless $VENDOR, with the latest barfy garbage that they “released” by basically sending you their cvs (or god help you rcs) repository, but without, like, the useful stuff like changelogs or diffs.

And then your fun job is to build their poxy driver, smoosh it into your environment, and then pray that nothing broke. Of course, since god doesn’t exist (if he did, why do you have your current job), inevitably something breaks. Which means you have to puzzle over the pitiful README that they call “release notes” and try and figure out what the hell changed (not that it matters anyway because your company and their company don’t have a common bug tracking system so you just send fifteen thousand emails with kazoollians of sub-threads and random worse-than-useless program managers who get cc’ed at inappropriate times who decide to “help” by cc’ing even more people out of context until basically your company’s global address book is emailing their company’s global address book because their engineer in a more “competitive” geography forgot to byte swap a certain field to take your big-endian system into account. Oops!)

Actually, I’m sorry to break this to you, but even git can’t really help you. It can only make the pain marginally less. And I’m truly talking marginal here; I mean, it’s the difference between swapping bodily fluids with aforementioned Craigslist hookers and smoking crack with them afterwards, or merely taking a pass on the crack. This time.

Anyway, the way to minimize your pain is to take the earliest version of their monolithic tarball that you can find, and:

$ tar zxvf codedrop.tar.gz
$ git init
$ git add .
$ git commit -m "Those schlubs use Visual SourceSafe!?!?"

From this point on, you can now simply untar each new code drop into the same directory and:

$ git commit -a -m "Revision X claims to fix following bugs..."

What you get is a semi-readable changelog, with hopefully manageable diffs. If you really get motivated, you could:

$ tar xzvf codedrop.tar.gz
$ git diff        # eyeball the changes
$ git add -i      # interactively add hunks in ways that make sense to you
$ git commit -m "I need a new job"

That latter case is useful in case a code drop involves 13k lines of diff that is an ASCII-encoded firmware update in a header file and 40 lines of functional change.

Now, when 6 different program managers are harassing you via Blackberry, asking you “what changed in the driver?” you can give them a reasonable answer (or if you’re feeling vindictive, just send them the output of git diff).

Pray that the baby jesus resurrects himself soon and puts you out of your misery.

November 10, 2008

open source consultancy

Filed under: geek — alex @ 9:42 pm

One of my guilty pleasures is reading the Joel on Software forums. In general, the participants are pretty bright, and they tend to have a decently broad technical background (web 2.0 experts mingle with low-level asm/C guys like me).

There was an interesting thread titled Pushing open source, where the guy thinks about pushing more open source in his consultancy business.

I haven’t written anything substantial in a while (in any form, whether email, code, blog, or misc stuffas), so I ripped out an off-the-cuff ~1000 word reply. I figured I might as well rip myself off and repost it here for my loyal readers (har har har, all 2 of you that might make it through this nerd paean).

Anyhow, enjoy (or skip). [Also, note that I botched the joke in the postscript; real induction says that if n=1 is true and the n+1 case is true, then n is true; oh well, it's been too long since I took maths.]

To the OP:

Your first post intrigued me, and I thought that you had found a nice niche to play in. I think your initial instinct to approach the problem from a TCO point of view was correct, but it looks like you learned that evaluating TCO is trickier than it looks.

The key thing to keep in mind is that your clients want to run a _business_ and don’t care at all what the enabling technology might be.

So if you can _seamlessly_ drop in a FOSS solution to replace a more expensive MSFT solution, you’re set. If your client experiences _any_ transactional cost in having to learn something new, or get annoyed by some new interop problems, game over, you’ve lost. Well, technically, _you_ haven’t lost, but at least you’ve given FOSS a black eye of sorts and helped contribute to the FUD that FOSS is inferior software.

[As someone who is paid to work on FOSS software (linux kernel engineer), I'd prefer not to see this happen ;) ]

Anyhow, if you accept my premise that 100% seamless transition at lower financial cost to the client is your goal, then your best bet is to find a standardized part of the stack and replace just that bit.

The easiest example that comes to mind is a mail server. If your client is currently running Exchange (for mail only, I’ll get back to this in a sec), then you have a pretty good chance at being able to drop in a Postfix install without anyone noticing. The hardware requirements are likely lower, and certainly the software license can’t be beat.

Unfortunately, if your client wants the shared calendar feature of Exchange, you’re hosed. There is no good FOSS replacement (especially one that Outlook can work well with); you’ve introduced some “switching pain” and game over.

Another possible replacement might be replacing an Active Directory server and/or file server with a samba server. Or maybe ripping out their costly firewall and dropping in pf or the like.

Because of the “switching pain” I talked about earlier, Postgres/MySQL probably aren’t 100% drop-in replacements for MS SQL server due to slightly different SQL syntaxes. The last thing you want to do is piss off your client’s developers who will only see your change as a hindrance to their productivity.

I would describe the tactic above as “phase 1″, where your client is completely “normal” and just wants the cheapest solution. Phase 1 allows you to swap out expensive parts with cheap parts that are as good (or better ;) without anyone noticing.

What you do with Phase 1 will depend on your long term goals. You could use it as a competitive advantage over the other consultants in the area, and make cheaper bids on jobs. Or, you could keep your bid the same, and simply pocket the difference as profit. [NB, I don't really see that as an unethical move; you are selling a client a total solution, not a bunch of individual parts, but that is a discussion for a separate thread.]

Now, if you are ambitious, the part that’s interesting to me is implementing Phase N.

In this step, you’re allowed to impose a “switching pain”, with the goal of an eventual lower TCO. Again, it’s a selective replacement of parts of the solution with FOSS bits that are likely to work well. Postgres/MySQL are likely candidates here, since SQL is close enough to a standard that switching away _from_ MS SQL Server is probably doable. Another potential _might_ be Zimbra to implement a shared calendaring solution, but I’m just a kernel guy who knows enough about userspace to be dangerous, so take that last one with a boulder of salt. ;)

Of course, Phase N will require a progressive client who is willing to experiment a little and take risks with the end goal of lower TCO. And don’t fool yourself, there _will_ be risks if they go down this route. Their SQL might be impossible to port or a million other little gotchas.

I imagine if you have a client with whom you could successfully implement Phase N, they might be experimental enough to try Phase N+1, where you escape out of the server room and onto the desktop. Of course, Phase N+1 has nuances — maybe you deploy the Windows version of OpenOffice.org before replacing their entire desktop with Ubuntu, but you get the point.

You mentioned that you’re not totally familiar with the FOSS stack, so I guess that implementing the strategy I outline above might be hard at first. But getting over the steepish learning curve is worth it.

This is especially true because I haven’t yet mentioned the best part for you: once you become _good_ at deploying and managing those FOSS solutions, you’ll find that mastering their complexity truly results in more power for you. In general, the FOSS apps tend to be scriptable, extensible, and automate-able (a neologism I just made up). Moreso than typical MSFT apps. [Note, I said "in general", there are probably counter-examples of course, but in the large, I believe I'm correct.]

So _your_ TCC (total cost of consultancy) drops, giving you a lot more flexibility over your topline. You can choose to pass those cost savings on or keep topline revenue constant, in which case I just described the fabled “Step 2: ???”, and you get to celebrate Step 3: Profit!1!

Anyhow, that’s what I managed to think of in-between bites of dinner. There are probably some holes, but overall, I think the strategy is sound.

Hope this helps.

-Alex

ps, if you were paying attention, you’ll notice that if you ever _do_ manage to successfully implement Phase N+1, then you will have proven via induction that FOSS can take over the world. ;) cheers!

September 24, 2008

evites suck

Filed under: geek — alex @ 8:59 am

Evites are one of the suckiest things ever invented.

The evite sender puts my email address into some 3rd-party website, which I never asked for. They claim they will never sell my email address, but why should I trust them? If I wanted evite to have my address, I would have sent it to them.

As an evite recipient, I get a spam in my inbox. Ok, fine, I get lots of emails. But an evite email doesn’t give you the fucking information!. It’s just useless noise — “hi! I sent you an email! But it’s empty! You should really go to this web page instead!”

What a waste.

Today, I realized that evite does one thing correct: it sets the Reply-to: field to the original sender’s email address. So from now on, I’m just going to respond via email and let the sender keep track of me the old-fashioned way.

September 6, 2008

czech your web

Filed under: geek — alex @ 1:37 am

Net access here (Prague) is really slow, seems to be slowest in establishing new connections (and then decent once connection is established), but I don’t have enough networking-fu to figure out why (maybe DNS is slow or something).

Also, Google keeps trying to fake me out, sending me to localized versions of their pages, which is very annoying.

I was gnashing my teeth trying to figure out a solution when I realized I’m already ssh forwarding a local port to a squid server back in the States. And with some other magic, all I need to do is start up one ssh connection, and all the other ones are muxed through it, so you only pay the DNS lookup once, and from there on out, all your new connections are fast.

As an extra benefit, I don’t have to worry about my unencrypted http traffic getting sniffed by scary Eastern European hax0rs.

September 5, 2008

unicode mutt

Filed under: geek — alex @ 1:05 pm

It turns out that in places like the Czech Republic, they actually use all the crazy accented characters that Americans are used to ignoring. You know, like “č”, “ů”, “ř”, etc. Anyhow, I finally got annoyed that I was just seeing “?” instead of “ö” (or whatever) and fixed the problem.

First:

$ xterm -u8 [other flags, etc...]

Next:

$ env | grep LANG
LANG=en_US.utf8

Finally:

$ grep charset .muttrc
  set charset="utf-8"

et voila! (Note the name in the “From:” header.)

August 21, 2008

sneaky pci_dev refcounts

Filed under: geek — alex @ 5:15 pm

Normal friends and family, just skip this post. Trust me.

Recently, while doing some PCI slot symlink work, I noticed that we weren’t calling pci_release_dev() after taking a card offline (via sysfs). Well, at least not the first time, due to a leaked refcount in pci_get_dev_by_id(). gregkh fixed it, so that’s good, but I had a slightly difficult time tracking down the exact problem.

See, I had instrumented pci_dev_get() and pci_dev_put() to call dump_stack() every time the device that I cared about was touched:

--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -789,6 +789,11 @@ struct pci_dev *pci_dev_get(struct pci_dev *dev)
 {
        if (dev)
                get_device(&dev->dev);
+       if (dev && (dev->bus->number == 1) && (PCI_SLOT(dev->devfn) == 2)) {
+               dump_stack();
+               printk(KERN_INFO "dev refcount: %d\n",
+                       atomic_read(&dev->dev.kobj.kref.refcount));
+       }
        return dev;
 }
 

[mutatis mutandis for pci_dev_put()]

But I wasn’t getting all the information I needed. Other pieces of code were playing with that device’s refcount too. So I did this instead (thanks to willy for the hint!):

@@ -983,7 +984,20 @@ int device_register(struct device *dev)
  */
 struct device *get_device(struct device *dev)
 {
-       return dev ? to_dev(kobject_get(&dev->kobj)) : NULL;
+
+       if (!dev)
+               return NULL;
+
+       if (dev->bus == &pci_bus_type) {
+               struct pci_dev *pci_dev = to_pci_dev(dev);
+               if ((pci_dev->bus->number == 1) && (PCI_SLOT(pci_dev->devfn) == 2)) {
+                       dump_stack();
+                       printk(KERN_INFO "dev refcount: %d\n",
+                              atomic_read(&pci_dev->dev.kobj.kref.refcount));
+               }
+       }
+
+       return to_dev(kobject_get(&dev->kobj));
 }

Armed with that patch, this is the list of code paths (as of 2.6.27) that can affect a struct pci_dev’s refcount:

* pci_dev_get
* pci_dev_put
* device_add -> get_device
* device_add -> klist_add_tail -> ... -> get_device
* device_del -> put_device
* device_del -> klist_del -> ... -> put_device
* bus_attach_device -> klist_add_tail -> ... -> get_device
* bus_remove_device -> klist_del -> ... -> put_device
* acpi_platform_notify -> acpi_bind_one -> get_device

There are lots of code paths that can potentially directly call device_add/del so if you’re debugging that, you’ll need to figure out which ones affect you. I included that acpi_bind_one() as an example of a surprising place where we might directly call get_device().

The only surprising gotcha might be the calls to klist_add_tail/klist_del. Those calls will be non-obvious to grep for while debugging your device’s lifetime. So if you are wondering why your struct pci_dev isn’t calling its ->release() properly, just patch get_device() directly, and be prepared to stare at the output for a while. Good luck.

[Note that device_add bumps the refcount by 2, but then calls put_device() during cleanup, for a net gain of +1.]

July 26, 2008

back from the dead

Filed under: dreck, geek — alex @ 5:45 pm

A few days ago, Fort Collins got nailed by a seeming monsoon. I had the misfortune of bike commuting home from the time the rain started, continuing into the time the rain converted into hail, and finishing my commute as the storm finally petered out. It wasn’t terribly bad, except for a minor freakout during the extreme flash-BOOM!s exploding overhead as I stupidly rode underneath the very exposed Powerline Trail. That was scary.

Worse, though, was my wide open window near my computer desk at home. Where my computer sits. With an open, exposed chassis.

I walked in and saw the Mac mini pulsating sadly and making sick noises. Sigh. Unplug it, and turn it upside-down and the inch of standing water in the case poured onto my desk. As a computer professional, I happen to know that dumping out that much water is probably a Bad ThingTM.

Let’s check my other system downstairs, to see the last time I did a backup of the Mac. February? Crap.

Stuff dries out kinda quick in Colorado, but I gave it an extra day or two before plugging everything back in. The external hard drive that I boot off of is having a really hard time spinning back up and I make a frowny face. On a whim, I decide to see if I can reinstall to the mini’s internal drive. Minor success for our hero.

Ok, so booting back up on the internal drive… woop woop, I can see the external drive again! Disk Repair finds some bad beeble-bops and fixes them up, so I think I’m set. Attempt a reboot and…. uh oh. Sad pulsating light, sick drive noises, and a frowny face from me again.

Boot back to internal drive, reconnect external drive and all is well. Buh? Run Disk Repair again and notice something very odd — it’s telling me that it’s connected via USB2.0, but I told the external drive to prefer firewire (both ports are hooked up). Now that’s a little weird.

Let’s think about this… Ok, theory formed, time to experiment.

Reboot with only firewire hooked up. Frowny faces all around. Reboot again with only USB hooked up. Happy face. Reboot one last time with both hooked up. Frown and gnash teeth.

Conclusion: the rain fried the firewire chip either on my mac or on the external enclosure. Most likely the enclosure. This kinda sucks because you can only boot off an external firewire drive, can’t boot off external USB. On the plus side, I have all my data back… which is nice and only minorly inconvenient.

Lesson learned. Computron moved away from window. Backup scripts dusted off and added to cron. Burn incense peace offering to the computer gods. Disaster averted.

July 16, 2008

chinglish, tomoyo, espresso

Filed under: dreck, geek — alex @ 10:39 am

Some insights into what I find interesting.

An essay on Chinglish says:

An estimated 300 million Chinese — roughly equivalent to the total US population — read and write English but don’t get enough quality spoken practice. The likely consequence of all this? In the future, more and more spoken English will sound increasingly like Chinese.

Speaking of bon mots, a slide deck from Toshiharu Harada on the trials and tribulations of merging TOMOYO linux into mainline had me in stitches. His good-natured humility is absolutely refreshing, and captures the emotions of my job and how I felt when I made my first major kernel submit. Some favorite quotes:

  • It was a meeting of the Hell
  • What You Need to Join the kernel development — COURAGE
  • AppArmor and SELinux guys began fighting (complete with ASCII art!)
  • Beware I don’t have long legs as HE has

And finally, I’m glad I haven’t gotten attitude at any of the coffee shops I’ve visited like this guy did. It took some reading between the lines, wondering “they really care that much about iced espresso?” But it turns out that no, the murky coffee owner is just trying to prevent people from doing a ghetto latte — ordering a cheaper iced espresso and then filling the remaining 3/4 of the cup with expensive cream. Hiding your fear of thieves behind some arrogant claims of coffee perfectionism seems… douchey.

June 22, 2008

google earth, fedora 9, intel 945GM, drmWaitVBlank

Filed under: geek — alex @ 7:44 pm

Google Earth can be excruciatingly slow on a misconfigured system. For me, it was extra confusing because lookie here:

[achiang@ethanol ~]$ glxinfo | grep direct
direct rendering: Yes

That’s goodness, and so one would think that I’m all set, right? Not so. I was seeing these errors:

do_wait: drmWaitVBlank returned -1, IRQs don't seem to be working correctly.
Try running with LIBGL_THROTTLE_REFRESH and LIBL_SYNC_REFRESH unset.

After endless googling, I got a clue to install something called driconf, except the concomitant advice was bad as it referred to obsolete options.

But hey, let’s use our brain and guess that the error message about drmWaitVBlank means we should change the magic setting labelled “Synchronization with vertical refresh (swap intervals)”. (This would be after launching driconf as root, duh.)

On my system, the default was “Initial swap interval 0, obey application’s choice”, which resulted in epic fail.

Changing it to “Never synchronize with vertical refresh, ignore application’s choice” was the answer for me.

This would be with DRIconf 0.9.1, Fedora Core 9, the i915 module, and:

Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03)

Hope this helps. Now I can use Google Earth to plot the ridiculous 137 mile, one-day bike ride I did over the weekend.

Edit: ok, I lied — the other thing you have to do to make Google Earth perform at a reasonable speed is to go to View -> Atmosphere and turn that display off

May 6, 2008

buy.com sucks

Filed under: dreck, geek — alex @ 12:59 pm

I tried purchasing a gift for someone from buy.com and having it shipped to his residence.

You know, the feature that Amazon.com has had since, oh, I don’t know, 1997?

I get this email in response from Tasha Eastman:

Hello Alexander,

We are anxious to ship your order # 39677038; however, we need some additional information to complete the order process.

The shipping address information entered on your order is not on file with your credit card company. To release your order, please contact your credit card company to have the address added on file as an alternate ship-to address.

Um, right. I’m not going to call Amex and have them put my brother’s address on my credit card file. That is the stupidest thing I’ve ever heard of in my life.

My response:

This is silly.

Amazon.com doesn’t ask me to call my credit card company if I want to purchase a gift for a family member and have it shipped to their residence.

In fact, no other retailer that I can think of has this requirement.

If you are still interested in my current and future business — I spend roughly $1000 / year on electronics purchases and influence the purchasing decisions of friends/family around roughly $7000 / year — then please fulfill the order as originally placed.

On the other hand, if you are really going to attempt to make me jump through this hoop, then please cancel my order.

Thank you.

Their response was to cancel my order.

So if you’re thinking of buying something off the intarwebs, don’t use buy.com. They are a bunch of JV bush-league-amateur-hour mouth-breathers who clearly do not understand customer service. Buy.com sucks.

Next Page »