lca2008 — open day

Whoops, one last nerdlinger entry, and that’s just to give a quick report on Open Day. This year, we (HP) had a few ideas on how to make Open Day better (from our perspective), namely, giving away schwag and actually bringing a product running Linux. Good ideas, right?

On the first front, for some reason, the free shirts we wanted to give out ended up in Honolulu, so that was somewhat of a bust. On the plus side, we did have two HP r927 cameras to raffle away, which worked out really well.

As for the Linux part, we brought an HP MediaVault 2100, which is a cool little NAS running a 2.6.12 kernel and some oldish busybox installation. I’d spent about half an hour in Fort Collins grilling the developers about two weeks before departure, and had a nice little cheat sheet to help talk it up. Most importantly, I convinced the developer that he wanted to give me the root password to the machine because our audience at Open Day would care about it. And it all would have been great except…

It turns out that the 2100 can only do wired ethernet. It also turns out that we were situated at a table where the courageous conference organizers could not run a cable. Solution? A quick ‘apt-get install dhcp’, and then carefully double-checking to make sure the daemon was bound only to my wired port (so as not to completely destroy the wireless network), and with a final bit of magic in that HP laptops have built-in crossover capabilities, the MediaVault was happily obtaining a DHCP address from my machine, and I could open a root console on my laptop. Voila!

The demo was actually a great hit, which was surprising to me, but it turns out if you let a bunch of Linux nerds poke around on a neat little piece of hardware and cat some procfs and sysfs entries, they’re happier than cats in a twine factory. It’s the digital equivalent of kicking the tyres (as my colleage tpot put it so eloquently); doesn’t actually prove much but it’s quite satisfying nonetheless. One enterprising fellow tried to write and build a hello world, but gcc was interacting weirdly with busybox and it didn’t quite work. Oh well.

In any case, Open Day 2008 was a whopping success and it was quite satisfying to see people walk away with a positive impression of HP and open source. Ok, now I’m done with the techno-babble. I promise.

lca2008 — friday (finally!)

The Friday keynote seemed a little weak — something about the upcoming python releases, of which I care about a little bit because I actually do like python, but didn’t seem generally applicable to something like Linux conf — so I skipped it and enjoyed a nice short black and salmon pide at a local cafe instead. Lovely, really.

The first talk I did go to was Zach Brown’s coherent remote file system talk (CRFS), and it was pretty impressive. On paper, CRFS looks to be heaps better than NFS both in terms of performance and reliability. This was demonstrated quite nicely by showing a tcpdump capture of the operations needed for an NFS delete, something like 9200 packets, vs the same operation in CRFS, which took 89 packets.But what really made it a good talk, in my opinion, was the fact that it made me excited to go back and start hacking on it (well, “it” in this case being btrfs as crfs is still stuck behind the Oracle firewall). To me, that’s the true nature of being an open source developer, that being doing enough implementation on a cool idea to get others excited enough to want to help you out with it.

Next up was Matthew Garrett’s suspend to disk (why does it hurt so?) discussion. It was a bit chaotic, as Matthew kept getting interrupted by other kernel hackers in the audience, but in the end, we learned that the process freezer in Linux was a bad implementation, and that kexec might be our savior. Whoo.

I couldn’t get the gumption to care about the first afternoon session (typical end-of-conference burnout combined with a weak lineup), and the final talk of the day initially seemed interesting (how to write a library API that your users will love), but the actual talk was a bit on the remedial side for me. The most useful piece of information about that talk was, “go look at Rusty’s ideas on how to make a good API”. Ok, fair enough; I’d not heard about that before.

The lightning talks and conference wrapup seemed to take forever, but I did learn at least one piece of useful information about Greasemonkey and Firebug. Paul Wayper talked about making the web suck less using those tools and demonstrated the results as applied to Myspace, the end result being, after removing all the annoying stuff, was a blank page. This was clearly the crowd favorite lightning talk and Paul got a little prize for his efforts.

All in all, a simultaneously energizing and exhausting week, and YT is glad that it’s mostly wrapped up. Saturday is Open Day, and then after that, it’s nothing but blue skies, as I’ve got a one way ticket to Christchurch, New Zealand and a blank agenda.

</geek>

lca2008 — thursday summary

Thursday started off with a keynote by Stormy Peters titled “Would You Do It Again For Free?” Some interesting things came out of this talk. First, open source developers are mainly motivated by internal factors. Second, studies show that giving external rewards to people who are doing an activity and later removing the rewards tends to lead to cessation of the activity. Put the two concepts together, and all of a sudden, the idea of, “are companies killing open source by paying for it?” whacks you across the face like a 2×4.

Unfortunately, Stormy shifted focus slightly during this talk, and started talking about how company processes tend to squash developers’ creativity, thus demotivating them, which is still an interesting topic, but not the one she started with. The idea here is that companies need to figure out how to balance traditional corporate process with the creativity and individualism needed to develop software.

I did shoot off an email to Stormy about the topic shift and we ended up chatting for a bit later that evening at the networking event, and we agreed that both topics were interesting, but she was really focusing on the latter.

I skipped the tutorial sessions because I couldn’t build a kernel for Ubuntu that had lguest and wireless both working at the same time. I only mention lguest because that’s the session I’d wanted to attend; I’m sure my personal problems were all contained in ipw2200.

During the afternoon, I went to Nick Piggin’s NUMA pagecache replication talk, and came away with mixed impressions. In a nutshell, it’s a performance feature that replicates a cache coherency algorithm in the operating system rather than in the hardware, and at the page granularity rather than cacheline granularity. So not a groundbreaking concept, but it does take serious hacking-fu to implement it correctly (in other words, please don’t think that I’m trying to detract anything from Nick at all).

My impression was that getting it all working correctly was really hairy and complicated, and that scared me. I’m not entirely convinced that it will actually lead to performance gains at least not in the world I typically inhabit, that being large ia64 SMP machines, but may help on smaller x86_64 machines. Nick doesn’t have any performance numbers yet, so it’s all somewhat up in the air as to how well it’s all going to work, and it’ll be interesting to see where it all ends up.

Next was HP’s own Doug Chapman (drc) doing a kickass demo of his autonomous rockhopper robot to a packed room. All sorts of cool hackery was demonstrated, including streaming live video from his workshop in New Hampshire to Melbourne (with an assist from his neighbor Paul). At least one audience member remarked that it was perhaps the best talk of the conference that he’d seen. Cool beans!

I went to davem’s “Linux on Sun logical domains” talk but didn’t get anything out of it. The fault was mine, not his, but in any case, I checked out kinda quickly.

Last talk of the day was conceptually the most mind-blowing for me. Vik Olliver presented his RepRap which is essentially an open-source rapid prototyping machine, aka a 3-d printer. The printer itself was kinda cool, but more interesting were the concepts around the ability for basically everyone to own a universal fabricator. Think about it — all of a sudden, a lot of problems go away if you can just make stuff on demand, on location.

No more holding inventory, no more transporting finished goods around, different tax implications (people aren’t buying stuff anymore, they’re making stuff for themselves), no ability to embargo or prohibit goods, etc. etc. etc. The implications are powerful.

One last note on why LCA is seriously cool — my G1G1 OLPC was having some problems with the control key getting stuck (a known issue) so Jim Gettys just swapped mine out for a new one. Schweet.

lca2008 — big honking wednesday summary

A few notes on Wednesday…

Bruce Schneier keynote address today. Cool stuff, although nothing groundbreaking (due to years of reading Crypto-Gram and his blog). Some important security concepts

  • security is best viewed via an economic lens; specifically, we are in a lemons market, aka information asymmetry.
  • in this market, we are all security consumers, making economic tradeoffs (should I wear this bullet-proof vest today? no, it’s kinda hot, and besides, it would clash with my belt)
  • the concept of feeling secure vs. being secure is quite important; divergence between the two is where all of our security problems come from essentially
  • society (broad term to include social conventions, technology, etc.) is evolving faster than our species. human brains aren’t equipped to be good judges of risk anymore.
  • the only way to get out of the current mess is by disseminating more information, and reducing the information asymmetry

I went to Jonathan Oxer’s Second Life talk, and was a bit disappointed, although this was entirely my own fault. I was hoping to get an explanation of why a sane person would care about Second Life. Jonathan’s talk was predicated upon the assumption that one already cared about Second Life and he explained how to do insane things with it.

I should have read his abstract closer, because it was all there. Regardless, it was still quite an entertaining talk (Jonathan is an excellent speaker) and amazingly his demo worked on the first try! Yowza!

Oh, you want to know exactly what insane things he was talking about, probably. Well, the talk mainly focused on attaching a programmable Arduino board to various appliances to your house so that when a real life switch is flipped, it triggers an event on the Arduino which sends some http request to a web server somewhere that then kicks off a Second Life script that then makes a Second Life object do something.

For instance, Jonathan has hooked a switch up to his real life snail mailbox so that when letters are placed inside, lots of magic happens, and then in his Second Life world, his fake mailbox’s flag raises up and it looks like it’s stuffed with snail mail. Someone from the audience suggested hooking up a scanner to the mailbox so he could actually read the mail in Second Life, and of course this idea was extended to then pipe the scanned text through some OCR tool and then run a spam filter on the scanned text so you could actually do automated Bayesian filtering on your real life snail mail.

Like I said: insane.

Next talk was Jonathan Corbett’s Linux World News “State of the Kernel” talk. Jonathan is a good speaker too, but I really shouldn’t have gone to this talk considering I already read lkml, and I was the wrong audience. Oh well, my fault again. I did notice that HP was not one of the top 20 organizations contributing to the kernel (as measured in changesets). Boo.

I attended Jim Gettys’ OLPC talk. Again, poor talk selection on my part because he was preaching to the choir (me). It was interesting to hear the emphasis that the number one inhibitor for laptops in the third world is power. They’re exploring the use of solar panels to replace the hand crank (he had demos of both), but power issues dominate the technical difficulties for actually deploying these things.

By the way, have you noticed that power consumption happens to be a huge issue in “normal” laptops right now? Oh, and in the data center too. As jwz would say, intertwingularity! It sounds completely ridiculous, but theoretically, a motivated kernel hacker who sits down with PowerTOP could do more to save the world than Al Gore ever could. A geek can dream, right?

Last talk of the day was the best. HP’s own Bdale Garbee gave a great talk on peace, love, and rockets. It isn’t every day that you get to hear a CTO of a $100B company talk about the difficulties in getting JTAG working on the ARM chip which he surface-mount soldered onto a two-layer PCB that he designed himself. Fascinating stuff, and I’m pretty sure that as a company, HP most definitely wins the “My CTO has a bigger um, dongle, than your CTO.”

Ok, enough! This entry has dragged on long enough, and I need to sleep.

lca2008 — tuesday wrapup

The nerdlinger portion of Tuesday concluded with me wandering back over to the kernel mini-conf to catch the lightning talks and the kernel panel.

Nothing super interesting at the lightning talks (not to detract from them, but they were basically plugs of upcoming LCA events or brief overviews of patch sets or upcoming changes, so nothing earth shattering).

The kernel panel was slightly better, with two very interesting problems and one very humorous moment.

First problem — is it reasonable to ask a user to perform a git-bisect to help a kernel hacker debug a problem? Lots of debate back and forth in the room here, and my personal take is, “probably not”. Joe Average doesn’t have multiple computers — he has his one machine that he wants to play Quake on, and your kernel bug is preventing his machine from booting and fragging or gibbing (or whatever the hell they call it) his opponents. Each git-bisect is going to take two reboots, and it usually takes, what, 7 tries on average to find out what commit broke his machine? Add on top of that the overhead of installing git, cloning a kernel, building said kernel, installing it and the initrd correctly, bla bla bla… man, that makes me tired just thinking about it.

Perhaps some work can be done in this space to write some tools to help automate this process. It would be a win-win, in my opinion, as kernel hackers are certainly going to save time since they’re doing bisects all day long, and Joe Average won’t be intimidated by the arduous process and will get back to his happy state of blowing up professional Korean Quake players all the sooner.

Second problem — how to track incoming bugs? A thousand bugzillas dot the landscape, and it’s difficult to get them to talk to each other. The fundamental problem is that the underlying database schemas are often different, so it’s not just a simple matter of forwarding bugs to one master bugzilla. I really believe that the distros need to show leadership here and work together to create some set of common schema that they can all agree on to solve this problem. It’s not a technical issue — it’s a human issue. Of course, other developers hate bugzilla, prefer to do everything out of email, and that’s fine too. But for those people using tools, let us, as a community, come to some agreement on the best way to use those tools.

Funny ha ha — question was raised about upstream acceptance of kernel debuggers. Some talk in the room ensued, and at one point, someone on the panel suggested slipping the patches in now, since Linus might not be paying so much attention. In a beautiful moment of comedic timing, willy says, “oh, hi there, we were, um, just talking about ice cream!” just as Linus himself had snuck in the back of the auditorium. Linus seemed to just wave and smile, but I think the “wave” was really that Darth Vader “use the dark side of the Force to mentally death choke” David Miller from across the auditorium. I saw davem twitch a little bit when Linus raised his hand, so it’s the only reasonable conclusion.

lca2008 — release management in free software projects minitalk

Final mini-conf talk of Tuesday for me was tbm’s “Release Management in Large Free Software Projects”. The most interesting part of this talk for me was seeing all the parallels between shipping free software and shipping proprietary software.

It turns out that while free software development is radically different from proprietary software development, trying to ship the end result is remarkably similar, due to the foibles of human nature.

In the end, what large free software projects strive for is… wait for it… yes! predictability! woo! Hey, that’s basically what proprietary software products try to provide too. So fundamentally, the problem space is equivalent.

tbm’s research showed that many large free software projects have settled on time-based releases to try and achieve predictability, and then gave a bunch of examples. Again, I won’t repeat his talk here; go to his website to read tbm’s entire thesis.

The interesting implication for me is that management at proprietary software companies shouldn’t despair about the growing momentum in free software development. Rather, they should view it as an opportunity — management at top-notch software companies know how to ship software, so they ought to be thinking of ways to marry their management expertise while leveraging the benefits of open source development models.

Perhaps there’s a business model in there somewhere, where a strong management team can make money by shipping open source software to interested buyers with some level of predictability and responsibility. And yes, you’re reading me correctly if you think that I’m implying that this management team startup idea is marketing both to developers (“hey, let us manage your project and sell it to customers!”) and buyers (“hey, let us sell you this excellent piece of software!”).

Food for thought, anyhow.

lca2008 — fossology minitalk

I got away from the kernel mini-conf for the early part of the afternoon to check out the distro mini-conf. HP’s own Bob Gobeille gave a talk on FOSSology that turned out to be quite entertaining.

The basic problem they’re trying to solve can be described as: what content, exactly, is in a file, and does that content resemble anything that we might care about?

Buh?

Ok, some examples might help. Let’s say both Fedora and Debian are shipping a library, zlib say, and you run ldconfig on both distros and both times, ldconfig says v1.2.3. Cool, they’re shipping exactly the same library right?

Erm, maybe. Who knows what patch levels each distro might be shipping, etc. Source level inspection is pretty much the only way to figure this out.

Let’s say you’re some poor schlub teaching assistant at Uni because the brochure tricked you into thinking that you’d go to some prestigious institution to get a great education, but in reality, you have to teach data structures 500 indifferent undergraduate halfwits. The half clever bit of wit that all those undergrads have is to figure out who the nerd in the class is that actually understands what a pointer is, promise to get him some girl’s phone number, copy his program, but only hand it in after cleverly changing all the variable names. Genius! The same algorithm used to analyze zlib above can be applied to help detect exactly this sort of cheating independent of variable names. Zing!

Oh, and you can use the algorithm to figure out such mundane things as what free software license any given source file might be distributed under. And it turns out that’s actually the reason HP wrote this tool, but the general underlying principle is extremely powerful because it’s so simple.

The grand vision behind FOSSology is to create a huge repo of source files, analyzing them for… whatever. Licenses, code reuse/originality, dependencies, etc. And HP gets it, which is really cool — we’re going to hand this repo over to the Linux foundation and allow anyone to poke at it.

Good job guys, and good job bobg for the entertaining talk.

lca2008 — cache efficient data structures minitalk

The second talk I attended on Tuesday was “Cache Efficient Data Structures” given by Joern Engel. Personally, I found this talk to be very interesting on several levels, but primarily because it focused on the nexus between theory and implementation that happens to excite me (yes, I am a dork. A good-lookng dork.).

The key takeaway from Joern’s talk, IMO, was that computer science theory is dandy in the classroom, but when it comes to actually implementing stuff, theory is insufficient.

The classic data structure that computer scientists prefer using to store information is the hash table, because it can be proven mathematically that operations such as insertions, removals, and lookups all have the perfect tradeoff in performance.

Unfortunately, in real life, it turns out that once a data set grows large enough, it doesn’t really matter what you do with it so much as how you access it. In other words, the penalty incurred from a cache miss completely dominates the theoretical running time of data structure accesses.

This means that while a computer scientist is happy to use a hash table to store a million objects, a kernel hacker realizes that the number of cache misses incurred when inserting / removing objects is going to be sufficiently large (as you can only fit one hash table object per cache line) to the point where performance will suffer. You’re going to be spending all your time flushing your caches instead of doing useful work. And may the gods help you if you’re on some sort of multi-cpu cache-coherent system with concurrent readers and writers. Egads.

More importantly for a kernel hacker, that memory doesn’t even belong to you; it’s the user’s memory! Bad hacker! Bad!

The solution is to use more exotic data structures like radix trees or perhaps a B tree variant. The memory characteristics of these structures happen to be much more cache friendly (lower memory overhead and higher number of objects per cacheline), and the gains are worth the increased complexity when managing these structures.

Two quick final notes (watch the video if you’re interested; I’ll not repeat it all here): willy mentioned a crazy data structure called an “xor list” which is basically an optimization trick on a doubly-linked list where you xor the forward and backward pointers together to save yourself one pointer’s worth of overhead, and second, Judy hashes (or whatever they’re called) could be cool if normal humans could understand them. As it is, probably only the author of the library is the only person on this planet that actually understands how they work. The verbatim quote from the slide:

Judy trees: !#@$$%@##%^&& ????

Might be an interesting project for the motivated individual, to rewrite this library with a usable API.

lca2008 — writing a PCI driver using qemu minitalk

Started off Tuesday at the kernel mini-conf. Unfortunately, due to a typo in the printed program, I missed a talk I was kinda interested in seeing (How not to invent kernel interfaces by Arnd Bergmann), and didn’t get there until hch’s talk on writing a PCI driver using qemu.

hch showed a lot of boilerplate code saying that much of it was “trivial, trivial”, and it certainly did seem like it, even to a PCI newbie such as YT. On the qemu side, it was some more “trivial boilerplate code” and again, it all seemed pretty straightforward. He mentioned that the RealTek 8139 would be a good example on the qemu side to try and learn from. He also pointed out that qemu was written by a former winner of the Obfuscated C contest, so the only way to really understand it is to get the swords out and start cutting your way through line after line of uncommented code. Good luck.

Unfortunately for me, I wasn’t smart enough to understand the implication of *why* one would want to use qemu to write a fake piece of hardware and then use it to write a device driver. hch pointed out that it was much easier to simulate hardware using C and qemu compared to writing out VHDL (which I certainly agree with), but I didn’t have any “ah ha” moments from a kernel guy perspective. Oh well, I’m certainly not as smart as hch, so maybe this is something useful for him, but I just didn’t get it.

On a side note, hch seems pretty personable to me; surprising, given his kinda bad guy reputation on lkml. Before his talk, we happened to be sitting outside the lecture hall, and he had no problems watching my bag whilst I got a coffee. Maybe the secret to making him like you is by not writing any SCSI code and certainly not showing it to him if you did.

lca2008 — keysigning and monday closing thoughts

LCA is about letting nerds be nerds, and what’s more nerdy than a GPG keysigning party? (If there exists such a beast, I sure don’t know about it.)

The original plan was to use the Sassaman projected method, but the projector wasn’t so great (projected the everything mirrored, which made it rather hard to read the text on the IDs), so after a little bit of confusion, the decision was made to do it the old fashioned way: by standing in a circular queue, checking the IDs of the person in front of you, and rotating. To be honest, I kinda liked that method a bit better because it was much more personal than the Sassaman method.

madduck pulled his “Transnational Republic” trick again, and I guess I fell for it. I asked for his ID, he showed it to me and said, “this is from the Transnational Republic” and I said, “ok, looks good to me”. Hey, I’m just a dumb ‘merkin, I figured it was some crazy Eastern European Slavic country I’d never heard of. Not my fault, says I.

A short break to recuperate, and then I tagged along on a small, ad hoc dinner that ended up as a mix of HP (bobg, tpot, tbm, achiang) and Debian developers (DDs: pasc, madduck, and tbm (tbm wears two hats)). Didn’t talk about anything groundbreaking; mostly revolved around the possibilities and challenges of debian in the corporate world, and things HP might be able to do to help improve on that front (we could start by providing non-broken m11y .debs!).

It was interesting for me to see the discussion evolve into something resembling corporate Linux guys vs keepers-of-the-flame, with bobg and YT taking the lead for $MEGACORP and madduck on point for the hippies. What made it interesting was the challenge in explaining that even though we’re corporate droids, we still believe in the Right Thing ™ and actually try to effect change to the extent that we can, but that there are many internal corporate challenges when someone as large as HP are signing your paystubs. I remember being on the other side of that fence while in University and wondering why the corporations just didn’t get it.

This is not to say madduck is naive; I don’t believe that to be the case at all (although I do find him to be a somewhat representative example of the true believer mentality, aka, smart, rational, and holding the unfortunate assumption that the world is populated with other rational agents who can be reasoned with, (which in my opinion, is sadly and shamefully is not the case)). Until one works for a large $MEGACORP, one doesn’t truly appreciate the complete dichotomy between developers, whose goal in life is to enable others, and lawyers/managers, whose goal in life is to avoid risk. If you think about it, those goals are almost perfectly at cross-purposes, and more often than not, those with the money (aka, lawyers/managers) win the battle.

But that’s mostly self-evident anyhow and not saying anything interesting. What might be interesting for those on the other side of the fence is the realization that the lawyers and management at $MEGACORP a) are humans too b) can actually be quite clueful when it comes to open source and c) want to effect change for the greater good as much as anyone. But change doesn’t come quick in gigantic companies, and it may be a useful analogy to remind ourselves that the patient, steady dripping of water can eventually create the Grand Canyon.

Well, enough of the sycophantry for now. Time for bed and day 2 of LCA2008.