Archive for October, 2007

Donated web hosting

Having helped set up a handful of websites for non-profit and ad-hoc groups, I’m increasingly convinced that donated web hosting is worth every penny you pay for it. While it seems great to have someone offer to host a site for free, or even donate rack space for a dedicated box, the trade-offs in availability, support, and accountability are severe.

Volunteer tech help is a valuable commodity for non-profits. Anything that makes it harder or less satisfying for your techies to do their work is a Bad Thing, and will cause a host of problems, up to an including total burn-out of your volunteers.

It isn’t that hard to find a single donor who will pay for an economical VPS or managed-server solution as an in-kind contribution. That $30-50 per month will free up many more hours of volunteer time, and should bring with it uptime guarantees, 24×7 emergency support, and secure off-site backup.

backlit

backlit

Feature request

Dear Intarwebs: I would like to request that someone implement a “meme killfile.” Specifically, I would like to be able to block all further discussion of [certain](http://bikeportland.org/2007/10/22/kgw-cyclist-dies-after-collision-with-garbage-truck/) [topics](http://www.blueoregon.com/2007/10/smearing-jeff-m.html), no matter how interesting I may find other discussions at those same sites.

Don’t even get me started on [this guy](http://politics.reddit.com/search?q=ron+paul). I would gladly pay to never have to see another discussion of his legitimacy as a candidate.

Perhaps a Greasemonkey script which ties into a Bayesian filtering engine to simply rewrite content I’m sick of out of pages? Think AdBlock for toxic memes.

Tail recursion

Fast Ruby IO

I’ve been writing some simple low-level IO code to copy a few GB at a time around, and since it’s all wrapped up in Ruby synchronization logic, I used my preferred idiom of a `sysread`/`syswrite` loop with a reasonable buffer size:

bsiz = 65536
open(from) do |inh|
open(to, ‘w’) do |outh|
begin
loop do
outh.write(inh.read(bsiz))
end
rescue EOFError; end
end
end

However, I’ve always just sort of picked the above `bsiz` value more or less out of thin air, and realized that it might be far from optimal.

So, I dragged out my old friend `Benchmark.bm`, and ran something like the following:

require ‘benchmark’

def with_open_files(src_path, dst_path)
open(src_path) do |src|
open(dst_path, ‘w’) do |dst|
begin
yield [src, dst]
rescue EOFError; end
end
end
end

def basic_syscopy(src_path, dst_path, bsiz)
with_open_files(src_path, dst_path) do |src, dst|
loop do
dst.syswrite(src.sysread(bsiz))
end
end
end

src_path = ‘__data__.in’
dst_path = ‘__data__.out’

if !File.exists?(src_path)
print “generating test data…”
STDOUT.flush
`dd if=/dev/urandom of=#{src_path} bs=1024 count=65536`
puts “done.”
end

Benchmark.bm(14) do |b|
(10..22).each do |exp|
bsiz = 2**exp
b.report(”bsiz=%8d: ” % bsiz) { basic_syscopy(src_path, dst_path, bsiz) }
end
end

The output was interesting, if not earth-shaking:

lennon@firefly:~$ ruby copy_bm.rb
user system total real
bsiz= 1024: 0.150000 0.700000 0.850000 ( 0.854416)
bsiz= 2048: 0.090000 0.580000 0.670000 ( 0.662276)
bsiz= 4096: 0.070000 0.420000 0.490000 ( 0.516608)
bsiz= 8192: 0.040000 0.390000 0.430000 ( 0.433775)
bsiz= 16384: 0.030000 0.380000 0.410000 ( 0.410382)
bsiz= 32768: 0.020000 0.370000 0.390000 ( 0.390833)
bsiz= 65536: 0.010000 0.370000 0.380000 ( 0.379887)
bsiz= 131072: 0.010000 0.360000 0.370000 ( 0.374959)
bsiz= 262144: 0.010000 0.370000 0.380000 ( 0.374990)
bsiz= 524288: 0.010000 0.380000 0.390000 ( 0.586017)
bsiz= 1048576: 0.020000 0.360000 0.380000 ( 0.390283)
bsiz= 2097152: 0.000000 0.380000 0.380000 ( 0.384693)
bsiz= 4194304: 0.010000 0.370000 0.380000 ( 0.380670)

Basically, this tells me that a) buffer size really doesn’t make a big difference (aside from the weird spike around 512K-1MB) and b) that Ruby really can turn in respectable IO performance, since the baseline for using the `cp` command is only a few percentage points faster:

lennon@firefly:~$ time cp __data__.in __data__.out

real 0m0.366s
user 0m0.000s
sys 0m0.364s

Of course, I don’t know that I’d put a lot of faith in Ruby keeping up with hand-tooled C over the long haul — as the size of the data (or longevity of the process) went up, I would expect memory allocation and garbage to start having an impact.

Gobuntu != Debian-lite

Mark Pilgrim has an [excellent writeup](http://diveintomark.org/archives/2007/10/18/gobuntu-has-already-failed) of how the folks over at Canonical just “don’t get it” when it comes to the Free Software vs. non-free distinction and its importance.

Basically, they’re bundling Firefox in [Gobuntu](https://wiki.ubuntu.com/Gobuntu), their “Free Software-only” variant of Ubuntu. As has been [well-established elsewhere](http://en.wikipedia.org/wiki/Naming_conflict_between_Debian_and_Mozilla), Firefox (and Thunderbird, and other Mozilla-derived projects) is not 100% free-as-in-speech, since its graphics and logos are both trademarked and copyrighted, and the license for their distribution does not allow redistribution or modification.

Personally, I still chuckle a bit every morning when I get to work, and the first thing I do after logging into my workstation is to launch Iceweasel and Icedove, the Debian-named variants of Firefox and Thunderbird, respectively. I don’t really care what the icons look like, and when I’m discussing the applications with coworkers and friends, I still use the “official” names to avoid confusion.

I am quite thoroughly in support of the Debian folks on this one, though. The protection earned for the Firefox “brand” by barring modification of its freaking logo is minimal, and the ill-will they’re earning in the community is real.

To be fair, the whole Mozilla/Firefox concept has never really been about open source as an end so much as getting a competitor to IE out there when Netscape couldn’t keep up. If you can’t effectively build an open source app, does it really count as open source?

I don’t think that makes it evil, mind you; Safari and Opera are fine browsers, too, and I think that diversity in the web browser market is essential.

That being said, let’s call a spade a spade, and acknowledge that the Mozilla Foundation is no more a supported of Free Software than is Sun, or IBM, or any other group who has latched onto “open source” as a life-preserver after nearly drowning in the MS sea.

Going back

beach shadows

I’ve been slowly working my way back through the last couple of years’ worth of photos that never made it onto Flickr. Many are simply crap, but a fair number simply need some love to coax out a decent image.

I don’t know that I’m ever going to add decent tags and other metadata to them, though, unless some bored programmer implements a Flickr-to-Lightroom metadata extractor. (I realize that I have a fine skillset to accomplish such a thing, but there are only so many hours in the day…)

crab sandwich

i’ve been cooking up a storm lately…we brought back something like 25-30 pounds of pork from the pig roast, so i’ve been motivated to spend more time in the kitchen rather than let any of it go to waste.

this was one of the first meals i’ve cooked all week which didn’t include any pig parts, actually.

it’s a simple setup:

i mixed the left-over garlic mashed potatoes from last night about 3:1 with cornmeal and an egg, then pan-friend them in olive oil on medium-low heat.

smooshed between each pair of potato cakes was about 2 ounces of cooked dungenous crab meat, spritzed with a bit of lemon juice.

i plated ‘em with a bit of chive and a poached egg, and we sat down to a satisfying (if somewhat coma-inducing) brunch.

(of course, the orange-pineapple mimosas probably didn’t help with the "coma" aspect…)


Fair use, Sci Fi, and civility

Those who have spoken with me on the subject of copyright, IP law, and fair use doctrine know that I am passionate about the subject, and consider open source and the Creative Commons to be an essential means of insuring the expansion and propagation of human knowledge.

With that in mind, I was immensely saddened to read of [Cory Doctorow's recent disagreement with Ursula K. Le Guin](http://www.boingboing.net/2007/10/14/an-apology-to-ursula.html). Both are favorite authors of mine, but I think that both have acted less-than-maturely in this case. Le Guin’s response of “sending in the attack dogs” for what amounted to an overly-long quote (with full attribution intact, mind you) was short-sighted, while Doctorow’s expunging of all references to her work in the BoingBoing archives seems more petulant than prudent.

In the end, though, I really have to side with Doctorow on this one. I would expect as dignified and visionary a writer as Le Guin to understand just how critical an open dialog about issues of copyright and license will be not only to her profession, but to all of society. Unfortunately, her behavior — mediated entirely through her literary agents and the SFWA — suggests an all-too-common misunderstanding of the role and intent of the Creative Commons.

Re-use, re-mixing, citation, linking, and attribution of content is the only sane and sustainable approach to dissemination of content in the networked era. The traditional hard-line model of litigating any alleged misappropriation of works is simply not viable, and it saddens me to see such a wonderful, creative mind so obviously fail to comprehend this fact.

I would attempt to contact her in response to this issue, but since she lists no means of contact on her website aside from a mailing address, (and basically states that she will make no effort to respond to letters unless it is convenient) I feel my only recourse is to refrain from ever buying her works in the future.

Hello, LDAP!

_Edit: Fleshed out example a bit more to actually do something interesting, and to pass compiler warnings._

I’ve finally decided to hunker down and force myself to start writing some Haskell, even if only as an intellectual exercise. Looking forward, I actually think it could be a really interesting exercise to start re-implementing key pieces of our security infrastructure that way, but I’m still in the early “baby steps” stage.

So without further ado, here is my first working Haskell program:

import LDAP.Types
import LDAP.Init
import LDAP.Search

baseDN :: String
baseDN = “ou=people,dc=reed,dc=edu”

hostName :: String
hostName = “localhost”

portNum :: LDAPInt
portNum = 10389

searchAttributes :: [String]
searchAttributes = ["uid", "cn", "givenName", "sn", "mail", "title"]

buildOneFilter :: String -> String -> String
buildOneFilter attr value = “(” ++ attr ++ “=” ++ value ++ “)”

buildSearchFilter :: String -> String
buildSearchFilter value =
“(|” ++ (foldl1 (++) (map (\attr -> buildOneFilter attr value) searchAttributes)) ++ “)”

main :: IO ()
main = do
putStr “search filter: “;
filterValue <- getLine;
conn <- ldapOpen hostName portNum;
ldapSimpleBind conn "" "";
-- putStrLn ("using filter " ++ (buildSearchFilter filterValue));
results <- ldapSearch conn
(Just baseDN)
LdapScopeSubtree
(Just (buildSearchFilter filterValue))
LDAPAllUserAttrs False;
putStrLn ("results: " ++ (foldl1 (++) (map (\r -> “\n ” ++ (ledn r)) results)))

Lovely, no?

Okay, so it’s kind of a pathological example; I mean, no one starts their Haskell tutorials with database integration tasks *for a reason*. This program doesn’t even define any interesting functions — it’s basically a transliteration of the equivalent Perl code to monadic Haskell.

That being said, it’s a start. I’ve never had a terribly hard time thinking in terms of abstract data structures, so I’m hoping that once the initial shock to my system has passed, I’ll be able to start using Haskell to prototype (or at least mock up) some new tools.