Homebrewing Knowledge-Base from HBD Archives?

Uh-oh!

Started thinking again. This time about a way to repurpose messages on the HomeBrew Digest into a kind of database of brewing knowledge. I can just see it. It’d be ah-some!

Anybody knows how to transform email messages from well-structured digests into database entries? Seems to me that it should be a trivial task, especially for someone well-versed in Perl and/or PHP. But what do I know?
That venerable HBD mailing-list contains a wealth of information about pretty much every single dimension of beer homebrewing. For a large number of reasons, content from the HBD.org site turns up quite often in Web searches for brewing terms.

One issue with the HBD, though, is that it’s a bit hard to search. There used to be a custom-built search feature on the site but we now need to rely on Google and AltaVista. This wouldn’t be too much of an issue if not for the fact that those engines search complete digests instead of individual messages. So the co-occurrence of two terms in the same digest can be due to two messages on completely different subjects.

Another issue with the HBD (as with many other mailing-lists) is the relatively high redundancy in message content. Some topics came cyclically on the mailing-list and though some kind souls were gracious enough to respond to the same queries over and over again, the mailing-list often looks like an outlet for FAQs. Among HBD “perennials” (or cyclical topics) are discussions of the effects of HSA (hot-side aeration), decoction mashing, and batch sparging, to name but a few technical issues.

Unfortunately, it looks like the HBD might need to be retired at some point in the not-so-distant future, at least for lack of sponsorship. Also, Pat Babcock, the digest’s “janitor,” recently asked for mirror space and announced the retrieval of some of the older digests (from the late 1980s).

Of course, there are lots of other brewing resources out there. So many, in fact, that it can be overwhelming to the newbie brewer. One impact of having so much information so easily available about homebrewing (and commercial brewing, for that matter) is a “democratization of beer knowledge.” Contrary to brewing guilds of medieval times, brew groups are open and free. Yet a side-effect of this is that there isn’t a centralized authority to prevent disinformation. Also, because the accumulated knowledge is difficult to peruse, people tend to “reinvent the wheel.”

In Internet terms, the HBD is the closest equivalent to a historical source. Few other mailing-lists have been running continuously since 1986.

Luckily, all the digests since October 1988 are available as HTML files. And the digest format has remained almost unchanged since that time.
All of the content is in plain ASCII. Messages never exceed a certain
length. IIRC, line length is also controlled. And HTML was officially
not admitted. Apparently, some messages did contain a bit of HTML
code
, but that shouldn’t be an issue.

Here’s what I imagine could be done:

  1. “Burst” out digests into individual messages (with each message containing digest information)
  2. Put all the individual messages (350MB worth) into a Content Management System
  3. Host the archived messages in the form of a knowledge-base
  4. Process those entries for things like absolute links and line breaks
  5. Collect messages in threads
  6. Add relevant del.icio.us-like tags and slashdot- or digg-like ratings
  7. Use this knowledge-base for wiki-like collaborative editing
  8. Assess some key issues to be taken up by brewing communities
  9. Add to the brewing knowledge-base
  10. Build profiles for major contributors and major groups

Because I couldn’t help it, I started writing down some potential tags I might use to label messages on the HBD. It could be part “folksonomy,” part taxonomy. For one thing, it’d be useful to distinguish messages based on “type” (general queries about a brewing technique vs. recipe posted after a competition) since many of the same terms and tags would be found in radically different messages.

Advertisement

Beer Explosion and Other Cautionary Tales

Here’s an old message I sent to the Members of Barleyment brewclub mailing-list, a while ago.

——– Original Message ——–

Subject: Beer Explosion and Other Cautionary Tales
Date: Mon, 1 Mar 2004 09:04:41 -0400
From: Alexandre Enkerli <aenkerli@indiana.edu>
To: brewers@wort.ca
Got back from the in-laws this morning. The house smelled like beer.
Not really a good sign.
Had brewed a batch and bottled another one on Thursday. Left Friday
afternoon. Thought the yeasties didn't need their herder for the
weekend. The new Scotch Ale seemed happy, bubbling in a cool carboy
with blow-off tube. The bottles of Mep were all warm and cozy, didn't
seem to want to transform into little bottle bombs, yet.
Where's that smell coming from? Oh, well, people were in the house
during the weekend so if a catastrophe happened, they probably know
about it. But let's check the bottles, just to make sure. Snif.
Snif-snif. Sniffffffff... Nope, no b.o. (beer odour) here. Fine, then.
Talked a bit with SWMBO before she left for work. Thought about going
back to bed (got home before 7am). Hey, it's Spring Break for everyone,
right. But no /Girls Gone Wild/ shooting in perspective. Just this beer
smell...
Speaking of beer: how's the new batch coming? It's always cool to check
on a fermenting beer. Except, that...
OMG! What's that thing where the carboy used to be? Did someone put it
somewhere else? Looks like it. An empty beer pack isn't where it was on
Friday. But, wait. This is the t-shirt that served as a carboy-jacket.
Why's it all wet? And where's the Scotch Ale?
Hey, the blow-off tube's still here. So is the wine bottle at the end
of the blow-off tube...
Uh-oh!
Oops!
There you go. That's where the b.o.'s coming from. And that's where the
carboy morphed into a pile of shattered glass in a pool of wort. Smells
good, though.

Let's learn some lessons:
a) Murphy's Law applies to brewing
b) yeast can be mighty strong
c) a rubber stopper can stick to a carboy more strongly than the
carboy's walls themselves
d) a blow-off tube shouldn't be constricted
e) there's a reason to have a headspace above fermenting wort in a
primary
f) it's a good thing to have your fermenters in the basement
g) carboys break fairly cleanly
h) a 5 gallon carboy filled with about 4.8 gallons of wort might make a
mess of ca. 1.5m^2
i) New Brunswick's blue plastic bags for "dry" trash aren't really
sturdy
j) there are situations where beer odors don't smell so good
k) it's probably a good thing to open-ferment ales in primary

["Whoooooo are you? Who-Who? Who-Who?"]
Sara's surprisingly not in the mood for beer this early in the morning,
so Warrick's the one taking the pictures and sending the yeast to Greg
for DNA analysis. Al establishes time and cause of death: carboy
explosion. Grissom, using his in-depth knowledge of brewing,
establishes a timeline.  Lag time was probably around 9–10 hours,
blow-off tube was blocked after 30 to 48 hours, pression accumulated at
a rate of 2 PSI/hour, carboy exploded about 66 hours after pitch-in,
most of the wort dried off in the remaining 18 hours.
Stokes notices some mud-like substance on a fragment of glass. Analysis
comes back: precipitated protein, yeast sediment... Yup, it's trub. But
how did it get there?
Catherine tours brewpub to identify the victim. The brewmaster at the
pub: "Hey, it looks *somewhat* like Scotch Ale, but real Scotch Ale
would be maltier and bigger." A botched attempt at Scotch Ale? A
lagered Tripel? Maybe...

Ale-X, not in Vegas

References/Apologies to:
http://www.homebrewers.com/product/600671
http://www.hum.utah.edu/english/faculty/brunvand.html
http://www.acsu.buffalo.edu/~insrisg/nature/nw00/laFontaine.html
http://www.edwards.af.mil/history/docs_html/tidbits/murphy's_law.html
http://www.cbs.com/primetime/csi/main.shtml

I hope this might help others, if only as a funny anecdote.

TechYesRati!

Woohoo! We’re back on, baby!

Technorati Blog Info: Disparate

Mysteriously, this here main blog of mine wasn’t getting updated in Technorati’s famously unreliable databases. For about six months, my new posts and incoming links weren’t showing up. It now works.  So, that’s cool.

Not that it’s likely to bring me traffic or to increase my ranking somewhere. But it might bring me more of the attention from cool people that some entries have garnered me, on occasion. Call me vain all you want (anyone who discusses blogging eventually calls some bloggers vain, it seems) but there’s something fun about getting noticed if you eventually get to contribute something back.  Usually, non-blog tribunes work better for me to achieve those goals. Mailing-lists are especially good, for me. Or, possibly, forum comments.

Actually, the problem might be that blogging is still not a very natural thing for me to do. And my writing habits are possibly incompatible with blogging culture (though, not with the nature of blogging).

Still, blogging has been fun. Technorati might even make it a little bit more fun.

Who knows, maybe some people will eventually comment my posts… 😉

For Those Who Don’t Grok Blogging

A friend sent me this link:

How to Dissuade Yourself from Becoming a Blogger – WikiHow
Cute, but not that insightful. Continue reading “For Those Who Don’t Grok Blogging”

“Defending” Mailing-Lists (Draft)

[Should edit this heavily. At some point. If time allows. It’s already somewhat on the long side of things…]

Been on hundreds of mailing-lists during the last thirteen years. Yes, literally.
All sorts of distribution lists, listservs, Yahoo! Groups, listprocs, Google Groups, majordomos, announcement lists, etc. Lists about community projects (ilesansfil.org, BiscuitChinois.net, AcadieUrbaine.net…), academic disciplines (linguist-list, Anthro-L, SEM-L…), open-source projects (BibDesk.sf.net, StrangebrewJava.sf.net…), commercial software (OmniOutliner-Users, EccoPro…), hobbies (MontreAlers.ca, SweetMarias.com…), communities (Causerie, MaliNet…), online stores (CalabashMusic.com, SAQ.com…). The list goes on and on.
Those who assume that “email is dead” probably give little consideration to mailing-lists. To them, mailing-lists may easily be replaced by Web-based forum- or blog-style comment systems. Yet, to me, and despite all the hype about what Tim O’Reilly calls “Web 2.0,” mailing-lists are one of the most interesting things happening online. Yes, even today. Maybe they’re not really here to stay but mailing-lists have yet to be replaced by “better” technology.

Mailing-lists are based on simple technology and vary greatly in the way such technology may be implemented on each list. Several announcement lists are quite similar in effect to XML-based syndication (RSS and Atom). You get a message anytime new content is added (for instance, on BorowitzReport.com or MarkFiore.com). Others are very interactive and dynamic, with dozens of people sending each other messages throughout the day (Members of Barleyment are part of one such list). In either case, a mailreader (Eudora, Thunderbird, Mail.app, Entourage, Outlook…) is a very convenient “aggregator” as list messages can be checked quite regularly, may be routed in different folders automatically or manually, are easy to label and archive, and use relatively little bandwidth or disk space (though my current mail folder weighs in at about 5GB and doesn’t include all of my mailing-list content for even the last five years).
On more interactive mailing-lists, using a mailreader is even more beneficial because mail editors are usually much more efficient than browser- or Web-based editors, especially when replying to somebody else’s comments. Furthermore, editing list posts in a mailreader makes it easy to archive and search their contents in a centralized place. On several occasions, looking through my list archives for my own submissions or those of others has been a very efficient way to find information and put it in its proper context.

Contrary to Web-based content, mailing-lists are not usually about getting larger audiences. While some list subscription numbers are rather impressive, many mailing-lists give more value to what happens on-list and off-list between listmembers than to the possibility of getting advertisement monies. As such, mailing-lists are much less likely to get hyped than Web-based “social” projects. Yet mailing-lists are often where important things are really happening online.

In some ways, mailing-lists are “push technology” done right (anybody remember the hype surrounding PointCast? Anybody believes PointCast had that much impact?).
Some mailing-lists (Humanist, HomeBrew Digest) have long histories and their archives are among the most valuable sources of online information.

Much mailing-list traffic is made of threads. Threads have lives of their own, often splitting in multiple subthreads and follow-ups. As such, they do look like comments on a Web forum or blog, but are quite possibly more fluid. This fluidity might imply a lower “signal to noise ratio” in some cases as off-topic messages multiply, but some of the more open mailing-lists greatly benefit from the “stream of consciousness” effect of having threads develop in different directions.

Many mailing-lists are really about building communities. Though blogs and “social networking” sites are seen as community-builders, mailing-lists are, in my humble opinion, more efficient ways to build stronger and longer-lasting online communities.
Although subscribing to a mailing-list is almost as easy a process as subscribing to an XML-based “feed” (RSS or Atom), becoming a listmember is often an easy way to fully integrate a community. It’s common practise, on many interactive mailing-lists, to introduce yourself as soon as you subscribe to a list or before you start posting queries. Responses to these introductions are typically welcoming and often generate interesting discussions. On some of the more personal mailing-lists, unsubscribing to a list may also be an interesting process as people’s parting words can be quite revealing.
Mailing-lists often emulate societies as group dynamics grow from the meeting of individual personalities. Contrary to blogs, mailing-lists are often based on large numbers of “authors” and “replies” have the same status as “posts.”
Members of mailing-lists often develop long-lasting relationships. This is especially obvious on the more personal lists where members will go to great lengths to visit each other. But even academic mailing-lists often give way to important collaborations between members. In some ways, listmembers know each other on a deeper level than comment writers on Web-based content.
Even more important than list posts, listmembers interact through private messages. Yes, like many might do on “social” sites. The difference here is in the transition from list to private communication which, though not strictly codified, often follows interesting lines. Because listmembers form a specific group (however open and large that group may be), those who interact through private messages already have the possibility to refer to a shared “history,” especially if both of them have been active members of the list for a significant amount of time. Similar processes have been happening on some IRC channels, chatrooms, MMPORPGs, and in some blogging communities but private interactions stemming from mailing-lists tend, in my experience, to be broader-reaching than other forms of online communication.

None of this is meant to say that mailing-lists are the only “cool” thing happening online. In fact, the claim is that mailing-lists are simply more useful than “cool.” The hope isn’t to have mailing-lists remain what they currently are, but for mailing-lists to transform and integrate into other online technologies. For instance, a few Web forum commenting systems send detailed notifications when new messages are added in a thread. This could be improved by allowing replies to these notification messages as an easy way to post Web comments. Mailreader could greatly improve their handling of mailing-lists as, to this day, none of them seems to even facilitate the distinction between a list address and a personal address. While some scripts exist to facilitate the creation of separate folders for different mailing-lists, mailing-list content often remains difficult to distinguish from private messages. List messages received in digest formats are “unpacked” by only a few mailreaders. Threaded mailreading (in Gmail, Mail.app, and Thunderbird) has improved over the years but is still imperfect. Mailing-list software has come a long way but much more could be done in terms of archiving and repurposing list content.

Ah, well…