Rod Waldhoff's Weblog  

Rod Waldhoff's Weblog

 Wednesday, 30 July 2003 
The Silent Majority Pays for Open Source #

When you think of a writer, you probably imagine a book author, a journalist, an essayist, or even a script writer of some sort, but it's generally someone who makes a living selling their words. But the majority of text isn't produced for direct or even indirect sale: it's used to remind your spouse to pick up a gallon of milk at the store, or to thank Aunt Rita for that lovely sweater, or to point your friend to that really funny website you saw this morning. Even the majority of text someone gets paid to write isn't written for sale: it's in business correspondence, or restaurant menus, or on mortgage applications, or bus schedules, or in instruction manuals, or on the back of cereal boxes. Practically everywhere you look you can find the written word. Someone was paid to write much of it, and yet hardly ever are you buying the words themselves.

Most of the time text isn't the end product, it's a tool for communication. Lots of folks are paid to write something, very few of them view the text itself as the product. It's a means to an end.

Yesterday Alan Williamson asked "Who pays for open source?" and answered his own question with "The great belief, as sung by pretty much all companies involved in open source, is 'we charge for support'."

Andy Oliver suggests this is but one open source business model, but goes on to define four "forms" that all pretty much come down to "we charge for support".

I think Steven Berkowitz, in a comment to Alan's post, gets much closer to the truth: "Open source projects that somehow make money for someone, be it through support, consulting, etc., are the exceptions." I'll take that comment one step further: Modulo outsourcing, software projects that directly make money for someone, be it through sales, support, etc., are the exceptions.

I'm going to make an assumption here, but I think it's an assumption I can safely make. The majority of software developers aren't selling software, or even software support and services.

Living as I do, in Chicago, where I'm told both the per capita and absolute number of software developers is higher than the Silicon Valley, I think this observation is a bit more obvious. I know a lot of software developers, but practically none of them build software for direct sale to consumers or businesses. When a "software development" firm is hired, it's often for custom (or at least customized) development. Even when we sell shrink-wrapped software, as my company does, it's not the software that customers are really buying.

For most companies and for most developers, software isn't a business, it's a tool for getting the real work done.

In this scenario, it's easy to see why a company might use open source software: just find me the best tool for the job. It's also easy to see why a company might allow its IT staff (internal or outsourced) to contribute to open source development. Critical as it might be, and as disastrous as it can be when it fails, few companies rely upon software for their competitive advantage. I don't care if my competitors use the same web or database server that I do, or for that matter the same XML parser, caching engine, unit testing framework, or database connection pool that I do. Indeed there is some advantage to me if the non-proprietary parts of my infrastructure become commodities.

While the public face of open source software might be the folks that are trying to build a business around it, I suspect that's because, well, they're selling something. Who pays for open source development? My guess is that the answer is the same as it is for most software development: the folks who are trying to get something else done.

 Tuesday, 29 July 2003 
Wanted: GUI Wrapper for Text-Based Java Applications #

I've got a simple console-based (i.e., text interface) Java application that I'd like to expose via Java Web Start or as an Applet. I can't, however, because System.in and System.out aren't attached to anything useful, so my application runs invisibly and can't collect any input.

What I need is a simple wrapper application that creates a basic text i/o frame that looks like stdin/stdout to my application's main method. (Or, failing that, to a custom main(String[] args, InputStream stdin, PrintStream stdout, PrintStream stderr) method.) More elaborate interfaces, such as color coded output, distinct stdout and stderr frames, spool to file, etc., are easily imagined.

I believe such an application would have fairly broad utility. A web search finds quite a few elaborate telnet/ssh oriented terminal emulators, which one might be able to pare down to what I'm looking for, but it seems like someone would have already tackled this problem. Lazy web to the rescue?

 Thursday, 24 July 2003 
A is for Axion ... Z is for Zaurus #

I've got Axion up and running on my Zaurus.

If you'd like to do the same, here's how:

  • Obtain a build of the Axion HEAD (1.0M3-dev). See the instructions on building Axion for details.

  • Obtain binaries of Axion's runtime dependencies, namely Jakarta Commons Collections (currently a nightly build is required) and Jakarta Commons Logging (you'll want release 1.0.3, earlier versions don't work on the Jeode VM).

    If you want to use the LIKE operator, you'll also need Jakarta Regexp.

    If you want to use the BASE64ENCODE or BASE64DECODE functions, you'll also need Jakarta Commons Codec.

  • Ensure that you have the java.util Collections and java.sql JDBC packages available on your platform. Since the Insignia Jeode VM installed on the Zaurus by default is JDK 1.1 based, I had to copy over the JDK Collections manually. I had thought the Collections package was available as a standalone JAR to drop in to JDK 1.1 environments but as I was unable to find it. I just copied over the rt.jar from a JDK 1.3 installation, but I think the java.util package would suffice.
  • Obtain a copy of axiondb.properties. You can pull this out of the Axion JAR or from CVS. Normally Axion loads this file out of the JAR automatically, but the Jeode VM seems to always return null for getClassLoader so we had to add a mechanism for specifying the configuration as an external file.

  • Now simply copy those files over to the Zaurus, add the JARs to your classpath, and you are ready to go. For example to run the Axion console I use:

    evm 
     -Dorg.axiondb.engine.BaseDatabase.properties=axiondb.properties 
     -classpath axion-1.0-M3-dev.jar:
                commons-collections.jar:
                commons-logging.jar:
                commons-codec.jar:
                regexp.jar:
                rt.jar 
     org.axiondb.tools.Console $1 $2

Similar steps should get Axion up and running on other micro or J2ME platforms or for that matter, other JDK 1.1 environments. If the console is any indication, Axion seems to run rather well on the Zaurus (even running off a compact flash card rather than the RAM disk).

If folks were interested, it wouldn't be difficult to create an IPK installer for Axion, although my interest here, and others as well I imagine, is in using Axion within other apps on the Zaurus, rather than playing with Axion via the console.

If you're curious, I did have to make a few minor changes to Axion to get it run on the Jeode/JDK 1.1 VM. Here's a brief list of what I had to change:

  • The Jeode VM always seems to return null for getClassLoader (rather than throwing a security exception), so I had to add checks for null, and provide an alternative mechanism for loading the properties file.

  • We had used File.createNewFile in several places, which is a JDK 1.2 method. These calls turned out to be unnecessary anyway, so I simply removed them.

  • Although it's easy to add java.util.Comparable and java.util.Comparator to the classpath, none of the core objects (Number, String, etc.) actually implement Comparable in JDK 1.1, so I had to add custom Comparators replacing ComparableComparator for those DataTypes.

  • For reasons I don't understand, in several places where we had an hierarchy like this:

    interface Foo {
      void someMethod();
    }
     
    abstract class AbstractBar implements Foo {
      void anotherMethod() {
        doSomething();
      }
    }
     
    class Bar extends AbstractBar {
      void someMethod() {
        doSomethingElse();
      }
    }

    I had to declare the interface methods in the abstract class:

    abstract class AbstractBar implements Foo {
      abstract void someMethod();
     
      void anotherMethod() {
        doSomething();
      }
    }

    to make various AbstractMethodErrors go away.

These changes have already been checked into the HEAD version of Axion, and should be part of the Milestone 3 release.

There are few Axion-based apps I've been thinking of tinkering with on my Zaurus, some for personal use, others for my day job. That may shake out a few issues I haven't yet encountered, but so far I've got feature I've tried working without too much trouble.

 Wednesday, 23 July 2003 
The Great Wall of China #

Although I'm neither old enough nor wise enough to pull it off effectively, I'm a fan of management by narrative: using a brief story or parable to illustrate a point. Often I find it is sufficient to present the narrative without explicitly connecting the dots--I'll just tell the story and then launch into what I was otherwise going to say. (As always, you need to understand your audience to use this approach effectively. I've noticed developers often respond well to leading with the narrative. Suits, whether management, business or marketing folks, often respond best to hearing the explanation and then the metaphor.)

Yesterday, in the context of the early stages of a fairly large scale development project, I had the opportunity to tell the following.

When the emperor decided to build the 10,000 Li Wall, he didn't amass an army of men to start at one end and build to the other. Nor did they start at both ends and build toward the middle, the way one might dig a long tunnel. Rather, they assembled small teams of men and spread them far and wide. Each crew was responsible for constructing a small section of the wall, separated by many li from the neighboring sections.

Like all ancient construction, the work was difficult, and by the time the section was complete the crew was greatly discouraged. They had worked so hard, and seemed to have little to show for it but a single, isolated tower.

Yet as the crew made the long journey back to their homes, they would come across other sections of the wall--some complete, others being busily attended to by other work crews. They came to discover that the wall stretched, section by section, from the Yellow Sea deep into the western desert. They arrived home in awe of the enormity of what had already been accomplished, and more inspired than ever to complete the construction.

I have no idea if this story is true (never let the facts stand in the way of a good story), my source is a half-remembered parable written by Franz Kafka. In my limited understanding of Chinese history, the story is a half-truth: the Great Wall was assembled a bit more organically out of a series of smaller fortifications.

I was a little hesitant to tell this story as I think the metaphor can be easily misinterpreted. I'm not suggesting that large scale systems be built by teams with only local knowledge of the overall system. Nor am I suggesting that if "isolated towers" aren't useful to you (or your customer) that this approach is a good one. (This latter case is an example of where the reality makes a better metaphor than the story. In practice Shih huang-ti connected a number of existing, independently useful fortifications to form the Great Wall.) In fact, I think laying out a grand vision of the project and working on little bits here and there is a remarkably bad way to approach the development unless those little bits are useful to you in isolation. Otherwise you're likely to end up with bits that aren't useful to you at all. But with a little bit of discipline about which "towers" to build first, I think this serves as a useful metaphor for an effective (and not uncommon) development practice. We're implementing a large and complex system via a series of user stories. Rather than building out a major subsystem in its entirety and then moving on to the next one, we're implementing the skeleton of the end-to-end system, to which we will add meat (that is, complexity) in subsequent iterations. We know any given story describes an incomplete subsystem at this stage, but each describes an cohesive unit of functionality, to which we can add complexity and functionality to fill out the walls between the towers.

 Tuesday, 22 July 2003 
Where are you all coming from? #

An unusually large number of my visitors today seem to be referred by various email clients. This and the number of direct hits (non-browser email clients?) are generating enough traffic to push this blog higher than usual in UserLand's rankings of Radio sites.

On other occasions the source of this sort of jump in traffic is obvious. This time I can only assume a link went out on some mailing list, seemingly to an older post, but my ears are burning with curiosity. Will someone post a comment indicating what list, if any, contained the link?

UPDATE: JavaBlogs sucks. This post isn't in either of my java or tech feeds, and hence shouldn't have been picked up by the aggregator, but there it is. Someone please tell me how to get JavaBlogs to pay attention to the proper feed. (Cf. a previous post.)

 Monday, 21 July 2003 
Curly braces, Pipes, Escape and other characters on the Zaurus #

If you've ever tried to type code, pseudo-code or shell scripts on the Sharp Zaurus, you may have noticed that the little slide-out keyboard is missing some useful keys. If you've done much typing with this thumb-keyboard, you may have noticed that when you fat-finger a couple of keys, you can get some of those extra characters. For my reference as much as yours, here's a short list:

keys character unicode value name
[Fn]+z   0x005A(?) undo
[Fn]+[Shift]+c 0x20AC euro symbol
[Fn]+[Shift]+[Backspace] [ 0x005B left square bracket
[Fn]+[Shift]+, ] 0x005D right square bracket
[Fn]+[Shift]+. { 0x007B left curly brace
[Fn]+[Shift]+[Enter] } 0x007D right curly brace
[Fn]+[Shift]+' ^ 0x005E caret/circumflex
[Fn]+[Shift]+[Space] ` 0x0060 tick/backquote
[Shift]+[Space] | 0x007C pipe
[Shift]+[Tab] \ 0x005C backslash (note that this is listed incorrectly in the sharp doc)

In the table above, "[Fn]" means the purple "function" key, "[Shift]" means the arrow-up "shift" key, "[Backspace] means the white back-arrow/delete key, "[Enter]" means the purple "return" key, "[Space]" means the space bar, "[Tab]" means the purple tab key, and a "+" means hit these keys in combination, typically by holding down the "meta" keys first.

Also notice that the "Cancel" button works like "Escape", which makes VI usable without resorting to the on-screen (virtual) keyboard.

There's a full keycode mapping table available on Sharp's site.

 Friday, 18 July 2003 
LGPL and Java: More confused than ever #

After reading the various comment threads, and seeing Andy and David's exchange in a bit more context, I'm more confused than ever.

As quoted by Andy, Roy wrote:

What the FSF needs to say is that inclusion of the external interface names (methods, filenames, imports, etc. defined by an LGPL jar file, so that a non-LGPL jar can make calls to the LGPL jar's implementation, does not cause the including work to be derived from the LGPL work even though java uses late-binding by name (requiring that names be copied into the derived executable), and thus does not (in and of itself) cause the package as a whole to be restricted to distribution as (L)GPL or as open source per section 6 of the LGPL.

and Andy asked:

Is this statement true with regards to the use of LGPL Java libraries by non-LGPL Java libraries?

To which David replied:

If I understand the statement correctly, yes -- that's exactly what section 6 is for.

Seen in context, my reading is that LGPL does not infect Java code that simply references, invokes methods of, or extends an LGPL'ed class. Brad Kuhn's comments mentioned in the slashdot "update" seem to be reiterating that position. I see this as saying all of my examples are not infectious. Joe's comments on that post focus on the word "distribute" in what may be a clarifying way (i.e., the in-memory, late-bound version of code that combines LGPL and non-LGPL code is not distributed, it is assembled at runtime, and therefore not in violation).

I hesitate to ask but the question seems inevitable: what does it mean to apply aspects to LGPL'ed code? If I hook in code to a cut-point in some LGPL'ed code, is my cut-point code infected? Does it matter if my AOP-system is implemented with byte-code manipulation, reflection or composition? Does it matter if this happens at runtime or at compile time? It seems that it would, and that feels like a fairly arbitrary distinction in some ways.

IANAL and this makes my head spin. I'm going back to writing code. Someone let me know when there's a clear answer here. (Unfortunately I think it may take a court to decide that. Let's hope it never comes to that.)

 Wednesday, 16 July 2003 
The [L]GPL, Java and Asymmetry #

Andy has finally tracked down some answers on applying the LGPL to Java. Dave and Sam had some additional comments.

A few points:

  1. In his comment thread Dave suggests that "Class.forName() might also be a workaround", echoing questions on the LGPL in Java I previously mentioned. The answers to those questions still aren't entirely clear to me, although the position of simply steering clear of LGPL/GPL code for Java is seems increasingly rational.

  2. Sam writes "An example of something that does solve the issue: JDBC" but I'm not sure I follow. Does this mean Class.forName? In other words, I can use a LGPL'd database code as long as I never directly invoke the LGPL code?

  3. Andy writes "Thus we need a license for Java that guarantees contributors will donate back to the library that does not infect the outside code."

    Do we? Why? I think there is sufficient incentive to release most derivative works anyway, and if someone doesn't, who cares? The open source project still gets more users, more support, more field testing, mo' better, than it otherwise would. If you make a few bucks selling proprietary extensions to my open source code, more power to you. (You run the threat, of course, of the community re-implementing your proprietary extensions in an open source fashion, thus destroying your business model. This disincentive to selling proprietary derivatives augments the incentives for releasing those derivatives to the community for further extension, testing, maintenance and support.)

A big issue I have with the [L]GPL is the asymmetry of the license. If Company X releases code under the [L]GPL, I'm less troubled by the need to similarly open my derivative works (although the people who sign my paycheck quite reasonably have a different perspective on that--there is some work that is best kept proprietary, if only because it pays the bills) than by the fact that Company X, typically acting as the "umbrella" copyright holder (to both their original work and my contributions) now has more rights to my contributions than I do. Under the viral GPL, I cannot use the larger work under any terms but those provided by the GPL, and in isolation my contributions are probably pretty much useless. Yet Company X (as the holder of the collective copyright) is free to take my (donated) work and theirs and release (or sell) it under whatever terms they choose. I wonder if Stallman had foreseen this consequence when constructing the GPL.

The "umbrella" copyright holder on a BSD- or ASF-type license has the same rights of course, but the license grants me the ability to do pretty much whatever I want with the larger work, short of claiming it all to be my own. There is very little that the copyright holder can do that I, as a user let alone contributor, cannot also do. It seems to me that the license with the least restrictive terms is the one that is most "free".

 Tuesday, 15 July 2003 
my pet bug, or another example of how sun doesn't get community development #

I've submitted a few insignificant bugs to Sun's Bug Parade over the years, most recently "Collections.ReverseOrder.equals method is lacking", which I submitted back in January of 2003 and was finally accepted yesterday (14 July).

This isn't a critical bug, of course: It's easy to work around (that is, replace) and frankly it's been an annoyance but not a real problem in my development efforts. Moreover, it seems like Sun has been pretty busy over the last few months working on other things. Nevertheless, this whole experience has been frustrating.

First, in my limited and anecdotal experience the time to respond to bug reports is getting much worse. It took six months for the Java team to acknowledge this simple, and if I may say reasonably well documented (including a "patch" and unit test), bug.

Second, this is a good if trivial example of how Sun's community efforts fail in execution. Here's a trivially simple (it's clear from inspection alone) and readily demonstrated (a complete unit test is provided) defect. A trivially simple (one line, two if you add hashCode()) patch is provided, and yet it took six months to get the issue acknowledged. Now I'll wait an indeterminate amount of time to see the problem fixed. All that and this problem could have been analyzed, patched and regression tested in less time than it took me to write this post. Even the slowest moving of open source projects would have had this problem patched, if not released, in a matter of weeks. If Sun can't find a way to move more quickly (and stop chasing the most superficial features of C#) Java seems destined to be eclipsed by community developed languages.

I'll uncharacteristically paraphrase Jerry Seinfeld here: "Sun knows how to take community input, they just don't know what to do with it. And that's really the most important part of community: the doing." (text, audio)

PS: Also note that the second bug I linked to above, "Repackage sun.net.www.content.* Classes (java.net.ContentHandler)", was already addressed by the time I submitted it, I personally added a comment to this effect nearly two years ago, and yet the bug sits marked "In progress" (and with 3 votes, none of them mine).

 Monday, 14 July 2003 
Given enough eyeballs, are all trends shallow? #

As part of some internal strategic planning, I recently found myself constructing various "scenarios" (in the scenario planning sense) that try to address the next 2 to 5 years of development in the areas of mainstream language development, "enterprise" application platforms, etc.

It occurs to me that (a) very little of this analysis is proprietary, either in the sense of "containing trade secrets" nor in the sense of "unique to the organization that created it"; (b) this analysis would be more insightful and likely more accurate when developed by a larger group; and (c) this sort of information would be a valuable community resource.

I wonder if it is possible to grow a body of "open", community developed scenario documents, initially focused on technical topics, to be used by individuals, corporations, and even open source projects for strategic planning. How would one organize this information? In what format would these documents be developed? (A Wiki?) What license would be appropriate? (creative commons?) Does such an initiative already exist?

 Friday, 11 July 2003 
Axion 1.0 Milestone 2 Released #

It seems Friday is a good day for releases:

Axion has released their third binary distribution, just one year after their first release, and only three months since their previous one.

This release adds several new features, including the DML and BTree performance tuning I previously described, as well as support for OUTER JOINs, LIKE clauses and JDBC 3.

 Wednesday, 9 July 2003 
Do you have a intranet portal? #

I'll show you mine if you show me yours.

screenshot

For the curious, here's where those links go:

Links
Iteration Documents
index to our current and historical iteration status documents with some summary statistics.
Publications
various internal whitepapers, HOWTOs, and articles. Most of the interesting content has moved to the wiki we introduced a couple of years ago.
Wiki
the aforementioned wiki.
JavaDocs
JavaDoc documentation for our codebase, updated as part of the continuous integration process. The "(more)" link points to a wiki page that links to various external JavaDocs.
CVS Tree
our ViewCVS instance.
Build Results
our CruiseControl build servlet.
Test Coverage Reports
a JCoverage report across our (Java) codebase, updated as part of the continuous integration process.
Latka Test Suite
a Latka webapp instance, which allows one to execute our Latka functional test suite against various development, integration, QA and production environments.
TrackNet
a defect tracking database.
Reports
web traffic analysis and similar reports.
Headlines
a simple RSS browsing web application.
Recent Wiki Changes
lists the last few changed pages on our internal wiki.
SysArch Intranet Traffic
web traffic analysis for the portal site itself (just below the fold in the screenshot).
Current Build Results
a one line summary of our continuous integration build status.
Iteration N Status
current status of the current and previous development iterations, in summary form.

This "portal" is largely cobbled together out of server side includes. It'd be nice to upgrade to a true portlet implementation, or at least use RSS (or whatever the "better" RSS is these days). In fact, I've been looking into introducing a number of blogging inspired ideas.

 Tuesday, 8 July 2003 
50% of crashes caused by 1% of bugs #

An interesting statistic: In Disruptive Programming Language Technologies, Todd Proebsting of the Programming Language Systems group at Microsoft Research notes:

Microsoft “Watson” experience: 50% of crashes caused by 1% of bugs.
[Slide 17]

There's some other interesting stuff in that presentation as well.

[Via Frank Mitchell's post in the comment thread at After Java and C# - what is next?.]

 Monday, 7 July 2003 
Wanted: Modular/Extensible Parser Generator #

Over at the Axion project, we're using a JavaCC grammar to implement the SQL parser.

One of things we've found Axion to be good for is unit testing database applications. In other words, one can use an in-memory, in-process Axion database as a "mock" replacement for a regular production database. In this case, it's quite useful to have Axion closely mimic the syntax of other RDBM systems. For example, Axion supports LIMIT and OFFSET clauses like PostgeSQL and MySQL as well as a ROWNUM pseudo-column like Oracle. Similarly, Axion supports the ISO SQL 99 syntax for outer joins (FROM a LEFT OUTER JOIN b ON a.id = b.id) , but it would be nice to support Oracle's custom syntax (FROM a, b WHERE a.id = b.id(+)) as well.

Supporting the idiosyncrasies of several of the popular database engines in a single grammar file seems cumbersome at best. and probably impossible. It tends to bloat our keyword namespace. Eventually, it must lead to conflicts. (For example, if I'm trying to unit test code that eventually interacts with an Oracle database, then the "mock" database shouldn't accept LIMIT and OFFSET clauses, and shouldn't consider those to be keywords either.)

Axion's design is modular enough to allow for pluggable parser implementations. Indeed, anything that implements:

interface Parser {
  AxionCommand parse(String sql) throws AxionException;
}

can be dropped right in. Hence it is straightforward to define, for example, MySqlSyntax.jj, OracleSyntax.jj, SqlServerSyntax.jj, etc. to support a specific SQL dialect. The trouble is, each of those files is going to be 90% or more the same. What I'd like is a clean mechanism for either:

  • declaring SpecializedGrammar to extend from GeneralizedGrammer, perhaps with abstract productions and the like, or
  • declaring SpecializedGrammar to be the composition of SubGrammarA and SubGrammarB

or both. I'm especially interested in being able to combine grammars at runtime. Anyone have any suggestions? Can you point to an example?

(I'll be honest, I'm not much of a *CC expert. This might be straightforward in JJTree or ANTLR, I just haven't dug into it much. I'm pretty sure it's not straightforward in plain old JavaCC.)

 Tuesday, 1 July 2003 
Re: Test Driven Development versus Component Reuse #

Over on the Software Craftsmen blog, Mike Hogan asks what is meant by "the simplest thing that could possibly work". For what it's worth, Beck actually addresses this point directly in Extreme Programming Explained [ISBN:0201616416]:

Here is what I mean by simplest--four constraints, in priority order:

  1. The system (code and tests together) must communicate everything you want to communicate.
  2. The system must contain no duplicate code. (1 and 2 together constitute the Once and Only Once Rule).
  3. The system should have the fewest possible classes.
  4. The system should have the fewest possible methods.
[...]

If you view the design as a communication medium, then you will have objects or methods for every important concept. You will choose the names of the classes and methods to work together.

I have my own issues with that definition that I hope to pick up in a later post. (For starters, define "everything you want to communicate".)

Mike goes on to question whether there are times when TSTTCPW conflicts with design for reuse.

I think this misses the test first aspect. Consider approaching the Grep example test first. A small set of tests that might lead to the first Grep implementation is:

static final String TEXT = "This is\na simple test";
Grep grep = null;
BufferedReader reader = null;
 
void setUp() {
  grep = new Grep();
  reader = new BufferedReader(new StringReader(TEXT));
}
 
void testDoesContain() {
  assertTrue(grep.contains(reader,"is");
}
 
void testDoesNotContain() {
  assertFalse(grep.contains(reader,"is not");
}
 
void testPatternsSplitAcrossMultipleLinesAreNotFound() {
  assertFalse(grep.contains(reader,"is*a");
}

Mike asserts that the Grep implementation would be more useful if it could interoperate with multiple regular expression frameworks, and provides an example "Inversion of Control" approach for doing so. Want to make that Grep implementation work with multiple regular expression frameworks? Great. First, write a test that fails:

void testGrepWithJakartaRegexp() {
   RegexpProvider rep = new RegexpRegexpProvider();
   ReusableGrep grep = new ReusableGrep(rep);
   assertTrue(grep.contains(reader,"is");
   assertFalse(grep.contains(reader,"is not");
}
 
void testGrepWithJakartaOro() {
   RegexpProvider rep = new OroRegexpProvider();
   ReusableGrep grep = new ReusableGrep(rep);
   assertTrue(grep.contains(reader,"is");
   assertFalse(grep.contains(reader,"is not");
}

(Alternatively, (1) define a "mock" instance of RegexpProvider for the purpose of this test rather than using specific implementations and (2) define an abstract getRegexpProvider method in your test class, an implement these tests as concrete extensions of that abstract test case, but I digress.)

Now we can justify the creation of the RegexpProvider interface, and ReusableGrep still meets the "simplest" criteria. (Ignoring that ORO and Regexp likely support slightly different syntaxes.)

I think Mike's first instinct--simple is "the smallest amount of code" that will "get the test case[s] running and refuses to concern itself with any potential future requirement" is the right one. Have additional requirements you'd like to assert? Then express them as tests and this simple rule allows you to support them. When approaching development test first I think a lot these questions about "what is simple" simply fade away, as does much premature generalization. (And I say that without taking a position on whether ReusableGrep represents premature generalization or not. I recognize that it's meant as a trivial example.)

PS: I can't resist the tempation to plug a commons-functor approach to this Grep implementation. How about something like:

Lines.from(reader).contains(RegexpMatch.of("my expression"))

Re: "Test First Design With UML" and who's recepetive to test first development #

I ran across David Hussman's "Test First Design With UML" in a list of bookmarks, though I'm not sure where I picked it up originally (likely another blog, I suppose).

This brief (two page) but interesting paper had two points I found especially worthy of note:

First, Hussman suggests that every story card be accompanied by a sequence ("lollipop") diagram, right on the back of the card (and further suggests a protocol for implementing that card using that diagram). As a frequent but informal user of sequence diagrams, I find this idea appealing.

Second, Huffman makes this incidental comment:

"I think that most developers that have gravitated toward test first design, have done so because it matched (or formalized) their development habits."

I find that a rather insightful statement, and I wonder what it means to those of us who'd like to convince developers to whom test first doesn't come naturally to follow this approach. (Cf. this post, its comment thread, and others.)