|
Listening To
|
|
|
|
| |
31 October 2002
A brief aside from all the techno-babble.
You know how when you're on the other side of the world and you say where you're from people always say things like, "Oh, my second cousin Jake lives there, you must know them..."
Well in the spirit of the aforementioned, and in the knowledge that there are quite a few java bloggers in Sydney: My sister (Anna Hobbs) lives there and some of her friends are developers. Anyone reading this know her?
Hi Anna!
10:32:44 PM
|
|
JMock is your friend... [Joe's Jelly]
JMock does actually look quite similar to what I've done, but it is focussed more on the normal use of Mockobjects - ie. checking they were called with the expected values.
I'm not really doing Mocks at this point - its more like convincing an application that its running inside a real servlet container. Which means that I have to be able to understand that when I get a call like request.getHeader("user-agent"), I need to return "Fakezilla", but when I get request.getHeader("if-modified-since"), I have to return a valid date string, or null. I'm not trying to assert how the mocks are called, just fake up a convincing looking environment. What I've ended up with is essentially a form of dynamic dispatch, as each mock has a collection of MethodReceivers in a Map keyed by method name, and each MethodReceiver has a Map of MethodReturn objects, keyed by an ArrayList of the arguments. Its then a case of calling getReturnValue on the MethodReturn object (confused yet?). I can therefore return different responses based on the actual values (not just type) of the arguments.
I can also simply make the Mock return a hard-coded response directly where I can get away with it (methods with no arguments for example).
There's some syntactic ugliness in the setup code due to Java's lack of enclosures - the closest approximation I've found is to implement an interface 'in-place' with an anonymous block. The alternative is hundreds of really tiny and very similar looking concrete classes.
The whole thing isn't exactly generic or lightweight, as you have to know quite a lot about what your application is going to need in the way of parameters, and which methods its going to call on the request/response objects, but it has been quite illuminating finding out just how the application in question interacts with its environment.
And its nice to work on something a little off the wall every now and then :)
9:25:46 PM
|
|
|
|
30 October 2002
The Lucene guys have added an updated file format information page. Interesting for wannabe searchengine technology mavens (like me).
There's also a new article on the site about the LARM webcrawler project that's currently underway here.
10:14:01 PM
|
|
Spent the afternoon making a servlet based application run all by itself by mocking the HttpServletRequest/Response, ServletContext etc etc. interfaces. It was also a great opportunity to use dynamic proxies in anger for the first time. I ended up with a rather nice abstract implementation of InvocationHandler that contained a Map of method names and MethodReceiver objects for the calls I cared about. Using dynamic proxies saved loads of time - I could code up canned responses to the method calls I was interested in and ignore the rest. It was kind of fun being able to instantiate a servlet inside a TestCase, call doGet() and get all the HTML dumped to System.out.
I didn't quite reimplement EasyMock from scratch, but there's probably some synergy to be had there...
9:51:30 PM
|
|
I've posted a straight copy of my ANT build file for Webwork up here.
The project is just something I was toying with as an excuse for learning Webwork & Castor - its vapour at the moment - don't expect a Roller-killer anytime soon :)
I'll add some more details (source code, jsp's, required jar files etc.) as and when it occurs to me.
9:29:29 PM
|
|
Untangling the Gordian Knot. Currently working on a project to untangle a hairball project. Its quite tempting to cut the Gordian Knot, but we... [development]
Interfaces are my friend in this situation. I have found it really useful to extract a single method on the bloated object I'm trying to break up into an interface, and substituting a reference to the interface in the calling code wherever possible, repeating for each method, initially by defining one interface per method, and occasionally consolidating the interfaces where I find that several of them are passed around together. It does mean that for really blob-like objects, at about the halfway stage I find myself with humourous constructors like:-
MyThing thing = new MyThing(this, this, this, this, this);
Where 'this' is some bloated object that now implements 5 distinct interfaces, and the target class has been refactored to work through the interfaces. Its so obviously wrong that sheer embarrassment forces you to keep refactoring until its all clean. Writing tests gets really easy too. Its easy to make the TestCase implement an interface and pass itself to the object under test (self-shunt pattern) when the interface only declares one method!
8:15:13 PM
|
|
|
|
28 October 2002
AOP Explained
Another useful AOP article here.
1:03:49 PM
|
|
Aspects == Indirection ^2
Jumping back on the aspects bandwagon (trying to explain something often helps
me to understand it):
Indirection, aka abstraction: You call a method on an interface, and the
recipient is determined at runtime, and may be at the end of several 'middlemen'
who simply pass the message along. What AOP does for you is allow you to
dynamically compose a whole method call chain using reusable interceptors. So
calling setFoo() on something that appears to be a simple bean can actually take
a quick detour to your validation interceptor to check the syntax, then visit
the persistence interceptor, telling it that the field is dirty, before
returning. This is nothing that can't already be done without AOP, but AOP (ie.
interceptors & extensions) appears to be much more flexible and reusable. Being
able to define your interceptor stack in a descriptor or programmatically means
you can substantially alter the behaviour of bits of your application while its
running, and the calling code still sees a simple 'setFoo' bean like interface.
And the possibilities for code reuse look huge.
Of course, I could be entirely wrong - I'm still trying to wrap my brain around
the concept.
10:03:19 AM
|
|
|
|
27 October 2002
Webwork+XDoclet 1.2 beta+ANT+Resin == Rapid Prototyping City
This is somewhat sketchy on the details, but I wanted to get it out there while it was fresh in my mind.
How to iteratively prototype a webapp:-
Get Webwork and import it into your project.
Slap together an ANT script that compiles your stuff and WARs it up, puts the web.xml file from webwork in WEB-INF and includes webwork.jar in your WEB-INF/lib folder. Add an extra line to copy your WAR to your servlet container's webapps directory.
Build at least one class that extends ActionSupport. Override execute() to return Action.SUCCESS.
Put a class-level javadoc comment that looks something like this in it:
/**
* @webwork.action
* name="myCoolAction"
* input="myInput.jsp"
* success="myCoolActionSuccess.jsp"
/*
Write a simple jsp called myInput.jsp with an input form along the lines of:-
FORM METHOD="POST" ACTION="myCoolAction.action"
Write a quick success.jsp page with a 'well done' or similar message.
Get XDoclet 1.2 beta. Stick the jars in Ant's classpath. Add a target that look like this:-
<target name="xdoclet" depends="init"
<taskdef
name="webdoclet"
classname="xdoclet.modules.web.WebDocletTask"
classpathref="class.path"
/>
<webdoclet
destdir="${build}/src"
mergedir="parent-fake-to-debug"
excludedtags="@version,@author,@todo"
addedtags="@xdoclet-generated at ${TODAY}, XDoclet,@version ${version}"
verbose="false"
>
<fileset dir="${src.java}">
<include name="**/*Servlet.java"/>
<include name="**/*Filter.java"/>
<include name="**/*Tag.java"/>
<include name="**/*Action.java"/>
</fileset>
<webWorkConfigProperties destDir="${webinf.classes}"/>
</webdoclet>
</target>
Include the target into your default build, just before the compilation step.
Run ANT. Wait for resin to reload your webapp.
Open /yourWebapp/myInput.jsp. Submit it. See the success.jsp page. Modify your Action class's execute() method to actually do something, like check for the right name/value pairs. Make it return Action.INPUT to go round again, Action.SUCCESS to succeed and go to success.jsp, Action.ERROR to report an error etc.
Repeat.
More details when I have time.
4:43:15 PM
|
|
|
|
26 October 2002
it's actually really simple. Web-the_simplest_thing_that_can_possibly-Work. I get WebWork. Finally. For some reason I've had a mental block on it up until now. Every time I sat down to explore it something came up, or I got bored. Finally cracked it this afternoon. Turns it I was expecting something complex, and it's actually really simple. This appeared to cause me as much (or more) mental discontinuity as when I'm expecting something to be trivial, and it turns out to be much more complex. WW really is very simple, but it lacks an 'idiots guide' which would have been very helpful to me. I'll see if I can retrace my steps and post my experiences up sometime. [Pushing the envelope]
Great! A WW idiots guide would be fantastic. Did you read Joseph Ottinger's tutorial? Maybe that would be a good starting point to collaborate on? [rebelutionary]
Doh! No I didn't know about Joseph's tutorial, I must have skimmed over the post where you mentioned it. That would have made life much easier :)
I find this kind of blow-by-blow guide to be an excellent 'kickstart' when using a new tool or framework. It helps me get my mind around the paradigm of the tool being used. I wish every new technology had one of these. For anyone else like me who missed Mike's link, the guide can be found here:-
http://enigmastation.com/~joeo/webwork.html
6:43:48 PM
|
|
|
|
24 October 2002
I hate starting projects. Writing ANT scripts, setting up the directory structure, configuring the IDE settings, putting all the library jars in the right place etc. I find it all immensely tedious.
I love starting projects. There's no code written, you are full of ideas, with an infinite solution space to choose from. The journey lies ahead.
So to recap, I love hate love, have mixed feelings about starting projects.
8:59:23 PM
|
|
I get WebWork. Finally. For some reason I've had a mental block on it up until now. Every time I sat down to explore it something came up, or I got bored. Finally cracked it this afternoon. Turns it I was expecting something complex, and it's actually really simple. This appeared to cause me as much (or more) mental discontinuity as when I'm expecting something to be trivial, and it turns out to be much more complex.
WW really is very simple, but it lacks an 'idiots guide' which would have been very helpful to me. I'll see if I can retrace my steps and post my experiences up sometime.
8:53:58 PM
|
|
|
|
23 October 2002
Found this rather interesting example of slogan reuse while browsing Netscape's homepage on the wayback machine. Right at the bottom of the page, a reference to 'Netscape ONE, the Open Network Environment'. Fast forward 6 years and a joint venture or so later, and we have Sun ONE: Open Net Environment. Wow, they have been planning this for a long time, apparently.
8:27:42 PM
|
|
Conditional Get Update. Various news aggregators are leaping on the conditional get bandwagon. (108 Words) [The Fishbowl]
Its somewhat ironic that all this effort is suddenly being devoted to make aggregators only perform full GETs when they are needed, when as anyone who's ever developed a public facing stateful web-app knows, its usually a knock-down drag-out fight to prevent misconfigured proxy servers from caching all your pages (and cookies, some of them) for all eternity and breaking your app in a variety of interesting and humorous ways.
8:14:23 PM
|
|
|
|
22 October 2002
OSAF Post Feedbacks.
I disagree that there is no sense of value for software in this country. I do agree that I seem to be buying less software than before, but I ask what factors might have caused this change? Microsoft contributed, but there are other factors involved. [Don Park's Blog]
I do pay for software. I would pay for more software, if there was quality software out there worth paying for. I think I've said it before, but free / open-source software entering a market won't ipso-facto kill commercial software, but it certainly will raise the quality bar. If the only means for a commercial product to survive in a marketplace is to monopolise it, then it probably doesn't deserve to survive. I bought The Bat, not because there are no free (or indeed pre-installed) email clients, but because it was sufficiently better to warrant its price. If a product takes a team of 50 a year to develop and costs tens of thousands per cpu, should it really have anything to fear from a competing OSS alternative developed in a few months by 3 or 4 people who've never met working in their evenings and weekends? And if it does, does that say anything about its true 'value'?
8:49:09 PM
|
|
Mark Pilgrim and Sam Ruby have released a nifty RSS
Validator. I had a very quick go at running it through Jython, but it fell over as Jython doesn't have
the 'select' module. A java port would be nice, but its unlikely I'll have the
time so rather than make promises I can't keep I thought I'd just flag it up and
see if any of the other java guys feels like taking up the challenge.
Oh yes, and I'm valid apparently. Although I do now feel like an extra in Gattaca.
2:06:33 PM
|
|
Someone else has a software blog entitled ' Pushing the
envelope'. Someone from Microsoft to be precise. I knew it wasn't a
sparklingly original title, and when I first started I was going to call it
'Random Thoughts' - until I saw Rickard's. I'd be annoyed, except the other PTE
was there first, so my bad (although there only appear to be a couple of posts).
I'm pondering the merits of switching to a fully web-based system such as
moveable type, roller, or miniblog, which will involve a URL change and all
sorts of other inconveniences, so maybe I'll change the title at the same time
just to add to the fun. Darren's Daily Diatribe, perhaps?
On the other hand I'm no.1 on google for 'pushing the envelope' so maybe I can
just pull rank (bad pun intended).
11:37:43 AM
|
|
Delegating SAX
Parser Handler
Delegating
SAX Parser Handler. At
work I'm working on refactoring / redesigning something that started as a
cool idea. Basically, you register sub handlers to a root handler with the
path you're interested in getting messages for (like "/document/header/title"
would get you the events for the document title).[Jason
Carreira]
It does sound quite similar to digester
which allows you to register interest in SAX events using abolute paths,
relative paths, wildcards and so forth and apply Rules when the SAX events
fire. Plus there's default rules for all kinds of things like creating beans,
setting bean properties, invoking methods. There's a default object stack
so its very handy for parsing XML config files and turning them into your
domain objects.
Competition in open source can be healthy, though I do prefer reuse when
it makes sense since it promotes a bigger user community which often results
in better software. So I'd recommend evaluating digester first to see if the effort of starting your own project
and supporting it is worth the effort.
[James Strachan's Radio Weblog]
I discovered Digester a few weeks ago and have found it very useful. Couple it
with BeanUtils
and you've got a great way of automagically populating your beans from an XML
config. I have a Configuration object that contains a Map of the 'digested'
name-value pairs and uses BeanUtils.copyProperties to set the fields on any
Object it gets passed. All I have to do is obey the javabean naming conventions
and it just works. For an additional check you could include the bean classname
in the XML and have the Configuration object complain if it was passed an object
of the wrong type.
11:09:13 AM
|
|
|
|
21 October 2002
In the script group, the Perl subjects may
be more capable than the others, because the Perl language appears more than
others to attract especially capable
people.
[Lutz
Prechelt, An Empirical Comparison of C, C++, Java, Perl, Python, Rexx, and Tcl]
Ready flamethrowers... Fire!
Could this perhaps be that it takes a better than average developer to wield
Perl in anger without hurting themselves or others? I have something of a
love-hate relationship with Perl. Its power and flexibility is undeniable, its
syntax questionable. I have this nagging fear that the more I learn, the
greater its seductive attraction will become, simply because it allows you to
get away with almost anything.
That's it. I'm not posting anymore on this (although I reserve the right to
change my mind). There are more important things in life than 'my language is
better than yours' catfights. Put your energy into writing code instead.
12:32:14 PM
|
|
With regard to Joe's recent
post about the deficiences of XML, I have something of a counterpoint to
offer. XML was invented as an attempt to unify and simplify data interchange
between disparate systems. This had been attempted before, but the efforts
never gained sufficient momentum to achieve general acceptance.
XML is a subset of SGML, which has been around for a number of years. SGML is
also the language from which HTML is derived. SGML itself is very complex, as
it includes all sorts of mechanisms for defining domain-specific dialects (such
as HTML and XML). XML was released on the back of the general and massive
uptake of HTML, and was similar enough to HTML to be explained as 'HTML that
computers can understand'. Part of the reason for XML's success is the huge
surge in popularity of the internet and its promise of global connectivity, part
is due to its design. XML is simple and formal enough to be relatively easy to
design parsers for, while being flexible enough to describe most types of data.
Developers were also used to dealing with HTML style markup. This combination
of factors probably accounts for XML's huge popularity. The biggest hurdle for
any attempt to standardise on a data interchange format was always going to be
garnering enough general support to make it the 'de facto' standard.
There is always more than one way to do things, and XML may not be the prettiest
or the best, but the details of its design are probably less important than the
fact that it succeeded in its goal of achieving a standard means of describing
data that was easy to pass around between otherwise incompatible systems. Now
that we have come to expect easy data exchange, we are free to explore
improvements, but we wouldn't be in this happy position were it not for XML.
12:00:34 PM
|
|
|
Skimming over some of my old posts, I can tend to spot which ones were written
from home and which ones from work (it helps that I also remember writing
them!). Generally speaking, the more 'off-topic' and emotive posts tended to be
written from home. It seems that being at work causes me to put on my
'professional' hat, while at home I'm more likely to just bash out whatever's on
my mind at the time. Interesting.
11:35:14 AM
|
|
|
|
20 October 2002
As promised...
XDoclet 1.2 beta looks good. I've been trying out the castor and servlet tags today. It makes it a lot easier to evolve the design when you don't have to keep changing your mapping file. It was the work of moments to throw together a couple of beans and collection objects and dump them out to XML. Nice.
Still not sure what to do with regard to persistence. I don't need a relational database (or the hassle), just something quick'n'easy. I toyed with the idea of just storing the raw xml in Lucene, indexed on the various fields and attributes, but I need to look into the querying side a bit more to see how easy (or even possible) it would be to construct queries like 'select * from documents where date is between 10-OCT-2002 and 20-OCT-2002'. I have a feeling this may be difficult, and probably isn't the best use of a search engine anyway.
Things to check out further:
Any other suggestions out there?
10:08:39 PM
|
|
It turns out my radio problem is a known bug. Don't know why I got so annoyed, other than the fact that most of the times I break something out of curiosity the only person who suffers is me, and most of the time I can dig around the source until I know enough to fix it. Or simply roll back. Its a little hard to hide it if you break your blog.
Found a fix on google, so this will be my last rant on that subject. Back to more interesting matters.
9:27:00 PM
|
|
I've broken radio. All the so-called 'dynamic' links have frozen pointing to 'www.darrenhobbs.com' after I played around with the upstream via ftp option yesterday. I aborted that idea after finding out that turning on ftp switched off the normal upstreaming to my radio account. Now I've found out that if you ever use the ftp option you can't apparently ever go back. Thank you Userland.
This wouldn't have annoyed me if I could go in and fix the problem, but I can't find the reference that the macros are using, leading me to believe it's buried somewhere in one of the mysterious .root files, which are, naturally, binary.
Now nobody new will be able to subscribe to my RSS feed until I hard code the links back to what they should be.
And to think this morning I'd just about decided to stick with radio for the time being, to minimise the inconvenience to my rss subscribers. Thank you for making the decision for me, radio.
Looks like I will be moving blog software after all.
4:12:55 PM
|
|
|
Actually, the biggest incentive for me to leave radio is because I can't post to it or access my RSS feeds through the web.
2:37:36 PM
|
|
Another leaves radio land. It's looks as if more people are leaving Radio behind. The Desktop Fishbowl has left its old Radio shanty to... [<big>kev's</big> catalogue of this and that.]
I'm not sure what the incentive is (even though I'm now feeling it myself) to leave Radio. Maybe its developer masochism, that 'not invented here' feeling that because I didn't write it / its not in my pet language, it must be bad for me. Maybe we feel vaguely guilty about using a 'consumer' product. We're geeks, we're supposed to use two or three arcane command line utilities and a dodgy perl script to achieve the same thing as a normal person using a pointy-clicky GUI app. Or something.
Maybe once the blogging addiction bites then you start wanting to add your own features, which is far easier to do when you have the source in front of you, in a programming language you're comfortable with.
9:40:22 AM
|
|
|
|
19 October 2002
|
While waiting for Roller to download from sourceforge I grabbed Miniblog, unzipped it and had a browse through several of the JSP pages. It's so tiny! Bijou. I know the goals of Roller and Miniblog are very different, but even so, I didn't think they'd be quite that different.
10:14:06 PM
|
|
|
Okay, first hurdle. It appears that Radio will let you upstream either to their servers OR via ftp to a server of your choice, but not both. That's no good. I'm not ready yet to join the ranks of the blogging rollers either. Time for a rethink...
9:18:09 PM
|
|
Woohoo, www.darrenhobbs.com is up and running. I can feel my ego expanding. Currently its just a static copy of my radio blog, and there may be some to-ing and fro-ing, but now I've got Orion running almost anything could happen. First plan is to put all my Lucene studies to work and get some of that full-text search action going on.
As soon as I get my head out of this door.
8:59:07 PM
|
|
|
Been unsubscribing from mailing lists this week. I have a tendency to subscribe to the dev-lists of any OSS project that looks interesting, and get so swamped I end up reading nothing. There're just too many good projects going on to follow them all. Time for a clear out.
4:51:53 PM
|
|
NBML - Not Bunk Markup Language (Okay the name might change.)
How long have we been staring at screens full of XML and not complaining? Have a look at a typical chunk of XML - it's all noise. [Joe's Jelly]
I thought the whole point of XML was that it allowed machine-parseable data to remain human readable. Don't tell me they could have achieved that goal without all the angular decoration? Could they? That would be like a technically superior video format being beaten by a lesser one that was just better marketed. Oh wait, that happened...
10:34:16 AM
|
|
|
|
18 October 2002
Lots of bloggers are suffering from spam it seems. One of the ahem, benefits of having a visible web presence I suppose. I recommend SpamAssassin to anyone who's:
- Using a linux/BSD/etc. system
- Can insert the aforementioned into their email path
There is also a windows outlook plugin version that costs money, plus several other methods of integrating it (perl scripts and the like - details on the site).
With regard to my second point, one option that occurs (and I haven't tried this yet as I've just thought of it) is to have a 'public/private' email pair. Publish the public one freely on the web, subscribe to mailing lists etc, and have all the mail retrieved using a FreeBSD / Linux machine running fetchmail -> procmail -> spamassassin. Save the other one for sending messages to individuals. Procmail can be made to forward messages after they've been filtered, so you can still have all your email delivered to the same place. It does require that you have a reasonable degree of access to a connected unix box though.
I did start to write about a sort of interface-implementation separation for email, but I realised it would only be one-way: while it would be fine to have a public email you told everyone to use to contact you that was then bounced to your 'real' address, there is no easy way to make this work in reverse. Any mail you sent would have your private address in it. There are ways to work around this, using anonymous remailers or services like anonymizer.com or you could run your own mail server. None of which are entirely transparent unfortunately (you can't just hit 'reply' from your favourite mail client).
10:42:49 PM
|
|
While idly clicking through my blog stats, it quickly became obvious that I have a rather specific type of readership. Can you guess what it is yet?
Browsers
- Netscape 6.x: 50.0%
- Internet Explorer 6.x: 28.6%
- Internet Explorer 5.x: 18.6%
- Netscape 7.x: 1.4%
- Opera 6.x: 1.4%
Operating systems
- Windows 2000: 33.3%
- Windows XP: 23.2%
- Linux: 17.4%
- Windows 98: 13.0%
- Windows NT: 11.6%
- Mac OS: 1.4%
57.1% of browsers report a resolution of more than 1024x768, and exactly 50% report True Colour.
It seems I'm generally viewed on high end systems running the 'IT professionals' choice of browser, Mozilla. And nearly a fifth of readers are using Linux. Cool.
Yes, its official. I'm a tech-head, and so is my readership. Greetings.
One question. Netscape 7? Did someone release Mozilla 2 when I wasn't looking?
7:43:22 PM
|
|
Just checked out Erik's
weblog. I like the groovy icons. Very nice. Oh, and the content is good
too :)
12:03:34 PM
|
|
|
|
17 October 2002
Wikify.... I like wiki. But... its too hard. I want to be able to wikify my entire website, so that if... [development]
I believe that snipsnap attempts to meld blogging with wiki. Might be worth a look.
11:56:06 PM
|
|
|
|
16 October 2002
11:35:00 PM
|
|
Party like it's 1993.
-Russ [Russell Beattie Notebook]
1993 was also when I first heard the word 'Linux', which, because we were British, was pronounced 'Line-ux'. One of my Uni acquaintances showed it to me. "Look, free Unix on a pc". "That won't go anywhere", I thought... This rates alongside my other great predictions such as, 'Netscape are having an IPO, should I invest? Naa, they probably won't be worth much'. And Yahoo, and Redhat... I console myself that I was a poor student and didn't have much to invest at the time anyway. Wouldn't have had enough money to get rich, just enough to have had one whale of a time at Uni.
11:30:53 PM
|
|
I couldn't unplug.... But I'm going to be brief tonight. It's 11:19 p.m. I want to be off the computer by 11:30 (maybe 11:45's more realistic).
-Russ [Russell Beattie Notebook]
Russ, if you're reading this and its NOT just after you got up on thursday morning, switch off. Now. :)
11:19:44 PM
|
|
Fed my book-buying habit today with the acquisition of the O'Reilly BEEP book. Build your own network protocol. Cool. BEEP looks very interesting as an alternative for all the contortions distributed application developers have to go through to make them work over HTTP. It provides a framework where most of the complex low-level stuff is done for you, and you just have to build your application-specific stuff on top of it. So the developer gets to decide whether the connection should be pull/push or both, stateless or stateful, pipelined or multiplexed etc. And security appears to be pluggable too.
I seem to remember Paul Hammant mentioning something about writing a BEEP module for AltRMI, which sounds like a great idea, especially for doing asynchronous callbacks. Must read more in case I'm totally wrong...
8:29:57 PM
|
|
...you've replaced sendmail as your default MTA (in my case with postfix). Oh my word. Talk about stressful. At one point I thought I'd just obliterated a whole day's worth of incoming mail because I kicked off fetchmail (thinking I was ready when I wasn't), and postfix threw a wobbly. Thankfully it kept all the undelivered messages so after a few frantic minutes skimming the docs, hacking the config and one 'postfix flush' later, all my email reappeared. Phew.
I flatter myself that I can usually puzzle my way through most techie things, but email delivery systems are way more complex than I ever imagined. I had no idea what I was getting into when I started. Its still not working as I expected but I appear to be able to send email, so I think I'll leave it until my palms stop sweating.
8:21:39 PM
|
|
Interesting article by Mark Harwood here regarding distributed
lucene indexes. Using distributed indexes is how google achieves its scalability I
believe, but they are a fairly special case.
If scalability in the sense of concurrent users is the issue, I tend to favour
multiple identical boxes with a load balancer and an RPC frontend. This can be
as simple as a servlet, or you can use SOAP or XML-RPC etc. (Possibly RMI,
although I've never tried that across a load balancer). Doing things this way
is probably a lot simpler to manage than splitting your indexes across boxes and
means that even if your queries are asymmetric (ie. 85% of the queries are for
the same thing), the load can be fairly balanced. Reliability is achieved for
free as well - if a box dies just stop sending requests there. Given Lucene's
performance (it has been used to index collections of more than 10 million
documents) its pretty unlikely that your dataset will get so large that sheer
size starts to affect your query times. Unless of course, you are google :)
10:16:02 AM
|
|
|
Lucene is great, but some of the default settings are heavily biased towards interactive indexing and searching. If you're building an index in a batch process style, set the IndexWriter.mergeFactor value to something big. I use 10,000, which makes it burn about 500 meg of RAM while indexing, but speeds it up a lot over the default value of 10. YMMV as ever.
7:16:14 AM
|
|
|
|
15 October 2002
Next Generation Email Clients.
Wow, you want a lot. :-)
"A reasonable man adapts himself to the world around him. An unreasonable man expects the world to adapt to him. Therefore, all progress is made by unreasonable men." - George Bernard Shaw.
Of course, he was a lot more eloquent than me. I just look pointedly at the title of my blog :)
First of all you should probably be using IMAP in the short term as it will provide a much better means for accessing email in a centralized location. The downside is that IMAP tends to require a good connection speed because the messages stay on the server and are downloaded on demand, as opposed to the POP strategy which downloads all messages in your INBOX and lets you organize and store them locally.
Yeah, I'll probably end up doing something like that. Although I can never do things the easy way, so what I might well end up doing is running my own local IMAP server...
As for a cross platform GUI client which actually works well, I have yet to find one which satisfies me. On W2K I use Outlook, and while it is adaquate there are a number of things which really piss me off. Sometimes Outlook just sits there for 10 or 20 minutes "checking for messages". Then there is the virus issue. On OS X I use the included Mail application which works pretty darn well. In the worse case scenerio I resort to Pine on Linux (if I can only get in via SSH).
The most powerful windows email client I've ever used is The Bat!. Also used by Ron Jeffries I believe.
On my side I think I have abandoned the desktop client in favor of a web client. There are just too many issues with synchronization accross clients which are too difficult to overcome. The existing protocols (IMAP and POP) do not really work well when you get into the level of tens of thousands of messages in hundreds of folders. You will essentially need to have a custom server and protocol which deals with these issues so that synchronization is not completely up to the client. Unfortunately this will be a painful if not impossible uphill battle due to the fact that people have their email servers in place and would be very relucatant to replace it with your server.
I wasn't really suggesting a replacement for established servers, but something more along the lines of a web service. As connectivity increases and more people have permanent connections, it's not unforseeable that your own static IP is as common as having a phone number. If you believe the IPv6 hype, even your fridge will have a net presence in the not-too-far future. Anyway, if you have your own server on the net, you have more options with regard to applications. Image the scenario: your powerful home server is collecting and indexing your email, news feeds, etc. according to the criteria you have defined. You have your lightweight wireless device / laptop with you, and can simply hook up to your central system and be presented with a condensed and sorted view of all the stuff it has for you. Read some emails, send some replies, organise your calendar, all centrally stored and managed from your personal server. Your personal server could equally well be a hosted service, much like many bloggers already have.
Here are my ideas for a web-based email/information manager:
- It will not emulate a typical client-side application. No folder trees. No drag and drop. My idea is a single "INBOX" which is an aggregation of your different message sources such as email, RSS, newsgroups, etc.
- It will link together email/contact management/history/tasks/issues etc. in a fashion which makes it easy to view the lifetime of a particular discussion as well as the results of a discussion.
- It will provide multi-user functionality so that a group of users could share some messages which are related to the group but not others which are related to the individual only.
- It will work with POP and IMAP servers.
- It will track EVERYTHING which comes in and goes out.
- It will link with JIRA! :-)
[All Things Java]
All good stuff. Especially the JIRA bit :)
9:19:47 PM
|
|
What were you doing in 1993? [Russell Beattie Notebook]
Hmm. Finished my A-levels, turned 18, bought my first car, started my degree in Chemical Engineering. Discovered the internet. Drunk (and drank) a lot.
I just realised I can still remember the first ftp site I ever accessed. wuarchive.wustl.edu.
8:04:58 PM
|
|
Interesting. I think MS has its work cut out for itself to build a large open source community anywhere near the sizes of Linux, GNU or Jakarta. Especially if its under a dodgy MS licence. All I've seen so far have been C# ports of existing Java open source projects (like NAnt, NUnit etc). Is there really much of a C# open source community out there?[James Strachan's Radio Weblog]
As someone who's turned most of their development attention to .NET (splitter), I'm feeling kind of isolated. OSS is totally alien to the existing MS community and the Java community view .NET as the plague. An umbrella is badly needed for the .NET OSS community.
How long until we see dotnet.apache.org? :) [Joe's Jelly]
C# is similar enough to Java that it isn't hard to pick up for java developers who haven't rejected it merely on the basis of its heritage. As companies (like Joe's clients) start to demand .net stuff, developers coming in to .net who are used to the availability of OSS tools for java will probably start to produce equivalents for the .net world. Although what Microsoft were thinking with their first workspaces license escapes me. Unless they offer a compelling technical reason not to use sourceforge I think they may struggle. The community will decide, and it doesn't really matter whether the tools are hosted on sourceforge, apache, or gotdotnet workspaces.
6:50:31 PM
|
|
|
|
13 October 2002
I am humbled, and impressed. Humbled by once again realising how much I don't know, and impressed by how much others do, and how important human interaction is in what we do.
In one single comment, Chris has pointed me to at least three java related email projects I somehow managed to miss. I'd heard of ICEMail already, but forgot to mention it.
Must brush up on my google terms - its too easy to artificially narrow the criteria and not get what you were after. Google should just know what I want, not what I say I want!
So what do I want? Its more than just a mail reader. I have a 'too many email addresses' issue. I have several personal ones and a work one. I want to be able to aggregate all my personal ones and read them at work and at home. I want emails I write to be stored centrally, irrespective of where I am when I write them. At home I use WindowsXP for most things, and FreeBSD for everything else. Part of the 'everything else' includes my email. But I don't want to have to switch chairs just to write an email. Fetchmail, Procmail and Mutt form a powerful triumvirate, and I can still fire up KMail if I feel like it and have it work. What I'd love is to have something like KMail as a cross-platform remote client, so I could use a nice GUI from wherever I can get a net connection, without having to lug my entire email directory around with me. It would have to be client-server. I don't want to try and rethread a 15,000 message folder back and forth over the net. The server should do that and just show me the results.
The other thing I want (don't ask much do I?) is actually where I started. The concept of a smart mail reader that is also a newsgroup, web page, rss, etc. aggregator that can index all my stuff and show me the relationships between things, sort of like ZOË taken to its logical extreme.
Then I just have the problem of what to do after breakfast... :)
12:11:14 PM
|
|
|
|
12 October 2002
Inspired by ZOË, and motivated partly by the fact that Mutt is still the best go-anywhere remote mail reader, I've been daydreaming about a cross-platform client-server GUI (not web based) java mail program with built in smart indexing and context sensitive threading and dynamic linking.
What have I achieved so far? I've got some very simplistic code that scans an mbox file and creates EmailMessage objects.
On the way I discovered that:
- There appear to be no open source java libraries for parsing mbox mail files.
- Mozilla starting work on a java email reader, called Grendel in 1997, but it seems to have died. There appears to be a lot of code in CVS, but most of it seems to be pre-java 1.2, and it requires netscape jars which probably aren't open source.
- There is apparently no official spec for the mbox format hence...
- ...the implementation varies significantly between applications and platforms. About the only thing you can assume is that each message starts with 'From ' preceded by a blank line or the start of the file.
- The 'From ' address at the start of each message isn't necessarily the same as the 'From:' header.
- Jamie Zawinski (ex-Mozillian) has lots of useful and eclectic information on his site.
9:29:50 PM
|
|
Late night ramblings.... Back to blogging... ahhhh.
I sorta mentioned my new server setup in passing before, but I'm using OrionServer again and I love it. I was thinking about slapping a hacked copy of Weblogic on the server so I can work and play in the same environment, but Orion is just so nice I decided to play nice.
-Russ [Russell Beattie Notebook]
Nice to have you back, Russ.
On Weblogic: Whoa, you've only got 4 gig of disk and a gig of RAM, are you sure she can take it captain?
On JohnCompanies: I received the following at 10 pm PST on a friday night, after signing up at 1pm. Impressed.
Hi,
We have a really nice welcome message that we send out to new customers for our FreeBSD product, but I haven't finished writing the welcome message for our linux customers - I assume you'd rather not wait for that and just get an informal welcome.
...[account stuff skipped]...
P.P.S. We are having a scheduled maintenance tomorrow (saturday) night for about 20 minutes ... this is rare and it is just a coincidence that it is happening the day after you sign up.
A hosting firm that actually keeps their customers informed? Remarkable.
Finally:
...we are very happy to have you as a customer...
I think I'm going to be very happy being one.
9:06:54 AM | | | | |