Drew did the heavy lifting and points out that the problem isn't with .NET but with the DTDs themselves which include a module which simply doesn't exist. I must also sheepishly admit that he had responses to some earlier questions that I missed. One was about FTP client services in .NET and the other was about creating a custom XmlResolver. Sorry Drew.
David Johnson is having a helluva time with CQHost and is trying to find an alternative hosting service that privides Java Servlets. Until then I'll keep both his Radio and Roller based weblogs news feeds in Aggie.
Andy McMullen pointed out to me via e-mail that I can write an XmlResolver all my own and fix the default XmlResolver's problems with relative directory names. There's even an MSDN article on creating custom resolvers. Way cool. My first experiment will be a little resolver that just passes all the work to the default resolver but lets me see exactly where .Net falls down on job when processing the XHTML 1.1 DTD.
A reader dropped me a note to point out that my solution to XmlDocument loading up DTDs was sub-optimal, noting that if the DTD defines any entities and they are used in the document then the document will fail to load.
Now this leaves me in a bit of trouble. I want to load and manipulate an XHTML 1.1 document using System.Xml.XmlDocument, but can't because .Net chokes on the 1.1 DTD. I suppose I could download, fix, merge and pre-feed all the XHTML 1.1 DTD's before I load my document which I could modify to only have half a doc-type, though I'm sure that has other consequences. On the other hand, I could just find and pre-feed all the entities defined for XHTML 1.1 and still use my XmlResolver = null hack.
I have just checked a fairly large change to Aggie into CVS. This change breaks out all the configuration information and puts it into a dialog box. This includes support for a proxy server (needs testing) and the ability to choose a 'skin'. The skin is for the look of the generated HTML page used for viewing the news, the application itself is currently not skinnable. Once Aggie's support for international feeds is fixed then this will become RC4.
Congrats to Dave Johnson on the birth of his third son Leo Michael Johnson!
This .NET Memory Profiler looks like a must-have tool. Thanks, Sam!
In .NET it turns out that if you load an XML document into XmlDocument, i.e.
then all of the DTD's are pulled in and parsed, even if you
do not pass the XmlDocument to a validating reader. If it can't
find the DTD's then it throws an exception. Not the behaviour
I expected. The way to stop this isn't obvious at first either, but
this works:
XmlDocument doc = new XmlDocument();
doc.Load("test.html");
The reason I know this is because I was trying to
load an XHTML 1.1 document into XmlDocument and it
kept failing. It appears that the XmlResolver can't
handle relative path names for included DTD's, and
guess what, the official DTD for XHTML 1.1 loads other
DTD's and does so with a relative URL in some cases.
You will have to wait a while to find out why I was trying to load an XHTML document into XmlDocument...
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;
doc.Load("test.html");
If you are interested in how Aggie works, I have posted a short Theory Of Operation for Aggie.
It seems like every day I run into one more reason to move to Italy. Recently it was this wired article, and now Ben Hammersley is moving there.
NPR doesn't just suck. They are brutally stupid too. [Via Boing Boing]
I was surprised to find that the FCL is completely bereft of support for client side FTP functionality. Bummer. I would have wanted an FTP class that did FTP, Passive FTP and SFTP. Given the very complete support for HTTP this omission was surprising.
The Heisenberg Uncertainty Principle states that you can't measure a thing without changing it. What google does is measure the web.
This topic came up obliquely at the RTP bloggers lunch yesterday. The topic turned to the fact that Google loves bloggers, ranks us highly and that eventually they would tune their ranking parameters to put us bloggers back down further on the list where we belong.
Google rewards good web behaviour: changing, linking and being linked to. It rewards those sites with higher page ranks. Don't want to play nicely on the web, like NPR and their idiotic policy about linking to them or the NYTimes and their dearth of links in stories, and you will be punished with low rankings. Don't believe me? Then why does the NYTimes, with all their content and all their reporters only have a page rank of 8 while Mark Pilgrim has a page rank of 7 and Dave Winer a page rank of 8?
Since the blogger lunch I have mulled this idea of Google changing it's ranking system to de-rate bloggers. Should Google change to accomodate the web, or should the web change to accomodate Google? For example: right now bitworking.org has a much higher page rank than the web site of my employer. [Do not bother looking, I have never mentioned them, their industry or their location in this blog. Ever.] I do know that I could put together a website for them that is structured like a blog, with frequent entries of links to news items from their particular industry, and short notes attached about how the company relates to that news item, maybe publish some white papers and mix in some press releases all done regularly on the home page. They could very quickly rise to a higher page rank and become authoritative in their industry, at least as far as Google is concerned. Is this better than the current static brochure-ware website? Yes. Everybody including current customers, future customers, shareholders, employees, and the web in general would be that much richer for a site like that.
So no, Google shouldn't change. The web needs to change to accomodate Google. Link, link to, be authoratitive on a subject, keep current and offer information others want and need and you'll succeed in Google's eyes. Let page-rank stand as the carrot and the stick of good web behaviour.
Heard an interview with They Might Be Giants on NPR and followed it up with a visit to the NPR site. TMBG has released NO! a CD for kids. There are 3 songs in RealAudio format available off the NPR site, my favorite is "Bed, Bed, Bed". It is a song about going to bed, but is probably the last thing you should actually play for children before sending them to bed. Reilly, Austin and I listened to it before they went to bed tonight. We were singing, marching and stomping all around the house. Lynne was upstairs grading papers and thought that the washing machine had become unbalanced, only to find that it was her husband that was unbalanced.
We'll be buying the CD.
I can't recommend DVRs highly enough, so if you don't already have one, consider purchasing one in the next year. You'll thank me later.[The Shifted Librarian]I agree 100%. Our TiVo has changed TV for the whole family, with our 6 and 8 year old being completely TiVo literate, and our 2 year old is always confident that he can watch Blue's Clues (with Joe, not Steve) any time he wants. But I have also had a real difficult time explaining to people exactly why it is so great, you just gotta see it and use it to really grok it.
Even better, he's got pictures.
Another review of desktop aggregators, this time by Matthew Eberle, where he compares Aggie, AmphetaDesk, Blagg, Radio Userland, and NewzCrawler.
Very cool news from Peter Drayton:
Traffic on another list I'm on suggested that not everyone knows that MS Research is already busy adding generics to the Rotor execution engine. It should be out long before the commercial EE gets upgraded to support generics, and will be a great way to start playing with the technology early. The F# language compiler already supports emitting generic-ized IL, too. [Peter Drayton's Radio Weblog]The lack of generics is one of the two weaknesses in .NET and it is good to see progress in this area. The second weakness is, of course, getting .NET cross-platform, but that is already being worked on.
I was down to just one web site that I read regularly that I could not get into Aggie and that site was Textism. But that era has passed and now, thanks in part to Mark Pilgrim, Textism has an RSS feed.
And today that was Rafe Colburn of rc3.org. He is the latest addition to the RTP bloggers and today he's posted a very nice review of Aggie, and no, I didn't bribe him for the good review over lunch ;).
As always, if you're a blogger in the RTP area please drop one of us line and come by the next monthly lunch.
Developing Aggie has at times been quite a pain because of DSL and .NET. You see, my DSL service only works on my machine when it is booted into Windows ME. I have tried unsuccessfully to get my DSL connection to work under Windows 2000. My machine does dual-boot into Windows 2000 which I must do to compile Aggie since the C# compiler only works under Windows NT, 2000 or XP. So my standard development cycle is code, reboot into Win2k, compile, reboot into WinME, test. I have new hardware on order that should fix this problem.
I have been approved and Aggie is now being hosted on SourceForge. Please stop by and submit a patch, submit a feature request, submit a bug report, pull the latest sources from cvs, etc.
I have tried to add all the bugs and enhancement requests I have seen to date. Please stop by and make sure I haven't missed anything.
Here is an update on my friend and boss Dave Winer. He is in the hospital and will remain there until next weekend. To those that are sending notes, he should be OK, but in the meantime send positive healing thoughts in his direction. He will be providing more details when he is able. [John Robb's Radio Weblog]
Best wishes to Dave for a speedy recovery.
Take a look at the coolness Richard Caetano is drawing up with his TDraw class.