Scheme
The Scheme programming language: you either love it or you hate it. I happen to love it. / Tom Edelson

















Subscribe to "Scheme" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.


Friday, April 25, 2008
 


The last five posts in this blog dealt, at perhaps amazing length, with differences between two (of the many) implementations of Scheme, both of them implemented in Java: JScheme and SISC.

In this "postscript", I want to do two things.  The first is to give an overview of the previous five posts in the sequence.  In particular, I want to call attention to the fact that they progressed from talking briefly about my preferences, to discussing a few of the differences between JScheme and SISC in much more detail.  And when I did the latter, I was no longer motivated primarily by a wish to explain my preferences: instead, my goal was to use these differences as examples of much more general points.

After doing that summarizing, I then want to return to the subject of preferences, or rather, of how one would choose between JScheme and SISC.  I want to refer, albeit briefly, to a rather larger number of differences between the two implementations: larger, that is, than the small number covered in the previous posts in the sequence.  This is in order to give a bit more practical help to someone whose interest is in making that choice; and also to make sure that I don't leave you with the mistaken impression that that small set of differences between JScheme and SISC (the ones that I discussed at length as examples of "tradeoffs in language design") is anywhere close to being all the important differences that there are.

The first of these five preceding posts, titled A Better Scheme, announced that, after working with JScheme for several months, I had decided that I preferred SISC ... enough to convert a project in mid-development from using JScheme to using SISC.

That post didn't say very much about why I prefer SISC, making just one general point on that subject, and one more specific one.  The general point was that SISC is "a complete implementation of R5RS, the most recent generally-accepted standard for the language", while JScheme claims only to be a "'nearly complete' implementation of R4RS, the previous version of the standard".  The more specific point was that in JScheme, the "... numeric types are the Java numeric types"; and that in some cases, JScheme gives you wrong answers to simple arithmetic questions.

The next four posts were titled as a series: "Comparing JScheme and SISC: Tradeoffs in Programming Language Design (part 1)" through "... (part 4)", the last of these being the most recent post prior to this one.  These were, in a sense, a continuation of "A Better Scheme", but their focus was different.  They weren't primarily about choosing between JScheme and SISC (even though they certainly presented information that might be relevant to that).  While it's a stretch, one could say that their point of view was that of a language implementor, rather than that of a [potential] user of a language implementation.

That statement would indeed be a stretch, because I wasn't writing for language implementors.  I was writing for Scheme programmers .... especially those who have reason to want access to Java code from their Scheme code ... which is to say, potential users of JScheme [and/]or SISC.  However, I wanted to give these readers a bit of appreciation for the language implementor's perspective.  In particular, I wanted to give some examples of how implementation choices may (in fact, pretty inevitably do) involve tradeoffs, in the sense that a single implementation decision may have consequences that users are apt to like ... and other consequences that many users will not like ... where there's no way to separate them, to have the "good" consequences without the "bad" ones.

It took me four long posts to deal with just two of these examples.  Each of these two examples was at least partly about how you invoke Java code from your Scheme code, in JScheme or in SISC.  There's no question of "following the Scheme standard" here, because the Scheme standard says nothing about how to do this.  Thus the implementors of JScheme and of SISC each designed their own extensions to the language for this purpose, and I was comparing those two designs.

The first example was entirely about how you do Scheme access to Java: it directly compared the syntax and semantics of how you do this in each.  This was the primary subject of the first two posts in the series.  I noted that JScheme's way of doing this, the "JavaDot notation", makes for more succinct code than does SISC's way of doing the equivalent.  On the other hand, one may (or may not) regard the JavaDot notation as a major departure from the simplicity and consistency of the Scheme syntax, and object to it accordingly.  So that's a tradeoff, at least if you do interpret the JavaDot "syntax" in that way, and do regard Scheme's simplicity of syntax as important.

The second example of a tradeoff took up the third and fourth posts in the series.  These returned to the subject of numeric data types (and limited themselves, just in order to narrow the topic, to integer data types in particular).  The underlying difference here is the fact that JScheme labels each integer value as belonging to one of the several built-in Java integer types, while to SISC (and to the R5RS standard), an integer is just an integer ... from the user's point of view, at least.

Each of the built-in Java integer data types has a fixed size in the computer memory, and, therefore, a fixed range of values that can be represented.  In Java itself, and in JScheme, this entails that some arithmetic calculations (namely, those whose true result is outside the range of representable values) produce answers that, mathematically, are simply incorrect.  This doesn't happen in SISC; I think that this can be counted as an advantage of SISC over JScheme.1

On the other hand, the same decision, to let the JScheme integer data types be the Java integer data types, also allows for greater simplicity (again, as compared with SISC) in how one calls Java from JScheme.  To give the Scheme programmer the ability to call Java (or, to an extent, any other language), a Scheme implementation must match up the arguments supplied by the Scheme code with the parameters required by the available Java methods.  In JScheme, since each integer argument is already tagged as one of the specific Java types, this matching process is straightforward.

The same matching process in SISC, which doesn't have such a one-to-one correspondence of types, could lead to ambiguities as to which Java method is to be called.  To prevent such ambiguities, SISC requires (and, so far as I can see, must require) that the Scheme programmer explicitly do a data type conversion before passing a Scheme value to a Java method.  Since that's [a little bit of] extra code to write, it counts as an advantage that JScheme has over SISC.

So we have a second example of an advantage and a disadvantage, flowing from one and the same design decision: a tradeoff.  Such tradeoffs are one reason why language design (or even the "implementation" of an existing language, because in real life, that still entails design decisions) is more difficult than it may look (to some).  This is a valuable thing to be aware of, partly because one may then notice that the same is true of the design of programs in general ... at least, in those cases where it makes sense to judge a program by more than just whether it meets the explicitly stated requirements.

I have used selected differences between JScheme and SISC as a vehicle for making some very general points about programming.  If I just left it at that, though, I might give some readers the mistaken impression that the differences I have mentioned thus far are the only differences between JScheme and SISC.  So, before leaving the topic of "Comparing JScheme and SISC", I want to dispel that impression.

There are many more differences between these two Scheme implementations.  Most of these differences (at least, if you can judge by the documentation) consist of features that SISC has, and JScheme doesn't.

Some of those features are specified by the R5RS standard, so that their absence in JScheme is a special case of the general statement that SISC is a more complete implementation of that standard.  One important feature in this category is the ability to create "syntactic extensions" of Scheme in Scheme, also known as "macros".  JScheme doesn't support these, whereas SISC does.

But a greater number of the features that SISC has, and JScheme doesn't have, aren't in R5RS: they are useful extensions to standard Scheme.  The SISC user manual describes too many of these to list here.  Here are three that may be regarded as important in various quarters:

  • An exception handling facility.

  • An object system: a way to define classes in Scheme code.

  • What could be called a "plug-in" architecture: a defined way to write Java code that implements new Scheme procedures.

SISC, in short, comes with a lot more stuff than JScheme does.  It's probably this, more so than the few specific differences that I examined in preceding posts, that primarily leads me to the conclusion that you almost have to prefer SISC over JScheme.  That is, it's hard for me to see how you could conclude otherwise, if you wish to use one implementation for all your Scheme programming needs, and you need to write a number of practical (and possibly large) programs.

The conclusion is strengthened if you add one more thing to your list of desiderata: if you prefer to use facilities already defined in Scheme, as opposed to explicit calls to Java code, where possible.  That's because you can pretty much do anything in JScheme that you can do in Java, but as compared to SISC, it's more often the case in JScheme that making explicit calls to Java is the only way to accomplish a task.

It's entirely appropriate, then, that JScheme's biggest claim to fame be that it provides easy access to Java.  A JScheme programmer depends more on explicit access to Java than a SISC programmer does, in order to get work done.  JScheme, as compared with SISC, seems more like a scripting language for Java.  SISC seems more like a Scheme implementation which "happens" to be written in Java ... though that fact about it means that you also can explicitly access Java from SISC, when you wish.  (And really, it isn't much more difficult.)

As a postscript to the postscript which is this post, I will add that, judging by my own experience, one of the greatest advantages of JScheme, in practice, may be something not mentioned so far.  That would be the fact that, to one category of potential users, JScheme has less of a learning curve, at the beginning.  I am referring to programmers who want or need to learn and use Scheme, but who are just beginning to do so.  Presuming that he or she also will be needing to make use of Java libraries, such a programmer, in learning to use either JScheme or SISC, is learning how to access Java from Scheme, and learning the Scheme language in general, at the same time.

This is something that one would prefer not to have to do, unless one were even more of a glutton for punishment than I am.  In such circumstances, when one hasn't yet reached a comfort level in "standard" Scheme programming, one is especially likely to look for the simplest possible way of accomplishing some additional task, such as invoking Java from Scheme.  And to such a Scheme newbie, that's what the JavaDot notation looks like: this person cares less about whether it may dilute the pristine simplicity of the Scheme syntax, and more about the fact that you have a one-screen "cheat sheet" which tells you how to do what you need to do.2

And that's where I pretty much was, when I first had to choose which Java-based Scheme to use: I'd been wanting to program in Scheme for years, and felt I understood the concepts pretty well; but I knew that there's a big difference between that and having some significant practice under your belt.  You could call it the difference between "knowing about" and "know-how".

So I chose to use JScheme for my first substantial Scheme project.  The project turned out to be a good bit more substantial than it seemed at first (don't they all?), and so it came to pass that, well before it was done, my perspective had changed.  Now I was comfortable with the basics of Scheme.  As a result, learning to use a modestly more complicated interface to Java was not nearly as scary as it had been a few months earlier.  In fact, the cost of doing that was now less, in my eyes, than the cost of finishing the project without access to the additional features offered by SISC.

Remembering this sequence of events helps to keep me from thinking that the original decision to start with JScheme was a "mistake". It was the right thing to do at the time, and perhaps not just in the sense that it appeared to be right, from my perspective at the time.  I might well recommend that someone else follow the same path -- that is, start with JScheme, with an awareness that one may switch to SISC fairly soon -- if that person were starting out in similar circumstances, with similar goals.



1This fact also returns to one of the topics of "A Better Scheme" (namely wrong arithmetic answers), and explains how and why these occur.

2Another reason why SISC looks scary, to a Scheme newbie, is the fact that the previously-mentioned SISC user manual is entitled "SISC for Seasoned Schemers".

Categorie(s) for this post: Scheme.



4:35:32 PM    comment []


Wednesday, April 16, 2008
 


As I explained in the previous post, in JScheme, an integer value is always tagged with a specific type.  It might be the default Java integer type, int; or it might be one of the alternatives, such as long.  But if it's an integer value which will be handled by JScheme's built-in facilities for same, then its type, as seen by the JScheme interpreter, will be one of the built-in Java integer types.

Furthermore, each of those built-in Java integer types has a size limit, a largest positive [and largest negative] value that can be represented.  A long takes up 64 bits, as opposed to 32 bits for an int, and so the maximum positive value for a long is 263-1, as opposed to 231-1.  But there is always a limit; and as a result, there is always the possibility that a seemingly simple calculation may produce a wrong answer, in JScheme as in Java.

In SISC, on the other hand, an integer is, so far as is visible to the SISC programmer, simply an integer.  The SISC interpreter does some work behind the scenes to keep track of the amount of memory that must be reserved for the value, since that is variable, with no predefined limit.  But integers of different sizes are not considered to be of different types.

As a result of this design difference, SISC cannot produce the sort of wrong answers to which I referred above, those due to "integer overflow".  The memory space occupied by the result is as big as it needs to be in order to accomodate the mathematically correct answer.

This can be viewed as an advantage of SISC over JScheme, and also over Java ... and over most other programming languages, for that matter.

Furthermore, in respect to both the original design difference, and the difference in mathematical correctness that results from it, SISC is conforming to the R5RS Scheme standard, and JScheme is not. 

That'a all a recap of what I've said before.  The new business for the present post is to explain why I say that this advantage for SISC necessarily comes with a disadvantage, when it comes to calling a Java method ("method" being, pretty much, Java's name for what might otherwise be called a routine, procedure, or function) ... specifically, when passing integer parameters from Scheme code to the Java method.

As background to this, one must know and remember certain facts about Java methods themselves.  The first of these is that when you create a Java method, you explicitly declare a type for each of its parameters; and in order to call the method, you must supply a list of parameters (or, more strictly speaking, "arguments") whose types match the types in the declared parameter list exactly.  For example, if the method's definition says that there must be two arguments, the first an int and the second a String, you cannot call it with a long and a String, instead.

The second fact is that a Java class, or source file, can contain two (or more) methods with the same name.  You can have one which expects an int and a String, and another, with the same name, which expects a long and a String.  Their implementations will [necessarily] be different as well, but the difference in parameter types is the only difference which the underlying language software can use in order to figure out which of the two methods will be called.  (We would say, of these two example methods, that the type declared for the first parameter is the only difference in their "signatures".  That's because "signature", for a method, is effectively defined to mean: that collection of facts about it which are relevant to determining whether or not it should be called (or "invoked") in any given situation.)

(When the calling code is written in Java as well, this determination can be made at compile time.  When the calling code is Scheme (JScheme or SISC), this can, in general, only be determined at run time.  This difference between the languages, while important in other contexts, is not very relevant to the particular difference between JScheme and SISC which is our topic today.)

(I will also mention that this particular fact about Java, the fact that you can have two methods with the same name, and thus distinguishable only by their parameter lists, is one of my least favorite things about Java.  It is undoubtedly considered a good thing by many who program primarily in Java itself; but for those interested in calling Java from a "dynamic" language like Scheme, it is the source of much pain, largely because of just the sort of consequences of it that are under discussion here.  However, I generally try not to fuss much about what I do or don't like about Java; Java is just there, y'know?)

So anyway, now we can look at the consequences of these facts about Java, as they affect calling Java from JScheme; and contrast that with how they affect calling Java from SISC.

From JScheme, this is relatively straightforward.  I'll continue to use the example in which there are two candidate methods, having the same name, with the only difference in their signatures being that the first method requires an int in the first argument position, while the second method requires a long.

When the JScheme interpreter encounters a method call in which the given method name, and other context, point at a choice between these two candidate methods, it decides between them by examining the internal tags which indicate the types of the arguments being supplied.  If the arguments are tagged as an an int and a String, the first method will be invoked.  If they are tagged as a long and a String, the second method will be invoked.  If anything else, a run-time error will be signaled.

Now consider the same situation, except that SISC, not JScheme, is the Scheme implementation being used.  In SISC, just as in JScheme (or any Scheme implementation), the values are tagged, internally, with some indication of the "type" of each.  But in SISC, as I said above, an integer is just an integer ... meaning that it is tagged only as such, not as one of the various specific Java integer types such as int or long.

Let me make clear that in SISC, you can have a value tagged as a Java int or as a Java long.  Indeed, one needs to make use of this capability in order to resolve the difficulty under discussion.  But a value tagged in either of these ways is not considered to be an "integer" by SISC: it doesn't let you use either of them in its native arithmetic operations.  There's really not much you can do with them, other than to pass them to Java methods, or to convert them to actual Scheme integers.

But to return to the situation at hand: the SISC program wants to call one of these two candidate methods.  It supplies an argument list in which the first parameter is tagged internally simply as a Scheme integer, not as a Java int or long.  But only by treating this argument as an int, or as a long, can the SISC interpreter legitimately determine which method to call.  So how can it make this decision?

The answer is that it can't.  If the SISC program attempts this, an error will be signalled.  Since there's no way for the SISC interpreter to make the decision, the person writing the SISC program must make it.  SISC must, and does, require that the SISC code first explicitly convert this Scheme integer value into either a Java int or a Java long, before attempting to pass it to one of these methods.  And [obviously?], the type to which it is converted determines which method is invoked.

The conversion is not, technically, part of the method call, but it is a prerequisite to it ... if you want to call one of these methods, and the value you want to pass as the first argument is [for example] the result of an arithmetic calculation done in the Scheme code.

Having to do an explicit conversion means having to write additional code; but we're not talking about a lot of additional code here.  Let me [finally!] show you what the code would actually look like.  In so doing, I'll simplify the example: now our two candidate methods each requires one argument, the first method requiring it to be an int, and the second a long.

The conversion, together with the method call, could look like this in SISC:

  (my-method (->jint scheme-value))
  

in order to call the first method, or

  (my-method (->jlong scheme-value))
  

in order to call the second.  Whereas in JScheme, it could look simply like this:

  (my-method scheme-value)
  

(Of course, in either implementation, it couldn't look exactly like what I've shown ... unless you had first defined the symbol my-method so that its value is the appropriate "generic Java method".  That prior step would look different in SISC from how it would look in JScheme; I'm assuming that it's already been made, in order to abstract away this syntactic difference, and focus on the difference concerning the data type conversion, or lack of same, in SISC or JScheme, respectively.

In fact, what I am doing here is the opposite of what I did in part 2.  There, my original SISC example was

  (define later
    (->boolean ((generic-java-method 'after) date2 date1)))
  

in which the "->boolean", plus one set of parentheses, represented one of these type conversions which SISC requires, and JScheme does not.  Then I waved my hand at the type conversion, promising to explain it later ("planning to come back, in a subsequent post, to the issues around it"), and then abstracted it out, in order to focus on the syntactic difference.  The "subsequent post" in question is the one you are reading now.)

I will claim that I have now shown you a very good example of a necessary tradeoff in language design.  I stress that it is in language design, not language choice: it's not a question of having to accept the whole package of choices that one or another language designer made, with it being [merely] statistically close-to-impossible that either designer will have made each choice exactly as you would wish.

The tradeoff here could be described, either from the [presumed] perspective of the JScheme designers/implementors, or from the perspective of those creating SISC.

For the people designing JScheme, it seems clear that a top priority was given to making calls from JScheme to Java as easy ... and succinct ... as possible.  That means you can't require explicit type conversions, for those are additional code.  The only way to make the calls, unambiguously, without requiring explicit type conversions (in at least some cases) is to have the JScheme types correspond one-to-one with Java types.  Each of the built-in integer types in Java has a predefined size limit; so the same must be true of each one in JScheme. 

Finally, if there is a finite set of integer data types, each with a predefined size limit, then there must be some cases in which you can code an arithmetic operation, and give it input values, each of which fits within the range of values that can be represented; but the actual correct result of the arithmetic operation, as that operation is understood outside "computer arithmetic", is outside the range of values that can be represented.  Since the correct answer cannot be represented, JScheme cannot give you that correct answer as the result of the operation.

The people designing SISC, on the other hand, seem to have had a fundamental design goal of fully implementing the R5RS Scheme specification.  The specification requires that an implementation not give incorrect arithmetic results, at least not of the sort that are entailed by fixed limitations on the range of representable values.  Therefore SISC, or any other fully conforming Scheme implementation, cannot have such predefined limits on the range of values.

But the built-in Java integer types all do have such predefined limits.  Therefore the SISC types cannot correspond one-to-one with the Java types.  Therefore (and given the fact that two Java methods can have the same name, and be distinguishable only by the types in their parameter lists), if you were allowed simply to pass values of Scheme types as arguments to Java methods, there would necessarily be cases in which the interpreter could not determine which method to invoke.  Therefore SISC must require explicit type conversions, in at least some cases, in order to eliminate these ambiguities.

In short, sometimes you can't have it both ways.

Categorie(s) for this post: Scheme.



3:42:59 PM    comment []


Wednesday, April 9, 2008
 


In part one of this series comparing JScheme with SISC, I said "My overall theme is that their design goals are different, and that they should be judged accordingly."  I'd say that this claim was moderately borne out by the remainder of the first two parts, which dealt with the syntactic differences between the two Scheme implementations' ways of calling Java code.  If your goal were to call Java from Scheme while writing as little Scheme code as possible, JScheme would have an advantage.  SISC's way of doing it, on the other hand, is more integrated with the overall "Scheme way" of doing things; this is an advantage for SISC (though not all would agree that it is a significant advantage), so this qualifies as an example of a tradeoff: of a decision in which each available choice has both advantages and disadvantages.

Now I want to return to one of the points discussed in the post preceding the one quoted above, the post titled A Better Scheme.  That is, I want to return to another kind of difference between JScheme and SISC, namely how they handle numbers.

In a subsequent post, I will follow this up by explaining how this difference in the treatment of numbers affects certain aspects of how you call Java from each, that is, from JScheme or from SISC.  Having done that, I hope to make an even stronger argument for the contention that designing a programming language (including a Scheme implementation) is a matter of tradeoffs: that there are cases where making the language "better" in one respect cannot help but make it "worse" in another.

In doing so, by the way, I aim to make this post a bit more accessible than the last two: not to assume, as they did, that the reader already understands the basics of Scheme programming.

I won't cover numbers in general, just integers (that is, whole numbers).  Talking about fractions would introduce complexity that isn't needed to make my point ... and the more so if I were to add those creatures that are actually called "complex numbers".

In general, a programming language will have a default integer "type": a default way of representing, in the computer, a value that is declared, or otherwise known, to be an integer.  And in many languages, probably most, a value of that default integer type takes up a fixed amount of space in the computer memory.  In Java, the default integer type is called int, and it takes up 32 bits.

It turns out, after allowing for negative as well as positive values, that the largest possible positive value representable by a Java int is 231-1 (raise two the the thirty-first power, then subtract one).

(Some readers, with technical backgrounds, will find this obvious.  Others may be more than happy to take it on faith.  But perhaps you don't fall into either of these categories; perhaps you'd like to see an explanation of how 32 bits yields 231-1 as the largest possible positive value.  I've written something which attempts to explain this, and made it available at the following location under my "home page" at The Well: http://www.well.com/user/edelsont/software/concepts/integer-range.html.)

The main point is that there is a fixed maximum value that can be represented as a Java int.  So what happens if a Java program takes two values, each represented that way, and tells the computer to multiply them together?  It will give you back the answer in the form of another Java int, which has, naturally enough, the same maximum value.

Perhaps you can see that there is a potential problem here: one which may arise, or not, depending on the actual values of the original two ints when the program is run.  Even though neither of the values is greater than 231-1, their product could still be greater than that number, and thus too large to be represented as a Java int.  Any such event, where the true result of a calculation will not fit in the data type being used to represent it, is called an "overflow".  What will Java do in such an event?

You might expect that the Java system would detect this and automatically "promote" the result to some other data type, which is capable of representing its value accurately.  Another possibility is that it might indicate that it is unable to give you a correct answer for this particular calculation.  For example, you might think of what some spreadsheet programs do when a column isn't wide enough to hold the number it's supposed to display: they fill the cell with asterisks, rather than displaying a number at all.

In fact, Java does neither of these things; it simply gives you a wrong answer.  That is, when the correct mathematical result is not within the range of integers representable as Java ints, Java gives you an answer which is within the range, but is not correct.  It may even tell you that the product of two positive numbers is negative.

If you are writing a Java program, and it includes some arithmetic calculations, and there is a possibility that the values being fed into the calculations may produce overflow, there are things you can do to prevent the program from giving the user the wrong answer when overflow does occur.  But in order to accomplish this, you have to tell the computer to do the calculation in a different way, thus making the program you write more complex than it would be if you just used the nice, simple Java notation for doing multiplication.

All this is true of Java, and of a lot of other programming languages as well: of most of them, I'm reasonably certain.  But Scheme is one of the exceptions.  According to the standard which specifies how a Scheme implementation is supposed to work, when a Scheme program says to multiply two integers together, it's supposed to get the right answer.

Actually, the Scheme programmer can specify that the starting values are not "exact" to begin with, in which case the Scheme system is allowed to avoid the extra work required in order to get the exact right answer.  But, in the case of integers, at least, this is not the default behavior: the programmer has to go out of her way to specify that exact results are not needed, and, if she hasn't done that, exact results she will get.  This is the opposite of the behavior of Java, and of most other programming languages, in this type of situation.

How is it possible for a Scheme implementation to accomplish this?  If you have a fixed amount of computer memory in which to represent a value, that necessarily places a limit on the range of values that can be represented.  So, in essence, there's only one way to meet this requirement of accuracy.  A Scheme implementation, in order to conform to the standard in this respect, must not place a fixed limit on the number of bits (the amount of memory) used to represent an integer.  It also must not leave it up to the authors of Scheme programs to specify, by choice of data type, how much memory is to be allocated for any particular integer value, such as the result of a calculation.

Instead, the Scheme implementation must determine, at run time, how much memory space will be required to hold the mathematically correct result, and make that space available.  ("At run time" means when the program is being run, as opposed, say, to when it is written.)  The amount of space will depend on the actual values of the numbers being fed into the calculation; and those values are usually not known when the program is being written, since they may come, directly or indirectly, from run-time inputs to the program.

So in general, the memory requirements for integer values in a Scheme program cannot be determined until run time.  This makes it harder to create a Scheme implementation which conforms to the standard.  On the other hand, it makes it easier to write reliable Scheme programs, if you know that they will "run under" [be executed by] a conforming implementation.

JScheme does not conform to this part of the Scheme specification.  Its default integer type, like Java's, has a fixed amount of memory space allocated for it; in fact, JScheme's default integer type is Java's default integer type.  And when you ask JScheme to multiply two numbers, each of which is represented internally in the default integer type, it allocates the same fixed amount of memory for the result, just as Java does.  And so, if the result of the calculation is too large to fit in that fixed amount of memory, you get a wrong answer ... again, just as in Java.

There was an example of this in the post titled A Better Scheme, dated February 11, 2008.  The example involved the number 230 (two to the thirtieth power).  I said "If you ask JScheme to multiply this number by two, it tells you that the answer is negative."  I didn't say so at the time, but there was a reason for picking this specific sample multiplication: its correct answer is 231.  Now you know that the biggest number that can be represented accurately, using JScheme's (and Java's) default integer type, is 231-1: the true result of the specified calculation is too big to fit ... by a margin of one.

So JScheme gets the wrong answer, as Java does.  Perhaps this is what was meant (or at least part of what was meant) by a remark that I quoted in part one, to the effect that JScheme can be thought of as a "Scheme skin" over Java.  (That was said, as you may recall, by one of those developers principally responsible for maintaining JScheme.)

A Scheme skin over Java!  That's a metaphor, of course.  But it seems to mean about the same thing as saying that JScheme is Java dressed up to look like Scheme; and that, in turn, could be said to imply that JScheme isn't really Scheme.

A more formal, and less loaded, way of putting the point is that JScheme has the syntax of Scheme, but, at least in this aspect of its handling of integers, it has the semantics of Java.

And really, to me, "less loaded" is better.  A purist, in the light of something like this, might call JScheme a faux Scheme ... and think that obviously, therefore a real Scheme programmer would never touch it.

I'm not that sort of purist.  I don't think that a decision about whether to use JScheme for something should be based on whether JScheme is "really Scheme": it should be based on whether JScheme meets the requirements of the job you need to do.

That said, however ... I think that for many projects, this aspect of JScheme's handling of integers does count significantly against it.  I don't want to have to do extra work to be sure of getting right answers from an arithmetic calculation.  Nor do I want to spend time thinking about whether that is necessary: about whether I know enough about the inputs to the calculation, the values of the numbers being fed in, to be confident that the JScheme/Java semantics will produce correct results.

In earlier days, programmers thought very carefully about such things, because calculations using a fixed amount of memory take less machine resources than do ones using a variable amount, and machine resources used to be expensive enough that programmers needed to spend a lot of their time finding ways to minimize the use of them.  But the cost of programmer time has risen, while the cost of machine resources has fallen, so that it rarely makes sense, any more, to expend significant programmer effort on saving resources in these sorts of ways.

SISC does conform to this part of the Scheme specification: it doesn't have a default integer type with a fixed size limit, it just adjusts the amount of memory, and thus the range of values that can accurately be represented, as needed.

So SISC removes this source of possible wrong answers for me, without the extra effort on my part.  That contributes significantly to my tending to prefer SISC over JScheme; and I would say, more generally, that it gives SISC a significant advantage over JScheme.  (By the same reasoning, it gives any conforming Scheme implementation a significant advantage over Java, and over most other programming languages as well.)

With this advantage, however, comes a disadvantage.  You get the advantage without the disadvantage, if you are writing pure Scheme code.  But if you also need to write explicit calls from Scheme to Java code, then this same design decision, unavoidably, makes it more laborious to do it in SISC than it is in JScheme.

This sounds like the same issue which got the most attention in both parts one and two of this series: JScheme's "Java Dot Notation", which also makes calling Java easier from JScheme than from SISC, but which some criticize for being a departure from the simple Scheme syntax.  But it's not the same issue: it's also about calling Java from SISC versus doing so from JScheme, but this time, it's more specifically about passing parameters, and integer parameters in particular, between the two.

The next post in the series will explain how this design decision, the same one which makes it easier to be sure that you will get correct arithmetic results from SISC, also makes it more difficult to pass integer parameters from SISC to Java, as compared with doing so from JScheme.  Thus it will fulfill the promise to give a clear-cut example of the claim that there are unavoidable tradeoffs in designing a programming language ... or even, as in this case, in designing [what you call] an implementation of an existing programming language.

Categorie(s) for this post: Scheme.



4:50:20 PM    comment []


Wednesday, March 19, 2008
 


To recap, the previous post began a comparison of how Java methods are called in Java, JScheme, and SISC, and gave an example, using the after method of the java.util.Date class, of "equivalent" calls in each.

In Java:

  boolean later = date2.after(date1);
  

In JScheme:

   (define later (.after date2 date1))
   

In SISC:

  (define later
    (->boolean ((generic-java-method 'after) date2 date1)))
  

There are two differences between the JScheme and SISC versions.  One is SISC's call to the ->boolean primitive: there is nothing explicit in the JScheme version that corresponds to this.  Let's talk briefly about that first, to get it out of the way, for now ... planning to come back, in a subsequent post, to the issues around it.

If you remove that altogether, the SISC version becomes

  (define later ((generic-java-method 'after) date2 date1))
  

and that is, by itself, a valid call to the after method.  I added the call to ->boolean because you need it in order to make the SISC version truly equivalent to the JScheme version: given a Java method that returns a boolean, calling that method from JScheme gives you a Scheme boolean, which can be used "as is" in, for example, a Scheme "if" construct.  In SISC, on the other hand, a Java boolean is not the same thing as a Scheme boolean: you have to convert the former to the latter, by calling ->boolean, in order to use the result as you'd expect to be able to use a boolean value in Scheme.

With that gone, the remaining difference is that the JScheme version has this:

  .after
  

where the SISC version has this:

  (generic-java-method 'after)
  

We know what the SISC version means: apply a procedure, generic-java-method, to the literal symbol after.  We also know that this will return a "generic Java method", which is a special type of procedure; when that procedure is, in turn, applied to the arguments date2 and date1, the specific Java method to be called will be determined from the types of those arguments.

Now how are we to understand the simple ".after" that JScheme uses to equivalent effect?  We might, by analogy, think that in JScheme, the period just before "after" does what the application of generic-java-method does in SISC: that "." in JScheme, when it appears just before a symbol, is a function, or operator, or something, which is applied to a symbol and returns a "generic Java method".  But, from a Scheme programmer's point of view, that would be dreadfully ugly.  Scheme doesn't have "operators", and in order to write an expression that denotes the application of any procedure, macro, or built-in syntactic form, you have to enclose that expression in parentheses.

In other words, it would be a minor deviation from Scheme syntax if the JScheme version of

  (generic-java-method 'after)
  

were to be

  (. 'after)
  

It would be a violation because the R5RS Scheme standard doesn't allow a symbol name to begin with a period: in my opinion, a relatively small matter.  But allowing the JScheme programmer to denote the same thing by just

  .after
  

(while still regarding it as the application of some sort of procedure) ... that's a much more major deviation from Scheme syntax.  For a while, I did interpret JScheme's "Java Dot notation" that way, and during that time, I did, indeed, consider it to mar the elegance of the Scheme language.

I have one data point suggesting that at least one other Scheme programmer has had this type of negative reaction to the "Java Dot notation".  That data point is to be found at http://lambda-the-ultimate.org/node/936.  That's a forum post on the Lambda the Ultimate site, titled "Yearning for a practical scheme".  I want to direct your attention, more specifically, to the reply headed "Some thoughts", by user bitwize .  He or she writes (in part):

SISC is fully R5RS compliant, supports the full numeric tower, tail-call optimisation, all of that beautiful stuff.  It has an *extensive* support for bringing Java objects and methods into the Scheme world and though it's somewhat more cumbersome to use than cute little hacks like JScheme's JavaDot, too much (syntactic) sugar is bad for you.

I am interpreting what is said here about "JavaDot" as expressing a complaint similar to the one I outlined above.  (If this misconstrues what was meant, my apologies to bitwize.)  And by the way, "too much (syntactic) sugar is bad for you" is a great quote.

However, is it actually applicable here?  Should we be regarding the "Java Dot notation" as syntactic sugar, JScheme's ".after" as an abbreviated notation for, but semantically equivalent to, SISC's

  (generic-java-method 'after)
  

... ?  It might seem that the answer has to be "yes", since in the present example, they produce the same effect (modulo the bit about how JScheme's

  (.after date2 date1)
  

returns an actual Scheme boolean, while SISC's

  ((generic-java-method 'after) date2 date1)
  

returns something which still has to be converted into one).

But wait.  There's another way to understand "Java Dot" locutions like .after.  We can regard ".after" as just a plain old symbol name, and posit that JScheme has, in effect, predefined a whole bunch of symbols ... all the ones whose names have one of the forms specified in the Java Dot Notation documentation ... as referring to [what SISC would call] "generic Java methods", field accessors, and the like.

Before you call the men in the little white coats to haul me off to the looney bin, let me hasten to assure you that I don't believe that JScheme actually works this way.  It couldn't; it would have to predefine an infinite number of symbols.  But this seems to me to be an alternate, more acceptable metaphor for how JScheme works.  We shouldn't take it literally, but we can understand JScheme as working "as if" this had been done.  It's "more acceptable" in the sense that if we read the code this way, then the notation does not violate the standard Scheme syntax in any really gross way.  (Again, it does violate it, in that the standard says that a symbol name can't begin with a period; but I hope you'll agree that that is a smaller, less repulsive sort of violation.)

And I will further claim that, in the light of a couple of details about how JScheme works, this metaphor also provides us with a more accurate conceptual (though not literal) model of those workings than the one which regards the dot as a funny sort of procedure name.

The first of those details is: in JScheme, you can write something like

  (define my-after .after)
  

and subsequently, my-after will [also] do what .after did before.

The second detail, which I find considerably more convincing than the first, is that in JScheme you can also write, say,

  (set! .after
   (lambda (arg1 arg2)
    (list arg2 arg1)))
  

and thereafter, .after will, indeed, function consistently with this new definition of it; more generally, you can redefine any of these symbols that has a name in "Java Dot" format.  This makes perfect sense if you conceptualize these character sequences as [names of] predefined symbols, and no sense if you conceptualize them as expressions in which the dot, by itself, is some sort of off-syntax procedure name.  And I understand the latter way of looking at them to be the one which treats the "Java Dot notation" as "syntactic sugar".

Once I adopted this "predefined symbols" model, I stopped regarding the "Java Dot notation" as repulsive or unclean ... despite the fact that I probably do fit the stereotype of Scheme enthusiasts as "purists" about keeping the simplicity of the language inviolate.  Predefining a whole bunch of symbols ... even an infinite number of them ... doesn't violate the simplicity of the language; it just means that you have a lot of libraries readily available.  Which, come to think of it, could be considered to be the point of having Java-based implementations of Scheme.

One could certainly regard all this as being of small moment, especially since I do, nevertheless, prefer SISC to JScheme (for most purposes).  But it seemed to me worthwhile to be clear about which (alleged) reasons for doing so I find valid, and which I don't.

Categorie(s) for this post: Scheme.



9:04:38 PM    comment []


Monday, February 25, 2008
 


My last post dealt with how I evolved a preference between two different implementations of the Scheme programming language, each of them written in Java: how and why I came to decide that I preferred SISC over JScheme.  I want to make some follow-up points about how one might compare and contrast the two implementations.

My overall theme is that their design goals are different, and that they should be judged accordingly.  This may help to explain why, though that last post was titled A Better Scheme, at the end of it I backed away from saying that SISC is "better" than JScheme, preferring simply to say that I prefer SISC.

The key difference in design goals was stated by Tim Hickey, one of the principal developers of JScheme, in an e-mail message to the "SISC-users" mailing list, dated 2002-05-31.  He said: "SISC seems to be written with R5RS conformance and high performance in mind, whereas Jscheme was designed to allow easy access to Java. It can be thought of as a 'Scheme skin' over Java...."  (The full message can be found here, in the archives at SourceForge.net.1)

Okay; JScheme was designed with "easy access to Java" in mind.  In other words, it is supposed to be easy to call Java methods from JScheme.  One important aspect of how this is achieved is called, by the JScheme developers, "The Java Dot Notation."

Let's look at an example of this notation: of how one would call a Java "instance method" from JScheme.  I'll randomly choose the "after" method of java.util.Date, which is declared thus:

  public boolean after (Date when)
  

So if we have two Date objects, date1 and date2, one example of a Java statement which calls this method would be

  boolean later = date2.after(date1);
  

An approximate equivalent of this in JScheme would be:

   (define later (.after date2 date1))
   

In JScheme, a symbol beginning with a period, such as ".after", is, in general, interpreted as the name of a Java method; that method is called on the Java object which is given as the first argument of this Scheme "procedure".

To do the same thing in SISC requires more code:

  (define later
    (->boolean ((generic-java-method 'after) date2 date1)))
  

So yes, in a simple sense, access to Java is easier from JScheme than from SISC.  Pragmatically, this could certainly be seen as an advantage that JScheme has over SISC, in some use cases, at least.  However, some in the Scheme community don't see it as an advantage at all, but rather as proof that JScheme is not faithful to the classsic simplicity of Scheme, and should be shunned.

Since I have already said that I prefer SISC, you might assume that I would be one of these people.  I was for a while, but not any more.  For me, the Javadot notation, per se, is not objectionable.

In a subsequent post, I will lay out the nature of the objection, and why I no longer think it has merit.



1Note added 2008-04-09: The e-mail message, as it appears in the archive, doesn't actually say that JScheme "can be thought of as a 'Scheme skin' over Java", but that it "can be though of as ..."  However, since I can't parse "can be though of", I have taken the liberty of assuming that Mr. Hickey meant "can be thought of", and editing the quotation accordingly.

Categorie(s) for this post: Scheme.



6:21:07 PM    comment []


Monday, February 11, 2008
 


I have now done enough programming in Scheme to be able to say confidently what I've long suspected: that it's my favorite programming language.  Of the ones I know well enough to judge, that is; I will never know for sure whether there's something else out there that I would like better ... because no one can learn enough about all existing programming languages to make an informed judgment about such a question.

All of my serious Scheme programming, to date, has involved the "scripting" of existing Java code.  This, at the very least, biases me toward using a Java-based Scheme implementation, meaning one that runs on the Java Virtual Machine: it would certainly be possible to script Java code from outside the JVM (after all, one can script Java code from Perl), but given that at least one acceptable Java-based implementation of Scheme exists, why not use one?

As those familiar with the Scheme world know, the language suffers from an embarrassment of riches where implementations are concerned.  One list, at community.schemewiki.org, has about seventy of them ... and I know for a fact that that's not a complete list.  And Robert Tolksdorf's list of Programming Languages for the Java Virtual Machine has no less than fourteen entries (all of them under the heading "Lisp and co") that claim to be full or partial Scheme implementations, for the JVM alone.  (Nine of the fourteen appear in the list at community.schemewiki.org as well.  Also see "Footnote 1" below.)

At any rate, of nine implementations that appear in both lists, I now have experience with two of them.  I began with JScheme (see "Footnote 2"), and have now moved on to SISC (Second Interpreter of Scheme Code).  You would be correct in inferring, from the way I phrased that, that I have decided that I prefer SISC.

Why do I prefer it?  In part, because it's a complete implementation of "R5RS", the most recent generally-accepted standard for the language, and JScheme is not.  (Yes, there's a story behind that qualification, "generally-accepted", but it will wait for another time.)  I will note at once that JScheme doesn't claim to be; it's never claimed to be more than a "nearly complete" implementation of R4RS, the previous version of the standard.

However, while the JScheme page linked above says "JScheme implements all of R4RS Scheme except that continuations can only be used as escape procedures and strings are not mutable", that list of exceptions is not complete.  The JScheme implementors have written elsewhere (I can't seem to find the reference right now) that "the numeric tower is not right", or words to that effect.  I'd have to call that an understatement ... or at least, the way I initially interpreted it was an understatement.  I thought it probably meant something like: JScheme does not support complex numbers, only real ones.

As it turns out, the best succinct statement of how JScheme handles numeric data types is: the JScheme numeric types are the Java numeric types.  For example, among integer values, it distinguishes between an int (31 bits of precision) and a long (63 bits).  In some cases, if you push the limits of these types, you get wrong answers in JScheme [version 7.2].

For example, suppose that you start with the number 1,073,741,824 (two to the thirtieth power).  If you ask JScheme to multiply this number by two, it tells you that the answer is negative.  If you ask it to multiply the same number by four instead, it tells you that the answer is zero.

Actually, many software programs, and programming languages, are prone to such errors, where large integers are concerned.  In Java, some of the same things will happen, by default.  The programmer can prevent them, but only by making the program more complicated.  But the Scheme standard says that, in cases like these, a Scheme implementation should give the mathematically correct answer, without special effort on the part of the person writing the Scheme code.

And SISC does.

It's not clear that this limitation in JScheme would have made a practical difference, in the project I was working on when I discovered it: it's not clear that my program ever would have encountered numbers that large.  But I didn't want to have to worry about it.  And, practical consequences or no, I found it, let's say, aesthetically unpleasant, given that this is supposed to be Scheme, and that [relative] mathematical correctness is one of the defining characteristics of Scheme, in my mind at least.

This funny business with numbers was just one of several reasons why I came to decide, in the middle of a project, to stop trying to write it in JScheme: to convert what I'd already written to use SISC instead, and then resume adding functionality to the program.  From a broader point of view, one could say that I was motivated by the fact that this was my first significant Scheme programming project, which means it was my first opportunity really to learn (as opposed to learning about) the language.  I decided that it was a waste of time to learn the habit of working around limitations in JScheme, if I didn't intend to keep using JScheme for the rest of my Scheming career.

And yet ... despite the title of this post, I don't really want to make an absolute statement that SISC is "better" than JScheme.  It depends on what you want.  If you want a "mostly Scheme-like" scripting language for Java, JScheme could be a perfectly reasonable choice.  If you really want to program in Scheme, you may prefer SISC.



Footnote 1: the list of Scheme implementations at community.schemewiki.org also has at least one entry, for llava, which is clearly Java-based, yet doesn't appear in the above-mentioned Java-specific list.

Footnote 2: the Tolksdorf list of Java-based implementations has (as of today) two confusingly similar entries, the first headed "JScheme" and the second headed "Jscheme".  It is the second of these which refers to the implementation I've used ... though the correct capitalization of that implementations's name is, in fact, "JScheme".  The first "JScheme" entry is a broken link (again, as of today), but clearly refers to a different implementation with a similar name.

Categorie(s) for this post: Scheme.



3:54:19 PM    comment []


Monday, May 21, 2007
 


In my April 23 post, I said that I intended to write a Moneydance extension which would make Moneydance scriptable from [one of the Java implementations of] Scheme.  I'd still like to do that, but have tentatively decided that that will not be the first Moneydance extension that I will loose upon the world.  The first one will still be a scripting interface, but for a different scripting language: BeanShell.

So what's Beanshell?  The really short answer is that BeanShell is interpreted, simplified Java.  "Simplified" in several senses; one is that declaring the types of your variables is optional in BeanShell.  Also, BeanShell can be run in a "console" mode in which you type a single BeanShell statement, and it is executed immediately. 

A BeanShell call to a Java method looks exactly like a Java call to the same method.  Behind the scenes, the BeanShell interpreter uses the Java "reflection" API to invoke the method.

This means that, while you can use BeanShell to write and run what we ordinarily think of as "scripts", you can also use it as a learning tool.  If the Javadoc for an API doesn't make it crystal clear just how to write that method call that you need, a little experimentation can usually clear it up.  You can do that experimentation in Java, of course, but doing it in BeanShell tends to be considerably faster.

Now actually, the same ability to experiment quickly with a Java API is also present in at least one of the Scheme implementations: namely, JScheme.  The difference is that, in order to use JScheme in that manner, you have to know how to program in Scheme, and also know a good bit about Java: at least enough to read Javadoc.  It's a pretty good bet that more people, today, know Java than know Scheme, but it's a dead solid cinch that more people know Java than know both.  (Perhaps you've never heard of BeanShell before, but if you know Java, then I've already told you nearly everything you need to know in order to get started in using BeanShell to explore an API.)

So there's the major reason for the change in plans: the BeanShell scripting interface has a bigger potential user base than a Scheme scripting interface does.  In particular, the BeanShell scripting interface should be useful to anyone engaged in learning the Moneydance API, in order to write their own Moneydance extension(s).  It might even encourage more people to do so.

The plan is, as I said, "tentative".  Why?  Mostly because I'd probably abandon it, if I found out that someone else was already working on the same thing, and had a big head start.  That consideration is part of the reason for this vaporware announcement; I'm hoping to get the word out such that, if my undertaking it would indeed be a duplication of effort, I'll find out.

I suppose I'd also reconsider the change of plan if this posting led to a great clamor of demand for me to implement a Scheme-Moneydance scripting interface as soon as possible.  Or, for that matter, if it resulted in squadrons of pigs flying around my house at all hours.

Categorie(s) for this post include: Personal Finance Software; Java.



4:20:13 PM    comment []



Click here to visit the Radio UserLand website. © Copyright 2008 Tom Edelson.
Last update: 4/25/08; 4:53:02 PM.
April 2008
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      
Mar   May