Dynamic Structures as Part of a Language

Hi,

I hope I've got the right group for this! I've been a professional software engineer for about 8 years, and I've always tried to concentrate on design principles. I've written a few compilers and a simple VM, and partially designed a new language, and after a while I began to have a few ideas about stuff.

One of the things a came to realise was that I think it is possible to design anything in terms of dynamic data structures (for example, tree-like structures) using only a vector, set, and map.

To me, these three items provide a full complement of tools needed to build most (if not all) dynamic structures. I think they are so important, that they are almost fundamental to a modern language.

Now, clearly there are some situations where it may be more useful to use a linked-list, or I'm sure there are some other specialist data structures that we could think of to solve particular problems, but I think that in 99% of design cases these types will suffice, and infact they have for me, in my experience.

My idea was to actually make these data structures part of the language, just like arrays, say. A possible notation/syntax might look something like this:

Declarations:

{int} m_mySet;

[int] m_myVector;

[int->String] m_myMap;

Simple usage:

int i = m_myVector[j];

m_myVector += 3; // Add to the end.

m_myVector += m_myVector; // Concatenate.

String s = m_myMap;

boolean b = m_mySet.contains(i);

There is also a little inheritance hierarchy:

A vector is a set.

A map is a set.

A [int->T] map is a vector (where T is some type).

So you could have:

boolean contains({int} is, int i)

{

return is.contains(i);

}

...and then do...

[int] is = {1,2,3};

boolean b = contains(is,1);

...fairly obviously.

You could also intermix types easily, so you could add map elements to a set with just:

{int} sis;

[String->int] mis;

sis += mis;

There could be VM instructions/optimisations for all operations associated with these three structures.

Obviously there is a lot more design work that needs to be put in, but the principles of what I'm talking about are the use of these three structures as fundamental parts of the language.

For those who have used functional languages, I'm sure you'll see some similarities in these ideas.

Any comments/suggestions?

Si.

[2530 byte] By [someideas] at [2007-9-30 10:44:59]
# 1
I think the langauage PERL is suitable for you. ^^
kennethz at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 2
Spelling mistake: language
kennethz at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 3
Well that just illustrates how little you know, doesn't it?
someideas at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 4
Well this shows why nobody is willing to reply your post, doesn't it?
kennethz at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 5

If you think it's a bad idea, then why don't you explain why you think that, instead of making a pointless sarcastic comment? Maybe then, one or both of us will learn something?

My point is that these structures are so useful and so widely used that it make sense to integrate them into the language, what's your point?

someideas at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 6

The phrase "syntactic sugar" is designed for that sort of thing. They offer little or nothing that isn't either already available from the library data structures, or which isn't coming in 1.5

Java developers tend to be conservative about the core language. I doubt you will ever see operator overloading in Java, and your suggestions are tantamount to that.

In point of fact, your suggestions mainly imply that you don't know Java very well, or at the very minimum that you don't know the collection classes very well. Try a little humility - until you know the language REALLY well, don't think you know how to improve it.

I wouldn't presume to dictate how the Perl crowd should develop their language.

Declarations:

// Guess what, the "old" way I get to specify

// the implementation, instead of having to use

// the language dictated one:

Set mySet = new HashSet();

List myList = new ArrayList();

Map myMap = new HashMap();

versus:

{int} m_mySet;

[int] m_myVector;

[int->String] m_myMap;

Simple usage:

// Admittedly clunky

int i = ((Integer)myList.get(j)).intValue();

// But generics & autoboxing are coming:

int i = myList.get(j);

myList.add(i); // Add to the end

myList.addAll(myOtherList); // Concatenate

boolean b = mySet.contains(i);

versus:

int i = m_myVector[j];

m_myVector += 3; // Add to the end.

m_myVector += m_myVector; // Concatenate.

String s = m_myMap; // <-- What the fsck is this supposed to be ?

boolean b = m_mySet.contains(i);

There is also a little inheritance hierarchy:

A vector is a set.

A map is a set.

A [int->T] map is a vector (where T is some type).

Ahem, but they aren't. They're all Collections. Which is why (with generics & autoboxing):

List<int> is = new ArrayList<int>();

is.add(1);

is.add(2);

is.add(3);

boolean b = is.contains(1);

versus:

boolean contains({int} is, int i)

{

return is.contains(i);

}

...and then do...

[int] is = {1,2,3};

boolean b = contains(is,1);

Do you see why nobody much has bothered to respond now ? Well, that and the fact that the JCP forum's not exactly a well travelled area...

D.

dcminter at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 7

Correction - Map is not (and should not be) a Collection, but neither is it a Set (it's allowed to retain duplicates) nor a List (it's elements do not need to be ordered).

It shouldn't be a collection in its own right, because it groups together two distinct collections - the key set, and the values. It makes no sense to ask "does the Map contain X" unless you specify whether you're looking for a key, a value, or both.

If you really want to say "Is X in the keys or in the values ?" then you can always say:

boolean b = myMap.keySet().contains(X) || myMap.values().contains(X);

But that's an unusual request (I can't think of a single occasion when I've needed to do that). Far more normal is the operation:

boolean b = myMap.values().contains(X);

Or even more common (and this after all is the point of Maps):

Object X = myMap.get(key);

D.

dcminter at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 8

Ah, a sensible (yet defensive) reply!

>The phrase "syntactic sugar" is designed for that sort of thing. They offer little or

>nothing that isn't either already available from the library data structures, or which

>isn't coming in 1.5

You're right, they don't offer any new functionality as such, but that wasn't the point.

Let me try to explain what I was attempting to achieve:

The idea is to try to present these structures as something that is not "just another library", but that are fundamental aspects of modern high-level programming. By making them part of the language I am trying to derive a more generic programming style whereby it becomes as natural to use this syntax as it does to use, say array "[]" syntax. Java has already taken a small step down this road by for example, associating length with arrays and using garbage collection. This is nothing that couldn't be done in C++ for example, but they have recognised the usefulness of those features and they have become integrated into the language.

Do you see the kind of thing I am aiming for?

>Java developers tend to be conservative about the core language. I doubt you will ever

>see operator overloading in Java, and your suggestions are tantamount to that.

Well, yes and no - the operators are overloaded, but it's restricted to those structures only. You could argue that it's confusing, but I think it adds a certain amount of "mathematical grace" (if that's the right phrase) to code. Personal preference though, I guess. I wouldn't mind seeing operator overloading in Java... They are rightly defensive, but that shouldn't restrict an open mind!

>In point of fact, your suggestions mainly imply that you don't know Java very well, or

>at the very minimum that you don't know the collection classes very well. Try a little

>humility - until you know the language REALLY well, don't think you know how to improve

>it.

I know Java extremely well, and the collection classes fairly well actually. I have written large sections of a KVM and an entire CLDC library set to go with it. I have worked extensively with STL in C++, and written many proprietry stl-style template libraries (let's face it, it's all the same, but in different packaging). I appreciate the Collections in Java - they are extremely well designed and work very well. With the current lack of generics and auto boxing they're a pain to use, but version 1.5 will be very good I'm sure. It was only after spending a many years doing this kind of work that what I am suggesting began to make itself apparent to me.

>I wouldn't presume to dictate how the Perl crowd should develop their language.

I'm not sure what you're suggesting, but I can't say I've ever really used PERL, I'm making suggestions as a Java programmer. And I wasn't "dictating", I was "suggesting ideas" because I thought that was the whole point of this forum?!

>[Your X vs Y stuff]

Yes, you can write the same thing as you did. I was just trying to make the syntax appear more "streamlined" and "mathematical". I personally find that the constructs are more aestheticly pleasing with some "operator overloading". I guess you Java people don't like it though.

Have you ever used Haskell or similar? It's quite a "beautiful" language (for want of a better word) - can you see the similarities, and the kind of thing I am aiming for?

>A vector is a set.

>A map is a set.

>A [int->T] map is a vector (where T is some type).

>

>Ahem, but they aren't. They're all Collections. Which is why (with generics & >autoboxing):

Yes, ok I lost the plot a bit there (it must have been Friday), they're collections, not sets, and [int->T] is not necessarily a vector either!

I appreciate your thoughts on maps/collections/sets - I'm not sure I agree with all of it, but these are details that could be discussed later.

Let me try to illustrate my point once more:

I'm not trying to say that my stuff is better than the Java collections or that they are not adequate. I am trying to say that those structures, namely vector, map and set, have become fundamental to modern high-level programming, so much so, that it is worth making them part of the language. The reasons for doing this are:

To develop a new universal syntax for working with these structures that is streamlined, compact, aesthetically pleasing and intuitive, through the use of more symbolic syntax rather than function calls. The idea being to promote a universal, clean almost mathematical approach to using these data strucures.

To get away from the feeling that these are "libraries", but to start to realise that these are the new fundamental tools of programming, such as garbage collection has become in Java.

To allow optimisations at the VM level for manipulating these structures - you could even possibly envisage hardware (CPU) support in the future.

I can understand why you all immediately reject the idea, but can you kind of appreciate where I'm coming from?

As an aside, here's a snipped of Haskell code for those who don't know it:

--

Quicksort in Haskell

qsort []= []

qsort (x:xs) = qsort elts_lt_x ++ [x] ++ qsort elts_greq_x

where

elts_lt_x= [y | y <- xs, y < x]

elts_greq_x = [y | y <- xs, y >= x]

--

It has (I think) a real mathematical cleanliness and "beauty" to it. FYI: The function is concatenating "++" linked lists "[]" and working with lists almost like mathematical sets: [y | y <- xs, y < x] "y such that y is a member of xs and y is less than x".

It's that kind of thing I'm suggesting...

>Well, that and the fact that the JCP forum's not exactly a well travelled area...

True.

someideas at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 9

> Ah, a sensible (yet defensive) reply!

I always try to defend Java from that sort of corruption.

> You're right, they don't offer any new functionality

> as such, but that wasn't the point.

That wasn't your point, certainly. But it is very much mine. I don't want Java to end up like C++ because if I wanted that I'd just use C++.

> By making them part of the language I am

> trying to derive a more generic programming style

> whereby it becomes as natural to use this syntax as it

> does to use, say array "[]" syntax.

With respect, I consider the collection classes to be extremely intuitive and essentially lacking the features that are already coming in 1.5

I don't want a lot of "special" syntax for "special" data structures to mess up otherwise quite readable code.

> Do you see the kind of thing I am aiming for?

I see exactly what you're aiming for, and I dislike it. This is not a matter of "right or wrong", it's a matter of aesthetics and frankly I think your taste sucks.

> You could

> argue that it's confusing, but I think it adds a

> certain amount of "mathematical grace"

Yes, I think it's confusing, and you can keep your mathematical grace. Everyone and his brother seems to have a problem domain where they feel that "special" syntax would be helpful. You either reject the vast majority of them, or you offer operator overloading.

The new syntax with 1.5 is quite enough for the forseeable future, thank you very much.

> I wouldn't mind seeing operator overloading in

> Java...

Ah, there you go. My guess is that people who like the idea of operator overloading will be somewhat inclined to support your sort of addition to the language, and those like myself who think it's intrinsically evil to consider adding it will be against. I won't bother to argue the point here (do a quick search and you'll find exhaustive diatribes in either direction) but thankfully Sun seem to be on the side of the angels with this one.

> I know Java extremely well, and the collection classes

> fairly well actually.

Well, I can't say it showed in your discussion.

> I'm not sure what you're suggesting, but I can't say

> I've ever really used PERL

Ok, I've been subjected to it a few times. I'm not especially inclined to learn more, but I don't know if that's the result of a crappy language, or just crappy perl programmers. I suspect the latter.

> a Java programmer. And I wasn't "dictating", I was

> "suggesting ideas" because I thought that was the

> whole point of this forum?!

Well, "orating" if it makes you happier.

> Yes, you can write the same thing as you did. I was

> just trying to make the syntax appear more

> "streamlined" and "mathematical".

Really ? I thought it was a bit pants, personally. Basically not as clear. I'm not even sure it's syntactically unambiguous given the operators you chose.

> <snip op. overloading> I guess you Java people

> don't like it though.

No, I don't like it. Many Java people find it distasteful. Some are of your opinion

> Have you ever used Haskell or similar?

Java is not Haskell. Java is not C++. If you want to use Haskell, or C++ feel free (I like C++ as it happens, and no, I've never used Haskell), but please refrain from trying to turn Java into either of those.

> Yes, ok I lost the plot a bit there (it must have been

> Friday),

Fair enough. We all have low-caffeine days.

> I appreciate your thoughts on maps/collections/sets -

> I'm not sure I agree with all of it, but these are

> details that could be discussed later.

I don't think so. In the unlikely event that you talked me around to your way of thinking, I would demand that they worked as the collection classes do. And since I use the collection classes every day, I'm a massive fan of their design. A rare A+ for Sun there.

> To get away from the feeling that these are

> "libraries", but to start to realise that these are

> the new fundamental tools of programming, such as

> garbage collection has become in Java.

This just in - the java.* libraries are a fundamental part of the language. Generics justify additions to the syntax because they cannot be achieved by adding new libraries.

> To allow optimisations at the VM level for

> manipulating these structures - you could even

> possibly envisage hardware (CPU) support in the

> future.

In principle that's achieveable through JNI calls in the libraries. And about as likely. Java is multi-platform, remember ?

> I can understand why you all immediately reject the

> idea, but can you kind of appreciate where I'm coming

> from?

Yes. You're coming from the same place that the operator overloading people come from, I know the way there, and I'll go back if I find a need.

> qsort []= []

> qsort (x:xs) = qsort elts_lt_x ++ [x] ++ qsort

> elts_greq_x

> where

> elts_lt_x= [y | y <- xs, y < x]

> elts_greq_x = [y | y <- xs, y >= x]

I'm sorry, were you under the impresion that this supported your argument ? Guess what, lots and lots of Java programmers aren't looking for mathematical cleanliness and "beauty", they're happy with plain old clarity of expression.

And lest that all seem too downbeat, no, I'm not objecting to you suggesting new features and approaches for Java, thanks for trying. I just happen to think this particular one sucks (big time).

Dave.

dcminter at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 10

Well, there's no convincing you is there?!

Just a couple of points:

Firstly, I wasn't suggesting complete use of operator overloading throughout the langauge (which you obviously hate). Java acutally already overloads the + and += operators for Strings. See you've already got "special syntax for special data structures"! You may think it "sucks", but it's already there! Do you hate this too? I personally think it is a useful syntax. I am only suggesting something similar for these other structures. For example if you overloaded [] for Vectors, I think that would add a certain amount of clarity, over using elementAt() without adding ambiguity. And why couldn't you use + and += for Vectors if you can use them for Strings, when a String is essentially a Vector<char>? These are they type of principles I am talking about.

Well, I'm sure you'll disagree, but I don't think what I'm suggesting is quite so "specialised" as you suggest, rather a natural progression.

Of course I expected the "stop trying to turn Java into C++/Haskell" / "why don't you use C++ then" -type comments. But that's just being argumentative. I'm not suggesting that one is turned into the other. I'm suggesting that there is syntax / syntactic style in other languages that may also benefit Java.

I also wanted to make clear that they syntax I suggested was an example of the type of thing that could be done - it probably is grammatically ambiguous, but I am more interested in the principles rather than the exact grammar at this stage.

I know it's a huge change to the language, and to be honest I don't for one second expect anyone to implement it (well, you never know) - I was really more interested in what people thought about it. Did people think it was a natural progression for these structures to become integral to a language. We have already seen this in garbage collection, partly in Strings; is this what will come next? I think it is, probably not as part of Java though judging by the response from this forum!

You should have a go with Haskell btw - it's well worth learning - I found it really opened my eyes to a few things.

Thanks for your comments anyway - I think we should just agree to differ.

someideas at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 11

Well, there's no convincing you is there?!

Not on this subject, I suspect. Generally - yes. For example, I loathed templates in C++ because while they're immensely powerful, they were horrible to debug and I just don't like stuff that essentially generates new code at compile time very much (largely because of debugging issues). Contrarily I delighted in operator overloading in C++ because I consider it to be in the spirit of the language. When used wrong, it's a nightmare, but then that's true of pointers (for example).

Whereas my position in Java is more or less reversed; I think that Generics are a brilliantly conceived addition to Java and they avoid most of the pitfalls of templates. And I think that operator overloading was left out of Java for all the right reasons.

Here's a tip - implementations may convince me. I was hostile to generics until I saw the excellent job that the JCP had done of it. So now I'm a fan. If you don't want to implement changes yourself, don't expect me to be supportive.

Firstly, I wasn't suggesting complete use of operator overloading throughout the langauge ...

No, I understand that. But my attitude is that if you add operator overloading for a few special cases now, you will make it easier to add special cases in the future, and to add full overloading later on. So you will find it virtually impossible to persuade me of the benefits of "special case" overloading.

As to the String thing ? Yes, I realise that we have limited operator overloading with Strings. It's just barely useful enough to justify its conclusion, but I wouldn't be sad to see it go. It causes enough trouble already and I don't want any more.

You should have a go with Haskell btw - it's well worth learning

I've mentally added it to my pile of things to look at. I'm not hostile to new languages. Sadly my "copious free time" is already rather limited.

I find your analogy with garbage collection in the language rather confusing, incidentally - sure, it's a feature of the language, but it removes the need for syntax rather than adding to it. Where you invoke it explicitly, (a) you do it via a normal method call - System.gc(); and (b) it's generally a mistake anyway ! I think perhaps I'm misunderstanding what you mean wrt gc because I really don't follow your argument... ?

The mistake I think you make is that "my domain is the whole world", which is a very understandable stance - but fundamentally against the Java philosophy. Specialized collection manipulations are of use to some people (hence the collection classes) but not to everybody. Those who don't want them can ignore them. This is much less true of a syntactic extension.

I will campaign to keep the core Java language small. I will enthuse about enlargement of the standard and extension libraries, because where possible that is the "right" way to do it.

If you're feeling the pain of writing something in Java, then usually you're either doing it wrong, or Java's just not the right language. It doesn't (and shouldn't) do everything, you know.

Dave.

dcminter at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 12

Did people think it was a natural progression for these structures to become integral to a language.

Nope. There are languages which stress this sort of tool - Perl is an example (hence the assumptions that you were a Perl programmer). Java tends to push the libraries in preference to language extensions.

We have already seen this in garbage collection, partly in Strings; is this what will come next?

As I say I really don't think I understand your point about garbage collections. Strings are hardly a recent addition, and arguably the operator overloading they introduce is a mistake. Certainly a lot of code performs poorly as a result of naive users misunderstanding the behaviour of the + operator.

I think it is, probably not as part of Java though judging by the response from this forum!

I don't think it's something that will get retrofitted to many existing languages, since most of the powerful ones will allow it to be bolted on via libraries or (shock horror) operator overloading. New ones may well choose to incorporate it depending upon how inclusive their philosophy of syntax is.

Dave.

dcminter at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 13

>Nope. There are languages which stress this sort of tool - Perl

>is an example (hence the assumptions that you were a Perl

>programmer). Java tends to push the libraries in preference to

>language extensions.

Yes, but from what I can remember about Perl, it's not OO'ed, I think it only provides vector-type dynamic structures (not maps/sets) and is it weakly typed? It's verging on being a scripting language really. And it's interpreted too isn't it?

However if you look at a language like Haskell for example, it is based on linked lists, maps and [their version of] structures (amongst others), whilst being strongly typed, garbage collected, supports lazy evalutation and is in general very powerful.

Just because a language might support dynamic structures, doesn't mean it has to be like Perl.

>We have already seen this in garbage collection, partly in

>Strings; is this what will come next?

>

>As I say I really don't think I understand your point about

>garbage collections. Strings are hardly a recent addition, and

>arguably the operator overloading they introduce is a mistake.

>Certainly a lot of code performs poorly as a result of naive

>users misunderstanding the behaviour of the + operator.

My point is: Java is essentially a C++ derivative. If you wanted garbage collection in C++ you would have to code some libs for it. In Java though, it was recognised as something that was so useful that it is now integrated as part of the language. There would probably have been a time when this would have been ridiculed because it wasn't efficient enough, but it is now becoming a standard way to handle memory.

Same with Strings, which were always char arrays in the past, now have a certain amount of language integration (through overloaded operators) because they are so useful.

I'm drawing a parallel with vectors, maps and sets, where I'm saying they are so useful and fundamental they could become integral parts of the language too.

I do understand your point about operator overloading. By allowing the user to define their own operators there is the potential to write very confusing code, but if operators are restricted to a particular subset of types, eg. strings, vectors, maps and sets, then I think they can remain intuitive without adding confusion.

For example:

System.out.print("Name: "+name+" Age: "+age);

or:

StringBuffer s = "Name: ";

s.append(name);

s.append(" Age: ");

s.append(Integer.toString(age));

System.out.print(s.toString())

[Would you do it like that 'ish?]

The first example you could argue appears to be more ambigous due to overloading, but you could also argue that even though the operators are overloaded, it is very obvious what is happening and the code is significantly easier to read (and write).

My proposal is to allow similar operations on vectors, maps and sets. The idea partly being that due to the large amount of use of these structures, the extra language support would streamline the a lot of code in a similar way to the above example.

>I don't think it's something that will get retrofitted to many

>existing languages, since most of the powerful ones will allow

>it to be bolted on via libraries or (shock horror) operator

>overloading. New ones may well choose to incorporate it

>depending upon how inclusive their philosophy of syntax is.

I agree.

someideas at 2007-7-3 20:11:31 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 14

I don't think it's quite fair to say that Java is a C++ derivative.

Take a look at this, and note just how many languages Java draws its design from:-

http://www.oreilly.com/news/graphics/prog_lang_poster.pdf

(As a point of interest, that inspires me to learn some Ruby - mongrel ancestry seems to be a Good Thing).

You could as easily say that it "added" curly braces to Ada, or what have you. So I don't think your point about gc is really valid - adding gc was a design decision certainly, but so was every other aspect of Java. As far as I know Java has had it since it was Oak; certainly it has had it for as long as I have been using it.

While it is "part of the language" in a sense, it is no more so than its absence is in C. To argue that we should consider adopting a feature because a language has some other feature strikes me as eccentric. If you argue that garbage collection adds to the complexity of the syntax then I fail entirely to understand you. Your proposals add to the complexity of the syntax - if you consider that to be in doubt, then we're really not talking the same language !

Same with Strings, which were always char arrays in the past, now have a certain amount of language integration (through overloaded operators) because they are so useful.

Sure - but they're not just useful, they're universally useful. This is where I think you miss my point again. You can't really learn Java without learning to use Strings. Almost all "real" applications make use of them. It's therefore justifiable to add a bit of sugar to make their use pleasanter. Even so, it adds enough confusion to be dangerous - watch how many novices fail to comprehend the difference between == and equals() for strings, largely because the + operator has misled them.

Taking your example:

System.out.print("Name: "+name+" Age: "+age);

If I were to use a StringBuffer, I'd do:

StringBuffer sb = new StringBuffer("Name: ");

sb.append(name);

sb.append(" Age: ");

sb.append(age);

System.out.println(sb);

And yes, I would have no particular complaints if that was the only way to build up a string, because I don't find it particularly painful. Perhaps I'm just odd that way. For better or worse we're stuck with the + operator for strings, however, and I acknowledge that I'll have to make the best of it. Sometimes it's convenient.

The idea partly being that due to the large amount of use of these structures, the extra language support would streamline the a lot of code in a similar way to the above example.

Yes, but use of those structures is not universal, and where use is made, it is usually not in the mathematical sense that you propose - on the contrary, most of us use them as slightly smarter arrays.

How do you propose to permit the user to dictate the implementation type of the structures in question ? If you take that ability away, you'll have rendered them useless for many users; arguably as many as would benefit. If you keep the ability, I doubt you will come up with a syntax for doing so that offers any real attraction over the existing one.

However if you look at a language like Haskell for example, it is based on linked lists, maps and [their version of] structures (amongst others), whilst being strongly typed, garbage collected, supports lazy evalutation and is in general very powerful.

Great. Why are you using Java then, if Haskell is the bees knees ? What's it BAD at in comparison to Java ? Whatever those things are, you might like to consider that it could be related to the sort of complexity of syntax that those features add.

Dave.

dcminter at 2007-7-3 20:11:32 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 15

Since we're both arguing from a position of ignorance, I'll drop the discussion of Perl (which is only tangentially relevant anyway) - but I'd like to offer up a couple of points for clarification by the Perl-aware:

1. I believe Perl offers OO constructs.

2. I believe Perl does offer maps and sets (although I think they use slightly different terminology)

3. I think Perl is weakly typed, but may have optionally stronger typing (?)

4. I think Perl is interpreted but with a move to a bytecode planned (?)

Anyone want to correct me on the above ?

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 16

> I don't think it's quite fair to say that Java is a C++

> derivative.

Sure, I'm just trying to illustrate that Java has more "built-in" features that C++ or other older languages, or that it's less "raw".

My point about gc was that it's complex function has been transparently added to the language without the need for libraries. I was drawing a parallel by implying that collections could be transparently added to the language so that libraries were not needed for those either. (I agree gc makes the syntax more simple btw!)

I realise my proposal makes the "grammar" more complex, but I think it makes the code appear less complex (as in the String example).

So I'm saying gc makes the syntax more simple. If I could add vectors together with + and access elements with [], although the grammar might be more complicated, the code would appear more simplified too. (Personal preference of course)

> Sure - but they're not just useful, they're universally useful.

This is a really important point. The reason I suggested all this in the first place is because **I actually consider these structures to be universally useful**. This is really the whole point of my proposal. I find that I use these structures so often that they are universally useful - hence the reason I was trying to suggest a sort of universal [mathematical] syntax. I can't emphasise enough that this is the whole crux of my proposal.

See, I don't see them just as smart arrays. In virtually all complex data structures there is the need for variable size arrays, maps and sets. I find I can describe almost all if not all problems with these representations. To me they have become as universally useful as arrays or strings.

> Perhaps I'm just odd that way.

Well maybe you are a little bit too much anti-"+" !!

Haskell is a functional language as opposed to iterative, which has the consequence that certain iterative-type tasks are hard to code. For example you couldn't effectively program graphics applications with Haskell. It's excellent for tasks such as artificial intellilgence, and searching large search spaces due to lazy evaluation. This is not down to the syntax, but the fact that it's functional - have a look at the Haskell web site if you want a quick description. It's *really* good for improving your design skills btw.

someideas at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 17

I now follow your argument about the GC, but I think that it is wrong, since this is a function of the JVM, not the Java language. The language just omits the library components (malloc) that are required in non-gced languages like C.

This is a really important point. The reason I suggested all this in the first place is because **I actually consider these structures to be universally useful**. This is really the whole point of my proposal. I find that I use these structures so often that they are universally useful - hence the reason I was trying to suggest a sort of universal [mathematical] syntax. I can't emphasise enough that this is the whole crux of my proposal.

The sentence in bold is the one I take issue with. You may find it applies to all the problems that you encounter, but that's not "universally useful" - it's universally useful if it applies to all the problems that everyone encounters ! I have no trouble believing that you use these structures all the time, that you do so in "mathematical" ways, and that you would benefit enormously from the proposed changes. So what ?

I contend that most people do use Lists as slightly smarter arrays, and that they do pretty basic things with Sets and Maps as well. Things that the collection classes - as part of the library - already do very well and very intuitively, since it's the same syntax as the rest of the system uses.

Most people don't write clever AI routines.

So I think the crux of your argument rests on the assumption that people want to, or generally would, use these data structures in the sort of way you're suggesting. Rather than in the sort of way that they already do use them. Which I think is a very specious argument indeed.

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 18

>If I could add vectors together with + and access elements with [], although the grammar might be more

> complicated, the code would appear more simplified too.

Until you read someone else's code, who has overloaded +(vector, vector) to do the other add operator to the one you use (as I don't know whether you mean the output of adding two vectors to be their concatenation or a mapping of the + operator to their elements).

> This is a really important point. The reason I suggested all this in the first place is because **I

> actually consider these structures to be universally useful**.

No-one's arguing about the structures: they already exist in the libraries.

>This is really the whole point of my proposal. I find that I use these structures so often

> that they are universally useful - hence the reason I was trying to suggest a sort of universal

> [mathematical] syntax. I can't emphasise enough that this is the whole crux of my proposal.

Are you using an editor with auto complete? I've worked in plenty of languages, but Java tends to be verbose so a/c is really essential. If you work with it for a while, you forget about it. It's better than ambiguity.

> For example you couldn't effectively program graphics applications with Haskell.

You can; use a monad for the state of the GUI device, a declaritive representation of the graphics to display, and having lazy functions for event handling is quite useful. There's papers at http://www.galois.com/~antony/ on GUI development using Haskell and SVG.

Pete

PeteKirkham at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 19

> I now follow your argument about the GC, but I think

> that it is wrong, since this is a function of the JVM,

> not the Java language. The language just omits the

> library components (malloc) that are required in

> non-gced languages like C.

Well, you can split hairs if you want - my point is that something that would always have been done with libs in the past is now integrated into the language/VM. I accept that in terms of syntax, it has reduced complexity, but that isn't my point. Incidentally, it has affected the language, because now there is no delete operator.

> The sentence in bold is the one I take issue with. You

> may find it applies to all the problems that you

> encounter, but that's not "universally useful" - it's

> universally useful if it applies to all the problems

> that everyone encounters ! I have no trouble

> believing that you use these structures all the time,

> that you do so in "mathematical" ways, and that you

> would benefit enormously from the proposed changes. So

> what ?

Ok, when I say "I" I guess I actually mean "one". I've done hundreds of different designs in my time and I find that these structures are just useful across the board (I'm not just talking about AI btw). Take vectors for example, how often do you require that functionality? Unless you're doing some Noddy design, they just crop up *everywhere*, same with maps and sets, although slightly less than vectors.

> So I think the crux of your argument rests on the

> assumption that people want to, or generally would,

> use these data structures in the sort of way you're

> suggesting. Rather than in the sort of way that they

> already do use them. Which I think is a very specious

> argument indeed.

No, not quite. The crux of my argument is that people use them *so often* that they deserve a (pardon the description) "universal mathematical syntax", which would serve not only to streamline code, but also to promote a universal syntax, which in turn promotes ease of code understanding.

In all fairness, looking at other people's code I do find that people actually *don't* use them that often! But, before you laugh at my contradiction (!) when I look at their code, what I find is a lot of complex code, that is simply doing exactly what a vector/map/set could do. It almost makes me laugh really how many times people have written their own [substandard] vector and map libraries! I also feel that integrating these structures into part of the language would actually promote their use and guide people in the direction of writing universally comprehensible code. It would also promote a universal cross-language syntax, which I believe is a good thing.

Let me just re-iterate that I'm not in any way complaining about the current Collection classes, I am attempting to promote a universal mathematical-type syntax for what I consider to be universally useful data structures with the idea of promoting streamlined universally comprehensible code.

someideas at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 20

> Until you read someone else's code, who has overloaded

> +(vector, vector) to do the other add operator to the

> one you use (as I don't know whether you mean the

> output of adding two vectors to be their concatenation

> or a mapping of the + operator to their elements).

Yes, but if you read one of my earlier posts, I made exactly that comment. I am saying that these operators should be restricted to the structures in question (as they currently are in Strings), in which case there would be no confusion because this would be a universal syntax, not a user-definable one.

What I'm suggesting is pretty similar to the overloaded + operator for Strings, and nobody (except maybe Dave) complains about that - infact I would expect most people would think it was pretty useful.

> Are you using an editor with auto complete? I've

> worked in plenty of languages, but Java tends to be

> verbose so a/c is really essential. If you work with

> it for a while, you forget about it. It's better than

> ambiguity.

Yeah, they're useful, but I'm also talking about the readability/comprehensibility of code, not just the time it takes to write it. I refer you to the String example again.

I guess we just disagree on the ambiguity thing. I understand your reasoning, but I also think that most people are capable of easily identifying what an overloaded operator does. It doesn't have to be spelled out to them. I mean do you ever sit there thinking does "cout << blah" mean shift cout left by blah? No, it's obvious, and it's probably easier to read than the long-hand version.

> You can; use a monad for the state of the GUI device,

> a declaritive representation of the graphics to

> display, and having lazy functions for event handling

> is quite useful. There's papers at

> http://www.galois.com/~antony/ on GUI development

> using Haskell and SVG.

Yeah, you can do it, but it's nowhere near as easy as using an iterative language, and the performance would be very poor. I was actually thinking of primitive rendering operations btw, eg. drawing triangles/texture mapping etc., not just using some graphics library. Anyway, it's all probably about as relevant as the Perl stuff!

someideas at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 21

I accept that in terms of syntax, it has reduced complexity, but that isn't my point.

No, but I'm prepared to accept changes (and this isn't even a change) to the language that actively remove syntax much more readily than those which add to it !

It almost makes me laugh really how many times people have written their own [substandard] vector and map libraries!

I don't think that's a great reason to provide them as part of the language. I doubt that the code in question "...that is simply doing exactly what a vector/map/set could do" would be hard to write using the existing libraries, so I don't see the need for syntaxtic extensions. If the user doesn't bother to use the libraries that's just tough. What makes you think those users will bother to learn the special set/map/vector syntax ? There are plenty who don't understand the conditional operator !

I am attempting to promote a universal mathematical-type syntax for ...

And I contend that as Java is not a language primarily used by mathematicians, there is no call for it. I don't think either of us are going to budge from our established positions on this subject, but an enjoyable discussion,

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 22

nobody (except maybe Dave) complains about that - infact I would expect most people would think it was pretty useful.

I see the usefulness of it, but I also see abominations born of it where users fail to understand the underlying mechanism that is the result of its use. A mechanism which is really incredibly simple.

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 23

It doesn't have to be spelled out to them. I mean do you ever sit there thinking does "cout << blah" mean shift cout left by blah? No, it's obvious

How many new users of Java could spell out to you exactly what the string plus operator "does". They can tell you the result (roughly), but not the mechanism.

I worked with an awful string library that had over-ridden the pointer operator in C++. I forget the details, thank god, but it was a nightmare to debug because it always looked like it was doing something completely other than what it was.

So no, I don't think that "it's just obvious" is a safe assumption.

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 24

> I mean do you ever sit there thinking does "cout << blah" mean shift cout left by blah?

Not for cout, but for writing packed binary data to a stream it is confusing.r = s << ((b1 << 4) | b0;

> Yeah, you can do it, but it's nowhere near as easy as using an iterative language, and the performance would be very poor.

Knowing people who write Haskell compilers for a living, I know better than to make speculative remarks as to the performance of their wares.

> I was actually thinking of primitive rendering operations btw, eg. drawing triangles/texture mapping etc., not just using some graphics library.

Since when was texture mapping a primitive rendernig operation? You need to go through a library equivalent to GL anyway. A lot of modern hardware has driver libraries that operate in terms of shape primatives similar to svg; it's generally possible to be efficitent.

> Anyway, it's all probably about as relevant as the Perl stuff!

No, you can compile Haskell to JVM code; you cannon compile Perl. If you want a pure functional language for the JVM, people are using that.

Pete

PeteKirkham at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 25

I don't think that's a great reason to provide them as part of the language. I doubt that the code in question "...that is simply doing exactly what a vector/map/set could do" would be hard to write using the existing libraries, so I don't see the need for syntaxtic extensions. If the user doesn't bother to use the libraries that's just tough. What makes you think those users will bother to learn the special set/map/vector syntax ? There are plenty who don't understand the conditional operator !

True!

I guess I am saying that if one could do something "along the lines of":

[char] v = new [char]; // If [...] were representing a "built-in vector".

[char] w = new [char];

v += 'a';

w += 'b';

w += 'c';

v = v+w;// Or "v += w"

for(int i=0;i<v.length;i++) System.out.print(v); // abc

...then similarity with array syntax, ease of use, and without the need to delve into libraries may promote the use of these structures where previously a user may have been more inclined to use "char[MAX_CHARS]" sort of thing, which is generally not very future proof.

Neither of us are going to budge are we?! You don't like it, but I think it's easier to read and more streamlined. Oh well!

I am attempting to promote a universal mathematical-type syntax for ...

And I contend that as Java is not a language primarily used by mathematicians, there is no call for it. I don't think either of us are going to budge from our established positions on this subject, but an enjoyable discussion,

Just a quick note: I'm using the term "mathematical" not really from the point of view of a mathematician, but simply to describe a more "symbolic" syntax, as opposed to the current "textual" method calls.

Interesting discussion, like you say.

Pete,

Not for cout, but for writing packed binary data to a stream it is confusing.

r = s ><< ((b1 << 4) | b0;

I agree, you can think of some obtuse cases, but I still retain that 99% of the time it's not a problem. If you put your variables in context with their types:

ostream s, r;

int b1, b0;

r = s << ((b1 << 4) | b0;

it's pretty obvious what's going on.

Personal preference though, of course.

Quick aside, because it's not really relevant:

Knowing people who write Haskell compilers for a living, I know better than to make speculative remarks as to the performance of their wares.

I'm sure their compilers are excellent, however because Haskell is a functional language, iterating over large sequences of data is inherently slow as each iteration requires a function call, it's just a feature of the language.

Since when was texture mapping a primitive rendernig operation? You need to go through a library equivalent to GL anyway. A lot of modern hardware has driver libraries that operate in terms of shape primatives similar to svg; it's generally possible to be efficitent.

Er, since the dawn of time. Believe it or not, before OpenGL and DirectX, people actually wrote their own texture mapping routines! Or, if OpenGL isn't available, such as in an embedded environment, it may be necessary to write your own primitive renderers (as I've just had to do). You can't write an effective renderer in a functional language for the reason above. It's too slow to iterate over a few thousand/million pixels with recursive fn calls.

No, you can compile Haskell to JVM code; you cannon compile Perl. If you want a pure functional language for the JVM, people are using that.

That's quite interesting actually. I don't want one however - I was simply using Haskell to illustrate particular types of syntax and functionality.

someideas at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 26

True!

I guess I am saying that if one could do something

"along the lines of":

[char] v = new [char]; // If [...] were representing a "built-in vector".

[char] w = new [char];

v += 'a';

w += 'b';

w += 'c';

v = v+w;// Or "v += w"

for(int j=0;j<v.length;j++) System.out.print(v[j]); //

// A quibble, but the following allows me to specify the implementation

// I wish to use, which your suggested syntax does not.

List><Character> v = new ArrayList<Character>();

List<Character> w = new LinkedList<Character>();

v.add('a');

w.add('b');

w.add('c');

v = new Vector<Character>(v);

v.addAll(w);

I just don't see any massive advantage of your syntax over "mine", which is the existing one. In fact in that last case, I hugely prefer the existing approach as being quite explicit about what you're assigning to v (a new List instance containing the old contents of v).

Neither of us are going to budge are we?! You don't like it, but I think it's easier to read and more streamlined. Oh well!

More "streamlined" and more ambiguous. Nope, not shifting from this here position.

Just a quick note: I'm using the term "mathematical" not really from the point of view of a mathematician, but simply to describe a more "symbolic" syntax, as opposed to the current "textual" method calls.

I get that, but I think the majority of users are more comfortable with the "textual" approach.

because Haskell is a functional language, iterating

over large sequences of data is inherently slow as

each iteration requires a function call

Does that automatically follow ? Surely some sort of cunning inlining is possible ? Maybe even spotting the pattern for iteration and converting it to a plain old loop ? Just curious, and yes, I may be completely wrong about that.

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 27

Incidentally, in that last case:

v = v+w;

It wasn't especially obvious whether you meant:

v = new ArrayList<Character>(v);

v.addAll(w);

or

v = new ArrayList<Character>(v);

v.add(w);

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 28

> > v = new ArrayList<Character>(v);

> v.add(w);

>

The first one was what I meant.

Wouldn't this ^ one cause a type mismatch error at compile time because you're trying to add an ArrayList<Character> as a single element to an ArrayList<Character>. You could only add(Character) in this case. Do you agree, or am I talking rubbish? (Assuming we are using 1.5)

Your comment about being able to choose the implementation is a fair point, and definitely something I'd concede. My reply to that is that I find that virtually all the time it is adequate to use a normal vector. There are particular instances when it is clearly better to use a linked list, in which case you would use a particular library. However, I suppose that then this might not be as neat as having everything derived from Collection - something for me to think about.

I just don't see any massive advantage of your syntax over "mine", which is the existing one. In fact in that last case, I hugely prefer the existing approach as being quite explicit about what you're assigning to v (a new List instance containing the old contents of v).

Well, my "v = v+w" stuff would equate to simply:

v.addAll(w);

Or I could have written "v += w" (as I put in the // comment)

If you wanted a new copy of v, perhaps you could do:

u = new v;

If it seems like I'm making it up as we go along, that's because I am! I'm just trying to illustrate the type of syntax I'm suggesting. My advantage as I see it is more compact syntax, and a more universal notation. We're going round in circles... aren't we?

Does that automatically follow ? Surely some sort of cunning inlining is possible ? Maybe even spotting the pattern for iteration and converting it to a plain old loop ? Just curious, and yes, I may be completely wrong about that.

Well it's a long time since I wrote any Haskell, but there's no such thing as a "loop" as such. Essentially you can think of it as each function being written only on a single line, and therefore you can't do iterative-type operations. Data is stored in linked lists and they are iterated through using recursion; v. simple eg:

total (a:as) = a + total as

in "iterative" speak:

int total(list<T> as)

{

return as.head()+total(as.tail());

}

but there's no way to do:

int total(list<int> as)

{

int t = 0;

iterator i = as.begin();

while(i!=as.end())

{

t += *i;

i++;

}

return t;

}

Neat, but slow. I'm sure I'll now be beaten into submission by an expert Haskell programmer who can demonstrate a far better way to do it!

someideas at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 29

> The first one was what I meant.

> Wouldn't this ^ one cause a type mismatch error at

> compile time because you're trying to add an

> ArrayList<Character> as a single element to an

> ArrayList<Character>.

True but you could be manipulating List objects instead of Character objects and it would be ambiguous. And if you don't have a type that represents the "primitive" lists, then you wouldn't be able to manipulate lists which would be a handicap in itself (I have lisp background, you can tell, can't you ?)

> Well, my "v = v+w" stuff would equate to simply:

> v.addAll(w);

Then it would be inconsistent with the behaviour of the + operator with respect to Strings. I thought you were making the language clearer ?

> If you wanted a new copy of v, perhaps you could do:

> u = new v;

Yuck.

> If it seems like I'm making it up as we go along,

> that's because I am! I'm just trying to illustrate the

> type of syntax I'm suggesting.

But since my point is that I think your suggesions would damage the language, I don't see that as much of a defence. Sure "some wonderfully unambiguous and elegant syntax" would improve things, but if you don't give me concrete examples, you don't have much of an argument, and if your concrete examples don't stand up to scrutiny, you still don't have much of an argument.

> My advantage as I see

> it is more compact syntax, and a more universal

> notation.

I don't value compactness particularly. Java is a verbose language. What do you mean by "universal" here ? No other language that I'm familiar with uses your proposed notation. Yes, it might some day in the future if it caught on, but in that case the existing collection classes are equally "universal".

> We're going round in circles... aren't we?

Present me with a syntax which is unambiguous and really enhances the language, and I might change my mind. Without that, no, there's very little chance I'll stop thinking that the idea really sucks.

> Well it's a long time since I wrote any Haskell, but

> there's no such thing as a "loop" as such.

Yes, but a compiler which compiled Haskell into more Haskell would be, um, of limited use. I'm assuming that Haskell can be targetted to any platform (since someone mentioned the JVM as one possible target), in which case the underlying platform is more likely than not to have JMP or its equivalent. Surely ? So if you can spot the Haskell iteration pattern you can optimise it to a plain old loop in the native platform. I'm not saying that's easy or indeed possible, but it doesn't follow (to me) that because Haskell has no notion of a loop directly that it's impossible to optimise it into one.

Dave.

dcminter at 2007-7-3 20:11:35 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 30

> Er, since the dawn of time. Believe it or not, before OpenGL and DirectX, people actually wrote their own

> texture mapping routines!

Evidently we're using different definitions for primitive; this [url=http://www.swif.uniba.it/lei/foldop/foldoc.cgi?primitive] definition[/url] fits with my understanding. Twenty years ago when I started doing machine code graphics, it took more than a 'few machine instructions' to code them, and (even though that may be before the dawn of time for you) they were not primitives - you could write code using the primitive operations to do them. In the last ten years many of these graphics functions have been implemented in hardware, which makes your comments as to performance differences between languages even less relevent.

> You can't write an effective renderer in a functional language for the reason

> above. It's too slow to iterate over a few thousand/million pixels with recursive fn calls.

Google for 'tail call elimination'. There is no difference in the compiler output between iterative calls and recursive calls that don't allocate stack space; if you can write an iterative algorithm that is constant space, then there is a tail call recursive one that can be compiled to identical machine instructions.

Pete

PeteKirkham at 2007-7-3 20:11:38 > top of Java-index,Other Topics,Java Community Process (JCP) Program...
# 31

I know loads of Java coders who would have run a mile on day 1 if we had to explicitly use StringBuffer. Luckily, in 5.0, we have varargs methods, so we could have just written a varargs print() method, or perhaps added String.cat() to do the appending giving you syntaxes like:

System.out.print("Name: ", name, " Age: ", age);

System.out.print(String.cat("Name: ", name, " Age: ", age));

Personally, I find the String + and += operators to be a major source of performance losses - but then the comipler could be much smarter about how to treat "a" + "b" + "c" I suppose.

> Taking your example:

> > System.out.print("Name: "+name+" Age: "+age);

>

> If I were to use a StringBuffer, I'd do:

> > StringBuffer sb = new StringBuffer("Name: ");

> sb.append(name);

> sb.append(" Age: ");

> sb.append(age);

> System.out.println(sb);

>

> And yes, I would have no particular complaints if that

> was the only way to build up a string, because I don't

> find it particularly painful. Perhaps I'm just odd

> that way. For better or worse we're stuck with the +

> operator for strings, however, and I acknowledge that

> I'll have to make the best of it. Sometimes it's

> convenient.

> Dave.

Matthew_Pocock at 2007-7-3 20:11:38 > top of Java-index,Other Topics,Java Community Process (JCP) Program...