Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I dislike Modula-2's full upper-case reserved word syntax. I understand the reasoning behind it: to make language constructs consistent and stand out more, but then, upper-case is harder to read.

Make no mistake, I use that syntax when writing SQL, and it's very fitting there because the queries are short and I can't tolerate typos due to lack of a compilation step. But, in case of a compiled language, especially with the advent of syntax highlighting and edit-time analysis capabilities, I find it a nuisance.



Uppercase for keywords was a fairly common style at the time even in languages where they were case-insensitive (e.g. Pascal, BASIC, dBase). Which makes sense when you consider that syntax highlighting was still several years away, and it would take even longer for it to become widespread. Turbo Pascal wouldn't get syntax highlighting until version 7.0 - that's 1992!


Turbo Pascal for Windows got it first.


> [The reasoning behind] Modula-2's full upper-case reserved word syntax [is] to make language constructs consistent and stand out more

I suspect it’s just inherited from ALGOL 68, where you were allowed to use names that coincide with language keywords at the cost of using one of a number of “stropping” conventions[1] to distinguish one from the other: surround each keyword with apostrophes (whence the term) or precede it with a period (if you’re using a single-case character set), type keywords in uppercase (if you’re not), or typeset them in bold (if you’re writing a paper—recall the original ALGOL problem statement was “like pseudocode from our papers but executable”).

[1] https://en.wikipedia.org/wiki/Stropping_(syntax)


That might as well be the case, but Wirth did Pascal eight years before Modula-2, yet the former feels way more modern.


Not really, unless you're talking about Extended Pascal or the several dialects.

ISO Pascal as Wirth originally designed it, only was modern in the sense of being much easier than ALGOL to implement and master, as learning language.

Meanwhile Modula-2 was designed from the start to be a type safe systems programming language, based on Wirth's learnings with Mesa at Xerox PARC.


I could be completely wrong on this one, but Pascal not caring about case would've made it more available on machines where lower case was not available, but taken for other needs (like cyrillic). Not sure if it would've made difference (given the COCOM and Eastern Block), but my Apple ][ clone had cyrillic instead of lowercase letters...

Or maybe my memory serves me wrong, and lower-case latin were not cyrillic, but somehow that's what I remember (was long time ago)...


Yes, that's called KOI-7.


Thank you so much, specifically I think I've had this on my Pravetz 8C (Apple ][/e clone) - https://en.wikipedia.org/wiki/KOI-7#KOI-7_N2


Personally, I don't use it even in SQL if I am starting from a clean slate...


> to make language constructs consistent and stand out more

That sort of thing also makes parsing easier from the compiler side; you can look at the string token and know without context whether or not it's a keyword.


Nope, compilers couldn't care less - it's much faster to hash keywords into the symbol table and just do a single lookup.

The upper case is all about human readability. I suspect ALGOL's heavy reliance on named keywords (unlike C's more extensive use of symbols) relates to the inconsistencies of character sets etc at the time of the ALGOL's definition.

IMvhO, Turbo Pascal (and Turbo Modula-2) would have been much nicer with a C-like (or Rust-like) syntax.

Disclaimer: I worked professionally in Turbo Pascal for a few years.


Algol (both 60 and 68) had various representations. The reference one, used in the spec, would use formatting to indicate things such as keywords, and used the conventional mathematic and engineering symbols where they were appropriate. Here's a bit of Algol-60 grammar:

   <arithmetic operator> ::=
       + | - | ⨉ | / | ÷ | ↑

   <relational operator> ::=
       < | ≤ | = | ≥ | > | ≠

   <logical operator> ::=
       ≡ | ⥰ | ∨ | ∧ | ¬
Most of these characters were not in contemporary character sets (although some later charsets would have symbols added specifically for Algol after it became more common). Given how much variation there was at the time, Algol designers didn't even try to solve that problem - they simply said that specific implementations of Algol would map the reference representation to some hardware-specific form, to be documented by the implementation. Those hardware representations would often use keywords for some of the operators.


Here's a table in a 1966 book showing the different keywords across implementations: https://archive.org/details/breuer-dictionary-for-computer-l...

(It uses a weird glyph for NOT that I've never seen elsewhere, probably a limitation of typesetting.)

Edit: Interesting to note that the English Electric KDF9's Algol used * at the beginning of all keywords, I suppose that's to simplify parsing, or maybe just to avoid reserving words.


Algol-60 didn't actually have the notion of keywords as such in the modern sense. The language grammar treats stuff like "if", "for" etc as fundamental terminal symbols without specifying how they're to be parsed distinctly from identically spelled identifiers. Each representation is supposed to come up with a way to make that distinction; for example, the reference representation uses bolding and/or underlining to distinguish keywords.

So, for hardware representations, which are all linear lists of characters without formatting, they had to use some kind of escape sequence - https://en.wikipedia.org/wiki/Stropping_(syntax). The term itself comes from the most popular syntax for this, which was to put keywords in single quotes / apostrophes, but there were many other variants, including some identical to how we handle keywords today, as seen from this table.

This approach also allowed for Algol programs to be "translated" to a language other than English in a sense that a representation could be defined that used native words for keywords. This was actually used to some extent in Europe, and especially in the USSR.


That not symbol looks like a stand-in for ¬. Nearest I can find is: https://en.wiktionary.org/wiki/⁊ (Tironian et). If that is the case it's amusing that it actually means "and".


looks exactly like U+29A2 https://en.wikipedia.org/wiki/Miscellaneous_Mathematical_Sym... Turned angle to me


Early character sets were indeed very limited and inconsistent in many ways, which contributes to the keyword-heavy style of many early languages. (The other factor behind it was the sheer amount of features that were provided by the language itself, as opposed to library code; minimalism was not generally favored. This in turn encouraged "fancy", hard to parse syntax for those custom features.)


It’s not hard to see the reason - bolting on a feature in the compiler is way easier than implementing a language that allows stuff like this to be implemented in a library. I imagine that memory limitation also contributed to those decisions.


The very concept of "library" would take a while to evolve. The first ones were literally libraries: you'd come, browse to find code for the algorithm you need, and reuse that code - by manually copying it.

Algol-60 itself didn't really have any separate compilation facilities, either, and some language semantics (e.g. call-by-name) complicate matters if you try to tack it on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: