The most striking feature about occam's syntax for newcomers to the
language is that it uses significant indentation: rather than delimiting
code blocks with {}
like C or Perl, the parser simply looks for
changes in indentation level. Since good programmers indent their code
anyway, this reduces redundancy and visual clutter.
occam isn't alone in this: the popular modern languages Python and Haskell are also indentation-based. Their syntaxes have some nice features that occam currently doesn't. I'd like to revise the occam syntax so that it's comparable to more recent indentation-based languages.
I don't want to change the semantics of the language at all; I'm just proposing an alternative syntax for the existing occam-pi language. The intention is to borrow the useful syntactical features from newer languages, and hopefully make occam look a bit less alien to new programmers at the same time.
A few really useful features don't fit into this scheme very well at the
moment: replicated IF
s, extended channel inputs, VALOF
expressions.
I'll need to think some more about how these could be represented.
I'd also like to come with with an example bit of occam code that uses all the features here, and mutate it as I work through the suggestions.
(One of the suggestions here was to add ASSERT
to the language, until Fred
pointed out that KRoC already has it!)
Flexible indentation
In occam, each indentation step must be two spaces.
WHILE foo
SEQ
bar ()
IF
condition
baz ()
Python and Haskell don't care, provided you're consistent between lines in the same block. Python counts tabs as eight spaces; some people have suggested making it complain if you mix tabs and spaces.
WHILE foo
SEQ
bar ()
IF
condition
baz ()
(Not that I'd actually want to indent code like that!)
Lowercase keywords
These days, most languages don't make you SHOUT ALL THE TIME. Modern syntax-highlighting editors differentiate keywords by colour, so there's no particular need for them to be capitalised any more.
while foo
seq
bar ()
if
condition
baz ()
Simpler IF
syntax
The occam IF
syntax is elegant for complicated stuff, but for simple
usage it's a bit verbose:
IF
condition
do.something ()
other.condition
do.something.else ()
TRUE
SKIP
A Python-style syntax could write this as:
if condition:
do_something()
elif other_condition:
do_something_else()
else:
skip
Or perhaps even have the compiler insert the skip
clause
automatically; when you want the old behaviour, you can always add an
else: stop
clause yourself.
Changing what colons mean
occam uses colons to indicate that a declaration is in force for the next block:
INT foo:
BOOL bar:
SEQ
c ? foo
do.something (c)
Python uses colons after statements that need an indentation increase after them:
while foo:
bar()
if x == 3:
print "foo"
else:
print "bar"
In neither case are the colons actually needed (the Ruby language is pretty similar but has neither, for instance), but I find the Python style to be a bit more readable.
Implicit SEQ
s
Current occam requires you to insert SEQ
or PAR
whenever you have
multiple processes:
WHILE condition
SEQ
do.one.thing ()
do.another.thing ()
IF
other.condition
SEQ
do.third.thing ()
do.fourth.thing ()
A quick look at the occamnet code shows that I use about three times as
many SEQ
s as I do PAR
s. It'd be possible to make occam code rather
more compact by assuming that any set of multiple processes is wrapped
in a SEQ
unless it's already wrapped in a PAR
. I suspect this'd be a
controversial change, because it could result in programmers thinking
less about opportunities to parallelise their code; we'd have to decide
whether the increased readability makes it worthwhile.
WHILE condition
do.one.thing ()
do.another.thing ()
IF
other.condition
do.third.thing ()
do.fourth.thing ()
You'd still need to have SEQ
in the language, so that you can do
replicated SEQ
s or force an extra internal scope. It may also be
necessary to come up with a different syntax for extended inputs, which
need to have two code blocks specified (without an implicit SEQ
).
Better syntax for INITIAL
occam-pi lets you initialise a variable when you declare it. However,
initialising a variable currently looks more like a VAL
declaration:
INT foo:
INITIAL INT bar IS 4:
VAL INT baz IS 5:
I'd prefer to have a syntax that looks more like a variable declaration:
INT foo IS 4:
INT foo := 4:
Underscores in variable names
occam, unlike pretty much every other programming languages, allows dots in variable names.
INT my.string:
Many languages use underscores instead for the same purpose.
int my_string
Field access
occam uses array-like syntax to refer to fields in structures.
packet[ip.id] := 3
C-derived languages use dots.
packet.ip_id := 3
C-style assignment and equality operators
occam uses the same scheme as many 70s-era languages for setting and comparing variables.
num := 3
IF
num = 3
print ("occam works!")
num <> 3
print ("occam doesn*'t work!")
TRUE
print ("occam really doesn*'t work!")
Modern languages tend to use a set of operators derived from C instead:
num = 3
if num == 3:
print("occam works!")
elif num != 3
print("occam doesn't work!")
else:
print("occam really doesn't work!")
C-style string escapes
occam (following Modula?) uses *
to escape characters in string
constants (and complains if you don't escape '
characters).
print ("Hello, world!*c*n")
Modern languages follow the C conventions.
print("Hello, world!\r\n")
It's also unusual to use \r\n
as a line-ending sequence on the systems
that occam's used on these days; I'd rather just be able to write \n
.
(As with C, you could have \n
translated to the appropriate
line-ending sequence on platforms that use \r
or \r\n
.)