May 2020 – Jacob Goldberg

I’ve had some free time lately, and I found a project for the past few days to keep me occupied. It all started in a group chat when someone brought up the idea of building his own programming language where as much as possible is symbols, primarily various types of brackets. Every command is in some sort of brackets (sort of like Lisp), and everything not in some sort of bracket is a comment. An example program (provided by the author) is this:

(#A)(&(A>12)*(A-=3)!(A+=2))(A>>)

To break it down, the program does the following:

(#A) reads A from standard in and requires A be a number
&(A>12) represents an if statement, where the condition is A>12
- *(A-=3) means that if the condition (A>12) is true, then subtract 3 from A
- !(A+=2) means that if the condition is false (i.e. A <= 12), then subtract 2 from A
(A>>) means print A to standard out

Arrow

This inspired me to build my own programming language. I was trying to figure out what the most ridiculous way to draw out control structures was, and after thinking for a while, I decided to require the programmer to use text to represent the control flow of the program by drawing everything with arrows. Thus, I called the language Arrow (mostly because that was the first name I could think of that I couldn’t find an existing language for). It is available on my GitHub for anyone who wants to read more about it or maybe try using it. You can look in “primesieve.arw” and “string.arw” for some examples of functions. There are two basic control structures: if and loop.

If statement

/--< false condition
| body
| more body
\-->

In that case, if the given condition is false, then the body is skipped over, indicated by the arrow. Here’s a basic example:

bool even
even = true
/--< x % 2 == 0
| even = false
\-->

In that case, if x is even, then the variable “even” will be set to true. How this works is that if x modulo 2 is zero (i.e. x is even), then the body of the if statement will be skipped and even will retain its original value of true. However, if x is odd, then the body of the if statement will not be skipped over and x will evaluate to false.

Loop

/-->
| loop body
| more loop body
\--< continue condition

The loop functions as a standard do-while. I couldn’t find a way to satisfactorily represent a standard while with arrows, so I gave up and just did do-while (you can always skip over it with an if statement if the condition is false if you want a real while loop). The arrow indicates that if the continue condition is true, go back to the top. A basic example of this:

int counter
counter = 0
int base
base = 2
/-->
| base = base * 2
| counter = counter + 1
\--< counter < 5

That code would calculate to the fifth power by looping until the counter reaches five, and at each iteration multiplying base by two.

Functions

I also couldn’t be satisfied with a programming language without adding functions. Functions also use a syntax that involves visualizing the control flow with text. An important nuance here is that functions must only have one return at the very end. Also, functions cannot be void (though I may implement that in a future update it time permits, and you can ignore the return value of a function). A basic example:

function
/--> int five()
|
^ 5

As you might imagine, that’s a function that returns five. The key idea here is how the return is specified (and code body goes above the return). The return (^) points back towards the function name. An important thing to note here is that you can only return from a function in one place (which must be the last instruction in a function body). Similar to how there are no break/continue statements, this requires the control flow to be visually obvious and have no jumping around, keeping the flow visible with the arrows.

Putting this together

With these frameworks, programs can be put together. A simple example of a loop inside a function is a function that calculates an integer square root (or the ceiling if the number is not a perfect square):

function 
/--> int squareRoot(int x)
| int candidate
| candidate = 0
| /--> //repeat the loop until the candidate squared is greater than or equal to the input
| | candidate = candidate + 1
| \--< candidate * candidate < x
^ candidate

You can get more complicated, since control structures can be nested. To make this more fun, here’s a function that detects if a number is prime:

function
/--> bool isPrime(int x)
| bool result
| result = true
| /--< x < 3
| | int factor
| | int ceiling
| | ceiling = squareRoot(x)
| | factor = 2
| | /-->
| | | /--< x % factor != 0
| | | | result = false //if the factor divides the input, then the input is not prime
| | | \-->
| | | factor = factor + 1
| | \--< factor < ceiling and result //bail out as soon as we see a factor
| \-->
^ result

That function (which uses the square root function from earlier) loops from 2 to the square root of a number and sees if any of those values divide the number. If so, it returns false, and if not, returns true.

Lexer Framework

In the process of developing this, I built a fairly powerful lexer framework that can, given a language spec built with objects, perform the lexical analysis to generate tokens automatically. It’s definitely not the most efficient tool (it uses greedy matching and a ton of string operations), but it was a good exercise in object-oriented design and writing reusable tools. The basic building blocks are as follows, and everything except “fixed string” recursively nests on other token types.

Fixed string
Fixed sequence
Repeated
Multiple options

Given rules using these elements (as well as a few other rules on how to actually get a useful result, such as what type to attach to each token and when to combine underlying tokens), the lexer can generate the tokens for a program. A few simple examples:

Digit is multiple options of the fixed strings “0”, “1”, … , “9”
Unsigned integer is repeated digit at least one time
Identifier is sequence of letter followed by repeated alphanumerics zero or more times

Type System

Given that I am a very strong believer in strong typing, I created a type system for Arrow. The three base types are int, char, and bool, and arrays of any dimension can be created from those types. Everything is copied/passed by value (and arrays are deep copied). Strings are just arrays of characters, with the length representing the length of the string. Every variable starts out uninitialized, and attempts to read an uninitialized variable will result in an error. Arrays also can be either initialized or uninitialized. To initialize an array, dimensions must be specified in the declaration (for example, int[] x[5] would initialize an array of length five). Once an array is initialized, each cell is an uninitialized value.

Strings are also a part of the language, with a string being an array of chars. Unlike C which uses null-terminated strings, since arrays are a well-defined type in Arrow (as opposed to C which just has a pointer to the head of the array), the length of the array determines the length of the string.

Scoping and static semantics

Using mostly what I learned in Programming Language Concepts class, I was able to implement static semantics fairly easily. I designed the call stack (and the static symbol stack) to have both a static link and a dynamic link, with the static link being used for scoping and dynamic link being used for a return location. I also was able to implement static type synthesis on the parse tree using techniques I learned in that class. Currently there are no global variables, but with how the scoping system is implemented I could add those with minimal effort.

Input/Output

Though it almost came as an afterthought, input/output were added to the language. Output is pretty simple, using the print keyword (note that it’s not a function, but instead a language construct). Multiple things can be printed by separating them with commas. Input is a bit more tricky, and currently input can only be taken to integers and strings. To receive input, use the input keyword than specify a type (for example, “input int” or “input char[]“). Input commands can be used in expressions like a regular function call, but the argument given is a type and not a variable.

Compiler Design

The overall high level design was taken from what I learned in Compiler Design class. In particular, having a separate lexer, parser, parse tree, and backend (interpreter).

I may have accidentally created something useful

I intended the Arrow language mostly to be a joke as well as maybe something to keep me occupied. However, as I was developing it I realized that it may actually have some useful value. Certainly it’s not a great tool for developing real-world applications, but the visualized control flow may be a useful tool for teaching control structures or helping visualize algorithms. The control flow graph is almost built into your source code here. The lack of any sort of GOTO statements (throw, return, break, continue) also means that the control flow has to be painfully obvious and can only be shown with the control structures that require the control flow to be directly visible.

If anyone creates something useful, I’d love to see it!

Wow, the past few weeks have been crazy. Quite a lot has happened. This was definitely not the scenario I imagined even a few months ago, but despite quite a whirlwind of things happening, I managed to make it work.

Graduating

Immediately after visiting Chicago to scope out apartments. I had to go back home for the remainder of the semester due to the COVID-19 pandemic. I was lucky enough to be in among the least-affected classes. For the most part, we just continued doing what we were before (mostly a combination of lectures and projects) online.

One interesting twist was the component where I was teaching. I was a teaching assistant for Software Craftsmanship, and the primary component of my job was leading code reviews every week for students’ programming assignments.

Then, the graduation ceremony was pretty much a complete non-event. Some videos were posted, and we each got a slide. I’ll get my diploma in the mail in a few weeks. I managed to graduate Summa Cum Laude (for CWRU that means top 10% of my class), so that was good to see. They unfortunately removed my submission for my slide. I’m not sure why, since it wasn’t anything inappropriate or rude or anything. It was just a picture of me sitting with a cat in my lap and the line “It’s been a crazy four years”. Oh well.

Moving

I have now moved to Chicago! When on my apartment scouting trip earlier, I found one apartment that I knew was my best option. The apartment had basically everything: a good location close to the L, lots of room, on-site laundry, the property management gave me a good impression of how they handled things, and an additional small room that would be good for an office. As an added bonus, the tenants living there at the time I toured the unit had a cat, so I immediately knew this would be a good apartment to have my own cat (coming soon, I hope!) After I finished with apartment tours, I went to the property management office to apply in person, and a few days later got a call telling me I was approved.

Fast forward to May, with some (a lot of) help from my parents, we loaded my stuff into a van and drove up to Chicago. Before actually moving in, I snapped a few quick photos of the empty apartment to document what it looked like without my stuff:

With that, we got to work unpacking everything. Most of the furniture was fairly easy. The bookshelf just had the shelves taken out, so we had to put those back. We also had to take the drawers out of the dresser to move it up the stairs. My desk required no modification at all. The one somewhat difficult thing was the bed, which we basically completely disassembled to fit into the van. However, being an Ikea bed, we were able to put it back together following the original instructions with minimal tools. Then we also unloaded my stuff, and went to Target to buy supplies and food. We finished the day with some Giordano’s pizza (deep dish, of course), a fitting welcome to Chicago. The leftover pizza then lasted me three more days.

With that, I had a place to live! Over the next few days, I got to work making this place look more like home, putting up decorations, cleaning out boxes, etc.

My bedroom with a bed in it (and a Chicago flag on the window)

My office (it was hard to get the lighting right on my phone due to the window)

Network supplies and Raspberry Pi in the living room

This definitely felt more like a home. However, I was still missing some real furniture. My living room table was a card table accompanied by a folding chair, and I had no couch. My grandparents came to the rescue for that problem and got me a table, two chairs, and a couch off Wayfair. After some assembly, I had a real living room:

Living room with a real table, chair, and couch

So with that, I have a home! No doubt this place will evolve as time goes on, but I’m pretty satisfied with how it turned out. I’m also really glad I ended up going for a slightly larger apartment, given that now I’m spending a lot more time here than I expected due to the stay-at-home order in Illinois.

Stuff I’ve been doing in the meantime

I had a fairly long gap between when I moved and when work will start, so I have done a bunch of different things in the meantime with all the free time I had. It felt weird being completely done with school, but nice to have all the free time. I have done the common things like watching TV and playing video games (in particular Wii Sports), but also some other things.

For exercise, I still am avoiding running on pavement due to my Achilles, and there aren’t really any good non-paved places around here to run that I’m aware of. Also, all the gyms are closed as part of the COVID-19 prevention measures. Thus, I settled on biking outside. After spending five months on the elliptical, being outside and moving again was a great feeling. The lakefront trail is closed since the amount of people that used it posed a hazard for COVID-19, so I had to find alternate routes. My primary route so far has been the North Shore Channel Trail up to Skokie and Evanston, but I also recently found that going further east into Evanston and Wilmette (via Northwestern) is a good route. I biked downtown once too (via Broadway, which has bike lanes for much of its length) for some photos, to date the only time I’ve been downtown since I moved here.

I have not yet been on the L, which is a bummer given how much of a transit enthusiast I am. Still, I know that the public health of Chicago is more important than my hobbies, so I’m avoiding the L to give more room for the people who really need it. All my supply needs are within walking distance (Jewel-Osco for groceries and Target for most other things), and I haven’t quite started work yet, so there really isn’t much reason for me to be taking public transportation for now. Once things calm down some, I’ll continue my exploration of the city. I have no plans to leave Chicago, so I have plenty of time to do this.

One of my projects in the meantime was the Arrow programming language, available on GitHub, and I will make another post about Arrow and the story behind it soon.

In all, despite the craziness going on everywhere, I think I’m doing well as I can right now. I can’t wait to start work, and in time as things settle down I’ll get to experience more of the city.

Month: May 2020

A ridiculous but maybe sort-of useful programming language?