We Need More Silly Projects in Julia

Lessons from Developing and Presenting WMATA.jl

A few years ago, I developed a simple Julia package in an effort to become more familiar with the language. It illustrated to me that we need more projects like this in the Julia community: projects which are simple, easy to follow, and provide a baseline for beginners. In this post, I make the case for doing simple, practical projects to help grow the Julia community.
Julia
Author
Affiliation

Data Analyst at CollegeVine

Published

April 29, 2023

Introduction

I make no secret of the fact that I’m a big fan of the Julia language, and I’ve been keeping an eye on its development since I first came across it in college. There is a lot to like about Julia in my opinion. For example:

  • it has a nice, high-level syntax that is pleasant to code in
  • it has some really powerful design choices like multiple dispatch, meta-programming, and a more functional style of programming
  • it has a pretty good package manager
  • it’s fast (most of the time)
  • it’s open source!

In the years since Julia has first emerged, it has begun to gain a pretty loyal following in the scientific and numeric programming community–with the SciML ecosystem providing a variety of packages ranging from solving differential equations to modeling chemical reaction networks.

Though it’s awesome that Julia is finding a niche in scientific/numeric computing, I would argue that its occupation of this niche is in some ways a detriment to the growth of the language. Anecdotal, it is very rare that I see a project done in Julia which doesn’t require relatively in-depth domain knowledge of the topic in question. Not to say that there aren’t more general purpose Julia packages, there are! But overall, there’s little reason to turn to Julia to solve these problems when you can use other more popular/well documented/supported languages to do so. And the areas of Julia that are well documented like SciML feel like a very advanced place to start using the language.

In fact, according to the 2022 Julia User Developer Survey, the top 5 non-technical problems in Julia were:1

  1. my colleagues, company, or collaborators use other languages
  2. there are not enough Julia users in my field or industry
  3. there are not enough Julia users in general
  4. insufficient documentation
  5. online tutorials and documentation that are outdated

All of these are pain points I have felt when working with Julia–but in particular I am always disheartened by the lack of tutorials or examples for things I would consider to be relatively simple. Frequently, I find myself comparing it to the first language I learned: R.

R does have the advantage of having been around a long time, and will celebrate its 30th birthday this year! R also largely has Hadley Wickham to thank for invigorating interest in the language outside of academia through the development and widespread adoption of the tidyverse. But R’s primary advantage in my comparison is its incredibly active community.

Think of every niche or area of interest in data analytics you can and I’d bet that you can find an R sub-community for it. These communities host meetups, post tutorials and projects, and write and share code for these problems. #tidytuesday always has an abundance of awesome code, visuals, and more! And there are tons of resources online for learning R, with more emerging every day.

If the Julia community is going to grow beyond its current scale, I would argue that it needs more people writing code and solving problems at all levels, but primarily more simple ones.

In this post, I’m going to tell a story about a simple package I wrote called WMATA.jl, and how I witnessed firsthand how something so simple could invigorate interest and discussion in Julia.

WMATA.jl

My package, WMATA.jl, is an opinionated wrapper to WMATA’s publicly available API which returns simple dataframes of the API response for things like train arrivals, station information, and more.

I won’t go into an in-depth tutorial here, but I will provide some code and explain how I broke this down into simple steps.

How it Started

A few years ago, WMATA’s system was a bit wonky due to derailment issues. As a result, suddenly all the apps which tracked train arrivals and departures suddenly broke as well. This raised some questions for me, like: how do these apps get their data? Can I get this data easily? Is this how people build their own arrival/departure boards?

After playing around with the API response in Python, I thought it would be a good project to tackle in Julia for fun. After all - at the time I was still a novice in Julia and was looking for real-world and practical projects to try it out on that weren’t linear optimization or differential equations.

Building it Out in Julia

The package started out as a simple question: how can I return a dataframe of train arrivals, like the table you see on the metro signs at an actual station?

hopefully the one on the right, and not the left…

As it turns out, this is a pretty trivial problem if you understand how API requests work. The steps look something like this.

How do I use the API?

I obtained an API key from WMATA’s developer portal, then I stored this as an environment variable in Julia.

In Julia, there is the HTTP package which is similar to Python’s requests or R’s httr. We can test if the API response is working using the validation endpoint which WMATA provides. If the status code of the response is 200, then your key works. Otherwise…there’s something wrong!

using HTTP 
validation_url = "https://api.wmata.com/Misc/Validate"
subscription_key = ENV["WMATA_KEY"]

r = HTTP.request(
    "GET", 
    validation_url, 
    Dict("api_key" => subscription_key)
);

print(r.status)
200

Great - the key works. And generally, this setup is how all the requests for this API look, which makes writing code for it a lot easier.

How do I get the train arrivals back for my station?

The next step was to determine how I can hit the API and return the train arrival information; I was interested in my station in particular. At the time, that station was Eisenhower Avenue…but there’s some construction happening now that probably won’t make that response very interesting 😉 so let’s use Metro Center as an example.

We can get the rail predictions–that is, the estimated arrival times of trains at a station– from the rail predictions endpoint for any given station code. Metro Center’s code is B08.

station_code = "B08"
rail_prediction_url = "https://api.wmata.com/StationPrediction.svc/json/GetPrediction/$station_code"

r = HTTP.request(
    "GET", 
    rail_prediction_url, 
    Dict("api_key" => "7ac81ce2ec13424e9a8ad11b8444d79e")
);

print(r.status)
200

Now, we have to parse the body of the response. This is returned in a JSON format.

using JSON3
trains = JSON3.read(r.body)["Trains"] # each train is contained in this.
4-element JSON3.Array{JSON3.Object, Vector{UInt8}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}:
 {
               "Car": "8",
       "Destination": "Shady Gr",
   "DestinationCode": "A15",
   "DestinationName": "Shady Grove",
             "Group": "2",
              "Line": "RD",
      "LocationCode": "B08",
      "LocationName": "Silver Spring",
               "Min": "7"
}
 {
               "Car": "8",
       "Destination": "Glenmont",
   "DestinationCode": "B11",
   "DestinationName": "Glenmont",
             "Group": "1",
              "Line": "RD",
      "LocationCode": "B08",
      "LocationName": "Silver Spring",
               "Min": "12"
}
 {
               "Car": "6",
       "Destination": "Glenmont",
   "DestinationCode": "B11",
   "DestinationName": "Glenmont",
             "Group": "1",
              "Line": "RD",
      "LocationCode": "B08",
      "LocationName": "Silver Spring",
               "Min": "24"
}
 {
               "Car": "-",
       "Destination": "Shady Gr",
   "DestinationCode": "A15",
   "DestinationName": "Shady Grove",
             "Group": "2",
              "Line": "RD",
      "LocationCode": "B08",
      "LocationName": "Silver Spring",
               "Min": "28"
}

Now, we can parse this to a DataFrame. Here’s how that looks:

using DataFrames
rail_predictions = DataFrame(trains)
4×9 DataFrame
Row Car Destination DestinationCode DestinationName Group Line LocationCode LocationName Min
String String String String String String String String String
1 8 Shady Gr A15 Shady Grove 2 RD B08 Silver Spring 7
2 8 Glenmont B11 Glenmont 1 RD B08 Silver Spring 12
3 6 Glenmont B11 Glenmont 1 RD B08 Silver Spring 24
4 - Shady Gr A15 Shady Grove 2 RD B08 Silver Spring 28
Note

This is a lot simpler than my original code 😄 I was generally inexperienced with Julia at the time and this package is definitely due for a re-write.

How do I make this easier, and more repeatable?

My solution here was to package my code up, using functions and documentation to make accessing WMATA’s API from Julia painless.

Basically, I tried to write functions that were clear and simple to use. In the case of getting rail predictions, as you might guess..I wrote a function called get_rail_predictions(), which accepted a Station Code or Name and returned a dataframe similar to the one above for a given station. All of the request code is abstracted away from the user.

Using the code from above (which again, is a lot simpler than my original code…), that function would look something like this:

function get_rail_predictions(;station_code = "All", station_name = "")
    r = HTTP.request(
        "GET", 
        "https://api.wmata.com/StationPrediction.svc/json/GetPrediction/$station_code", 
        Dict("api_key" => "7ac81ce2ec13424e9a8ad11b8444d79e")
    )

    return DataFrame(JSON3.read(r.body)["Trains"])
end

get_rail_predictions(station_code = "C13")
6×9 DataFrame
Row Car Destination DestinationCode DestinationName Group Line LocationCode LocationName Min
String String Union… String String String String String String
1 8 Franconia Franconia 2 BL C13 King St-Old Town 1
2 8 N Carrollton N Carrollton 1 BL C13 King St-Old Town 2
3 6 Largo G05 Downtown Largo 1 BL C13 King St-Old Town 6
4 8 Huntington Huntington 2 BL C13 King St-Old Town 7
5 - N Carrollton N Carrollton 1 BL C13 King St-Old Town 17
6 8 Franconia Franconia 2 BL C13 King St-Old Town 17

When using the package, the user would only have to do this to get the same result:

using WMATA 
WMATA_auth("API-KEY")

get_rail_predictions(StationName = "King St-Old Town")

I continued to make similar functions for other aspects of WMATA’s API, like station opening/closing times, station lists, etc. As I went along, I tried to write documentation for how to use these functions on GitHub and improve them where possible.

Ultimately I saw the development of the package as a learning exercise for writing Julia code, developing and maintaining a package, and writing documentation for that package. If someone found my package and improved on it or even just used it as a baseline for their own project, I saw that as a win.

Presenting on WMATA.jl

In April of last year, I received an invitation to present at a Metro Hack Night - an event put on by the Transportation Techies meetup group. The organizer found my project and asked if I wanted to give a short presentation on it–an invitation which I happily accepted!

Compared to many of the other projects, mine was very simple since it was just an API wrapper. The code was not super complex, nor was the problem that it was solving.

Despite the simplicity of the project, it initiated some great discussions about my use of Julia–and many of the attendees that I talked to expressed an interest in applying Julia to problems they were working on. Generally, many were excited to see that I had used Julia solely because it was different than the other projects which utilized Python or R.

Lessons Learned

Ultimately, I think the discussions I had following my presentation illustrate a hypothesis that I’ve had for a while: when people see Julia used in a more general way, tackling practical problems you would see day-to-day in a data role, they become more interested in using the language themselves. I’ve made it a goal of mine to try and provide practical examples of using the language, and I hope to make more posts in the future using Julia.

And this is where the broader Julia community can help out! Your projects don’t need to be groundbreaking to make an impact–simple projects and examples provide a baseline to beginners to get started just using the language to solve day-to-day problems like parsing the response of an API into a DataFrame.

So, lets get out there and make more silly Julia projects!

Footnotes

  1. beyond the top 5, other concerns were about not knowing how to do things and a lack of teaching and learning resources available online.↩︎