using HTTP
= "https://api.wmata.com/Misc/Validate"
validation_url = ENV["WMATA_KEY"]
subscription_key
= HTTP.request(
r "GET",
validation_url, Dict("api_key" => subscription_key)
);
print(r.status)
200
Lessons from Developing and Presenting WMATA.jl
April 29, 2023
I make no secret of the fact that I’m a big fan of the Julia language, and I’ve been keeping an eye on its development since I first came across it in college. There is a lot to like about Julia in my opinion. For example:
In the years since Julia has first emerged, it has begun to gain a pretty loyal following in the scientific and numeric programming community–with the SciML ecosystem providing a variety of packages ranging from solving differential equations to modeling chemical reaction networks.
Though it’s awesome that Julia is finding a niche in scientific/numeric computing, I would argue that its occupation of this niche is in some ways a detriment to the growth of the language. Anecdotal, it is very rare that I see a project done in Julia which doesn’t require relatively in-depth domain knowledge of the topic in question. Not to say that there aren’t more general purpose Julia packages, there are! But overall, there’s little reason to turn to Julia to solve these problems when you can use other more popular/well documented/supported languages to do so. And the areas of Julia that are well documented like SciML feel like a very advanced place to start using the language.
In fact, according to the 2022 Julia User Developer Survey, the top 5 non-technical problems in Julia were:1
All of these are pain points I have felt when working with Julia–but in particular I am always disheartened by the lack of tutorials or examples for things I would consider to be relatively simple. Frequently, I find myself comparing it to the first language I learned: R.
R does have the advantage of having been around a long time, and will celebrate its 30th birthday this year! R also largely has Hadley Wickham to thank for invigorating interest in the language outside of academia through the development and widespread adoption of the tidyverse. But R’s primary advantage in my comparison is its incredibly active community.
Think of every niche or area of interest in data analytics you can and I’d bet that you can find an R sub-community for it. These communities host meetups, post tutorials and projects, and write and share code for these problems. #tidytuesday
always has an abundance of awesome code, visuals, and more! And there are tons of resources online for learning R, with more emerging every day.
If the Julia community is going to grow beyond its current scale, I would argue that it needs more people writing code and solving problems at all levels, but primarily more simple ones.
In this post, I’m going to tell a story about a simple package I wrote called WMATA.jl
, and how I witnessed firsthand how something so simple could invigorate interest and discussion in Julia.
My package, WMATA.jl
, is an opinionated wrapper to WMATA’s publicly available API which returns simple dataframes of the API response for things like train arrivals, station information, and more.
I won’t go into an in-depth tutorial here, but I will provide some code and explain how I broke this down into simple steps.
A few years ago, WMATA’s system was a bit wonky due to derailment issues. As a result, suddenly all the apps which tracked train arrivals and departures suddenly broke as well. This raised some questions for me, like: how do these apps get their data? Can I get this data easily? Is this how people build their own arrival/departure boards?
After playing around with the API response in Python, I thought it would be a good project to tackle in Julia for fun. After all - at the time I was still a novice in Julia and was looking for real-world and practical projects to try it out on that weren’t linear optimization or differential equations.
The package started out as a simple question: how can I return a dataframe of train arrivals, like the table you see on the metro signs at an actual station?
As it turns out, this is a pretty trivial problem if you understand how API requests work. The steps look something like this.
I obtained an API key from WMATA’s developer portal, then I stored this as an environment variable in Julia.
In Julia, there is the HTTP
package which is similar to Python’s requests
or R’s httr
. We can test if the API response is working using the validation endpoint which WMATA provides. If the status code of the response is 200, then your key works. Otherwise…there’s something wrong!
using HTTP
validation_url = "https://api.wmata.com/Misc/Validate"
subscription_key = ENV["WMATA_KEY"]
r = HTTP.request(
"GET",
validation_url,
Dict("api_key" => subscription_key)
);
print(r.status)
200
Great - the key works. And generally, this setup is how all the requests for this API look, which makes writing code for it a lot easier.
The next step was to determine how I can hit the API and return the train arrival information; I was interested in my station in particular. At the time, that station was Eisenhower Avenue…but there’s some construction happening now that probably won’t make that response very interesting 😉 so let’s use Metro Center as an example.
We can get the rail predictions–that is, the estimated arrival times of trains at a station– from the rail predictions endpoint for any given station code. Metro Center’s code is B08.
station_code = "B08"
rail_prediction_url = "https://api.wmata.com/StationPrediction.svc/json/GetPrediction/$station_code"
r = HTTP.request(
"GET",
rail_prediction_url,
Dict("api_key" => "7ac81ce2ec13424e9a8ad11b8444d79e")
);
print(r.status)
200
Now, we have to parse the body of the response. This is returned in a JSON format.
4-element JSON3.Array{JSON3.Object, Vector{UInt8}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}:
{
"Car": "8",
"Destination": "Shady Gr",
"DestinationCode": "A15",
"DestinationName": "Shady Grove",
"Group": "2",
"Line": "RD",
"LocationCode": "B08",
"LocationName": "Silver Spring",
"Min": "7"
}
{
"Car": "8",
"Destination": "Glenmont",
"DestinationCode": "B11",
"DestinationName": "Glenmont",
"Group": "1",
"Line": "RD",
"LocationCode": "B08",
"LocationName": "Silver Spring",
"Min": "12"
}
{
"Car": "6",
"Destination": "Glenmont",
"DestinationCode": "B11",
"DestinationName": "Glenmont",
"Group": "1",
"Line": "RD",
"LocationCode": "B08",
"LocationName": "Silver Spring",
"Min": "24"
}
{
"Car": "-",
"Destination": "Shady Gr",
"DestinationCode": "A15",
"DestinationName": "Shady Grove",
"Group": "2",
"Line": "RD",
"LocationCode": "B08",
"LocationName": "Silver Spring",
"Min": "28"
}
Now, we can parse this to a DataFrame. Here’s how that looks:
Row | Car | Destination | DestinationCode | DestinationName | Group | Line | LocationCode | LocationName | Min |
---|---|---|---|---|---|---|---|---|---|
String | String | String | String | String | String | String | String | String | |
1 | 8 | Shady Gr | A15 | Shady Grove | 2 | RD | B08 | Silver Spring | 7 |
2 | 8 | Glenmont | B11 | Glenmont | 1 | RD | B08 | Silver Spring | 12 |
3 | 6 | Glenmont | B11 | Glenmont | 1 | RD | B08 | Silver Spring | 24 |
4 | - | Shady Gr | A15 | Shady Grove | 2 | RD | B08 | Silver Spring | 28 |
This is a lot simpler than my original code 😄 I was generally inexperienced with Julia at the time and this package is definitely due for a re-write.
My solution here was to package my code up, using functions and documentation to make accessing WMATA’s API from Julia painless.
Basically, I tried to write functions that were clear and simple to use. In the case of getting rail predictions, as you might guess..I wrote a function called get_rail_predictions()
, which accepted a Station Code or Name and returned a dataframe similar to the one above for a given station. All of the request code is abstracted away from the user.
Using the code from above (which again, is a lot simpler than my original code…), that function would look something like this:
function get_rail_predictions(;station_code = "All", station_name = "")
r = HTTP.request(
"GET",
"https://api.wmata.com/StationPrediction.svc/json/GetPrediction/$station_code",
Dict("api_key" => "7ac81ce2ec13424e9a8ad11b8444d79e")
)
return DataFrame(JSON3.read(r.body)["Trains"])
end
get_rail_predictions(station_code = "C13")
Row | Car | Destination | DestinationCode | DestinationName | Group | Line | LocationCode | LocationName | Min |
---|---|---|---|---|---|---|---|---|---|
String | String | Union… | String | String | String | String | String | String | |
1 | 8 | Franconia | Franconia | 2 | BL | C13 | King St-Old Town | 1 | |
2 | 8 | N Carrollton | N Carrollton | 1 | BL | C13 | King St-Old Town | 2 | |
3 | 6 | Largo | G05 | Downtown Largo | 1 | BL | C13 | King St-Old Town | 6 |
4 | 8 | Huntington | Huntington | 2 | BL | C13 | King St-Old Town | 7 | |
5 | - | N Carrollton | N Carrollton | 1 | BL | C13 | King St-Old Town | 17 | |
6 | 8 | Franconia | Franconia | 2 | BL | C13 | King St-Old Town | 17 |
When using the package, the user would only have to do this to get the same result:
I continued to make similar functions for other aspects of WMATA’s API, like station opening/closing times, station lists, etc. As I went along, I tried to write documentation for how to use these functions on GitHub and improve them where possible.
Ultimately I saw the development of the package as a learning exercise for writing Julia code, developing and maintaining a package, and writing documentation for that package. If someone found my package and improved on it or even just used it as a baseline for their own project, I saw that as a win.
In April of last year, I received an invitation to present at a Metro Hack Night - an event put on by the Transportation Techies meetup group. The organizer found my project and asked if I wanted to give a short presentation on it–an invitation which I happily accepted!
Compared to many of the other projects, mine was very simple since it was just an API wrapper. The code was not super complex, nor was the problem that it was solving.
Despite the simplicity of the project, it initiated some great discussions about my use of Julia–and many of the attendees that I talked to expressed an interest in applying Julia to problems they were working on. Generally, many were excited to see that I had used Julia solely because it was different than the other projects which utilized Python or R.
Ultimately, I think the discussions I had following my presentation illustrate a hypothesis that I’ve had for a while: when people see Julia used in a more general way, tackling practical problems you would see day-to-day in a data role, they become more interested in using the language themselves. I’ve made it a goal of mine to try and provide practical examples of using the language, and I hope to make more posts in the future using Julia.
And this is where the broader Julia community can help out! Your projects don’t need to be groundbreaking to make an impact–simple projects and examples provide a baseline to beginners to get started just using the language to solve day-to-day problems like parsing the response of an API into a DataFrame.
So, lets get out there and make more silly Julia projects!
beyond the top 5, other concerns were about not knowing how to do things and a lack of teaching and learning resources available online.↩︎