Introducing watcher

we watch the watches!

watcher is a data scraping and data visualiztion projects intended to provide insight into the watch grey market. In this post, I introdce v0.0.1 of watcher and do some light analysis in Julia of the dataset which powers it.
R
Python
Julia
Author
Affiliation

Data Analyst at CollegeVine

Published

November 29, 2023

Introduction

I’m excited to officially announce v0.0.1 of watcher, the watch-watching site 😄

watcher is a data visualization tool intended to provide insight into the trends of grey market watch prices. For those not familiar, grey market watch prices can be absolutely insane. At some point, I realized that I could probably turn all these private sales into data which could provide a snapshot of the grey watch market, and even potentially provide the basis of a dataset for predicting the prices of watches.

Right now, it’s pretty barebones…but I hope that in the future I can flesh it out more!

You can find watcher here.

How does it work?

watcher scrapes watch prices from grey market websites like forums and consolidates them into a single dataset. The dataset contains posts from both private sellers and authorized dealers and resellers.

On the technical side, watcher relies primarily on a small Python script deployed to ec2 to scrape data on a cron job. This data is pushed to s3, and then served via a Shiny app.

Right now, data is sourced primarily from a single forum in order to deliver an MVP. In the future, I hope to expand the data sources to places like Reddit, Chrono24, and more.

Some Analysis of the Dataset in Julia

Since we’re dealing with a new dataset, I thought this would also be a perfect opportunity to demonstrate a data analysis workflow in Julia that I’ve wanted to do for some time.

This is not meant to be an exhaustive analysis by any means, but more of a high-level walkthrough of some elements of the analysis that I’ve already done when I was preparing to build watcher.

Initial Setup

First, I’m going to import some packages that I’ll be using as well as define a few convenience functions. For the most part, any data analysis workflow I do in Julia is going to require at least DataFrames and DataFramesMeta.

For plotting, I’ve recently been adapting a combination of Makie and AlgebraOfGraphics in Julia. More on that later!

The convenience functions I’m defining here are used to connect to a database (in this case, a local postgres server) and to query that database using SQL.

Note

Note that I’m using Julia’s ENV to store credentials. I put these in my startup.jl file so that they persist through sessions and I’m never putting secrets in code!

Code
using AlgebraOfGraphics
using CairoMakie
using Colors
using DataFrames
using DataFramesMeta
using Dates
using Formatting
using LibPQ
using Statistics: mean, median

update_theme!(theme_minimal()) # set the plot theme

#=
generally, I'd put something like the following into an internal package.
    the gist here is that the following functions:
        - abstract away connecting to postgres
        - queries our database and returns that query as a dataframe.
=#
function connect_to_postgres()
    connection_string = [
        "dbname=postgres",
        "host=localhost",
        "user=$(ENV["DB_USER"])",
        "password=$(ENV["DB_PASSWORD"])",
    ]

    return LibPQ.Connection(join(connection_string, " "))
end

function query_db(conn, query)
    return DataFrame(LibPQ.execute(conn, query))
end

Here are those convenience functions in action, and a snippet of the DataFrame which is returned.

conn = connect_to_postgres();

watches = query_db(
    conn,
    "
    SELECT
        *
    FROM watcher.raw_watcher_posts
    "
);

first((@select watches :title :currency :price :post_date :num_replies), 5)
5×5 DataFrame
Row title currency price post_date num_replies
String? String? Float64? DateTime? Int32?
1 Marathon Navigator X J Crew Limited Edition USD 225.0 2023-11-12T22:41:39 0
2 70s Squale Master 1000m Orange/Black Bezel for Tissier USD 1800.0 2023-11-12T20:55:50 0
3 FS Only - Squale 1521 50 Atmos Blue Dial Polished Case ~ $695 shipped USD 695.0 2023-11-12T18:00:32 1
4 SM x Watch Dives Milan Blue USD 150.0 2023-11-12T16:19:38 0
5 FS:Casio Protrek 1300YJ USD 159.0 2023-11-12T15:37:25 0

As part of our initial setup, I’d also like to define some approximate currency conversions to USD in order to keep prices consistent in the analysis. We’ll create a dictionary of these and use that dictionary to easily convert the price with a DataFrames macro.

currency_conversions = Dict(
    "USD" => 1,
    "CAD" => .80,
    "EUR" => 1.15,
    "GBP" => 1.35,
    "AUD" => .66
)

#=
note that this function mutates the original dataframe without
    the need to assign a new object.
=#
@transform! watches @byrow :value_in_usd = currency_conversions[:currency] * :price;

first((@select watches :title :currency :price :value_in_usd), 5)
5×4 DataFrame
Row title currency price value_in_usd
String? String? Float64? Float64
1 Marathon Navigator X J Crew Limited Edition USD 225.0 225.0
2 70s Squale Master 1000m Orange/Black Bezel for Tissier USD 1800.0 1800.0
3 FS Only - Squale 1521 50 Atmos Blue Dial Polished Case ~ $695 shipped USD 695.0 695.0
4 SM x Watch Dives Milan Blue USD 150.0 150.0
5 FS:Casio Protrek 1300YJ USD 159.0 159.0

Great, now we’re ready to actually do some analysis!

Wrapping Up

Thanks for reading through my analysis, or thanks for skimming through my website and skipping to the end! Regardless, I hope that this was an interesting peak into my newest project and a data analysis workflow in Julia!

Footnotes

  1. At the moment, I’m not entirely sure what the future of plotting in Julia looks like–but I am pretty confident that it’ll be Makie, and for those of us who are coming from R it’ll probably involve AlgebraOfGraphics as well. See this Julia discourse post for more details.↩︎