Editor contribution timeline

I wanted to try to display a github-like timeline of editor contributions. For this purpose I'm going to use the calendar-heatmap javascript library, and of course I need to load all my edit history. Another candidate I tested is the cal-heatmap library which gives similar results.

Setup

In [2]:
%run startup.ipy
Last notebook update: 2018-07-14
Git repo: git@bitbucket.org:loujine/musicbrainz-dataviz.git
Importing libs
Defining database parameters

Defining *sql* helper function
Last database update: 2018-06-02

Python packages versions:
numpy       1.14.5
pandas      0.23.3
sqlalchemy  1.2.9
CPython 3.7.0b5
IPython 6.4.0

 Load edit history

Let's choose an editor randomly:

In [3]:
editor_name = 'loujin'

We can extract the number of edits made every day with:

In [105]:
edits_count = sql("""
SELECT date_trunc('day', edit.open_time) AS date,
       COUNT(*) AS cnt
  FROM edit
  JOIN editor ON editor.id = edit.editor
 WHERE editor.name = %(editor_name)s
GROUP BY date
ORDER BY cnt DESC
;
""", editor_name=editor_name)

We need to convert the date to seconds for the javascript library:

In [111]:
edits_count.index = edits_count.date.astype('int')*1e-6
In [112]:
edits_count.head()
Out[112]:
date cnt
date
1.356998e+12 2013-01-01 00:00:00+00:00 4845
1.475798e+12 2016-10-07 00:00:00+00:00 4423
1.464307e+12 2016-05-27 00:00:00+00:00 4178
1.444954e+12 2015-10-16 00:00:00+00:00 4049
1.452470e+12 2016-01-11 00:00:00+00:00 4032

Plot the timeline using calendar-heatmap

Now we can transmit the data to javascript by replacing the data in an HTML template:

In [132]:
# Transform into JSON
data = edits_count.cnt.to_json()
In [136]:
from jinja2 import Template
template = Template("""
<!doctype html>

<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>MusicBrainz Editor edits timeline</title>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.12.0/moment.min.js" charset="utf-8"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.17/d3.min.js" charset="utf-8"></script>
    <script src="https://cdn.rawgit.com/DKirwan/calendar-heatmap/337b4314/src/calendar-heatmap.js"></script>
    <link rel="stylesheet" href="https://cdn.rawgit.com/DKirwan/calendar-heatmap/337b4314/src/calendar-heatmap.css" />
    <style>
      svg.calendar-heatmap {
        box-sizing: initial;
        overflow: visible;
      }
    </style>
    </head>

  <body>
    <h1>{{ editor_name }}'s MusicBrainz edits timeline
    <div id="heatmap"></div>
    <script type="text/javascript">
      const data = Object.entries({{ data }}).map(
        function (t,cnt) {return {date: new Date(parseInt(t)),
                                  count: parseInt(cnt)}}
      );
      const range = (start, end) => [...Array(end - start + 1)].map((_, i) => start + i);
      for (let year of range(2012, 2018)) {
        document.getElementById('heatmap').insertAdjacentHTML(
          'beforeend', `
          <h3>${year}</h3>
          <div id="y${year}"></div>
        `);
        calendarHeatmap().data(data)
                         .startDate(new Date(year,1,1))
                         .selector(`#y${year}`)
                         .tooltipEnabled(true)
                         .colorRange(['#ffffff', '#ff0000'])
                         .onClick(function () {
                           console.log(data);
                         })();
      }
    </script>
  </body>
</html>
""")

with open('docs/editor_edits_timeline.html', 'w') as f:
    f.write(template.render(**globals()))

The result can be seen on this page

Editor meaningful stats

Number of artists created:

In [25]:
# lib/MusicBrainz/Server/Constants.pm
types = dict(
EDIT_ARTIST_CREATE = 1,
EDIT_LABEL_CREATE = 10,
EDIT_RELEASE_CREATE = 31,
EDIT_WORK_CREATE = 41,
EDIT_PLACE_CREATE = 61,
EDIT_SERIES_CREATE = 140,
EDIT_EVENT_CREATE = 150,
)
In [26]:
case = ' '.join(["WHEN edit.type={1} THEN '{0}'".format(*it) for it in types.items()])
edits_count = sql("""
SELECT CASE {} END, COUNT(*)
  FROM edit
  JOIN editor ON editor.id = edit.editor
 WHERE editor.name = %(editor_name)s
   AND edit.type IN ({})
 GROUP BY edit.type
 ORDER BY edit.type
;
""".format(case, ', '.join(str(t) for t in types.values())), case=case, editor_name=editor_name)
In [27]:
edits_count
Out[27]:
case count
0 EDIT_ARTIST_CREATE 343
1 EDIT_LABEL_CREATE 9
2 EDIT_RELEASE_CREATE 587
3 EDIT_WORK_CREATE 3375
4 EDIT_PLACE_CREATE 66
5 EDIT_SERIES_CREATE 18
6 EDIT_EVENT_CREATE 12