Advent of Code 2018

A Sunday morning with the Advent of Code

I happened to stumble upon the very interesting site:; where there are some challenges posted every day until Christmas. It felt more adventurous than CodeForces, SPOJ or CodeChef – so I took to it.

Day 01

(UPDATE: After authoring this post, I also came across, which offers a new and interesting perspective. It is worth a read.) There are two parts to day 01, and solving each will give you exactly 1 gold star. Obtaining the first gold star is perhaps too trivial to write about. The second part is a classic data structures 101 problem, which I enjoyed. The problem is well described in the site. In summary, you have a stream of numbers, which you have to continuously accumulate until you find a duplicated state.

My first attempt was to brute force it, something like this:

def find_duplicate():
    register, counter, memory = 0, 0, []
    while 1:
        print("epoch: %s" % counter)
        counter = counter + 1
        for freq in freqs:
            register = register + freq
            if register in memory:
                return register

but that took a significant amount of time… more than 2 minutes, on my MacBook Air 2015.

python  133.25s user 0.51s system 99% cpu 2:14.39 total

Running a profiler confirms that the time hog is the step where we check for item existence in list.

Total time: 16.9317 s
Function: find_duplicate at line 3

Line #      Hits         Time  Per Hit   % Time  Line Contents
     3                                           @profile
     4                                           def find_duplicate():
     5         1        262.0    262.0      0.0      register, counter, memory = 0, 0, []
     6         1          2.0      2.0      0.0      while 1:
     7        51       2223.0     43.6      0.0          print("epoch: %s" % counter)
     8        51         51.0      1.0      0.0          counter = counter + 1
     9     52111      26479.0      0.5      0.2          for freq in freqs:
    10     52061      29430.0      0.6      0.2              register = register + freq
    11     52061   16822038.0    323.1     99.4              if register in memory:
    12                                                           return register
    13     52060      51251.0      1.0      0.3              memory.append(register)

From the CPython implementation of in (, we know that it is an O(n) operation. Below is the list_contains function:

static int
list_contains(PyListObject *a, PyObject *el)
    Py_ssize_t i;
    int cmp;

    for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
        cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
    return cmp;

Bloom Filters

We can do much better… with probabilistic data structures. We could try this problem with a Bloom Filter. For the sake of it, I also switched the language to Golang.

I’m using an already written implementation of Bloom Filters in Golang that I Googled and found. It is here: Below is the core of the implementation.

func duplicatedFrequency(frequencies []string) string {
    var register int
    var registerStr string
    bf := bbloom.New(float64(1<<18), float64(0.001))

    for {
        for _, freq := range frequencies {
            val, _ := strconv.Atoi(freq)
            register = register + val
            registerStr = strconv.Itoa(register)

            if bf.Has([]byte(registerStr)) {
                return registerStr


This runs almost 400x faster than the brute force method:

go run 01.go  0.32s user 0.16s system 94% cpu 0.511 total

The full gist can be found here… and now off to a nice Sunday breakfast.

Share article

Also recommended

Styling And Building Custom Maps
Notes on building a data-styled map with TileMill

In this post, I will assume that you are familiar with shapefiles. First, you’ll need to install TileMill (the now …

Read now ▶
Parsing Python Abstract Syntax Trees
Featuring my talk at Pycon India 2016

From the below file, we’ll extract all the function calls as a list. # import foo data = …

Read now ▶