Deconstructing the Gavle Goat

The Gävle goat was disassembled late 2014. Long live the old goat. Photo borrowed from Sverigesradio.se (link)

In Sweden we have this weird Yule goat made from straw that is erected each year just before 1st of December for people to look at. In the late nineties they even put up a web cam, so people can stare at it via the internet. It is fascinating ;-)

This past December I figured I would save the images from the web cam and do something with the images. I thought about it and then I almost forgot about the goat. Recently however I got inspired to finish this project, so I did. This article is about how I deconstructed the Gävle Goat like I had nothing better to do with my life.

I believe the reason why it is so popular is because since its first year of erection in 1966, it has been a victim of arson. You could say that people have a burning desire to set it on fire. The goat has been destroyed a total of 27 times over the years. Read more about it on Wikipedia Wiki. So you may say it is tradition.

How can we sleep while the goat is burning.

This December I played around with Python and I thought I could use it to store the web cam images of the goat. I expected it to burn to the ground in no time at all.

The setup

First of all, it is not trivial to save the images from the Gävle webpage. The site runs a javascript code that updates each five seconds. The filename is updated each time and it seem to me to be impossible to predict the new filename for the next web cam image. I solved it by using a package in Python called Selenium.
This allows me to save a screenshot as a web browser to a file. That is all I need. Sometimes the script would stop and I had to restart it again. This was probably because of bandwidth limits of the server or something. I added a “refresh” in my script, which seemed to help. The selenium package saves to .png, but the web cam has already compressed the image into .jpeg, so a total waste of perfectly good image quality. The month of December passed and nobody would dare to set the goat of straw ablaze, at least not now that I was keeping a watchful eye on it ;-)
I now had 381 248 images (25 GB) of the goat. Ok, what am I supposed to do with this data?

During the past months I had a couple of ideas:

  1. Traffic lights statistics
  2. Estimating day cycle (sunrise/sunset)
  3. Ambient light estimation
  4. Traffic flow estimation
  5. Wall of timelapse clips

The script I wrote saved the images “Bocken_1.jpg” through “Bocken_381248.jpg” into ONE directory. Lesson: Windows’ file explorer can’t handle that many files in one directory. So I needed to move the images into smaller directories. I move all the files of one day into a directory. This was simple to do: all I had to do was to step through the images and stop when I encounter a new day. The reason I could do this is because the script saves metadata about the image when it is written to disk such as date and time.

I wrote the image moving code in Matlab. It took about an hour for the code to do the deed. The file explorer was also happy. Besides, 25 GB is too much data and I simply want to see videos of them. First I had to find out a way to compress the videos. One thing that will affect the compressability to mpeg is noise. There were noise due to jpeg artefacts. I figured that we can easily suppress this noise by adding a temporal smoothing filter to smooth between frames. The goat is very still and covers about half of each frame, so we would get a nice compression factor out of it. I also added a blur filter.

I created one clip per “goat day” using VirtualDub. Tip: if you use Faststone Image viewer batch convert, untick progressive and use color coding YCbCr.
The first thing I could see was that many days had gaps where the script stopped working, so I could use at most maybe 15-17 days. In total, the compressed 17 videos took 1.7 GB, averaging at 100 MB per clip. The videos are only for viewing. When I do my analysis I prefer to use the raw images. On to the experiments.

The experiments

The first observation I made, after creating the videos, was that the closest traffic light seemed to switch in a very predictable way, almost as if it didn’t matter if it was there or not. If you notice, there are several traffic lights that are seen around the goat. I focus on two in particular (see below figure). The first one is to the right of the image, which I fittingly call traffic light #1.

The two traffic lights highlighted.

The second traffic light is called #2 and is much harder to capture, because the pixels are smaller and because of noise and that the camera is moved slightly, each day. To solve this, I calculate the average pixel intensity over an area to get the traffic light. Unfortunately, the area around the light is very noisy, so I have to remove the pixels that may affect the result, but are too dark to actually be part of the light. To make the colors easy to capture I convert RGB to a single value (rgb2ind). Even then, the orange color is sometimes confused with red, because they share some color values. After trying several methods (max, median, mean,…) I realised that I had to find the threshold first.

I tried all thresholds from 50 to 255 (8 bit colors) and created a plot. I then ran the thresholds over one typical day. I chose the threshold by looking at the frequency of red/orange/green. The tests were successful. My threshold seemed to work fine. For light #1 my scheme is very accurate. I tested the code for 30 frames and it is 100% accurate (see below). For light #2, the accuracy is affected because of noise, smaller light and intermittent fog, but it seemed ok.

Traffic light #1 test.

The traffic lights close to the goat is very predictable for all days except for 20-23rd of December. In the figure below I show the 23rd of December. On that day, the frequency of red is much higher than green. On a normal day we have more green than red and less orange than red.

Odd Dec 23 traffic light #1.

We will get back to this situation, after we check on light #2.

Given the assumption that my traffic light calculator is correct, we have three types of traffic light behaviour:
Dominating Orange/Yellow: Dec 2, 6
Normal green, red orange: Dec 7, 8, 9, 12, 18, 26, 27, 28
Dominating red: 20, 21, 22, 23

Worth noting is that the Orange/Yellow case only occured 2 and 6th of December. Probably this is due to an error in the way my code sees the colors. But, how come the 20-23 of December had so odd traffic light statistics? Was the traffic particularly heavy that day? Was it because it was the day before christmas (in Sweden, we celebrate christmas on Dec 24 as opposed to Dec 25). Yet another question: how can we estimate traffic flow using images? I had an idea: Why not take the difference between frames, threshold and sum the pixel values, for heavy traffic flow, the sum will be larger than for slow traffic flow.

I looked at all the traffic flow plots, they all seem very normal. There is nothing out of the ordinary in any traffic flow for the 23rd of December. For most days the traffic picks up just before eight in the morning, peaks at 16:30 and decreases until 18:40.

Totally unrelated, but interesting nevertheless: How can we estimate sunrise and sunset using the captures? The answer to this question is related to the estimation of ambient light. To be able to see when it is daytime, we first need to sample points that are not obscured by people walking by. To get a good estimation of the ambient light, we assume that the samples are close to white during daytime. The questions are related and I found the perfect spots too. The problem is to find a white point. Because it cannot be obscured by people walking by at any time, it had to be high up.

Sample areas for the ambient color.

The posts I found is on top of two lamp posts. They are perfect to estimate the color of the sky. I select a small area on top of them and average the pixel values. The results can be seen in the animation below.

To estimate sunrise/sunset I used data from timeanddata.com and the 2nd of December data to match with my data. Because the color channels increase intensity uneavenly, I use (through experimenting) a threshold value of 108 and simply determine sunset and sunrise based on which color channel first hits the threshold value, which is different for sunrise and sunset. For the 2nd of December, this estimation is very accurate, my estimation is sunrise: 8:16 and sunset 14:46. The real times are: 8:32 and 14:49. Ofcourse, when we try to extrapolate the time, it is not likely that it will be accurate.

This is the code for my sunrise/sunset estimation

thres=108
rrr = find(y_color(:, 1, 1) > thres);
ggg = find(y_color(:, 1, 2) > thres);
bbb = find(y_color(:, 1, 3) > thres);
[ttt(1), idmin] = min([rrr(1), ggg(1), bbb(1)] );
[ttt(2), idmax] = min([rrr(end), ggg(end), bbb(end)] );

where y_color is the RGB-matrix. Below we can see an animation of the plots. Please note: The data is smoothed slightly.

Animation of a couple of sunrises/sunsets.

If we look at Dec 28, the sunrise and sunset should be shifted to 09:00 and 14:48, respectively. My estimation for 28th of December is 8:42 and 14:31, so it is not very accurate. Besides, I don’t really know how accurate the website is either. Another reason I want to estimate the sunrise and sunset is to get the color of the sky. This is important when rendering 3d objects in live action video, for instance.

Note: The blueish tint at sunset and sunrise. Should it not be reddish because of Rayleigh scattering? This is because we are not looking at the horizon, but directly at a reflection of the whole sky.

Because of some missing frames the pixel values has to be mapped to a specific pixel, so we easily can compare between the day cycles side by side. In order to do this I placed the pixels in a master matrix. Over the days, some frames were acquired at 10:00 00 and the next day maybe 10:00 01. These two pixels maps to the same element in the master matrix. I had to also fill the missing data with appropriate color values. The solution was to simply scan each row of the matrix, take the last known color –which is not black– and copy, row-by-row, until I found a new color. For comparison, if we simply take the average pixel values of the whole image, instead of the sample area, we get a very noisy result, see below.

Global average pixel intensity to estimate ambient light.

Another problem is glare, which just happens to be very important here. The glare not only affects the pixel intensities but the whole capture shifts during glare. After the glare, the pixels are shifted back into position (not really, I will talk more about that later). I’m not sure what the camera is doing during the glare events, maybe it is shifting ISO or f/stop, which changes the lenses and the positions of the pixels. In the case of estimating ambient light, we need to get many samples, because of noise. But also, the quality of the samples need to be good. So, we cannot simply pixk one pixel that happens to be bright for all frames and days. We need to find the lamp. My solution is simple: move the area of interest toward the more intense pixels until the sum of pixels are greater than a set threshold.

The glare will give us unwanted coloring and noise, but we really only interested in the sunrise here, the color of the sky changes very little during the day, until the sunset, but because the glare causes a temporary shift, we don’t have to find the lamp again during sunset. The is another problem: For some days it seems as though the camera has shifted regardless of glare. Now, if we assume that during the daytime the sky is basically the same color, we can avoid the glare altogether, we can simply use the color we have and fill the day with that color. So, we know approximately when sunrise is. Also, the area is close to the lamp already, so we only need to move it a couple of pixels. By doing this, the area is guaranteed to sample as high of quality pixels as possible, at all times.

The camera shifts focus, causing the lamp post to move in relation to the sample area.

By moving the sample area and only doing this during sunrise, we get less noise -because of a larger sample area- and more accurate results, because we avoid low-intensity pixels. The images below, shows the raw data, where I applied the corrected mapping, compensating for missing frames. Further down is the smoothed data. In the bottom image we calculate the average from 00:00 to 08:00 (close to sunrise), 09:00 to 14:00 (close to sunset) and 16:00 to 23:59.

Comparison between ambient “smoothed” version (top) and the smoothed version (below).

The ambient colors zoomed in.

The weather didn’t give us any magic answers, but I believe that the mere fact that the traffic is flowing smoothly is evidence that the traffic light is efficient and does its work well. I also made some observations.

Observations

The camera lens seems to change when the sun is glaring. This affects the pixel values but also offsets the pixels, especially important for traffic light #2 which is also affected by the refocus of the camera, but also the glare itself.

The culmination of the project can be seen in a “wall of video clips” I created, where all the usable clips are showcased. I made it in 3ds Max.

Some things to note from the video: The snow on the goat fell on Dec 22. The goat was disassembled on 28th of Dec.

Conclusions

It is surprising how much information one can think of milking out from web cam. The experiments here were very simple, but was still useful. My investigation led me on a quest to try to figure out why the traffic light suddenly changed its pattern. The quest led me to other questions being answered but the not the question I asked. My conclusion –albeit trivial sounding– is that perhaps the fact that the flow of traffic is smooth is an indication that the traffic lights are doing their job well, instead of looking it from the traffic light’s view. I should add extra frames so the clips show the sunrise at the same time, but I’m just not going to. Maybe I will do that later on. Anyway, I got my traffic light statistics, day cycle, ambient light, traffic flow estimation and the wall of clips.
I really hope for the goat to burn this winter, it would make for a nice animation.

Extra videos

Dec 6:


Dec 18, the day it snowed:

Leave a Reply

Your email address will not be published. Required fields are marked *