red_blue_01_b

An Equation-Free Explanation of Machine Learning

Over the past few years there’s been a growing interest in artificial intelligence with many articles about its accomplishments, its potential and its implications for the future.  These articles often reference how the software “learns” or how it “chooses” the best solution but few of us understand what these terms actually mean when applied to a machine.  Machine learning to many people is just a magical black box.

My interest in AI grew substantially after reading the Artificial Intelligence Revolution article on Wait But Why.  I then looked further into DeepMind and found some of the amazing things machine learning can do today. It was all very interesting but I needed to know what was happening inside the black box. I knew that you could feed in a bunch of data at one end and then answers start popping out of the other but at the time I had no idea how it worked.

I’ve now just finished week 8 of 11 of a Machine Learning course from Coursera and I feel I have at good understanding of at least some machine learning algorithms. This is probably still nothing compared to the highly developed AIs like DeepMind or Watson but it’s definitely a start.

So, now that I have this introductory knowledge I hope to show others a peek inside the black box. The explanation below is my attempt to explain logistic regression, one type of machine learning, without using any mathematic equations. With this simplification I will be glossing over some elements but I hope to retain the most important core concepts.

On with the learning!

Logistic regression is used to group data into pre-defined categories. It first learns from an initial set of data with group labels and then applies that knowledge to more unlabelled examples. One common example would be grouping emails into spam or not spam. The example below (definitely not a typical use case) will hopefully help you understand how that learning takes place.

Imagine a large park full people. Each of these people supports either the Red Team or the Blue Team. These teams could represent football clubs, Trump/Clinton supporters, pick your poison, but I’ll just stick to calling them Red and Blue. A few of the supporters on the Red Team are wearing Red Hats and a few of the supporters on the Blue Team are wearing Blue Hats, but most of the supporters are hatless. Each team has a favourite pub, the Red Pub and the Blue Pub. The pubs are located at opposite ends of the park, and for the most part the Red and Blue teams will flock towards their favourite pub.

red_blue_01_b

We have the task of distributing Red or Blue Hats to every hatless person. Each hatless person is already a Red or Blue Team supporter, but we have to guess which team they support before giving out the hats. To do this we send out an autonomous drone to fly over the park with a giant curtain.

drone_01

The drone will need to learn on its own where the Red and Blue Team supporters are standing and then use the curtain to divide the two groups. The drone has no data about the pubs or the layout of the park so it has no way of knowing how the supporters will be distributed. It also has no cameras so it can’t even see the lay of the land. It does however have a GPS and it knows the location of the park so it starts out by just hovering in a random location above the supporters.

drone_position_010001

Even though the drone cannot see the supporters it can hear them. Each hat is equipped with a microphone that can record only the sounds of the person wearing it. The supporters are very vocal so if they are on the wrong side of the curtain they will yell and complain. The microphones pick up only the complaints of the supporters with hats (our labelled data in machine learning terms) while the hatless supporters (unlabelled data) remain silent to our drone.

drone_position_010003

Conveniently the volume of each yelling supporter will be directly related to how far away they are from the curtain: the further they are from the curtain, the louder they will yell. The total volume from all of the hat-wearing, yelling supporters is a measurement of the amount of error in the drone’s location and orientation.

drone_position_010004

Next the drone makes a few very small movements. First it moves a tiny amount east and then west and records the difference in the error (the volume of the angry supporters). It then moves north and south and tests again. Finally it rotates clockwise and counter-clockwise and makes further measurements. Each test helps to identify a direction to move that will lower the overall error. The drone then raises the curtain, moves to a new position based on its findings, lowers the curtain again and repeats the test.

drone_position_010007

With each of the drone’s moves, the error drops…

drone_position_010010

…until the lowest level of error is found and no movement in any direction would improve it.

drone_position_010012

At this point the groups have been split, the hats are distributed to each side and hopefully most supporters get the correct coloured hat. The drone has successfully used a logistic regression algorithm to find the best straight line placement to separate the data (Red and Blue Team supporters).

drone_position_01_end_animation_01

The key thing to note is that the drone was never given any specific instructions; it just listens to the labelled data (the people with the hats) and makes adjustments until it fits.  Because of this simple ruleset (continue moving in a direction that lowers the volume of yelling supporters until the lowest level is reached), the drone could be dropped in any number of different parks with different arrangements of people and each time it would learn the best way to divide the groups.

This example only explains a very simple problem where the data can be separated into 2 groups by a straight line, but logistic regression can be used to split multiple groups with much more complex curves. If you’re interested to learn more, let me know in the comments and I might extend this analogy in another post.

If you would like to read more about machine learning, I would recommend this article which needs to be read on a large screen to take full advantage of the interactive animations.

2016-08-22-09-58-20

Interesting Problems

Back in March, after 9 long and wonderful years, I worked my last day at Squint/Opera.

A few years before I had read David Allen’s Getting Things Done and started evaluating all areas of my life, especially my goals over the next 5, 10, 15 years. I had several ideas but I felt I could only pursue some of them effectively if I went out on my own. Squint was a growing company though, and I had many responsibilities on my shoulders. I was also excited about the challenges yet to come, so at the time I decided to stay, but not without a plan.

Squint transformed over the years that followed. A new division for installations and museums was growing, bringing more interactive work, and Messy Goes to Okido went into full production. I built a support team including IT, pipeline, UX and several production team leads. I mentored these new leaders and handed over responsibilities, helping to grow the company and ensure I was no longer a bottleneck. Once things had stabilised I gave myself room to step back and re-evaluate both my role at Squint and my future plans. Squint had many intriguing challenges on the horizon, but I knew that some of my goals would never quite fit their business model, so I had to say goodbye.

Merlin Mann (the inbox zero guy) once said “try to be scared of more interesting problems“. This is exactly what I’m planning to do. I’m scared of tackling problems in neurotech and healthtech, but I also find those areas immensely interesting. I’m scared of building a company from scratch and going all in on a new app or game, but the challenge also excites me. I’m of course scared of what the future of AI means for humanity, but a problem of that scale is fascinating.

So I’m giving it a shot.

Already in the last few months I’ve had the opportunity to work on a neurotech project, designing and building a keyboard app for the ici·bici (formerly Smartphone BCI)I’ve also gone back to school! I’m in week 5 of a Machine Learning course from Coursera, and I’m beginning to program basic neural nets. (I had forgotten how much I love math.) If you can’t tell from the references to GTD and Merlin Mann, I’m really into productivity and efficient systems. I have developed many workflows for myself, friends and co-workers, and I’m working on better ways to share these to benefit others.

In between these projects I’m also available for part-time consulting work. I’m eager to meet teams who are facing new challenges and see if I can help them out. If you have a studio that needs to grow or you just want to make things run more efficiently then please get in touch.

To help explain a bit more about my work both old and current, I’ve built this website. My portfolio currently has a selection of work from my time at Squint and I hope to post more about my new projects soon.

The first step was finding interesting problems that scared me but now I’m on a faster path to solving them.