The Game-Changing Potential of Augmented Reality for Businesses
- 13.11.2024
Last week marked 5 years of Appoly, and to celebrate the occasion we held our annual “hackathon”. This is a day which gives the team a chance to innovate and come up with a technical solution to one of a list of challenges, set by Appoly’s managing director James Merrix.
One of these challenges was “Intelligent stock control and ordering for the beer fridge”. Being big fans of the office beer fridge, and knowing the disappointment when it is running low, the choice was a no-brainer.
There are several ways to approach this, and we considered a lot of options; from weighing the whole fridge with some sort of smart scale to calculate its “fullness”, to attaching RFID tags to each bottle to get a granular idea of the fridge’s contents. None of these felt innovative enough though – scales have been around for thousands of years, and no one wants to attach a tag to bottles when loading or unloading!
Thankfully though, our fridge has a glass front, which opened up a different option. If we pointed a camera at the fridge, surely we would be able to work out what’s inside and know whether it needs refilling.
This raises two important questions:-
As Wikipedia helpfully says, “ESP32 is a series of low-cost, low-power system on a chip microcontrollers with integrated Wi-Fi and dual-mode Bluetooth”. For the purposes of this article, an ESP32 board is a really tiny computer which has built-in WiFi and – really helpfully – can also come with a built-in camera.
Now, this isn’t any computer. You can’t connect a keyboard/mouse, a monitor, or install Windows. Instead, it is a microcontroller, a computer which is so low-powered that it can only run some C or Python code which is loaded onto it through a USB cable. It can also be bought for under £15 at the time of writing, which is also great.
We decided to code this using micropython – a cut-down version of the Python framework which is suited for this sort of board.
Although neither of us had experience programming ESP32 boards, the code for this was relatively simple. When booting, it would try to connect to our office WiFi, wait until midday each day, take a photo and make a web request uploading the photo to a web service. Remember that these boards really don’t have much power at all, so trying to process the image on the device is a non-starter!
So for the question, “Given a picture of a fridge, how can I give it a score of being empty or full using only code?”. This is our solution. In short a web API which takes a photo and uses FastAI machine learning (a wrapper for PyTorch) to score the image as either “full” or “nearly empty”, and notify us when nearly empty.
This could do so much more than just decide between two options, but given that we also had the hardware to contend with, this is ambitious enough for a 6-hour deadline. It does mean that in future we have the option to further improve this back-end to recognise exactly which drinks the fridge is running low on, or maybe even predict the fridge becoming empty before it does.
We use the same premise as an example here, which aims to tell you – given a photo – whether it is of a bird, or it is not of a bird. Obviously, we’re not expecting any birds in our photos of the fridge, but instead, we hoped to adopt the same ideas to produce a binary decision of either “the fridge is full” or “the fridge is nearly empty”.
Our web API was written using Laravel, which allows us to quickly build nice interfaces around the data coming in. The uploaded images are passed over to a Python script which actually makes the prediction, and returns it back to the Laravel application to decide what to do with the information. For example, a picture of an empty fridge is uploaded, the python application returns a prediction of “empty” and the Laravel application can then send a notification to someone.
First, though, we needed to actually create a model. This is basically a file which has been trained to recognise certain patterns. To do this we’d need lots of photos of our empty fridge, and lots of photos of our full fridge, all labelled to say whether they are full or empty. This is what took the bulk of our time – we first took 80 photos of a full fridge, moving the drinks around as they would be during normal use over the months, and then 80 photos of a nearly empty fridge, again moving any remaining drinks around to avoid any biases in the model. Examples of some of the photos below for each dataset, the first of a full fridge, and the second of a nearly empty fridge:
Splitting these into 2 different directories, we were then able to use FastAI to train a model:
Feeding any images from our API into the python script to make a prediction, we could see a prediction against each, which gave us very positive results in our interface, even between relatively nuanced images:
To allow our model to be improved if we get invalid predictions, and as the size of the dataset increases, we added additional buttons in the interface to flag any new images as “full” or “nearly” empty if the predictions were wrong, which would correctly categorise the images and retrain the model, allowing future predictions to be more accurate. Although some were hard to guess!
To make this actually useful, we’d need to make this information available somewhere where it would be seen. Slack is used widely at Appoly, so we decided to integrate with Slack to send a notification when the fridge is nearly empty – meaning the necessary person would be made aware and order more beer.
The same buttons “Looks nearly empty” and “Looks full” were implemented here, which means the model can be retrained if it incorrectly sends a notification to Slack.
Hopefully, we’ll never go thirsty again.