I woke up this morning pondering how hard it would be to use Microsoft’s Custom Vision service to build a service which could be used to extract out number plates from an image. In this post I’m just going to cover the object detection phase of this work, which identifies where on an image a number plate exists. A subsequent phase would be to use the OCR service that’s part of the computer vision services to extract the number plate itself.
I decided to see how far I could get using the out of the box offering at CustomVision.AI. After signing in (you’ll need an account linked to Azure) you’re presented with the option to create a new project:
Clicking New Project I needed to fill in some basic information about my project and the type of project I’m building. In this case we’re going to use object detection in order to identify number plates within an image
After hitting Create Project I’m dropped into the project dashboard which essentially covers three areas: Training, Performance and Predictions. Rather than this being a strict sequence, the idea is that you’ll go back and forth between these areas gradually refining and improving the model. As we don’t already have a model, we need to start by adding some images to train with.
As I don’t have an archive of photos with number plates, I decided to grab a selection of images from Google. You’ll notice that I include “with car” in my image search – we’ll talk about why this is important in a minute
I downloaded around 30 of these images (you’ll need at least 15 to train the model with but the more images the better). Clicking on Add images give me the ability to upload the images I downloaded from Google image search
The images I uploaded appeared as “untagged” – essentially I haven’t identified what in each photo we’re looking for. To proceed, I need to go through each image, and select and tag any areas of interest
Rather than selecting each individual image, if you hit Select All and then click on the first image you can step through each image in turn
If you hover over the image, you’ll see that there are some suggest areas that appear with a dotted outline.
You can either click the suggested area, or you can simply click-and-drag to define your own area
In my first attempt I assumed that I should be marking just the area that includes the text, because the registration number is what I want as the eventual output. However, this didn’t seem to give very accurate results. What the service is great at is identifying objects, and rather than defining areas that show a number plate, I was just getting it to recognise text, any text. In my second attempt I defined regions that bounded the whole number plate, which gave me much better results.
After going through all of the images and tagging them, all the images should appear as tagged and you should see a summary of the number if images for each tag
Now to hit the Train button (the green button with cogs in top right corner). Once training is done you can see some key metrics on the quality of the iteration of your model. In general terms the higher the percentage the better; and the more training images you provide and tag, the better the model will get.
After you’ve run Train the first time, you actually have a model that you can use. From the Predictions tab you can see information about the endpoint that’s available for your app to call in order to invoke the service.
What’s really great is that you can click on Quick Test and then supply an image to see how your service performs.
In this case the service identified one area that it thinks is a number plate with a probability of 92.4%. The next step would be to pass the number plate through an OCR service in order to extract the registration number.
All of this was setup, trained and made available as a service in under 5 minutes (excluding the time spent re-tagging the training images to include the whole number plate, rather than just the text).