Computer Vision and Deep Learning to Identify Bikelane Hazards

| Dec 12, 2018 min read

At a meeting of the Transportation Techies Meet Up in early 2018, another Techie announced that he had gathered images from traffic cameras around Arlington, VA at intersections with bike lanes. The data is available at here. He suggested that this data would likely be well-suited for computer vision (CV) to automate the process of classifying the images based on whether the bike lane was blocked or not.

I recently decided to learn about computer vision to tackle this challenge. Recognizing that CV is not well-implemented in R, I read up on the major differences between R and Python, and then worked through the following texts:

While the first text gave a good conceptual understanding of computer vision, this task requires a more advanced analytical approach to annotate images and greater computational strength to power it. Another Techie has worked through this same problem using deep learning algorithms powered by Google’s Colaboratory cloud environment to achieve good results. Using his work as a guide, I will develop similar algorithms with more of a focus on discussing how various techniques can be used to improve deep learning model results. You can see my work here.

In working through this problem, I am guided by the “typical machine learning workflow” that Francois Chollet outlines in the second text:

  • First, define the problem: what data is available, and what are you trying to predict? Will you need to collect more data, to hire people to manually label a dataset?

  • Identify a way to reliably measure success on your goal. For simple tasks, this may be prediction accuracy, but in many cases it will require sophisticated domain-specific metrics.

  • Prepare the validation process that you will use to evaluate your models. In particular, you should define a training set, validation set, and test set. Your validation and test set labels should not “leak” into your training data: for instance, with temporal prediction, the validation and test data should be posterior to the training data.

  • Vectorize your data, by turning it into vectors and preprocessing it in a way that makes it more easily approachable by a neural network (normalization, etc).

  • Develop a first model that beats a trivial common-sense baseline—thus demonstrating that machine learning can work on your problem. This might not always be the case!

  • Gradually refine your model architecture by tuning hyperparameters and adding regularization. Make changes based on your performance on the validation data only, not the test data nor the training data. Remember that you should manage to get your model to overfit (thus identifying a model capacity level that is above that you need), and only then start adding regularization or start downsizing your model.

  • Mind “validation set overfitting” when turning hyperparameters, i.e. the fact that your hyperparameters might end up being over-specialized to your validation set. Avoiding this is precisely the purpose of having a separate test set!

Updates:

Deep Learning Neural Network Approach: My first approach was to use deep learning through Keras to recognize blocked bike lanes and properly classify them. It was difficult to improve the model from a baseline performance of about 60% accuracy, despite hyperparameter tuning, which can often provide small boosts in performance. I then attempted to better understand what patterns the model was detecting through the pattern detection stage (convolutional neural network layers) and the classification stage (fully-connected layers), this work determined that the model was activating appropriately in the presence or absence of a bike lane obstacle.

Object Detection Approach: This image classification problem is very complex; it seems difficult to image that a computer to come to recognize the relative position of cars within bike lanes at different camera image depths. For this reason, I have tried a different approach that leverages the TensorFlow objection detection API to automatically identify objects within an image. After doing so, I take the resulting bounding boxes (see image) and tried to find a reliable and simple method for determining when an object is obstructing the bike lane. Due to the camera angle, objects can appear to “block” the bike lane when they are just sitting in the car lane but the perspective gives the appearance of the object being in the bike lane. To investigate this, I tried three methods for classifying an object that blocked the bike lane: 1) if the center of the bounding box was inside the bike lane polygon, 2) whether the overlapping area between the bounding box and the bike lane was at least 30% of the total object area, and 3) whether the overlapping area was greater than 50% of the total object area. From the results in the colab script, we can see that the first two approaches performed better than the last; the 50% approach gave many false negatives, failing to identify an object obstructing a bike lane when present. The other two approaches both correctly identified blocked bike lanes in approximately 3100 out of 3800 training images, over 80%. This approach outperforms the best results from the deep learning neural network approach; though there is no reason why a neural network could not outperform object detection with proper training and fine tuning.