Cost Effective Machine Vision with Raspberry Pi and OpenCV

Preface

Machine Vision has become an integral part in the arsenal of a roboticist. No other sensor is able to capture information with density sufficient for robots to gain a thorough semantic understanding of their environments.

It’s rise can be attributed to a single, sweeping factor- Price to Performance Ratio. Today, hobbyist have a great variety of solutions they can use for their Machine Vision applications. Gaining in popularity are all-in-one solutions which have minimal configurability, instead running a set of popular algorithms.

Pixy 2 CMUcam 5 (USD59.95)

The CMUcam series is probably the product that created this category. The Pixy 2 gains additional features such as Line Tracking and Road Sign detection algorithms, making it more capable than the original Pixy.

While it is a great product for a beginner (it even has compatibility with the LEGO Mindstorms system!), more advanced users will find the lack of customization options limiting.

HUSKYLENS (USD54.90)

The HuskyLens differentiates itself by utilizing machine learning to enable functionality such as facial and object recognition, learning continually to improve accuracy. Paired with a more traditional suite of features such as line tracking and color recognition, this product brings to the end user the power of select machine learning algorithms without too much in the way of code.

OpenMV (USD65)

The OpenMV family has been a popular option as it offers great performance (40-80 FPS at 0.8MP) in a small form factor (45mm L, 36mm W, 30mm H). It is programmable, with supported applications including frame differencing, color tracking, marker tracking etc.

This camera even has some GPIO options, including an I2C and Serial Bus, and a 12-bit ADC and 12-bit DAC.

This piece is for the hobbyist who is looking to go beyond these all-in-one boards. While doing so is certainly a step away from the plug-and-play realm, one might be surprised that an equally performant system can cost less than some of these integrated boards. Additionally, by using a Single Board Computer, the shackles of using proprietary hardware and software are removed, leaving the user free to move beyond the limited algorithms these AIOs offer, with possibilities only bound by their creativity.


Hardware (USD50)

  1. Raspberry Pi 3A+ (USD25)
  2. 0.3MP USB Camera (USD8.50)
  3. Cooling Fan (USD5)
  4. MicroSD Card (USD8)
  5. MicroUSB Cable (USD3.50)
  6. Structural (Assorted M2.5 Nylon Standoffs, M3 Bolts and Nuts, 3D Printed Mounts etc.)

Raspberry Pi 3A+

The Raspberry Pi 3A+ is a solution in a smaller form factor compared to the b+ boards. The 3A+ utilizes the same 1.4GHz 64-bit quad-core processor as the 3B+, providing us with more computational power than the boards with the smallest Zero form factor.

While it does have less RAM compared to the B+ variant (USD45), and gets handily beaten in all regards compared to the 4 (85USD), it can still achieve a very usable >100FPS on tasks like blob detection by color and Hough Line detection.

Of course, if cost and form factor aren’t concerns, go ahead and make the jump to a more performant SBC!

0.3MP USB Camera

Megapixel count is often used (inappropriately) as a proxy for camera performance. While higher MP counts may give you more detail in certain situations, most Machine Vision algorithms today do not benefit from the added resolution, instead suffering from long loop times.

This is especially true for lightweight learnt models. In their Object detection example, TensorFlow assumes an expected input image size of just 300 x 300, or 0.09MP.

Of course, if your application demands it, there are an assortment of cameras available, from this 12.3MP module (59USD) to this 5MP Night Vision one (USD36.90).

Cooling Fan

A cost effective way of pushing your Single Board Computer to the limit is to invest in some active cooling. By keeping the temperature of critical components lower, we reduce incidence of thermal throttling, which reduces processor clock speeds to prevent damage. With sufficient cooling, we may even be able to achieve a stable overclock, eking out the last ounces of performance from the board.

Should you choose to use a board that is known to run hot, such as the Raspberry Pi 4, you may want to invest in a fancier option with heat pipes and a larger heatsink (USD19.90), or use a 2 fan solution (USD6.50).

MicroSD Card

One often overlooked aspect of performance comes from the MicroSD Card used. To that end, we have Jeff Geerling to thank for his superb comparison.

As for capacity, a 16GB card should minimally be used. This is so that we have sufficient space for the OS, and enough to spare to act as swap (covered in software).

Users dealing with large machine learning models, or have specific requirements such as long recording times, should definitely invest in larger cards.

MicroUSB Cable

If you are not intending to power the Raspberry Pi with the official power supply, you’ll need one of these.

While innocent looking, some cheaper cables are not capable of carrying the current needed by the Pi at load. For instance, the 3A+ draws over 4W under load, which would require the cable to carry >0.8A at 5V.

Similarly, do ensure that your power source is capable of supplying the power required to prevent throttling.


Software (Raspberry Pi)

  1. DietPi ARMv8 64-bit (Free)
  2. OpenCV 4.5.4 (Free)

DietPi

DietPi is a lightweight distro, just what we need for a resource constrained platform like the 3A+. With official benchmarks showing that it uses 42% less RAM and 59% less disk space over Raspberry Pi OS Lite (64-bit), we will have much more overhead when running intensive machine vision algorithms.

OpenCV

OpenCV has an amazing collection of free computer vision algorithms. For full functionality, you will want to also get the “extras” which can be found in the opencv_contrib repository. Be warned that these modules “quite often do not have stable API, and … are not well-tested”, as per the readme.


Software (PC)

  1. PuTTY (Free)
  2. Xming (Free)
  3. Visual Studio Code (Free)

PuTTY

PuTTY is a popular, free SSH Client for Windows, providing a nice interface for configuring settings. For our application, we will be most concerned with using PuTTY for configuring X11 Forwarding.

Xming

Xming is an X11 display server for Windows. This will greatly aid the development of our Machine Vision algorithms by allowing us to show images with imshow over SSH.

Visual Studio Code

Visual Studio Code is the Integrated Development Environment ranked most popular in the Stack Overflow 2021 Developer Survey. With the Remote Explorer extension, we are able to read and write to files easily.


Setup

We start off by performing a headless install of DietPi. To do so, simply follow along this great video by Jeff’s TechCorner (8m11s). Note that you will need software not mentioned above to flash the image on the MicroSD card. The software of choice here is Balena Etcher, although you will find various free alternatives should you prefer.

Headless install of DietPi (Jeff’s TechCorner)

In this tutorial, Jeff also uses SSH to access his Pi via Windows PowerShell. Picking up from where he leaves off, we will proceed to tweak some settings via DietPi Config (launched with “dietpi-config”).

  1. Performance Options
    1. Overclocking: safe (1450MHz)
      While users have reported that they have been able to achieve stable clocks of above 1500MHz, we will leave it up to you to decide if this is required after you have your vision algorithms running. This will require some experimentation as the ability to overclock has a lot to do with the capabilities of the silicon out of the factory.
    2. Governor: performance
      If your application is not power constrained, there should be no harm in running at maximum clock all the time!
  2. Advanced Options
    1. Swap (enter Drive Manager) -> “/” -> Swap file: Maximum manual size
      With a 16GB MicroSD card, you should be able to assign 7984MB to swap. Swap is space allocated on the MicroSD card that serves as a virtual memory extension of the RAM. On the 500MB 3A+, this is needed to build OpenCV successfully. Do note that using your MicroSD card as swap could lead to a reduction in lifespan. Should you be concerned about this, reset the swap once everything is up and running.

Next up, we will be installing OpenCV. To do so, we will be following this tutorial from pyimagesearch.com with a few caveats. Most importantly, ensure that the VFPV flag is not enabled during the build. Along the way, you may find that certain packages are outdated as the article was written in 2017. Since this article will eventually suffer the same fate of being out of date, I would recommend for you to Google for the latest versions (and potentially, replacements) of each package. Note that swap space is crucial for the completion of the make process. You may have to run this several times if you are out of space. It is also worth noting that this process could take over half a day depending on the options selected.

Once you have validated that the install works with a Hello World, do check out this other tutorial, also from pyimagesearch.com, which will help with increasing the FPS of your webcam through threading.

For some quality of life improvements while doing so, I would suggest using a combination of PuTTY, Xming, and Visual Studio Code for development. While PowerShell does have XForwarding, I could not get it to work reliably. As such, I installed Xming, got that running, and used PuTTY to connect to the 3A+, enabling XForwarding (connection -> SSH -> X11 -> enable X11 forwarding). If you are having errors with the imshow command in OpenCV, XForwarding is the likely cause. To edit code, simply install the remote explorer extension.

Results

Running at 120 x 160 (QQVGA), we are able to achieve up to 150FPS equivalent loop time on a Hough Line test when no lines are drawn. Performance expectedly degrades with the number of lines drawn. This is a very respectable result, and will work well for applications where visuals are not important past the debugging phase. At USD50, this combination is undoubtedly an excellent one to get started with implementing your own Machine Vision pipelines.

Further Reading

A lot of machine vision applications, especially those in robotics, favors high refresh rates over high resolutions. If you are able to work with an even lower resolution, you may be able to squeeze a lot more FPS out of your camera. Here’s a video to give you a taster!

660FPS on a Raspberry Pi Camera by Robert Elder Software