HOWTO: Smooth Noise Pixels with Microsoft Kinect


Overview

This post will demonstrate how to smooth depth images from the Microsoft Kinect and how to build a historical noise model using openCV and openFrameworks.

When using openCv in public spaces certain things need to be accounted for such as lighting conditions, camera imperfections, unexpected objects and rigging instability. There are certain techniques and hardware solutions that can help solve some of these issues. For example, in Shadowing we removed the infrared filter on a generic web camera and shone infrared light onto the participants, so our system could could ignore the projected images behind them.

We had to implement a similar system in the 2015 Playable City Award Urbanimals.

In Urbanimals, the public were invited to interact with a set of projected origami animals (Kangaroo, Dolphin, Rabbit and Beetle) that responded to their movements and actions towards the projected animals.

Unlike in Shadowing, we needed a relatively clean image of the interaction area to access reliable tracking data. We decided to use the Microsoft Kinect as it had a depth camera that would not be affected by projected images and importantly remain unaffected by exterior lighting sources. The Kinect’s depth system works in two stages. First, an infrared source emits a special pattern of dots onto what ever surface is in the emitters path.

rWeb

Infrared Pattern

Then the depth camera collects the data and processes it into a machine readable form. In the image above you can see the emitted light being brighter the closer you are to the Kinect.

Problem

For the hardware to be viable we needed to alter the maximum viewing depth from 3.5m to around 6-10m. By default ofxKinect clips the depth values to the practical use limit (0.5m and 3.5m), by adding this line after initialisation you are able to control the clipping planes.

 kinect.setDepthClipping(minDepth,maxDepth);



However, the Kinect’s depth camera is susceptible to small fluctuations in the infrared signal, which result in invalid pixels or noise. This effect is exacerbated when the far plane clipping distance is increased. In the image below you can see this increase in noise especially around the checkboard area and the object on the right hand side of the image.

rWeb

Processed Depth Image

Our solution was to build a model of the environment and separate the foreground objects.

Solution

We are using openFrameworks with ofxCv and ofxKinect.
You’ll need three Matrix objects (kinectImage, backgroundModel and foreground) and a variable called frameCount. Set frameCount to 0.

kinect.update();

if (kinect.isConnected())

{

	if (kinect.isFrameNewDepth())

	{

		kinectImage = Mat(kinect.height,kinect.width,CV_8UC1,kinect.getDepthPixels(),0);

In the update function, check the connection to the Kinect and whether we have a new depth frame. Copy the depth pixels into the kinectImage Mat().

//! 1

if (frameCount < 300)

{

    //! 2

    if (frameCount == 0)

    {

	unsigned char zero = 0;

	backgroundModel = Mat(kinectImage.rows, kinectImage.cols, CV_8UC1);

	for (int y = 0; y < kinectImage.rows; y++)

	{

		for (int x = 0; x < kinectImage.cols; x++)

		{

			backgroundModel.at(y,x) = zero;

		}

	}

    }

    //! 3

    for (int y = 0; y < kinectImage.rows; y++)

    {

    	for (int x = 0; x < kinectImage.cols; x++)

    	{

                //! 4

    		unsigned char backPixel = backgroundModel.at(y, x);

    		unsigned char currentPixel = kinectImage.at(y, x);



    		if (currentPixel == 0) {}

    		else if (backPixel == 0)

    		{

    			backPixel = currentPixel;

    		}

    		else if (currentPixel < backPixel)

    		{

    			backPixel = currentPixel;

    		}

    		

                //! 5

    		backgroundModel.at(y, x) = backPixel;

    	}

    }

    //! 6			

    if (frameCount == 299)

    {

	backgroundModel = backgroundModel + Scalar(6);

    }

    //! 7

    frameCount++;

}

  1. Evaluate the frameCount conditional. If we are still below the value keep capturing (the bigger the number the longer the capture time).
  2. If we are at 0 frames, create a new Mat which is the same resolution as the kinectImage Mat and set all the pixels to 0.
  3. Loop through the pixels in the kinectImage
  4. Get the pixel values from both the backgroundModel and the kinectImage
  5. Check the pixels values and store the value in the backgroundModel
  6. If the frameCount is one frame below the limit. Resolve backgroundModel, which copies the Mat into itself after brightening the image slightly.
  7. Increment frameCount each frame.
rWeb

Capture Loop

Now that we have captured and resolved our model we can process incoming frames against it.

else {

        //! 1

	for (int y = 0; y < kinectImage.rows; y++)

	{

		for (int x = 0; x < kinectImage.cols; x++)

		{

                        //! 2

			unsigned char backPixel = backgroundModel.at(y, x);

			unsigned char currentPixel = kinectImage.at(y, x);

            

        		//! 3

                	if (currentPixel > _nearThreshold)

			{

				currentPixel = 0;

			}

			else if (currentPixel < _farThreshold)

			{

				currentPixel = 0;

			}

			else if (currentPixel < backPixel)

			{

				currentPixel = 0;

			}

		        //! 4 

			kinectImage.at(y, x) = currentPixel;

		}

	}

        //! 5

	foreground = kinectImage - backgroundModel;

        //! 6

	cv::blur(foreground,foreground,cv::Size(13,13));

        //! 7

	threshold(foreground,100);

} 

  1. Loop through the matrix
  2. Get the pixel values from both the backgroundModel and the kinectImage
  3. This is an easy way to threshold the near and far values from the image. If the currentPixel is less than the nearThreshold turn the currentPixel to black. Likewise, if the currentPixel is more than the farThreshold turn it to black. And the final condition is if the currentPixel is less than the back pixel turn it black.
  4. Set the kinectImage pixel at the coordinate to the value of the currentPixel
  5. We can use mathematical operators on the Matrices, so we will subtract the current kinectImage from the backgroundModel to create the foreground.
  6. Blur the resulting pixels.
  7. Then threshold the foreground, which should give you a smoother blob to track.

Now you should have a smoothed image.

The source code for this project is available on Watershed’s GitHub page.

rWeb

Contours