Posture recognition with Kinect III

In the previous article of this series I showed the structures, enumerations and classes that the application uses to be independent of the sensor version and the Kinect SDK. In this third article I will show a possible implementation of a class that deals with reading and converting the skeletons using version 2.0 of the SDK, for the Xbox One sensor.

In this link you can find the first article in this series. In this another one you can download the source code of the KinectGestures solution, written in csharp with Visual Studio 2015.

The name of the project is KTSensor20. It generates an assembly named KTSensor.dll. In this project there is only one class, the Sensor one, which implements the capture and conversion of the Body class, proper of the Kinect SDK, into the BodyVector class, proper of the application.

The idea is that, if you have to build an application that can work interchangeably with several versions of the sensor, it is only necessary to modify this Sensor class, generating always an assembly named KTSensor.dll. The application would work with any sensor version simply by replacing this library of classes with the appropriate version of it.

The ISensor interface

In the KinectInterface project, the ISensor interface is defined in the namespace KinectInterface.Interfaces. This interface must be implemented by all classes that encapsulate the access to the Kinect sensor.

public class KinectSensorEventArgs { public KinectSensorEventArgs(bool av) { IsAvailable = av; } public bool IsAvailable { get; private set; } } public delegate void KinectSensorEventHandler(object sender, KinectSensorEventArgs e); public interface ISensor { event KinectSensorEventHandler AvailableChanged; bool IsOpen { get; } BodyVector NextBody { get; } void Start(); void Stop(); }

This interface defines an AvailableChanged event to notify the application when sensor availability changes.

The IsOpen property is used to check whether the sensor is active or inactive.

With the property NextBody you can get the following skeleton read from the sensor, as a BodyVector object.

The Start and Stop methods are used to start or stop capturing.

The Sensor class

This class implements the ISensor interface. In order to completely isolate this library of classes from the application, instead of putting a reference to the assembly in the main application, two post-compilation events have been implemented in the KTSensor20 project to copy the generated class library, in addition to the Microsoft.Kinect.dll assembly, required to access the sensor. The version of this last library will depend on the version of the SDK that we are using.

copy $(TargetPath) $(SolutionDir)KinectGestures\$(OutDir) copy $(TargetDir)Microsoft.Kinect.dll $(SolutionDir)KinectGestures\$(OutDir)

In the class constructor, we get an instance of the KinectSensor class in the variable _kSensor and then we subscribe to the IsAvailableChanged event, which informs when the sensor is available to be used or is no longer available.

public Sensor() { _kSensor = KinectSensor.GetDefault(); _kSensor.IsAvailableChanged += new EventHandler<IsAvailableChangedEventArgs>( Sensor_IsAvailableChanged); }

The Sensor class in turn exposes the AvailableChanged event to receive these same notifications.

To start the sensor, call the Start method:

public void Start() { if (!_kSensor.IsOpen) { _bodies.Clear(); _bodiesV.Clear(); _kSensor.Open(); _bReader = _kSensor.BodyFrameSource.OpenReader(); _bReader.FrameArrived += new EventHandler<BodyFrameArrivedEventArgs>( Body_FrameArrived); _cancel = new CancellationTokenSource(); _convertTask = Task.Run(() => { ConvertBodies(); }, _cancel.Token); } }

This class uses two different processes. The capture of the skeletons from the sensor is done by means of a BodyFrameReader object and the subscription to the FrameArrived event, which is triggered each time the sensor has a new object of Body type available. For the conversion of these Body objects into the generic BodyVector objects, another task is launched using the Task class, to which a cancellation token is assigned so that we can stop it.

The read Body objects are stored in a Queue<BodyTM> object, where BodyTM is a struct containing the read Body object and a DateTime object with the date and time of reading. The task that performs the conversion of Body objects into BodyVector objects consumes the elements of this queue and stores the resulting BodyVector objects in another Queue<BodyVector> object so that they are consumed in turn by the application, using the NextBody property.

public BodyVector NextBody { get { BodyVector bv = null; if (_bodiesV.Count > 0) { lock (_vLock) { bv = _bodiesV.Dequeue(); } } return bv; } }

We use the lock statement to prevent the application and the conversion task from simultaneously accessing the queue containing the bodies. Keep in mind that this property can return null if no body is available.

To stop capturing, the Stop method is used:

public void Stop() { if (_kSensor.IsOpen) { _bReader.FrameArrived -= new EventHandler<BodyFrameArrivedEventArgs>(Body_FrameArrived); _bReader = null; _kSensor.Close(); try { _cancel.Cancel(); Task.WaitAll(new Task[1] { _convertTask }); } catch { } finally { _convertTask.Dispose(); _cancel.Dispose(); _convertTask = null; _cancel = null; } } }

This method simply stops the sensor and cancels the skeleton conversion task.

The FrameArrived event handler is implemented as follows:

private void Body_FrameArrived(object sender, BodyFrameArrivedEventArgs e) { try { using (BodyFrame bf = e.FrameReference.AcquireFrame()) { if (bf != null) { Body[] bodies = new Body[_kSensor.BodyFrameSource.BodyCount]; bf.GetAndRefreshBodyData(bodies); foreach (Body b in bodies) { if (b.IsTracked) { lock (_bLock) { _bodies.Enqueue( new BodyTM(b, DateTime.Now)); } break; } } } } } catch { } }

First, we obtain a BodyFrame object using the AcquireFrame method. This object contains all the bodies captured by the sensor, which with this version of the SDK can be up to 6. This account is obtained from the BodyCount property of the BodyFrameSource object. These bodies are obtained by the GetAndRefreshBodyData method of the BodyFrame object, in the bodies array. Although we can have up to 6 bodies, only two of them at most can be complete. The remainder ones only are defined by a single point that represents, more or less, the center of the body. To determine this, the IsTracked property of the Body object is used, which will only be true for the complete skeletons. In this case we take the first one we find.

Finally, this body is encapsulated in a BodyTM structure together with the current date and time and is added to the corresponding queue.

Regarding the conversion of the Body object to a BodyVector, the ConvertBodies method carries on this work:

private void ConvertBodies() { while (!_cancel.Token.IsCancellationRequested) { if (_bodies.Count > 0) { BodyTM b; lock (_bLock) { b = _bodies.Dequeue(); } BodyPoint[] vector = new BodyPoint[25]; vector[(int)KinectInterface.Joint.Head] = TranslateBodyJoint(JointType.Head, b.Body); vector[(int)KinectInterface.Joint.Neck] = TranslateBodyJoint(JointType.Neck, b.Body); vector[(int)KinectInterface.Joint.SpineShoulder] = TranslateBodyJoint(JointType.SpineShoulder, b.Body); ... vector[(int)KinectInterface.Joint.AnkleRight] = TranslateBodyJoint(JointType.AnkleRight, b.Body); vector[(int)KinectInterface.Joint.FootRight] = TranslateBodyJoint(JointType.FootRight, b.Body); BodyVector bv = new BodyVector(vector, HState(b.Body.HandLeftState, b.Body.HandLeftConfidence), HState(b.Body.HandRightState, b.Body.HandRightConfidence), new ClippedEdges( b.Body.ClippedEdges.HasFlag(FrameEdges.Top), b.Body.ClippedEdges.HasFlag(FrameEdges.Right), b.Body.ClippedEdges.HasFlag(FrameEdges.Bottom), b.Body.ClippedEdges.HasFlag(FrameEdges.Left)), new SizeF(_kSensor.DepthFrameSource.FrameDescription.Width, _kSensor.DepthFrameSource.FrameDescription.Height), b.Time); lock (_vLock) { _bodiesV.Enqueue(bv); } } } }

First, we obtain the next Body from the corresponding queue and we build an array of BodyPoint structures from the BodyJoint objects that represent the joints in the form provided by the SDK. The TranslateBodyJoint method is used for this conversion:

private BodyPoint TranslateBodyJoint(JointType bj, Body b) { BodyPoint v; CameraSpacePoint position = b.Joints[bj].Position; if (position.Z < 0) { position.Z = 0.1f; } DepthSpacePoint dp = _kSensor.CoordinateMapper.MapCameraPointToDepthSpace(position); if (!(float.IsInfinity(dp.X) || float.IsInfinity(dp.Y))) { v = new BodyPoint(new Vector3D(position.X, position.Y, position.Z), new System.Drawing.PointF(dp.X, dp.Y), TrackingAccuracy(b.Joints[bj].TrackingState)); } else { v = new BodyPoint(new Vector3D(position.X, position.Y, position.Z), TrackingAccuracy(b.Joints[bj].TrackingState)); } return v; }

With the MapCameraPointToDepthSpace method you get a point that represents the projection of the joint in two dimensions. This will be the point we use to draw the skeleton in the application. We build the BodyPoint structure using a Vector3D with the coordinates in three dimensions provided by the sensor, the projection in two dimensions in the form of a PointF, and the accuracy of the measurement obtained with the TrackingAccuracy method. This method will return -1 if the position of the joint could not be determined, 0 if the value is inferred, or 1 if it is an exact measurement.

The BodyVector object generated by the ConvertBodies method is built with this BodyPoint articulation array, the state of the two hands, obtained using the HState method, a ClippedEdges structure that will indicate if the skeleton is leaving the sensor frame by any direction, and the date and time in which the body has been captured, in order to process composite movements in addition to static positions. Finally, the BodyVector is put in the corresponding queue to be consumed by the application.

That's all regarding the implementation of the Kinect sensor access. In the next article I will show the different classes dedicated to the normalization of the different parts of the body.