Markerless Tracking In Motion Capture. What is it and Why it is important? – Sports Biometrics Conference
AI and Machine Learning(AI/ML) have gotten much attention in the last 5 years. AI and Convoluted Neural Networks (CNN) seem well suited for extracting finer detail from video images. The process is basically one of taking a set of inputs and mapping them to a set of outputs. But unlike a direct mapping that might occur with linear regression techniques, CNN mapping includes several hidden layers that result in non-linear transformations of input to output. As Gary Marcus of New York University points out in a critical appraisal of deep learning, the methods for achieving that mapping are complex, with results often not understood by the team creating the network. While beyond the scope of this post, for those wanting more insight into CNN and Machine Learning, see Google’s developer section for an introduction to feature identification.
Training a CNN requires that large data sets be run through the fitting routines. In the case of tracking human motion, one needs i) large numbers of images with the relevant features identified, ii) images need to include people in various poses consistent with the activities that are to be tracked, and iii) subjects and backgrounds need to be sufficiently diverse that the models can track in all environments.
Validating the trained CNN is equally daunting. Typically, a model trained on say, 90% of the dataset is used to test its ability to accurately identify the remaining 10%. Randomly picking the 90% in repeated tests allows for a very large validation test.
With images of the features taken from multiple digital videos located in different viewpoints, standard DLT techniques can be used to locate those features in 3D space.
As mentioned, the shape/silhouette approach offers fast processing and reasonable accuracy of gross motion. But its methods are limited in terms of the resolution that can be achieved.
AI/ML on the other hand, enable extracting the position of individual features such as eyes, joint centers and condyles. But the processing of video files is time consuming and not yet real time requiring several minutes to process videos of short duration. The computational requirements are also more expensive involving high end graphics cards and parallel processing on the GPU.
The Future
Our first experience with markerless motion tracking used shape/silhouette technology. However, our current belief is that AI and CNN are especially suited to tracking human motion and are, in fact, the future of motion tracking. Its primary shortfall is the time it takes to process video data, and that will be addressed as computing power and programming techniques evolve.
With AI/ML many of the traditional biomechanical analyses remain in place. For example, rigid body analyses where 3 non-colinear markers are used to track the orientation of a body segment find parallels with markerless tracking. Feature identification that includes a proximal joint center, lateral condyle and distal joint center could be right out of traditional biomechanical marker sets.
The high resolution of MI/ML also enables monitoring of such things as how a ball is held during a pitching activity.
AI/ML approaches are being commercialized by several companies including Theia, KinaTrax, Intel 3Dat, and SwRI. While we are actively evaluating all, Theia and KinaTrax have already demonstrated acceptance in their respective markets.
Thus far we have created tight integrations with Theia and KinaTrax. It is important from an “ease of use” perspective that the steps of recording video, processing video, applying AI modeling, and generating the appropriate analytical output progress without intervention after clicking the record button. The MotionMonitor’s existing structure is being used to collect digital video synchronously with standard laboratory peripherals such as forceplates and EMG. This greatly simplifies and expands the usability of Theia and Kinatrax markerless tracking.
In addition, The MotionMonitor’s unique design supports fast special purpose application development on multiple platforms. The In Game Baseball application developed for KinaTrax shown in the following video is a good example of this capability.