While down at MLB Instructs last week, a Front Office executive asked my thoughts on a new wearable they just got. Many of these devices collect an enormous amount of data for tens or even hundreds of different variables. The Front Office executive voiced concern over how long it would take to collect enough data to be meaningful and actionable. His concerns were well placed.
Particularly in the United States, job contracts don’t last long enough to allow us to collect data for a year or two before we can figure out what is actionable. Unfortunately, teams need to win, and win NOW. Teams need to do everything they can to improve athlete resiliency and improve it right NOW.
So if teams need to use data make better decisions today, they need to be able to take advantage of volumes of data they simply can’t build themselves. While many of today’s technologies bring in huge volumes of data; the volume comes from width, not depth. Modern wearables provide hundreds of variables for each athlete, but just because you can measure 1000 different variables doesn’t mean you should. Instead, build a great depth of volume using a few meaningful variables collected frequently.
Even if a single team or organization has enough volumes of data (GPS, Accelerometry, HRV, force plate assessments), in order to make meaning out of this data we must have outcome metrics like injuries. Hopefully a single organization never suffers enough injuries to build their own injury risk models, but this is where the database can help! Having a depth of frequent, reliable, longitudinal data along with outcome metrics like injuries allows you to begin the pursuit of predictability.
The power of a database is truly demonstrated when the same standardized assessment is performed frequently across dozens of organizations with people of all shapes and sizes. The most important thing this allows us to do is identify “norms”. When you go to the doctor to get blood work done, the results are interpreted based on where the patient’s numbers lie compared to the gender norms. Is the white blood cell count high or low? What are the inherent risks with a low white blood cell count? What is the ideal range? Without this context, the tests don’t mean much.
For example: Running Back “A” has a concentric impulse of 5.39724N*s/kg. To most people, this number doesn’t mean anything. When you have context into norms, you can see this would be considered normal for a running back. But just because he is normal for his population, that doesn’t mean it is ideal. When we compare him to the entire population he is in an extremely low range and we can see that he has increased odds of suffering a muscular strain. That meaningful insight only comes to light when he is compared to our entire database, not just people that play his sport or position.
Instead of comparing running backs to running backs, we analyze movement in context to the entire male or female population. We get asked a lot why we haven’t created separate databases for each sport, position, or population. It turns out much more meaningful findings are possible when utilizing only gender norms. Similar thoughts have occurred in creating norms for blood work. Should we use separate norms based on ethnicity? Age? Just like what they discovered in the blood work results, we found there is more similarity between different populations (Offensive Lineman and Front Row in rugby or a Catcher) than within sports (Offensive Lineman to Quarterback). Simply gender has proven the most reliable and valid population for creating actionable information.
Building a database of depth takes TIME. Data depth > data width.
Norms give meaning to the depths of data
Movement is universal, not sport specific