April 10, 2018

    Why More Data Isn’t Always Better

    At a conference this past week, I had the opportunity to listen and learn from some great professionals in the field. Many presentations I attended discussed best practices for collecting and utilizing data within athletic and military populations. It was great to hear some of the speakers preaching many of the same concepts we discuss internally, as well as with the partner organizations we work with: the importance of reliability and standardization, the value of context when interpreting data, and the need for practicality and simplicity in order for this data to become useful information.

    None of these concepts are new or groundbreaking, but are so crucial they bear repeating time and time again. Another concept was discussed that is perhaps a bit counterintuitive to us in the era of Big Data. That is, it is important for practitioners looking to utilize data to understand that more isn’t always better.

    “No data is better than bad data”

    As always at these conferences, there were a few great quotable lines, but I believe this concept may be the most important takeaway for practitioners who are hoping to utilize data to guide their decisions. We often assume that these new technologies are inherently reliable and scientific purely because they are digital, wireless, or sleek, but that is often not the case. Often times organizations will utilize more “budget friendly” solutions without understanding that what they potentially save in money on the front end they can end up losing in time, buy-in, and insight by collecting unreliable and unusable data. As the saying goes, garbage in, garbage out.

    This is not to say there are not viable cheap or free solutions. For example, there is a growing body of research showing value in utilizing RPE and wellness questionnaires that can be collected using a variety of free applications and software. The very first question that needs to be asked when looking to collect data: is this data reliable?

    Just because you can doesn’t mean you should

    Countless devices and measurement tools today can give us thousands of variables to look at. Some of these variables can be meaningful, but many of them will simply add complexity and confusion. Deciding which of these variables to utilize on your own can easily become an exercise in futility. While it is true that a large volume of data is necessary in order to truly uncover what is meaningful, longitudinal depth of data over time, not simply width of different variables, is what allows this to occur. Collecting “too much” data will often cause paralysis by analysis, as those looking at the data are unsure what they are supposed to do next. Many organizations feel as though they are behind the curve if they are not collecting large amounts of data, even if they have no intention of using it.

    In the academic world, before researchers begin with data collection, they will clearly define the question they are trying to answer, as well as perform an in-depth review of the current literature. Practitioners in the field need to understand these concepts and take the time do the same. Before we even begin we must ask, what problem are we trying to solve?

    Are you a researcher or a practitioner?

    In an age of Big Data, it seems we are getting swayed into the simple mindset that more is always better. In the applied world, we need to be aware individuals we work with are not test subjects, but human beings. The growing collaboration between academia and the applied setting is a great thing, but the goal of data collection on our end is clear: to improve outcomes for these individuals. Building trust between athlete and coach, or patient and caregiver is a prerequisite in order for data to actually be utilized. Simply having a large amount of data in itself doesn’t guarantee improved outcomes.

    Scientists and researchers will hopefully continue to perform more controlled studies in order to give practitioners insight, as well as validate what they are seeing in the applied setting. For an individual, however, the data collection is often invasive and in itself can be a stressor. Is the data we are collecting able to give us information that we need, or are we simply fulfilling a selfish curiosity? Data collection is a means to an end, it isn’t about you.

    To simplify is complicated

    Technology today has helped to make data collection more seamless than ever. In the real world, this ability is both a blessing and a curse. When you can collect anything you want, the challenge is deciding what you should collect.

    • Ensure the data you are collecting is reliable

    • Don’t collect data simply because you can or you feel you should

    • Understand your goal: what problem are you trying to solve?

    Other posts you might be interested in:

    View All Posts