“It’s what you learn after you know it all that counts”
– John Wooden
Dissecting research can be laborious, frustrating, and confusing. It is extremely tempting to review the abstract, file key takeaways, and move on. If you feel a strong desire to overachieve, you may skim the discussion section. This was how I approached literature reviews when in PT School and through much of residency (this is not a license for current residents to do the same, learn from the error of my ways). While I slowly came around to the value of reading the methods and results sections thoroughly, it wasn’t until I had a greater grasp of the foundations of research and statistics – still developing this foundation – that my desire and understanding flourished.
Chances are a few of you have wide eyes right now and are thinking “please do not tell me you are about to ramble on for 10 minutes on research methods sections!” To put your mind at ease, I am not…well not entirely. The intent is not to drag you along a boring review of how studies are made. The intent is to highlight key areas that are often either overlooked or are simply invisible to most readers of literature, regardless of what some of my colleagues may think (plus you receive the added bonus of my hilarious and well-timed memes). I’ll use various relevant topics to showcase the impact poor research design can have on deciphering and implementing literature into practice. This post is a ‘dipping your toe in the water’ guide to understanding research. Nothing too serious. Part two will get into the crux of the primary issues within the realm of understanding the results and translational science. Prepare yourself!
We will start by considering the placebo effect. For many people, the first thing that comes to mind is the sugar pill. The basic premise of a placebo is that your mind believes you received the active treatment, and consequently, you receive the expected benefits, despite having received an inactive control. Placebo assessment is vital when determining the effectiveness of an intervention in medicine. For example, let us say you are part of a new drug trial aimed to quickly reduce exercise induced muscle soreness.
CONTROLS AND PLACEBOS
On the first day of the study, you complete a protocol of seated knee extension exercises at 75% of your 1 rep max to failure for 6 sets. You are asked to return in 48 hours. When you waddle back into the clinic due to the massive amount of delayed onset muscle soreness (DOMS), you are provided a pill that that has been approved for human testing. The medicine shows great promise in reducing DOMS severity following physical exertion which will allow for more rapid return to activity. You have been informed of the risks and are a willing participant. You take the medicine and wait. The researchers check in with you after 20 minutes, 1 hour, and 4 hours to assess your pain levels. Remarkably, you feel much better at the 4-hour mark and quickly abandon the waddle. You feel so good in fact, that you are ready to take on another set of leg extensions to prepare for beach season and justify your recent purchase of debatably too short swim trunks. Alas, the researchers ask you to hold off and report pain scores at 24- and 72-hours post administration of the medicine.
You, as the reviewer, later find out the participant took a sugar pill, thus displaying the elevated power of the placebo effect. This warrants a closer look, however.
What if both the experimental group and the control group improved but the magnitude for the experimental group was greater?
How much greater would the difference have to be to matter?
What if the designed study did not include a control at all but the outcomes were outrageously positive?
Is the study still valuable and can you conclude the pill created the effect?
Would the placebo effect have been present if no information about the drug’s efficacy in previous trials was disclosed?
These questions are vital for clinicians to thoroughly understand.
BEWARE OF THE SELECTION BIAS
Even if you decide research is for nerds – proud nerd here! – and you determine you will never need to know how to set up a study, understanding the methodology of study design will allow you to critically appraise both literature and individuals who claim they’ve read the literature and are attempting to synthesize relevant information to regurgitate to the masses. Some do this far better than others (i.e. professors/researchers > journalists…most of the time) but how do you know when the reported information is accurate?
When looking at a study, the methods and results sections are by far the most important. In the methods, you immediately want to know about the set up and potential sources for bias, as these can muddy the waters of your results. For example, if the study above included patients ranging from age 12-92 years old and did not control for training experience, then there is a high likelihood of volatility in the results due to the variability in the participants. Conversely, if the population consists of healthy, trained college athletes ranging from age 18-22, the predictability is greater and you have a greater ability to translate to the general population, but only within those specific parameters (i.e. other healthy, trained college athletes ranging from age 18-22). The DOMs would likely be significantly dampened for someone with a history of training, which stimulates the protective phenomenon of the repeated bout effect, compared to a novice individual.
This highlights how recruitment of individuals and the inclusion and exclusion criteria of studies can dramatically impact the results. We need to determine if the researchers obtained a representative population. Did they recruit participants similar to the target group the pain-relieving medicine is intended for upon release, or did they perhaps lure a bunch of people seeking a quick ‘fix’ for their addiction and the opportunity to try a new pain drug? Selection bias is a primary concern when evaluating the quality of a trial and determining the applicability of the results. Selection bias essentially means the researchers did not obtain a randomized group of participants and thus the sample is not representative of the population. This is important as the end goal is to be able to apply the results to future individuals with a reasonable amount of predictability.
Let’s say the recruitment was correctly carried out and you have a representative population who are appropriate for the trial. Next we must determine the set-up for administering the treatment.
What was the education provided to the patient? Where they told to expect pain relief or not?
Were the experimenters blinded to who received the treatment and who received the placebo?
You may be surprised, but the latter can have a substantial impact on the outcomes. How is that you ask? Great question.
I KNOW SOMETHING YOU DON’T KNOW
Studies have demonstrated that experimenters or providers act differently when they believe the intervention is a sham rather than the real deal. This is a critical point for the world of rehabilitation. While we may be able to blind a patient to the sham intervention, such as setting up a thrust joint manipulation while only providing a light shear force, a clinician will know whether they provided a legitimate joint manipulation or not. Subtle changes in body language, effort, communication, and facial expressions can cue the patient that they are not receiving the experimental treatment and thus the potential improvement will be dampened. Let’s take it one step further.
The type of sham intervention matters as well. In many cases, a sham intervention may be a treatment. While a thrust did not occur during the sham manual technique, what was the effect of placing your hands on a patient? Was the hand placement on an area that was painful? Perhaps the patient experienced relief from the light touch acknowledgement of their pain. The sham can potentially highlight what components of the intervention are actually creating an effect. This leads to a lot of debate and controversy among treatments provided. In physical therapy, dry needling comes to the forefront regarding methodological quality and sham treatments.
The intent of this post is not to dive into the efficacy of dry needling but instead to demonstrate the need to evaluate sham procedures. At Combined Sections Meeting this year, I gave platform presentations on my two recent publications.[3, 4] In a platform presentation, 8 presenters are each given a 15-minute block to present their research and answer a couple of audience questions. In my second session, one of the presentations, “Short-Term Outcomes of Dry Needling for Patients with Mechanical Neck Pain: Randomized Clinical Trial” by Gattie et al, used sham dry needling.
The study utilized a sham needle device that blocked the needle from penetrating the skin (thin rubber material). The clinician was still able to replicate the tapping of the needle (when you insert it) and its manipulation (such as twisting), but the point would not penetrate the skin. The authors compared the pain and outcome scores of two statistically similar groups presenting with shoulder pain following physical therapy treatment. Each group received the same exercise protocol, but one received sham periscapular dry needling and the other received actual dry needling. The results demonstrated statistically similar improvements for both sham and experimental groups. Unfortunately, they did not have a control group only receiving exercise. But, the most fascinating outcome pertains to the assessment of the sham procedure.
At the end of the interventions, the participants were provided a questionnaire asking whether they believed they received the sham, the actual needle, or were unsure. No participant in either group (total n of 77) believed they received the sham, with only a couple more in the sham group stating they were unsure compared to the experimental group. Similar findings have been reported with other types of sham needles. Again, this research does not determine the efficacy of dry needling, but it is something to consider when attempting to determine what is contributing to an observed effect.
For the past two years, I have worked with Emory DPT students to develop and write systematic reviews (SR). It has been a steep learning curve and a challenge; however, it has taught me a great deal about the assessment of literature. One of the tools used when conducting a SR is an assessment of rigor. In essence, these charts provide a framework for evaluating the quality of evidence through assessment of potential sources of bias and the soundness of the methodology.
For example, The Medlicott & Harris Evaluation of Methodological Rigor assesses the following: randomization, presence of inclusion/exclusion criteria, if the subject pools were similar at baseline, repeatability of the treatment protocol, whether outcome measure reliability and validating are included, was the assessment blinded (could be broken down to single vs. double blind), presence of long-term follow up, and mention of adherence to home program (obviously, specific to rehab). When you scan this list, how many of these qualities do you look for when reading a study? This list is not exhaustive either (e.g. selectivity biases and conflicts of interest).
There are additional subtleties to consider depending on the type of study being conducted – randomized control trial, cohort, quasi-experimental – but it should highlight the importance of analyzing how the methods are developed and carried out.
Another important factor to assess is the controlling of variables. Whether it is the pain pill experiment or assessing a real vs. sham manipulation, we unfortunately cannot control for all the variables that may impact the outcome of a treatment. In the case of the DOMS reducing drug, we certainly have more control though. The intervention itself is pretty straight forward…swallow the pill. For the manipulation however, we have more to consider.
How skilled is the practitioner? Aside from the difference in the delivery of the techniques, what is the patient’s belief about manipulation and cavitations. If they hate the sound of cracking knuckles or do not like being up close and personal with someone, they may tense or simply respond poorly as the experience itself was negative. How about the state of mind and body of the patient? This can affect any type of experiment and intervention. If they came into the experiment and had a rough night of sleep, had a fight with a significant other, were recently laid off at work, or they are a Browns fan and the NFL season just started, chances are their foul mood could result in a dampened experience; now is not the best time to conduct many types of trials.
Conversely, the subject who recently received a raise, went on a great first date, or slept gloriously the previous night will be in a far better state of mind and more likely to respond well to the treatment. Again, this list is not exhaustive – their diet, past experiences, connection with the clinician, temperature and comfort of the treatment room, the number of times they have watched the current episode of Fixer Upper playing on the TVs in the clinic – but it underscores the inability to control all variables. We can draw incomplete and inaccurate conclusions if we simply look at the outcome following an intervention and ascribe all the credit to said intervention.
In my next post, I will discuss the vital section to read in a paper, the results section. In the meantime, I encourage you to dive deeper in the methods section of papers you are reading. There will legitimately be papers that warrant no further consideration after the methods section as you determine the paper is essentially useless. It is frustrating, but unfortunately, often the case. Your ability to translate evidence into clinical practice will improve as you develop your critical thinking skills. You will better understand the reasoning for treatment choices you may not currently utilize in the clinic. It will take practice – I am still learning more about this every day – and you will have to research about research to have a greater grasp on the concepts (yes, it is as fun as it sounds). You will also start to note that abstracts are heavily biased and only tell a portion of the story. This will become even more apparent after the next post. I promise it is worth the added effort and will help you become a more skilled and evidence-informed clinician.
Blasini, M., et al., The Role of Patient-Practitioner Relationships in Placebo and Nocebo Phenomena. Int Rev Neurobiol, 2018. 139: p. 211-231.
Gattie, E., J.A. Cleland, and S. Snodgrass, The Effectiveness of Trigger Point Dry Needling for Musculoskeletal Conditions by Physical Therapists: A Systematic Review and Meta-analysis. J Orthop Sports Phys Ther, 2017. 47(3): p. 133-149.
Walston, Z. and C. McLester, Importance of Early Improvement In The Treatment of Low Back Pain With Physical Therapy. Spine (Phila Pa 1976), 2019.
Walston, Z. and J. McLester, Impact of low back pain chronicity on patient outcomes treated in outpatient physical therapy: a retrospective observational study. Arch Phys Med Rehabil, 2019.
Mitchell, U.H., et al., The Construction of Sham Dry Needles and Their Validity. Evid Based Complement Alternat Med, 2018. 2018: p. 9567061.
Medlicott, M.S. and S.R. Harris, A systematic review of the effectiveness of exercise, manual therapy, electrotherapy, relaxation training, and biofeedback in the management of temporomandibular disorder. Phys Ther, 2006. 86(7): p. 955-73.