Why should this posting be reviewed?
See also Guidelines for Comments and Corrections.
Thank you for taking the time to flag this posting; we review flagged postings on a regular basis.close
Post Your Discussion Comment
Please follow our guidelines for comments and review our competing interests policy. Comments that do not conform to our guidelines will be promptly removed and the user account disabled. The following must be avoided:
- Remarks that could be interpreted as allegations of misconduct
- Unsupported assertions or statements
- Inflammatory or insulting language
Reader Comments (46)
Post a new comment on this article
Initial Severity and Antidepressant Benefits: Author Replies to Commentaries
Posted by plosmedicine on 31 Mar 2009 at 00:23 GMT
Author: Blair T. Johnson
Position: Professor of Psychology
Institution: University of Connecticut, Storrs, Connecticut, United States of America
Additional Authors: Tania B. Huedo-Medina, Irving Kirsch, Brett J. Deacon
Submitted Date: March 14, 2008
Published Date: March 14, 2008
This comment was originally posted as a “Reader Response” on the publication date indicated above. All Reader Responses are now available as comments.
The article we published in February reports an analysis of evidence submitted to the U.S. Food and Drug Administration (FDA) for approval as new-generation anti-depressants (Kirsch, Deacon, Huedo-Medina, Scoboria, Moore, Johnson 2008). We found that anti-depressants have a limited impact relative to placebo except in samples with higher levels of depression, where a clinically significant difference emerged. We are grateful that our article has generated so much spirited discussion and are happy to address not only the PLoS commentaries but also many that have appeared elsewhere. In this response, we address representative samples of these commentaries.
**How Well Do Anti-Depressants Work?**
A first class of replies characterized our review as reaching conclusions that do not square with clinical practice. Representative of these statements are comments from Vargas that “Doctors and patients know what works and what does not” and Werner that “Clinical practice plus millions of content patients can’t be that wrong” (Jewett and Dales provide similar comments). Contrary to these claims, we can imagine similar statements being said in centuries past about bloodletting, lizard’s blood, crocodile dung, pig's teeth, putrid meat, fly specs, frog’s sperm, powdered stone, human sweat, worms, spiders, furs, feathers, and all of the other treatments that were once widely used but whose success, if any, are now considered to have been entirely due to placebo effects (Honigfeld 1964). As Lenzer and Brownlee put it, “overwhelming expert consensus and clinical observation have been proven wrong time and time again.”
Because clinical depression is a serious disorder with major implications for functioning, we are sympathetic with patient reports of improvement on anti-depressants (Dales, Roberts, Thirlwell, Clark) and wish to allay patient (Dales) and scholarly concerns (e.g., Cowen) that our conclusions about anti-depressants somehow discredit or discriminate against people who suffer from depression. The fact that the Hamilton Rating Scale for Depression (HRSD) is not a self-report but results from the doctor’s observations of the patient should provide some solace that the large observed changes in response to both placebo and drug are not illusory (American Psychiatric Association 2000; Furukawa et al. 2007; Hamilton 1960).
In actuality, there is no discrepancy between our results and clinical experience. In clinical practice, patients are given antidepressant medication and improve, consistent with the results we reported. The important point of our article is that substantial improvement also appears in the placebo groups. Presumably, no clinician prescribes patients placebos, but clinical trials take this exact option in order to provide a rigorous control, using procedures intended to blind both provider and patient as to whether drug or placebo was administered. As our article states, overall mean improvement was 7.80 points for the placebo condition, compared with 9.60 points for the drug condition (Difference= 1.80, 95% CI = 1.35, 2.25); thus overall mean improvement on placebo is 81% of that for anti-depressants. As our Figure 2 shows, for samples with lower initial severity of depression, placebo effects overlapped more with drug effects; they overlapped less as severity increased.
Because neither clinicians nor patients can see what might occur in response to a mere placebo pill, it is no wonder that both the patients and their physicians attribute the patient’s improvement to a particular prescribed treatment. Indeed, the same factors that create improvement in patients who receive placebo logically can be expected to create improvement in patients who receive drugs. The mere fact that people routinely expect medications to work (Gomory) is ample reason for large placebo effects in depression (Walsh, Seidman, Sysko, & Gould 2002), consistent with Tucker’s speculations. Magnetic resonance imaging research has shown that patients responding to placebo exhibit similar neurological changes to patients responding successfully to an anti-depressant (Benedetti, Mayberg, Wager, Stohler, & Zubieta 2005). The implication is that, especially for patients with less severe depression, the expectancy of improvement may be a bigger motivator of change than any pharmaceutical action of the anti-depressant.
**Sampling of Studies and the Potential for Differential Drug Efficacy**
Wager drew attention to the fact that focusing purely on published trials amounts to use of “selective information.” As we stated in our Methods, we filed a Freedom of Information Act (FOIA) request to the FDA for all releasable information about the clinical trials for efficacy conducted for marketing approval of fluoxetine, venlafaxine, nefazodone, paroxetine, sertraline, and citalopram, drugs that were each approved between 1987 and 1999 (our presented analyses omitted sertraline and citalopram trials because these study reports were incomplete). Consequently, our review does not pertain to trials performed after FDA approval for each drug or for drugs reviewed after our FOIA requeset. Although as Corruble and Russell mentioned, certain categories of antidepressants (e.g., tri-cyclic anti-depressants) were not included, the results of the meta-analysis are germane to the medications that were most popularly prescribed at the time of our FOIA request (and still are popular). Furthermore, previous meta-analyses indicate that beneficial effects of different classes of antidepressants are comparable (e.g., Kirsch & Saperstien 1998). Turner and colleagues (2008) also showed negligible differences between drugs, suggesting that more recent drugs are no more efficacious than the set we reviewed.
The fact that the FDA mandates that all trials be provided for review implies that the FDAs’ data will not suffer reporting biases such as have been reported elsewhere. Of the 35 trials available for the 4 drugs that we fully analyzed (see Table 1), we were unable to match 11 (31%) with a published counterpart, a rate identical to Turner and colleagues’ (2008) examination of 74 FDA-registered trials for anti-depressants. The implication is that our meta-analytic results rest on relatively complete evidence. Moreover, when we included all the available studies’ results (whether complete or not), the patterns we presented in the article remained intact. Many have drawn attention to the fact that a compelling picture of anti-depressants’ efficacy emerged only following our use of a FOIA request, thus making complete vital public health information available. As Nature (2008), Turner and Rosenthal (2008), Lenzer and Brownlee (2008) and Wager concluded, public archives for trial registration and compulsory results testing would go a long way to ensuring that the complete data can inform policy and practice.
Some readers (e.g., Watson; Steel) took issue with the PLoS Medicine editorial summary’s statement that all the medications we studied (fluoxetine, venlafaxine, nefazodone, and paroxetine) are selective serotonin reuptake inhibitors (SSRIs). Indeed, according to Goldberg (1995), “in addition to inhibition of serotonin reuptake, nefazodone exhibits 5-HT2 antagonism. Venlafaxine inhibits the reuptake of both norepinephrine and serotonin” (p. 591). Because all of the drugs at least may inhibit serotonin reuptake, they may reasonably be labeled SSRIs. In any event, as we say in our paper, our intent was to examine the efficacy of new-generation antidepressants. The fact that the different drugs may have different underlying psychopharmacology led Watson to inspect trial results for the drugs and to observe that “the mean changes for nefazodone appear to be much closer to placebo than the other drugs examined.” Yet, as we reported in our results, this difference was more apparent than real, disappearing when we controlled for baseline severity. It is worth noting that Turner et al. (2008) found between-group effect size (d) estimates of 0.40 for venlafaxine and 0.26 for nefazodone, both of which are close to the mean of 0.40 for all 12 newer antidepressants and are identical to those for fluoxetine (0.26) and paroxetine (0.42).
**Generalizability and Analytic Issues**
Leonard took the trouble of re-analyzing the data from our Table 1 and concluded that a clinically significant difference emerged at a lower point of severity than we concluded in our article (i.e., 26 vs. 28). We are grateful that his work confirms our major conclusion, which is that the efficacy of anti-depressants depends on the initial severity of depression. Unfortunately, however, his estimates of the standard deviation underlying each effect size relied on between-subjects’ rather than within-subjects’ formulations. In examining improvement in response to drug or placebo, individual trials conventionally control for the correlation between the HRSD scores at baseline. We adopted this convention in our analyses of drug and placebo improvement.
Reassuringly, the analyses at the end of our Results section pertaining to each trial’s drug vs. placebo comparison also used a between-subjects variance formulation and confirmed that clinical significance emerges in the vicinity of an HRSD score of 28.
As Hoschl noted, because our analyses examined change scores and comparisons of change scores, our presentation did not make it easy to visualize how depressed participants may be after treatment with either drug or placebo. To illustrate, consider that the HRSD may vary between 0 and 52 in its standard usage. Nearly all of the trials’ samples had initial mean scores that ranged between 22.8 and 31, in the section of the scale that the American Psychiatric Association (APA) classifies as very severe depression (the most severe category they use). The weighted mean baseline score for the drug groups was 25.28; after treatment with medication, the score was 14.98; for placebo groups, these values are 25.36 and 17.41. Both of these values fall in the portion of the HRSD scale that the APA considers moderate depression. The fact that the post-treatment drug and placebo scores are so close confirms our conclusion that clinical significance has not been achieved for the overall difference between drug and placebo.
We found a nonsignificant benefit of drug compared to placebo for moderately depressed patients. Yet, consistent with our other conclusions, the difference between drug and placebo grows at higher levels of depression. Davies commented on the fact that there were few samples with scores below the category of very severe depression on the Hamilton Rating Scale of Depression (HRSD), a limitation that our Discussion mentioned. There were also very few samples with scores over 30, even though individual patients can score higher. The fact that our database suffered from some restriction of range means that the full impact of initial severity may be more marked than our relatively small database was able to determine.
Young suspected that regression to the mean explained the results, arguing that “(1) that treatment is more effective in patients who are more depressed at the start of a trial; (2) that there is a placebo effect but this effect is no greater in patients who are more depressed at the start of a trial; and (3) that measurement of depression is subject to random error. Figure 3 is entirely consistent with this scenario – a scenario that is very different from the authors’ conclusion.” In fact we carefully considered but rejected the possibility that a regression artefact explained our primary findings because regression should be a constant influence on the drug and placebo groups, who were after all randomly assigned to condition. Moreover, one would expect more extreme groups to regress more toward the mean (less depressed) than less extreme groups, but in fact there was no such tendency in the data we reviewed.
Future studies should further explore how settings, side effects, and patient categories impact results, in addition to the duration of time on drug or placebo. Corruble rightfully argued that “generalizability to clinical practice is a matter of concern because patients included in clinical trials are different from daily practice patients” and that omission of patients at high risk for suicide from the trials might mute the trials’ observed efficacy. Drug approval agencies explicitly take these considerations into account and often request additional trials when they perceive a threat. In this case, there appears to be evidence from RCTs that included patients with high suicidal risk. Specifically, a meta-analysis found a “more than twofold increase in the rate of suicide attempts in patients receiving SSRIs compared with placebo or therapeutic interventions other than tricyclic antidepressants” (Fergusson et al. 2005); SSRI patients also had more fatal suicides. Many (Clark; Beezhold; Goudsmit; Dales; Gill) expressed surprise that the trials took only relatively short-term assessments. There was no tendency for the trials we reviewed for interval to relate to efficacy. Note that 8 weeks was the longest duration of efficacy trials considered by the FDA, largely because attrition rates become too high beyond that trial length; duration did not relate to efficacy in the trials we reviewed (see also Kirsch et al. 2002). Finally, an examination of trials’ side effect data could be informative (Sieswerda, Roberts).
**Alternatives to Anti-Depressant Medications**
Several readers (e.g., Blackburn, Gomory, Russell) took issue with the editorial summary’s description of depression as “caused by imbalances in the brain chemicals that regulate mood.” The problem is of course that a mere “chemical imbalance” might seem amenable only to drugs rather than alternative therapies. In reality, all actions, whether normal or abnormal, may be regulated by, and in turn regulate brain chemistries. Individuals can change how they act with regard to themselves and in turn affect how others relate to them, all of which may alleviate depression. To date, no definitive evidence has established key differences in the brain chemistry of depressed individuals and those who are not depressed (see Lacasse & Leo 2005) and a simple “chemical imbalance in the brain” theory of depression remains debatable (see Castrén 2005). Unfortunately, this common misconception of depression is, as pointed out by Lacasse and Leo fueled by drug company marketing campaigns. Our study also demonstrates the problems inherent in reasoning backward from a treatment to infer the cause of a problem.
Among others (e.g., Turner & Rosenthal 2008), Beezhold criticised our conclusion that “there seems to be little evidence to support the prescription of antidepressant medication to any but the most severely depressed patients, unless alternative treatments have failed to provide benefit.” It is true that our meta-analysis considered no alternative treatments except for the placebo effect, yet the fact that so much improvement could be seen for placebo in very depressed samples strongly supports the conclusion that alternatives to anti-depressants should be the first line of defence against depression, especially when depression is not extremely severe. It is currently difficult to evaluate the extent to which alternative therapies may be combined or sequenced (e.g., STAR*D, see Stewart) to create greater reduction of depression. Nonetheless, an added benefit of alternative treatments is avoiding the common side effects associated with antidepressants, which in addition to the suicide risk cited above, commonly include dry mouth, urinary retention, blurred vision, constipation, sedation, sleep disruption, weight gain, headache, nausea, gastrointestinal disturbance or diarrhoea, abdominal pain, inability to achieve an erection, inability to achieve an orgasm (men and women), loss of libido, agitation, and anxiety.
Several readers drew attention to alternative treatments ranging from psychotherapy, to subliminal disruptions, to omega-3 fatty acid supplements. Although we find creativity in Tucker’s suggestion that excessive startle reflexes trigger depression, we will reserve judgment until compelling demonstrations are made. In any event, those experiencing such disruptions would have been equally present in both placebo and drug conditions of the trials; therefore it is not a threat to the conclusions we reached. In contrast, we found Sieswerda and Ross’s evidence regarding omega-3 fatty acid supplements much more compelling and we are grateful for them sharing their evidence of improvement was larger for those who initially scored higher in depression, parallel to our results.
Cost considerations also bolster the conclusion that alternative therapies be considered. As Cowen stated, “there seems little point in spending large sums of money funding psychological treatments when all that is needed for the management of severe depression is the prescription of sugar pills.” As we alluded above, it is not in clinicians’ powers or ethics merely to prescribe placebo, but the fact is that sufferers of depression routinely remain on these drugs for many months and years (Roberts, Clark). Especially in their non-generic forms, these drugs can be quite expensive for most patients. Dobson and colleagues (in press) examined the relative costs of psychotherapy and drug therapy for depression in a randomized controlled trial of adults with major depression. Their results indicated that psychotherapy became less expensive than drug therapy after 9 months and was more effective at preventing relapse into major depression at later assessments. Whether individuals must pay these expenses or their health care plans, it would seem the choice is clear.
Blair T. Johnson, PhD, University of Connecticut, Storrs, Connecticut, United States of America
Tania B. Huedo-Medina, PhD, University of Connecticut, Storrs, Connecticut, United States of America
Irving Kirsch, PhD, University of Hull, Hull, United Kingdom
Brett J. Deacon, PhD, University of Wyoming, Laramie, Wyoming, United States of America
American Psychiatric Association (2000). Hamilton Rating Scale for Depression (HRS-D), in Handbook of Psychiatric Measures. Washington, DC, pp 526–529.
Benedetti F, Mayberg HS, Wager TD, Stohler CS, Zubieta J-K. (2005). Neurobiological mechanisms of the placebo effect. The Journal of Neuroscience, 25, 10390-10402.
Castrén, E. (2005). Is mood chemistry? Nature Reviews –Neuroscience, 6, 241-246.
Dobson KS, et al. (in press). Randomized trial of behavioral activation, cognitive therapy, and antidepressant medication in the prevention of relapse and recurrence in major depression. Journal of Consulting and Clinical Psychology.
Fergusson D, Doucette S, Glass KC, Shapiro S, Healy D, Hebert P, Hutton B. (2005). Association between suicide attempts and selective serotonin reuptake inhibitors: Systematic review of randomised controlled trials. BMJ, 330, 396-399.
Freedom of Information Act (FOIA). 5 US Congress. 552 (1994 & Supp. II 1996).
Furukawa TA, Cipriani A, Barbui C, Geddes JR. (2007) Long-term treatment of depression with antidepressants: A systematic narrative review. Canadian Journal of Psychiatry, 52, 545-552.
Goldberg, RJ. (1995). Nefazodone and venlafaxine: Two new agents for the treatment of depression. Journal of Family Practice, 41, p591(4).
Honigfeld G. (1964). Non specific factors in treatment: I. Review of placebo reactions and placebo reactors. Diseases of the Nervous System, 25, 145 156.
Kirsch I, Moore TJ, Scoboria A, & Nicholls SS. (2002). The emperor’s new drugs: An analysis of antidepressant medication data submitted to the FDA. Prevention and Treatment, 5, Article 23.
Kirsch I, Sapirstein G. Listening to Prozac but hearing placebo: a meta-analysis of antidepressant medication. Prevention and Treatment, 1, Article 2.
Lacasse JR, Leo J (2005) Serotonin and Depression: A Disconnect between the Advertisements and the Scientific Literature. PLoS Med 2(12): e392 doi:10.1371/journal.pmed.0020392.
Lenzer J, Brownlee S. (2008). Antidepressants: An untold story? BMJ, doi:10.1136/bmj.39504.662685.0F (published 27 February 2008). Available on the World Wide Web: http://www.bmj.com/cgi/co...
Nature. (2008). Editorial: No more scavenger hunts. Nature 452, 1 (6 March 2008), doi:10.1038/452001a. Available on the World Wide Web: http://www.nature.com/nat...
Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. (2008). Selective publication of antidepressant trials and its influence on apparent efficacy. The New England Journal of Medicine, 358, 252-260.
Turner EH, Rosenthal R. (2008). Editorial: Efficacy of antidepressants. BMJ, BMJ 2008,336:516-517 (8 March), doi:10.1136/bmj.39510.531597.80. Available on the World Wide Web: http://www.bmj.com/cgi/co...
Walsh BT, Seidman SN, Sysko R, Gould M. (2002). Placebo response in studies of major depression: Variable, substantial, and growing. JAMA, 287, 1840-1847.