The Replicability Crisis in Cancer Research

The cancer researchers Glenn Begley and Lee Ellis made a rather remarkable claim last year. In a commentary that analyzed the dearth of efficacious novel cancer therapies, they revealed that scientists at the biotechnology company Amgen were unable to replicate the vast majority of published pre-clinical research studies. Only 6 out of 53 landmark cancer studies could be replicated, a dismal success rate of 11%! The Amgen researchers had deliberately chosen highly innovative cancer research papers, hoping that these would form the scientific basis for future cancer therapies that they could develop. It should not come as a surprise that progress in developing new cancer treatments is so sluggish. New clinical treatments are often based on innovative scientific concepts derived from pre-clinical laboratory research. However, if the pre-clinical scientific experiments cannot be replicated, it would be folly to expect that clinical treatments based on these questionable scientific concepts would succeed.

Cancer-Detecting Nanoparticles. Here, when cancer cells (cell nuclei in blue) were treated with antibody-conjugated nanoparticles, the antibodies (red) and the nanoparticle cores (green) separated into different cellular compartments. Source: National Cancer Institute \ M.D. Anderson Cancer Center. Creator: Sangheon Han, Konstantin Sokolov, Tomasz Zal, Anna Zal

Reproducibility of research findings is the cornerstone of science. Peer-reviewed scientific journals generally require that scientists conduct multiple repeat experiments and report the variability of their findings before publishing them. However, it is not uncommon for researchers to successfully repeat experiments and publish a paper, only to learn that colleagues at other institutions can’t replicate the findings. This does not necessarily indicate foul play. The reasons for the lack of reproducibility include intentional fraud and misconduct, yes, but more often it’s negligence, inadvertent errors, imperfectly designed experiments and the subliminal biases of the researchers or other uncontrollable variables.

Clinical studies, of new drugs, for example, are often plagued by the biological variability found in study participants. A group of patients in a trial may exhibit different responses to a new medication compared to patients enrolled in similar trials at different locations. In addition to genetic differences between patient populations, factors like differences in socioeconomic status, diet, access to healthcare, criteria used by referring physicians, standards of data analysis by researchers or the subjective nature of certain clinical outcomes – as well as many other uncharted variables – might all contribute to different results.

The claims of low reproducibility made by Begley and Ellis, however, did not refer to clinical cancer research but to pre-clinical science. Pre-clinical scientists attempt to reduce the degree of experimental variability by using well-defined animal models and standardized outcomes such as cell division, cell death, cell signaling or tumor growth. Without the variability inherent in patient populations, pre-clinical research variables should in theory be easier to control. The lack of reproducibility in pre-clinical cancer research has a significance that reaches far beyond just cancer research. Similar or comparable molecular and cellular experimental methods are also used in other areas of biological research, such as stem cell biology, neurobiology or cardiovascular biology. If only 11% of published landmark papers in cancer research are reproducible, it raises questions about how published papers in other areas of biological research fare.

Following the publication of Begley and Ellis’ commentary, cancer researchers wanted to know more details. Could they reveal the list of the irreproducible papers? How were the experiments at Amgen conducted to assess reproducibility? What constituted a successful replication? Were certain areas of cancer research or specific journals more prone to publishing irreproducible results? What was the cause of the poor reproducibility? Unfortunately, the Amgen scientists were bound by confidentiality agreements that they had entered into with the scientists whose work they attempted to replicate. They could not reveal which papers were irreproducible or specific details regarding the experiments, thus leaving the cancer research world in a state of uncertainty. If so much published cancer research cannot be replicated, how can the field progress?

 Lee Ellis has now co-authored another paper to delve further into the question. In the study, published in the journal PLOS One, Ellis teamed up with colleagues at the renowned University of Texas MD Anderson Cancer Center to survey faculty members and trainees (PhD students and postdoctoral fellows) at the center. Only 15-17% of their colleagues responded to the anonymous survey, but the responses confirmed that reproducibility of papers in peer-reviewed scientific journals is a major problem. Two-thirds of the senior faculty respondents revealed they had been unable to replicate published findings, and the same was true for roughly half of the junior faculty members as well as trainees. Seventy-eight percent of the scientists had attempted to contact the authors of the original scientific paper to identify the problem, but only 38.5% received a helpful response. Nearly 44% of the researchers encountered difficulties when trying to publish findings that contradicted the results of previously published papers.

The list of scientific journals in which some of the irreproducible papers were published includes the the “elite” of scientific publications: The prestigious Nature tops the list with ten mentions, but one can also find Cancer Research (nine mentions), Cell (six mentions), PNAS (six mentions) and Science (three mentions).

Does this mean that these high-profile journals are the ones most likely to publish irreproducible results? Not necessarily. Researchers typically choose to replicate the work published in high-profile journals and use that as a foundation for new projects. Researchers at MD Anderson Cancer Center may not have been able to reproduce the results of ten cancer research papers published in Nature, but the survey did not provide any information regarding how many cancer research papers in Nature were successfully replicated.

The lack of data on successful replications is a major limitation of this survey. We know that more than half of all scientists responded “Yes” to the rather opaque question “Have you ever tried to reproduce a finding from a published paper and not been able to do so?”, but we do not know how often this occurred. Researchers who successfully replicated nine out of ten papers and researchers who failed to replicate four out of four published papers would have both responded “Yes.” Other limitations of this survey include that it does not list the specific irreproducible papers or clearly define what constitutes reproducibility. Published scientific papers represent years of work and can encompass five, ten or more distinct experiments. Does successful reproducibility require that every single experiment in a paper be replicated or just the major findings? What if similar trends are seen but the magnitude of effects is smaller than what was published in the original paper?

Due to these limitations, the survey cannot provide definitive answers about the magnitude of the reproducibility problem. It only confirms that lack of reproducibility is a potentially important problem in pre-clinical cancer research, and that high-impact peer-reviewed journals are not immune. While Begley and Ellis have focused on questioning the reproducibility of cancer research, it is likely that other areas of biological and medical research are also struggling with the problem of reproducibility. Some of the most highly cited papers in stem cell biology cannot be replicated , and a recent clinical trial using bone marrow cells to regenerate the heart did not succeed in improving heart function after a heart attack  despite earlier trials demonstrating benefits.

Does this mean that cancer research is facing a crisis? If only 11% of pre-clinical cancer research is reproducible, as originally proposed by Begley and Ellis, then it might be time to sound the alarm bells. But since we don’t know how exactly reproducibility was assessed, it is impossible to ascertain the extent of the problem. The word “crisis” also has a less sensationalist meaning: the time for a crucial decision. In that sense, cancer research and perhaps much of contemporary biological and medical research needs to face up to the current quality control “crisis.” Scientists need to wholeheartedly acknowledge that reproducibility is a major problem and crucial steps must be taken to track and improve the reproducibility of published scientific work.

First, scientists involved in biological and medical research need to foster a culture that encourages the evaluation of reproducibility and develop the necessary infrastructure. When scientists are unable to replicate results of published papers and contact the authors, the latter need to treat their colleagues with respect and work together to resolve the issue. Many academic psychologists have already recognized the importance of tracking reproducibility and initiated a large-scale collaborative effort to tackle the issue; the Harvard psychologists Joshua Hartshorne and Adena Schachner also recently proposed using a formal approach to track the reproducibility of research. Biological and medical scientists should consider adopting similar infrastructures for their research, because reproducibility is clearly not just a problem for psychology research.

Second, grant-funding agencies should provide adequate research funding for scientists to conduct replication studies. Currently, research grants are awarded to those who propose the most innovative experiments, but few — if any — funds are available for researchers who want to confirm or refute a published scientific paper. While innovation is obviously important, attempts to replicate published findings deserve recognition and funding because new work can only succeed if it is built on solid, reproducible scientific data.

In the U.S., it can take 1-2 years from when researchers submit a grant proposal to when they receive funding to conduct research. Funding agencies could consider an alternate approach, one that allows for rapid approval of small-budget grant proposals so that researchers can immediately start evaluating the reproducibility of recent breakthrough discoveries. Such funding for reproducibility testing could be provided to individual laboratories or teams of scientists such as the Reproducibility Initiative or the recent efforts of chemistry bloggers to document reproducibility.

The U.S.-based NIH (National Institutes of Health) is the largest source of funding for medical research in the world and is now considering the implementation of new reproducibility requirements for scientists who receive funding. However, not even the NIH has a clear plan for how reproducibility testing should be funded.

Lastly, it is also important that scientific journals address the issue of reproducibility. One of the most common and also most heavily criticized metrics for the success of a scientific journal is its “impact factor,” an indicator of how often an average article published in the journal is cited. Even irreproducible scientific papers can be cited thousands of times and boost a journal’s “impact.”

If a system tracked the reproducibility of scientific papers, one could conceivably calculate a reproducibility score for any scientific journal. That way, a journal’s reputation would not only rest on the average number of citations but also on the reliability of the papers it publishes. Scientific journals should also consider supporting reproducibility initiatives by encouraging the publication of papers that attempted to replicate previous papers — as long as the reproducibility was tested in a rigorous fashion and independent of whether or not the replication attempts were successful.

There is no need to publish the 20th replication study that merely confirms what 19 previous studies have previously found, but publication of replication attempts is sorely needed before a consensus is reached regarding a scientific discovery. The journal PLOS One has partnered up with the Reproducibility Initiative to provide a forum for the publication of replication studies, but there is no reason why other journals should not follow.

While PLOS One publishes many excellent papers, current requirements for tenure and promotion at academic centers often require that researchers publish in certain pre-specified scientific journals, including those affiliated with certain professional societies and which carry prestige in a designated field of research. If these journals also encouraged the publication of replication attempts, more researchers would conduct them and contribute to the post-publication quality control of scientific literature.

The recent questions raised about the reproducibility of biological and medical research findings is forcing scientists to embark on a soul-searching mission. It is likely that this journey will shake up many long-held beliefs. But this reappraisal will ultimately lead to a more rigorous and reliable science.


Note: An earlier version of this article was first published on


Immune Cells Can Remember Past Lives

The generation of induced pluripotent stem cells (iPSCs) is one of the most fascinating discoveries in the history of stem cell biology. John Gurdon and Shinya Yamanaka received the 2012 Nobel Prize for showing that adult cells could be induced to become embryonic-like stem cells (iPSCs). Many stem cell laboratories now routinely convert skin cells or blood cells from an adult patient into iPSCs. The stem cell properties of the generated iPSCs then allow researchers to convert them into a desired cell type, such as heart cells (cardiomyocytes) or brain cells (neurons), which can then be used for cell-based therapies or for the screening of novel drugs. The initial conversion of adult cells to iPSCs is referred to as “reprogramming” and is thought to represent a form of rejuvenation, because the adult cell appears to lose its adult cell identity and reverts to an immature embryonic-like state. However, we know surprisingly little about the specific mechanisms that allow adult cells to become embryonic-like. For example, how does a blood immune cell such as a lymphocyte lose its lymphocyte characteristics during the reprogramming process? Does the lymphocyte that is converted into an immature iPSC state “remember” that it used to be a lymphocyte? If yes, does this memory affect what types of cells the newly generated iPSCs can be converted into, i.e. are iPSCs derived from lymphocytes very different from iPSCs that are derived from skin cells?

There have been a number of recent studies that have tried to address the question of the “memory” in iPSCs, but two recent papers published in the January 3, 2013 issue of the journal Cell Stem Cell provide some of the most compelling proofs of an iPSC “memory” and also show that this “memory” could be used for therapeutic purposes. In the paper “Regeneration of Human Tumor Antigen-Specific T Cells from iPSCs Derived from Mature CD8+ T Cells“, Vizcardo and colleagues studied the reprogramming of T-lymphocytes derived from the tumor of a melanoma patient. Mature T-lymphocytes are immune cells that can recognize specific targets, depending on what antigen they have been exposed to. The tumor infiltrating cells used by Vizcardo and colleagues have been previously shown to recognize the melanoma tumor antigen MART-1. The researchers were able to successfully generate iPSCs from the T-lymphocytes, and they then converted the iPSCs back to T-lymphocytes. What they found was that the newly generated T-lymphocytes expressed a receptor that was specific for the MART tumor antigen. Even though the newly generated T-lymphocytes had not been exposed to the tumor, they had retained their capacity to respond to the melanoma antigen. The most likely explanation for this is that the generated iPSCs “remembered” their previous exposure to the tumor in their past lives as T-lymphocytes before they had been converted to embryonic-like iPSCs and then “reborn” as new T-lymphocytes. The iPSC reprogramming apparently did not wipe out their “memory”.

This finding has important therapeutic implications. One key problem that the immune system faces when fighting a malignant tumor is that the demand for immune cells outpaces their availability. The new study suggests that one can take activated immune cells from a cancer patient, convert them to the iPSC state, differentiate them back into rejuvenated immune cells, expand them and inject them back into the patient. The expanded and rejuvenated immune cells would retain their prior anti-tumor memory, be primed to fight the tumor and thus significantly augment the ability of the immune system to slow down the tumor growth.

The paper by Vizcardo and colleagues did not actually show the rejuvenation and anti-tumor efficacy of the iPSC-derived T-lymphocytes and this needs to be addressed in future studies. However, the paper “Generation of Rejuvenated Antigen-Specific T Cells by Reprogramming to Pluripotency and Redifferentiation” by Nishimura and colleagues in the same issue of Cell Stem Cell, did address the rejuvenation question, albeit in a slightly different context. This group of researchers obtained T-lymphocytes from a patient with HIV, then generated iPSC and re-differentiated the iPSCs back into T-lymphocytes. Similar to what Vizcardo and colleagues had observed, Nishimura and colleagues found that their iPSC derived T-lymphocytes retained an immunological memory against HIV antigens. Importantly, the newly derived T-lymphocytes were highly proliferative and had longer telomeres. The telomeres are chunks of DNA that become shorter as cells age, so the lengthening of telomeres and the high growth rate of the iPSC derived T-lymphocytes were both indicators that the iPSC reprogramming process had made the cells younger while also retaining their “memory” or ability to respond to HIV.

Further studies are now needed to test whether adding the rejuvenated cells back into the body does actually help prevent tumor growth and can treat HIV infections. There is also a need to ensure that the cells are safe and the rejuvenation process itself did not cause any harmful genetic changes. Long telomeres have been associated with the formation of tumors and one has to make sure that the iPSC-derived lymphocytes do not become malignant. These two studies represent an exciting new development in iPSC research. They not only clearly document that iPSCs retain a memory of the original adult cell type they are derived from but they also show that this memory can be put to good use. This is especially true for immune cells, because retaining an immunological memory allows rejuvenated iPSC-derived immune cells to resume the fight against a tumor or a virus.


Image credit: “Surface of HIV infected macrophage” by Sriram Subramaniam at the National Cancer Institute (NCI) via National Institutes of Health Image Bank