The ENCODE Controversy And Professionalism In Science

The ENCODE (Encyclopedia Of DNA Elements) project received quite a bit of attention when its results were publicized last year. This project involved a very large consortium of scientists with the goal to identify all the functional elements in the human genome. In September 2012, 30 papers were published in a coordinated release and their extraordinary claim was that roughly 80% of the human genome was “functional”. This was in direct contrast to the prevailing view among molecular biologists that the bulk of human DNA was just “junk DNA”, i.e. sequences of DNA for which one could not assign any specific function. The ENCODE papers contained huge amounts of data, collating the work of hundreds of scientists who had worked on this for nearly a decade. But what garnered most attention, among scientists, the media and the public was the “80%” claim and the supposed “death of junk DNA“.

Soon after the discovery of DNA, the primary function ascribed to DNA was its role as a template from which messenger RNA could be transcribed and then translated into functional proteins. Using this definition of “function”, only 1-2% of the human DNA would be functional because they actually encoded for proteins. The term “junk DNA” was coined to describe the 98-99% of non-coding DNA which appeared to primarily represent genetic remnants of our evolutionary past without any specific function in our present day cells.

However, in the past decades, scientists have uncovered more and more functions for the non-coding DNA segments that were previously thought to be merely “junk”. Non-coding DNA can, for example, act as a binding site for regulatory proteins and exert an influence on protein-coding DNA. There has also been an increasing awareness of the presence of various types of non-coding RNA molecules, i.e. RNA molecules which are transcribed from the DNA but not subsequently translated into proteins. Some of these non-coding RNAs have known regulatory functions, others may not have any or their functions have not yet been established.

Despite these discoveries, most scientists were in agreement that only a small fraction of DNA was “functional”, even when all the non-coding pieces of DNA with known functions were included. The bulk of our genome was still thought to be non-functional. The term “junk DNA” was used less frequently by scientists, because it was becoming apparent that we were probably going to discover even more functional elements in the non-coding DNA.

In September 2012, everyone was talking about “junk DNA” again, because the ENCODE scientists claimed their data showed that 80% of the human genome was “functional”. Most scientists had expected that the ENCODE project would uncover some new functions for non-coding DNA, but the 80% figure was way out of proportion to what everyone had expected. The problem was that the ENCODE project used a very low bar for “function”. Binding to the DNA or any kind of chemical DNA modification was already seen as a sign of “function”, without necessarily proving that these pieces of DNA had any significant impact on the function of a cell.

The media hype with the “death of junk DNA” headlines and the lack of discussion about what constitutes function were appropriately criticized by many scientists, but the recent paper by Dan Graur and colleagues “On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE” has grabbed everyone’s attention. Not necessarily because of the fact that it criticizes the claims made by the ENCODE scientists, but because of the sarcastic tone it uses to ridicule ENCODE.

There have been so many other blog posts and articles that either praise or criticize the Graur paper, so I decided to list some of them here:

1. PZ Myers writes “ENCODE gets a public reaming” and seems to generally agree with Graur and colleagues.

2. Ashutosh Jogalekar says Graur’s paper is a “devastating takedown of ENCODE in which they pick apart ENCODE’s claims with the tenacity and aplomb of a vulture picking apart a wildebeest carcass.”

3. Ryan Gregory highlights some of the “zingers” in the Graur paper

Other scientists, on the other hand, agree with some of the conclusions of the Graur paper and its criticism of how the ENCODE data was presented, but disagree with the sarcastic tone:

1. OpenHelix reminds us that this kind of spanking” should not distract from all the valuable data that ENCODE has generated.

2. Mick Watson shows how Graur and colleagues could have presented their key critiques in a very non-confrontational manner and foster a constructive debate.

3. Josh Witten points out the irony of Graur accusing ENCODE of seeking hype, even though Graur and his colleagues seem to use sarcasm and ridicule to also increase the visibility of their work. I think Josh’s blog post is an excellent analysis of the problems with ENCODE and the problems associated with Graur’s tone.

On Twitter, I engaged in a debate with Benoit Bruneau, my fellow Scilogs blogger Malcolm Campbell and Jonathan Eisen and I thought it would be helpful to share the Storify version here. There was a general consensus that even though some of the points mentioned by Graur and colleagues are indeed correct, their sarcastic tone was uncalled for. Scientists can be critical of each other, but can and should do so in a respectful and professional manner, without necessarily resorting to insults or mockery.

[<a href=”//” target=”_blank”>View the story “ENCODE controversy and professionalism in scientific debates” on Storify</a>]
Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, & Elhaik E (2013). On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome biology and evolution PMID: 23431001

5 thoughts on “The ENCODE Controversy And Professionalism In Science

  1. TheOtherJim

    You stated, ” The term “junk DNA” was coined to describe the 98-99% of non-coding DNA which appeared to primarily represent genetic remnants of our evolutionary past without any specific function in our present day cells.”

    I understand you were trying to be brief, but that sentence is one of the problems. One reason for the frustration is that what you wrote is the official ENCODE / popular media story of Junk DNA. And it is false.

    As outlined in the Graur et al. paper, there is a lot of evidence to support this assertion, not just “we didn’t know so called it junk”. Psuedogenes, LINEs SINEs, Genetic Load observations, massive genome content differences between sister taxa, megabase transgenic knockouts, etc, etc are all informing this hypothesis.


    1. Dear Jim,

      Thank you very much for your comment and the great links you sent. I also received similar comments on Twitter, echoing exactly what you said.

      I think your comment clarifies another problem. The word “junk DNA” is used very differently by scientists, the media and the public. I was referring to “junk DNA” how it was originally used by some scientists.

      I am going to quote from one of the links you posted (which is a great read, by the way!):

      “Junk DNA” had a specific meaning when it first was formulated. It was meant to describe the loss of protein-coding function by deactivated gene duplicates, which in turn were believed to constitute the bulk of eukaryotic genomes. As different types of non-coding DNA were identified, the concept of gene duplication as their source – and therefore “junk DNA” as their descriptor – found new and broader application. However, it is now clear that most non-coding DNA is not produced by this mechanism, and is therefore not accurately described as “junk” in the original sense.

      As Ryan Gregory points out, this definition of “junk DNA” itself evolved. This is possibly the reason why not all scientists use it in a uniform manner and why perhaps there is even more confusion in the media and the non-specialist public about what really constitutes “junk DNA”.

      The Graur et al paper devotes a whole section to discussing this point and I am going to quote from their paper, where they distinguish between “junk” and “garbage”:

      To deal with the confusion in the literature, we propose to refresh the memory of those objecting to “junk DNA” by repeating a 15-year old terminological distinction made by Sydney Brenner, who astutely differentiated between “junk DNA,” one the one hand, and “garbage DNA,” on the other: “Some years ago I noticed that there are two kinds of rubbish in the world and that most languages have different words to distinguish them. There is the rubbish we keep, which is junk, and the rubbish we throw away, which is
      garbage. The excess DNA in our genomes is junk, and it is there because it is harmless, as well as being useless, and because the molecular processes generating extra DNA outpace those getting rid of it. Were the extra DNA to become disadvantageous, it would become subject to selection, just as junk that takes up too much space, or is beginning to smell, is instantly converted to garbage…” (Brenner 1998).

      I think that the ENCODE controversy highlights the importance of language in science. The definitions of scientific words vary depending on who uses them. The definition of “junk DNA” and of “function” in the ENCODE controversy are perhaps two examples of expressions that are being used differently by molecular biologists and by non-specialists.


  2. TheOtherJim

    I agree with the “different use of the term” idea, but think it is worse than that. It seems each sub-discipline within bio-sciences appears to have their own definition at the moment, creating a lot of confusion. And for half of them, the current definition seems to be the “stuff with no defined function = junk DNA” strawman argument. One thing that junk DNA never was was “undefined DNA”.

    But there is something else going on. I come from the molecular evolution side of things, and, like Graur, I get annoyed at being accused (indirectly) of calling all things we have not yet assigned a function to “Junk DNA”. This is a backhanded insult to an entire field of research (evolutionary comparative genomics) that has taken 40 years to get to where it is today. If they are going to attack the theory, at least they should understand what it is and stop using a strawman argument


  3. Pingback: [BLOCKED BY STBV] Spanking #ENCODE | The OpenHelix Blog

  4. Pingback: [BLOCKED BY STBV] Introducing F1000Prime most influential: Molecular Medicine 2013 | Naturally Selected

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s