Are American Professors More Responsive to Requests Made by White Male Students?

Less than one fifth of PhD students in the United States will be able to pursue tenure track academic faculty careers once they graduate from their program. Reduced federal funding for research and dwindling support from the institutions for their tenure-track faculty are some of the major reasons for why there is such an imbalance between the large numbers of PhD graduates and the limited availability of academic positions. Upon completing the program, PhD graduates have to consider non-academic job opportunities such as in the industry, government agencies and non-profit foundations but not every doctoral program is equally well-suited to prepare their graduates for such alternate careers. It is therefore essential for prospective students to carefully assess the doctoral program they want to enroll in and the primary mentor they would work with. The best approach is to proactively contact prospective mentors, meet with them and learn about the research opportunities in their group but also discuss how completing the doctoral program would prepare them for their future careers.

students-in-library

The vast majority of professors will gladly meet a prospective graduate student and discuss research opportunities as well as long-term career options, especially if the student requesting the meeting clarifies the goal of the meeting. However, there are cases when students wait in vain for a response. Is it because their email never reached the professor because it got lost in the internet ether or a spam folder? Was the professor simply too busy to respond? A research study headed by Katherine Milkman from the University of Pennsylvania suggests that the lack of response from the professor may in part be influenced by the perceived race or gender of the student.


Milkman and her colleagues conducted a field experiment in which 6,548 professors at the leading US academic institutions (covering 89 disciplines) were contacted via email to meet with a prospective graduate student. Here is the text of the email that was sent to each professor.

Subject Line: Prospective Doctoral Student (On Campus Next

Monday)

Dear Professor [surname of professor inserted here],

I am writing you because I am a prospective doctoral student with considerable interest in your research. My plan is to apply to doctoral programs this coming Fall, and I am eager to learn as much as I can about research opportunities in the meantime.

I will be on campus next Monday, and although I know it is short notice, I was wondering if you might have 10 minutes when you would be willing to meet with me to briefly talk about your work and any possible opportunities for me to get involved in your research. Any time that would be convenient for you would be fine with me, as meeting with you is my first priority during this campus visit.

 Thank you in advance for your consideration.

Sincerely,

[Student’s full name inserted here]

As a professor who frequently receives emails from people who want to work in my laboratory, I feel that the email used in the research study was extremely well-crafted. The student only wants a brief meeting to explore potential opportunities without trying to extract any specific commitment from the professor. The email clearly states the long-term goal – applying to doctoral programs. The tone is also very polite and the student expresses willingness of the prospective student to a to the professor’s schedule. Each email was also personally addressed with the name of the contacted faculty member.

Milkman’s research team then assessed whether the willingness of the professors to respond depended on the gender or ethnicity of the prospective student.  Since this was an experiment, the emails and student names were all fictional but the researchers generated names which most readers would clearly associate with a specific gender and ethnicity.

Here is a list of the names they used:

White male names:  Brad Anderson, Steven Smith

White female names:  Meredith Roberts, Claire Smith

Black male names: Lamar Washington, Terell Jones

Black female names: Keisha Thomas, Latoya Brown

Hispanic male names: Carlos Lopez, Juan Gonzalez

Hispanic female names: Gabriella Rodriguez, Juanita Martinez

Indian male names: Raj Singh, Deepak Patel

Indian female names: Sonali Desai, Indira Shah

Chinese Male names; Chang Huang, Dong Lin

Chinese female names: Mei Chen, Ling Wong

The researchers assessed whether the professors responded (either by agreeing to meet or providing a reason for why they could not meet) at all or whether they simply ignored the email and whether the rate of response depended on the ethnicity/gender of the student.

The overall response rate of the professors ranged from about 60% to 80%, depending on the research discipline as well as the perceived ethnicity and gender of the prospective student. When the emails were signed with names suggesting a white male background of the student, professors were far less likely to ignore the email when compared to those signed with female names or names indicating an ethnic minority background. Professors in the business sciences showed the strongest discrimination in their response rates. They ignored only 18% of emails when it appeared that they had been written by a white male and ignored 38% of the emails if they were signed with names indicating a female gender or ethnic minority background. Professors in the education disciplines ignored 21% of emails with white male names versus 35% with female or minority names. The discrimination gaps in the health sciences (33% vs 43%) and life sciences (32% vs 39%) were smaller but still significant, whereas there was no statistical difference in the humanities professor response rates. Doctoral programs in the fine arts were an interesting exception where emails from apparent white male students were more likely to be ignored (26%) than those of female or minority candidates (only 10%).

The discrimination primarily occurred at the initial response stage. When professors did respond, there was no difference in terms of whether they were able to make time for the student. The researchers also noted that responsiveness discrimination in any discipline was not restricted to one gender or ethnicity. In business doctoral programs, for example, professors were most likely to ignore emails with black female names and Indian male names. Significant discrimination against white female names (when compared to white males names) predicted an increase in discrimination against other ethnic minorities. Surprisingly, the researchers found that having higher representation of female and minority faculty at an institution did not necessarily improve the responsiveness towards requests from potential female or minority students.

This carefully designed study with a large sample size of over 6,500 professors reveals the prevalence of bias against women and ethnic minorities at the top US institutions. This bias may be so entrenched and subconscious that it cannot be remedied by simply increasing the percentage of female or ethnic minority professors in academia. Instead, it is important that professors understand that they may be victims of these biases even if they do not know it. Something as simple as deleting an email from a prospective student because we think that we are too busy to respond may be indicative of an insidious gender or racial bias that we need to understand and confront. Increased awareness and introspection as well targeted measures by institutions are the important first steps to ensure that students receive the guidance and mentorship they need, independent of their gender or ethnic background.

Reference:

Milkman KL, Akinola M, Chugh D. (2015). What Happens Before? A Field Experiment Exploring How Pay and Representation Differentially Shape Bias on the Pathway Into Organizations. Journal of Applied Psychology, 100(6), 1678–1712.

Note: An earlier version of this post was first published on the 3Quarksdaily Blog.

ResearchBlogging.org

Milkman KL, Akinola M, & Chugh D (2015). What happens before? A field experiment exploring how pay and representation differentially shape bias on the pathway into organizations. The Journal of applied psychology, 100 (6), 1678-712 PMID: 25867167

Advertisement

Feel Our Pain: Empathy and Moral Behavior

“It’s empathy that makes us help other people. It’s empathy that makes us moral.” The economist Paul Zak casually makes this comment in his widely watched TED talk about the hormone oxytocin, which he dubs the “moral molecule”. Zak quotes a number of behavioral studies to support his claim that oxytocin increases empathy and trust, which in turn increases moral behavior. If all humans regularly inhaled a few puffs of oxytocin through a nasal spray, we could become more compassionate and caring. It sounds too good to be true. And recent research now suggests that this overly simplistic view of oxytocin, empathy and morality is indeed too good to be true.

Hands

Many scientific studies support the idea that oxytocin is a major biological mechanism underlying the emotions of empathy and the formation of bonds between humans. However, inferring that these oxytocin effects in turn make us more moral is a much more controversial statement. In 2011, the researcher Carsten De Dreu and his colleagues at the University of Amsterdam in the Netherlands published the study Oxytocin promotes human ethnocentrism which studied indigenous Dutch male study subjects who in a blinded fashion self-administered either nasal oxytocin or a placebo spray. The subjects then answered questions and performed word association tasks after seeing photographic images of Dutch males (the “in-group”) or images of Arabs and Germans, the “out-group” because prior surveys had shown that the Dutch public has negative views of both Arabs/Muslims and Germans. To ensure that the subjects understood the distinct ethnic backgrounds of the target people shown in the images, they were referred to typical Dutch male names, German names (such as Markus and Helmut) or Arab names (such as Ahmed and Youssef).

Oxytocin increased favorable views and word associations but only towards in-group images of fellow Dutch males. The oxytocin treatment even had the unexpected effect of worsening the views regarding Arabs and Germans but this latter effect was not quite statistically significant. Far from being a “moral molecule”, oxytocin may actually increase ethnic bias in society because it selectively enhances certain emotional bonds. In a subsequent study, De Dreu then addressed another aspect of the purported link between oxytocin and morality by testing the honesty of subjects. The study Oxytocin promotes group-serving dishonesty showed that oxytocin increased cheating in study subjects if they were under the impression that dishonesty would benefit their group. De Dreu concluded that oxytocin does make us less selfish and care more about the interest of the group we belong to.

These recent oxytocin studies not only question the “moral molecule” status of oxytocin but raise the even broader question of whether more empathy necessarily leads to increased moral behavior, independent of whether or not it is related to oxytocin. The researchers Jean Decety and Jason Cowell at the University of Chicago recently analyzed the scientific literature on the link between empathy and morality in their commentary Friends or Foes: Is Empathy Necessary for Moral Behavior?, and find that the relationship is far more complicated than one would surmise. Judges, police officers and doctors who exhibit great empathy by sharing in the emotional upheaval experienced by the oppressed, persecuted and severely ill always end up making the right moral choices – in Hollywood movies. But empathy in the real world is a multi-faceted phenomenon and we use this term loosely, as Decety and Cowell point out, without clarifying which aspect of empathy we are referring to.

Decety and Cowell distinguish at least three distinct aspects of empathy:

1. Emotional sharing, which refers to how one’s emotions respond to the emotions of those around us. Empathy enables us to “feel” the pain of others and this phenomenon of emotional sharing is also commonly observed in non-human animals such as birds or mice.

2. Empathic concern, which describes how we care for the welfare of others. Whereas emotional sharing refers to how we experience the emotions of others, empathic concern motivates us to take actions that will improve their welfare. As with emotional sharing, empathic concern is not only present in humans but also conserved among many non-human species and likely constitutes a major evolutionary advantage.

3. Perspective taking, which – according to Decety and Cowell – is the ability to put oneself into the mind of another and thus imagine what they might be thinking or feeling. This is a more cognitive dimension of empathy and essential for our ability to interact with fellow human beings. Even if we cannot experience the pain of others, we may still be able to understand or envision how they might be feeling. One of the key features of psychopaths is their inability to experience the emotions of others. However, this does not necessarily mean that psychopaths are unable to cognitively imagine what others are thinking. Instead of labeling psychopaths as having no empathy, it is probably more appropriate to specifically characterize them as having a reduced capacity to share in the emotions while maintaining an intact capacity for perspective-taking.

In addition to the complexity of what we call “empathy”, we need to also understand that empathy is usually directed towards specific individuals and groups. De Dreu’s studies demonstrated that oxytocin can make us more pro-social as long as it benefits those who we feel belong to our group but not necessarily those outside of our group. The study Do you feel my pain? Racial group membership modulates empathic neural responses by Xu and colleagues at Peking University used fMRI brain imaging in Chinese and Caucasian study subjects and measured their neural responses to watching painful images. The study subjects were shown images of either a Chinese or a Caucasian face. In the control condition, the depicted image showed a face being poked with a cotton swab. In the pain condition, study subjects were shown a face of a person being poked with a needle attached to syringe. When the researchers measured the neural responses with the fMRI, they found significant activation in the anterior cingulate cortex (ACC) which is part of the neural pain circuit, both for pain we experience ourselves but also for empathic pain we experience when we see others in pain. The key finding in Xu’s study was that ACC activation in response to seeing the painful image was much more profound when the study subject and the person shown in the painful image belonged to the same race.

As we realize that the neural circuits and hormones which form the biological basis of our empathy responses are so easily swayed by group membership then it becomes apparent why increased empathy does not necessarily result in behavior consistent with moral principles. In his essay “Against Empathy“, the psychologist Paul Bloom also opposes the view that empathy should form the basis of morality and that we should unquestioningly elevate empathy to virtue for all:

“But we know that a high level of empathy does not make one a good person and that a low level does not make one a bad person. Being a good person likely is more related to distanced feelings of compassion and kindness, along with intelligence, self-control, and a sense of justice. Being a bad person has more to do with a lack of regard for others and an inability to control one’s appetites.”

I do not think that we can dismiss empathy as a factor in our moral decision-making. Bloom makes a good case for distanced compassion and kindness that does not arise from the more visceral emotion of empathy. But when we see fellow humans and animals in pain, then our initial biological responses are guided by empathy and anger, not the more abstract concept of distanced compassion. What we need is a better scientific and philosophical understanding of what empathy is. Empathic perspective-taking may be a far more robust and reliable guide for moral decision-making than empathic emotions. Current scientific studies on empathy often measure it as an aggregate measure without teasing out the various components of empathy. They also tend to underestimate that the relative contributions of the empathy components (emotion, concern, perspective-taking) can vary widely among cultures and age groups. We need to replace overly simplistic notions such as oxytocin = moral molecule or empathy = good with a more refined view of the complex morality-empathy relationship guided by rigorous science and philosophy.

 

References:

De Dreu, C. K., Greer, L. L., Van Kleef, G. A., Shalvi, S., & Handgraaf, M. J. (2011). Oxytocin promotes human ethnocentrismProceedings of the National Academy of Sciences, 108(4), 1262-1266.

Decety, J., & Cowell, J. M. (2014). Friends or Foes: Is Empathy Necessary for Moral Behavior?Perspectives on Psychological Science, 9(5), 525-537.

Shalvi, S., & De Dreu, C. K. (2014). Oxytocin promotes group-serving dishonestyProceedings of the National Academy of Sciences, 111(15), 5503-5507.

Xu, X., Zuo, X., Wang, X., & Han, S. (2009). Do you feel my pain? Racial group membership modulates empathic neural responsesThe Journal of Neuroscience, 29(26), 8525-8529.

 

*****************************

Note: An earlier version of this article was first published on the 3Quarksdaily blog.

 

ResearchBlogging.org

 

 

 

 

De Dreu, C., Greer, L., Van Kleef, G., Shalvi, S., & Handgraaf, M. (2011). Oxytocin promotes human ethnocentrism Proceedings of the National Academy of Sciences, 108 (4), 1262-1266 DOI: 10.1073/pnas.1015316108

 

Decety J, & Cowell JM (2014). Friends or Foes: Is Empathy Necessary for Moral Behavior? Perspectives on psychological science : a journal of the Association for Psychological Science, 9 (5), 525-37 PMID: 25429304

 

Shalvi S, & De Dreu CK (2014). Oxytocin promotes group-serving dishonesty. Proceedings of the National Academy of Sciences of the United States of America, 111 (15), 5503-7 PMID: 24706799

 

Xu X, Zuo X, Wang X, & Han S (2009). Do you feel my pain? Racial group membership modulates empathic neural responses. The Journal of neuroscience : the official journal of the Society for Neuroscience, 29 (26), 8525-9 PMID: 19571143

Murder Your Darling Hypotheses But Do Not Bury Them

“Whenever you feel an impulse to perpetrate a piece of exceptionally fine writing, obey it—whole-heartedly—and delete it before sending your manuscript to press. Murder your darlings.”

Sir Arthur Quiller-Couch (1863–1944). On the Art of Writing. 1916

 

Murder your darlings. The British writer Sir Arthur Quiller Crouch shared this piece of writerly wisdom when he gave his inaugural lecture series at Cambridge, asking writers to consider deleting words, phrases or even paragraphs that are especially dear to them. The minute writers fall in love with what they write, they are bound to lose their objectivity and may not be able to judge how their choice of words will be perceived by the reader. But writers aren’t the only ones who can fall prey to the Pygmalion syndrome. Scientists often find themselves in a similar situation when they develop “pet” or “darling” hypotheses.

Hypothesis via Shutterstock
Hypothesis via Shutterstock

How do scientists decide when it is time to murder their darling hypotheses? The simple answer is that scientists ought to give up scientific hypotheses once the experimental data is unable to support them, no matter how “darling” they are. However, the problem with scientific hypotheses is that they aren’t just generated based on subjective whims. A scientific hypothesis is usually put forward after analyzing substantial amounts of experimental data. The better a hypothesis is at explaining the existing data, the more “darling” it becomes. Therefore, scientists are reluctant to discard a hypothesis because of just one piece of experimental data that contradicts it.

In addition to experimental data, a number of additional factors can also play a major role in determining whether scientists will either discard or uphold their darling scientific hypotheses. Some scientific careers are built on specific scientific hypotheses which set apart certain scientists from competing rival groups. Research grants, which are essential to the survival of a scientific laboratory by providing salary funds for the senior researchers as well as the junior trainees and research staff, are written in a hypothesis-focused manner, outlining experiments that will lead to the acceptance or rejection of selected scientific hypotheses. Well written research grants always consider the possibility that the core hypothesis may be rejected based on the future experimental data. But if the hypothesis has to be rejected then the scientist has to explain the discrepancies between the preferred hypothesis that is now falling in disrepute and all the preliminary data that had led her to formulate the initial hypothesis. Such discrepancies could endanger the renewal of the grant funding and the future of the laboratory. Last but not least, it is very difficult to publish a scholarly paper describing a rejected scientific hypothesis without providing an in-depth mechanistic explanation for why the hypothesis was wrong and proposing alternate hypotheses.

For example, it is quite reasonable for a cell biologist to formulate the hypothesis that protein A improves the survival of neurons by activating pathway X based on prior scientific studies which have shown that protein A is an activator of pathway X in neurons and other studies which prove that pathway X improves cell survival in skin cells. If the data supports the hypothesis, publishing this result is fairly straightforward because it conforms to the general expectations. However, if the data does not support this hypothesis then the scientist has to explain why. Is it because protein A did not activate pathway X in her experiments? Is it because in pathway X functions differently in neurons than in skin cells? Is it because neurons and skin cells have a different threshold for survival? Experimental results that do not conform to the predictions have the potential to uncover exciting new scientific mechanisms but chasing down these alternate explanations requires a lot of time and resources which are becoming increasingly scarce. Therefore, it shouldn’t come as a surprise that some scientists may consciously or subconsciously ignore selected pieces of experimental data which contradict their darling hypotheses.

Let us move from these hypothetical situations to the real world of laboratories. There is surprisingly little data on how and when scientists reject hypotheses, but John Fugelsang and Kevin Dunbar at Dartmouth conducted a rather unique study “Theory and data interactions of the scientific mind: Evidence from the molecular and the cognitive laboratory” in 2004 in which they researched researchers. They sat in at scientific laboratory meetings of three renowned molecular biology laboratories at carefully recorded how scientists presented their laboratory data and how they would handle results which contradicted their predictions based on their hypotheses and models.

In their final analysis, Fugelsang and Dunbar included 417 scientific results that were presented at the meetings of which roughly half (223 out of 417) were not consistent with the predictions. Only 12% of these inconsistencies lead to change of the scientific model (and thus a revision of hypotheses). In the vast majority of the cases, the laboratories decided to follow up the studies by repeating and modifying the experimental protocols, thinking that the fault did not lie with the hypotheses but instead with the manner how the experiment was conducted. In the follow up experiments, 84 of the inconsistent findings could be replicated and this in turn resulted in a gradual modification of the underlying models and hypotheses in the majority of the cases. However, even when the inconsistent results were replicated, only 61% of the models were revised which means that 39% of the cases did not lead to any significant changes.

The study did not provide much information on the long-term fate of the hypotheses and models and we obviously cannot generalize the results of three molecular biology laboratory meetings at one university to the whole scientific enterprise. Also, Fugelsang and Dunbar’s study did not have a large enough sample size to clearly identify the reasons why some scientists were willing to revise their models and others weren’t. Was it because of varying complexity of experiments and models? Was it because of the approach of the individuals who conducted the experiments or the laboratory heads? I wish there were more studies like this because it would help us understand the scientific process better and maybe improve the quality of scientific research if we learned how different scientists handle inconsistent results.

In my own experience, I have also struggled with results which defied my scientific hypotheses. In 2002, we found that stem cells in human fat tissue could help grow new blood vessels. Yes, you could obtain fat from a liposuction performed by a plastic surgeon and inject these fat-derived stem cells into animal models of low blood flow in the legs. Within a week or two, the injected cells helped restore the blood flow to near normal levels! The simplest hypothesis was that the stem cells converted into endothelial cells, the cell type which forms the lining of blood vessels. However, after several months of experiments, I found no consistent evidence of fat-derived stem cells transforming into endothelial cells. We ended up publishing a paper which proposed an alternative explanation that the stem cells were releasing growth factors that helped grow blood vessels. But this explanation was not as satisfying as I had hoped. It did not account for the fact that the stem cells had aligned themselves alongside blood vessel structures and behaved like blood vessel cells.

Even though I “murdered” my darling hypothesis of fat –derived stem cells converting into blood vessel endothelial cells at the time, I did not “bury” the hypothesis. It kept ruminating in the back of my mind until roughly one decade later when we were again studying how stem cells were improving blood vessel growth. The difference was that this time, I had access to a live-imaging confocal laser microscope which allowed us to take images of cells labeled with red and green fluorescent dyes over long periods of time. Below, you can see a video of human bone marrow mesenchymal stem cells (labeled green) and human endothelial cells (labeled red) observed with the microscope overnight. The short movie compresses images obtained throughout the night and shows that the stem cells indeed do not convert into endothelial cells. Instead, they form a scaffold and guide the endothelial cells (red) by allowing them to move alongside the green scaffold and thus construct their network. This work was published in 2013 in the Journal of Molecular and Cellular Cardiology, roughly a decade after I had been forced to give up on the initial hypothesis. Back in 2002, I had assumed that the stem cells were turning into blood vessel endothelial cells because they aligned themselves in blood vessel like structures. I had never considered the possibility that they were scaffold for the endothelial cells.

This and other similar experiences have lead me to reformulate the “murder your darlings” commandment to “murder your darling hypotheses but do not bury them”. Instead of repeatedly trying to defend scientific hypotheses that cannot be supported by emerging experimental data, it is better to give up on them. But this does not mean that we should forget and bury those initial hypotheses. With newer technologies, resources or collaborations, we may find ways to explain inconsistent results years later that were not previously available to us. This is why I regularly peruse my cemetery of dead hypotheses on my hard drive to see if there are ways of perhaps resurrecting them, not in their original form but in a modification that I am now able to test.

 

Reference:

ResearchBlogging.org

Fugelsang, J., Stein, C., Green, A., & Dunbar, K. (2004). Theory and Data Interactions of the Scientific Mind: Evidence From the Molecular and the Cognitive Laboratory. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 58 (2), 86-95 DOI: 10.1037/h0085799

 

Note: An earlier version of this article first appeared on 3Quarksdaily.

“She’s strong for a girl”: The Negative Impact of Stereotypes About Women

This is a guest blog post by Ulli Hain (Twitter: @ulli_hain, Email: hain.ulli[at]gmail.com). Ulli is a postdoctoral researcher in the field of autophagy and also a science writer/blogger. Her blog Bench and Beyond reports on interesting scientific studies and explores life as a scientist including issues of gender and science.

We have all heard the stereotypes: women can’t drive, they don’t understand computers, and how many blondes does it take to screw in a light bulb? But those are all in good fun, right? But what if gender stereotypes actually bring about the observed differences between men and women that supposedly underline these stereotypes? A recent study by the psychologist Marina Pavlova at the University of Tübingen tested this idea.
Science and Gender
Boy and Girl at School – Stereotypes about favorites subjects. Via Shutterstock.

 

 

 

While previous studies have supported the idea that negative stereotypes hinder women’s athletic and cognitive performance on a range of tests, those studies all looked at tasks with preexisting stereotypes. For example women score worse on math tests when reminded of old “adages” about women and math.

 

Pavlova and her colleagues instead wanted to see how stereotype impacts an area where no gender difference exists. Could a fabricated stereotype change the way women and men perform on a test?

 

They chose the event arrangement (EA) test, used on certain modern IQ tests to measure nonverbal reasoning skills. Participants arrange cards depicting scenes, such as a man fishing, cooking over a campfire, and preparing for a trip, in a logical order to create a story. Scores are based on the number of correct sequences and amount of time required.

 

117 college students were split into three groups and given different instructions for the test. The first group was given standard instructions on the task. A second group was additionally told: “females usually perform worse on this task” while the third group was told: “males usually perform worse on this task.”

 

Men and women performed equally well when no stereotyped messages were given. When the group was told that women usually perform worse, women’s scores on the test decreased. In contrast, men’s scores actually increased, perhaps reflecting that their confidence was boosted by the perceived weakness of women. helped boost that they thrived on their perceived advantage.

 

The most surprising findings came from the group that was told that men usually do worse on the test. Men’s performance was diminished as expected, but instead of improving women’s scores, they dropped just as much as men.

 

Pavlova and her colleagues also looked at positive messages. Telling participants that women are usually better at the EA test modestly improved women’s scores without affecting men. However, the opposite was not true. Women’s performance was even more hurt by being told that men are better at the test than the more explicit message that women are worse.

What clearly emerges from the study is that women are more susceptible to stereotyping than men. The only time men’s performance declined was when given the explicit negative male message.

 

Why are women more impacted by the stereotypes than men? Although controlling for preexisting stereotypes on this specific test, researchers cannot escape society’s influence on women, which begins at an incredibly early age. Women are constantly under the threat of stereotype. And women who break stereotypes face harsh criticism not faced by men, such as criticism of working mothers who use daycare or the perception of women who speak up as being aggressive or bossy rather than being leaders.

 

The researchers suggest that since women have a history of being typecast, they may misinterpret the message “males are usually worse” to mean that if men have a hard time with the test, women will have an even harder time.

 

More and more studies confirm the existence of subtle forms of bias against women at all levels of society. It is a major finding that these subtle biases can have even greater psychological consequences than more blatant and bygone forms of sexism. Interventions are needed to combat existing stereotypes at an early age.

 

ResearchBlogging.org
Pavlova, M., Weber, S., Simoes, E., & Sokolov, A. (2014). Gender Stereotype Susceptibility PLoS ONE, 9 (12) DOI: 10.1371/journal.pone.0114802

The Replicability Crisis in Cancer Research

The cancer researchers Glenn Begley and Lee Ellis made a rather remarkable claim last year. In a commentary that analyzed the dearth of efficacious novel cancer therapies, they revealed that scientists at the biotechnology company Amgen were unable to replicate the vast majority of published pre-clinical research studies. Only 6 out of 53 landmark cancer studies could be replicated, a dismal success rate of 11%! The Amgen researchers had deliberately chosen highly innovative cancer research papers, hoping that these would form the scientific basis for future cancer therapies that they could develop. It should not come as a surprise that progress in developing new cancer treatments is so sluggish. New clinical treatments are often based on innovative scientific concepts derived from pre-clinical laboratory research. However, if the pre-clinical scientific experiments cannot be replicated, it would be folly to expect that clinical treatments based on these questionable scientific concepts would succeed.

Cancer-Detecting Nanoparticles. Here, when cancer cells (cell nuclei in blue) were treated with antibody-conjugated nanoparticles, the antibodies (red) and the nanoparticle cores (green) separated into different cellular compartments. Source: National Cancer Institute \ M.D. Anderson Cancer Center. Creator: Sangheon Han, Konstantin Sokolov, Tomasz Zal, Anna Zal

Reproducibility of research findings is the cornerstone of science. Peer-reviewed scientific journals generally require that scientists conduct multiple repeat experiments and report the variability of their findings before publishing them. However, it is not uncommon for researchers to successfully repeat experiments and publish a paper, only to learn that colleagues at other institutions can’t replicate the findings. This does not necessarily indicate foul play. The reasons for the lack of reproducibility include intentional fraud and misconduct, yes, but more often it’s negligence, inadvertent errors, imperfectly designed experiments and the subliminal biases of the researchers or other uncontrollable variables.

Clinical studies, of new drugs, for example, are often plagued by the biological variability found in study participants. A group of patients in a trial may exhibit different responses to a new medication compared to patients enrolled in similar trials at different locations. In addition to genetic differences between patient populations, factors like differences in socioeconomic status, diet, access to healthcare, criteria used by referring physicians, standards of data analysis by researchers or the subjective nature of certain clinical outcomes – as well as many other uncharted variables – might all contribute to different results.

The claims of low reproducibility made by Begley and Ellis, however, did not refer to clinical cancer research but to pre-clinical science. Pre-clinical scientists attempt to reduce the degree of experimental variability by using well-defined animal models and standardized outcomes such as cell division, cell death, cell signaling or tumor growth. Without the variability inherent in patient populations, pre-clinical research variables should in theory be easier to control. The lack of reproducibility in pre-clinical cancer research has a significance that reaches far beyond just cancer research. Similar or comparable molecular and cellular experimental methods are also used in other areas of biological research, such as stem cell biology, neurobiology or cardiovascular biology. If only 11% of published landmark papers in cancer research are reproducible, it raises questions about how published papers in other areas of biological research fare.

Following the publication of Begley and Ellis’ commentary, cancer researchers wanted to know more details. Could they reveal the list of the irreproducible papers? How were the experiments at Amgen conducted to assess reproducibility? What constituted a successful replication? Were certain areas of cancer research or specific journals more prone to publishing irreproducible results? What was the cause of the poor reproducibility? Unfortunately, the Amgen scientists were bound by confidentiality agreements that they had entered into with the scientists whose work they attempted to replicate. They could not reveal which papers were irreproducible or specific details regarding the experiments, thus leaving the cancer research world in a state of uncertainty. If so much published cancer research cannot be replicated, how can the field progress?

 Lee Ellis has now co-authored another paper to delve further into the question. In the study, published in the journal PLOS One, Ellis teamed up with colleagues at the renowned University of Texas MD Anderson Cancer Center to survey faculty members and trainees (PhD students and postdoctoral fellows) at the center. Only 15-17% of their colleagues responded to the anonymous survey, but the responses confirmed that reproducibility of papers in peer-reviewed scientific journals is a major problem. Two-thirds of the senior faculty respondents revealed they had been unable to replicate published findings, and the same was true for roughly half of the junior faculty members as well as trainees. Seventy-eight percent of the scientists had attempted to contact the authors of the original scientific paper to identify the problem, but only 38.5% received a helpful response. Nearly 44% of the researchers encountered difficulties when trying to publish findings that contradicted the results of previously published papers.

The list of scientific journals in which some of the irreproducible papers were published includes the the “elite” of scientific publications: The prestigious Nature tops the list with ten mentions, but one can also find Cancer Research (nine mentions), Cell (six mentions), PNAS (six mentions) and Science (three mentions).

Does this mean that these high-profile journals are the ones most likely to publish irreproducible results? Not necessarily. Researchers typically choose to replicate the work published in high-profile journals and use that as a foundation for new projects. Researchers at MD Anderson Cancer Center may not have been able to reproduce the results of ten cancer research papers published in Nature, but the survey did not provide any information regarding how many cancer research papers in Nature were successfully replicated.

The lack of data on successful replications is a major limitation of this survey. We know that more than half of all scientists responded “Yes” to the rather opaque question “Have you ever tried to reproduce a finding from a published paper and not been able to do so?”, but we do not know how often this occurred. Researchers who successfully replicated nine out of ten papers and researchers who failed to replicate four out of four published papers would have both responded “Yes.” Other limitations of this survey include that it does not list the specific irreproducible papers or clearly define what constitutes reproducibility. Published scientific papers represent years of work and can encompass five, ten or more distinct experiments. Does successful reproducibility require that every single experiment in a paper be replicated or just the major findings? What if similar trends are seen but the magnitude of effects is smaller than what was published in the original paper?

Due to these limitations, the survey cannot provide definitive answers about the magnitude of the reproducibility problem. It only confirms that lack of reproducibility is a potentially important problem in pre-clinical cancer research, and that high-impact peer-reviewed journals are not immune. While Begley and Ellis have focused on questioning the reproducibility of cancer research, it is likely that other areas of biological and medical research are also struggling with the problem of reproducibility. Some of the most highly cited papers in stem cell biology cannot be replicated , and a recent clinical trial using bone marrow cells to regenerate the heart did not succeed in improving heart function after a heart attack  despite earlier trials demonstrating benefits.

Does this mean that cancer research is facing a crisis? If only 11% of pre-clinical cancer research is reproducible, as originally proposed by Begley and Ellis, then it might be time to sound the alarm bells. But since we don’t know how exactly reproducibility was assessed, it is impossible to ascertain the extent of the problem. The word “crisis” also has a less sensationalist meaning: the time for a crucial decision. In that sense, cancer research and perhaps much of contemporary biological and medical research needs to face up to the current quality control “crisis.” Scientists need to wholeheartedly acknowledge that reproducibility is a major problem and crucial steps must be taken to track and improve the reproducibility of published scientific work.

First, scientists involved in biological and medical research need to foster a culture that encourages the evaluation of reproducibility and develop the necessary infrastructure. When scientists are unable to replicate results of published papers and contact the authors, the latter need to treat their colleagues with respect and work together to resolve the issue. Many academic psychologists have already recognized the importance of tracking reproducibility and initiated a large-scale collaborative effort to tackle the issue; the Harvard psychologists Joshua Hartshorne and Adena Schachner also recently proposed using a formal approach to track the reproducibility of research. Biological and medical scientists should consider adopting similar infrastructures for their research, because reproducibility is clearly not just a problem for psychology research.

Second, grant-funding agencies should provide adequate research funding for scientists to conduct replication studies. Currently, research grants are awarded to those who propose the most innovative experiments, but few — if any — funds are available for researchers who want to confirm or refute a published scientific paper. While innovation is obviously important, attempts to replicate published findings deserve recognition and funding because new work can only succeed if it is built on solid, reproducible scientific data.

In the U.S., it can take 1-2 years from when researchers submit a grant proposal to when they receive funding to conduct research. Funding agencies could consider an alternate approach, one that allows for rapid approval of small-budget grant proposals so that researchers can immediately start evaluating the reproducibility of recent breakthrough discoveries. Such funding for reproducibility testing could be provided to individual laboratories or teams of scientists such as the Reproducibility Initiative or the recent efforts of chemistry bloggers to document reproducibility.

The U.S.-based NIH (National Institutes of Health) is the largest source of funding for medical research in the world and is now considering the implementation of new reproducibility requirements for scientists who receive funding. However, not even the NIH has a clear plan for how reproducibility testing should be funded.

Lastly, it is also important that scientific journals address the issue of reproducibility. One of the most common and also most heavily criticized metrics for the success of a scientific journal is its “impact factor,” an indicator of how often an average article published in the journal is cited. Even irreproducible scientific papers can be cited thousands of times and boost a journal’s “impact.”

If a system tracked the reproducibility of scientific papers, one could conceivably calculate a reproducibility score for any scientific journal. That way, a journal’s reputation would not only rest on the average number of citations but also on the reliability of the papers it publishes. Scientific journals should also consider supporting reproducibility initiatives by encouraging the publication of papers that attempted to replicate previous papers — as long as the reproducibility was tested in a rigorous fashion and independent of whether or not the replication attempts were successful.

There is no need to publish the 20th replication study that merely confirms what 19 previous studies have previously found, but publication of replication attempts is sorely needed before a consensus is reached regarding a scientific discovery. The journal PLOS One has partnered up with the Reproducibility Initiative to provide a forum for the publication of replication studies, but there is no reason why other journals should not follow.

While PLOS One publishes many excellent papers, current requirements for tenure and promotion at academic centers often require that researchers publish in certain pre-specified scientific journals, including those affiliated with certain professional societies and which carry prestige in a designated field of research. If these journals also encouraged the publication of replication attempts, more researchers would conduct them and contribute to the post-publication quality control of scientific literature.

The recent questions raised about the reproducibility of biological and medical research findings is forcing scientists to embark on a soul-searching mission. It is likely that this journey will shake up many long-held beliefs. But this reappraisal will ultimately lead to a more rigorous and reliable science.

 

Note: An earlier version of this article was first published on Salon.com.

“Citizen Science”: Scientific Consensus On Global Warming

I came across an interesting study about the consensus in the scientific community on anthropogenic global warming (AGW), i.e. the idea that human activity is very likely causing most of global warming. What makes this study so interesting is the fact that it involved a “citizen science” approach. Volunteers who contributed to the Skeptical Science website were asked to grade the abstracts of 11,944 scientific papers on global climate change that were published in the years 1991-2011.  These volunteers assessed whether the abstracts explicitly or implicitly endorsed AGW, were neutral on this question or whether they explicitly or implicitly rejected the idea that human activity is the main cause of global warming.

The study entitled “Quantifying the consensus on anthropogenic global warming in the scientific literature” was published by John Cook and colleagues as an open access paper in the journal Environmental Research Letters. The results are no surprise to anyone who has been following the scientific literature on AGW. Of the abstracts that expressed an opinion on AGW, 97.1% explicitly or implicitly stated that humans were the primary cause of global warming. This high level of consensus on the primary human role in causing global warming is very consistent with prior publications in the field. When Cook and colleagues contacted the authors of the papers to obtain their own opinion on the matter, they found that 98% of the authors who had a clear position on climate change agreed on human activity being the major cause of global warming.

http://c.brightcove.com/services/viewer/federated_f9?isVid=1

Even though I think that the conclusions of the study are correct and that there is indeed a 97-98% consensus among scientists on AGW, I feel that the study highlights the potential for bias in “citizen science”.

The idea of using “citizens”, i.e. volunteers who are not necessarily trained as scientists to help obtain data is very intriguing. However, I do not believe that the authors of the study adequately addressed the issue of potential bias among these volunteers. The paper mentions that they contributed to the website Skeptical Science, which is managed by John Cook and attempts to convert climate change skeptics, i.e. people who deny the primary role of humans in global warming. I suspect that the volunteers who contribute to the website are probably all strongly convinced that AGW is very real. This could introduce a bias in the grading of the abstracts by these volunteers. I could not find any part of the paper, which discussed this potential bias and whether the authors also considered using “citizens” as abstract evaluators who did not believe in AGW or volunteers who felt neutral about AGW. Such “citizens” would have been good control groups to test whether a pre-existing opinion among volunteers can bias their interpretation of the scientific literature.

These concerns about potential “citizen science” bias should not only be addressed in the context of global warming research, but also in other areas of science that are associated with controversy and strongly held beliefs. A “citizen science” assessment of the risks and benefits of gene therapy or of embryonic stem cells in the scientific literature might also be influenced by their beliefs. As excited as many of us are about “citizen science”, it is necessary for us to consider potential biases that “citizens” can introduce, just like we also take into account the biases of professional scientists who conduct experiments when we evaluate a scientific paper.

 

Image credit: Annual average global warming by the year 2060 simulated and plotted via NASA)