Testosterone, performance & intersex athletes: Will the IAAF evidence be enough?


Early this week, a paper by Bermon and Garnier was published in the British Journal of Sports Medicine, called “Serum androgen levels and their relation to performance in track and field: mass spectrometry results from 2127 observations in male and female elite athletes”

The paper is significant because it will be key item in the IAAF’s submission to the Court of Arbitration for Sport (CAS) regarding its policy on hyperandrogenism (high testosterone), which CAS set aside for two years in 2015.

That policy, and its subsequent temporary removal, is known by most because if affected Caster Semenya, who prior to the 2015 ruling was running outside 2:01, and who has since the ruling been unbeatable over 400m and 800m, and is now the Olympic champion.

There are, of course, other athletes with the same condition.  Semenya, thanks to a set of almighty screw-ups back in 2009 when she won the World title in Berlin as an 18-year old, is the one who is known and is thus unfairly the focal point of this issue.  Research at the 2011 World Champs revealed 6 athletes with these so-called “Disorders of Sex Development” (DSDs), and there are certainly others since then.

A good story, however, likes a focal point, and so as soon as this latest Bermon paper was published, media were speculating: “Would Caster Semenya be forced to take hormone therapy to lower her testosterone levels?”

That, of course, is up to CAS, who later this month will hear the IAAF’s evidence, having given them two years to come back with good reasons to re-instate the previous policy.

Here are my thoughts on this latest research, and whether it will be enough for the IAAF to achieve its goal of bringing back that policy and the upper limit for testosterone.

Top

Requisite (minimal) background

Remember that this is an 8-year long story, if you joined it at the time of Semenya’s Berlin emergence.  It goes back 70 years before that, if you go to Stella Walsh and Dora/Herman Ratjen in Berlin in 1936.  Point is, there’s a lot of history, science, ethics, politics and confusion in this issue.  I’ve been writing on the science of the conditions since the day before Semenya’s Berlin win, and I’d encourage people to have at lease some understanding of those issues before wading in (a few SA journalists, this is for you).

With respects to Semenya, the background is this:

After her 2009 World Championship win, and “outing”, the IAAF response was to create a hyperandrogen policy, which basically required than any athlete competing in women’s events had a testosterone level below 10 nmol/L.  They did away with gender verification testing as it existed then, and replaced it with a hormone requirement in an attempt to find a best possible compromise.

Their upper limit was set based on studies of women with a condition called Polycystic Ovary Syndrome (PCOS).  Women with this condition have an average T level of 4.5 nmol/L.  Adding 3 SD (which generally represents extreme outliers) would give a T level of 7.5 nmol/L.

However, research in elite female T&F athletes had shown that this limit would apply to 16 in 1000 women.  The IAAF then added a further 2SD to the PCOS average, which took the level up to 10nmol/L.  This level was just below the bottom end of the normal male range (10.5 nmol/L).

The IAAF research conducted in 2011 had further shown that a) the 99th percentile among female T&F athletes was 3.08 nmol/L (meaning that 99% of women had values below 3.08 nmol/L), and b) not a single women had a T level above this, unless they had a DSD.  So the IAAF were satisfied that this cut-off of 10 nmol/L was would prevent false positives, and would make for a legitimate upper level in their hyperandrogen policy.

The result was that any intersex athlete would now have to suppress their T levels (hormone therapy), get below this level, and then compete.  This was what Semenya, and others, would have been subject to from 2010 onwards.

This policy was challenged by an Indian sprinter, Dutee Chand, who took the case to CAS, and it was that hearing, in July 2015, that ruled in her favour, setting aside the IAAF’s policy for two years, and asking them to return with evidence.  The main reason for setting it aside, and we shall return to this later, is seen in points 533 and 534.  These are very important concepts:

Consider also point 522 of that Chand decision:

And finally, point 517, part of CAS building towards their conclusion:

The point is, CAS were clear in their explanations that a) there was insufficient evidence to argue that elevated T was the source of a competitive advantage; b) the size of the advantage is key to them, because it has to be greater than any other normal genetic advantage (see point 517); and c) the “degree of competitive advantage over non-hyperandrogenic females” had to be of “commensurate significance to the competitive advantage that male athletes enjoy over female athletes” (From point 533).

Now, this does not necessarily mean that the advantage must be the same as the male advantage over females – it needs to be “of commensurate significance”, but it does lead CAS to refer, often, to this 10% to 12% male vs female difference, which I think compromises the issue.

Wrongly, in my opinion, CAS has been led into “anchoring” the difference between men and women at between 10% and 12%, because that is the typical difference between men’s and women’s performances. This should never have been an issue of men vs women, however. Rather, it should be about whether women who possess a Y-chromosome, and who produce T in the male range, have an unnaturally large advantage over women who do not have those male-level T values (more in line with the language used in Point 517, shown above)

In the CAS decision, they are aware that it should be women with high T vs women without, but as you see in Points 522 and 527, they keep coming back to this 10% to 12% comparison.  Quite how strongly CAS wish to hold to that comparison is going to determine how effective the IAAF research is in swaying them to reinstate the policy.

The question is: At what point does the advantage become small enough that it starts to resemble all other advantages?  It’s at that point that CAS will dismiss any evidence as insufficient to reinstate the policy, and that’s where the latest IAAF research comes in.

Top

The IAAF research: A performance advantage of Testosterone

So what does the IAAF have?  The study was done at the 2011 and 2013 World Championships in Daegu and Moscow.  The IAAF obtained blood samples of 2127 athletes – 795 male, and 1332 female – and tested their T levels as well as a range of other hormones (see Tables later).

The findings.  Some highlights below:

  • Among the 1332 females, 24 had T levels above 3.08 nmol/L (remember this is the 99th percentile in a previous study).
  • Of the 24, nine were diagnosed as having a DSD.  Nine were found to have been doping, and six were “impossible to classify”
  • An almost ‘sideline’ finding is that in the men’s athletes, the throwers have lower T values than marathon runners and race walkers, which the authors call “an unexpected result”. They attribute it to doping, which I think is reasonable – the athletes are using steroids up to the competition, and when they stop, so as to avoid detection by doping tests, they have a typical withdrawal response, and lower T levels at the event.

Here’s a table from the study, showing what was measured, and the levels in females (top) and males (bottom).  Note the approximately 20-25 difference in the T values of men compared to women (for mean, and the 25th and 75th percentiles).

Top

The performance comparison – High T vs Low T

What the IAAF then did was to divide the female athletes into ‘tertiles’ – that is, three groups, a top third, a middle third and a bottom third. They then compared the performances of the top tertile and the bottom tertile.  This table shows the result:

It’s a lot to take in, but I’ve highlighted in yellow the events where the women in the high T group significantly outperform women in the low T group.  For example, look at the 800m, middle left – the high T group have an average time of 2:00.50, and the low T group run 2:02.68.

The significant differences existed for 400m (1.34 s faster for High T), 400m hurdles (1.60 s faster in High T), 800m (2.18 s faster for High T), Hammer Throw (3.07m further in High T) and Pole Vault (13cm higher in High T).

As percentages, the range is 1.8% (800m) to 4.5% (Hammer throw).

Looking ahead to the CAS hearing, the IAAF position will be that the effect of high T is between 1.8% and 4.5%, and thus provides a significant advantage that meets the criteria identified by CAS (one example being shown in Point 517 above).

Note that the paper does not isolate the performances of the nine athletes with DSDs, nor for that matter the 24 female athletes with T levels above the 99th percentile.  Instead, they have been included in the “highest fT tertile” group shown in the table above.  They are significant outliers for their elevated T level, and it will be the IAAF position that if a group that includes normal T has a large advantage, then a group with very elevated T (remember that these athletes with DSDs have T levels in the male range – they are the top percentile, literally, not the top 33%) then women with hyperandrogenism, who have T levels between 15 nmol/L and 30 nmol/L will have an even greater performance advantage, provided they are sensitive to T effects.

Top

Is the evidence strong enough?

Now for the big questions.  The IAAF have now got evidence that goes some of the way to addressing what CAS had identified as a lack of evidence for a performance advantage associated with high T levels.  That’s a step in the right direction, not only for the IAAF’s case, but for our general understanding of performance physiology.

The IAAF will present this study, plus a few things not included in it, and say “Here is the evidence that T is a suitable candidate with which to manage the complicated issue of intersex conditions, because high T levels confer an advantage that is between 1.8% and 4.5% within normal T ranges.  We propose that even higher T levels provide even greater advantages, and thus our policy to set a cut-off at 10nmol/L should be reinstated.

Is it going to be enough? Before I give my views on this study, I just want to emphasize that my own opinion on this issue is that the IAAF’s policy and 10nmol/L cut-off was the best solution to an impossible problem.  I thought that CAS was wrong to set it aside (though I understand some of the rationale), and I think the best solution for the sport, and for the most athletes, would be to reinstate the policy.

At one extreme, you have a situation where all intersex athletes are banned, and I don’t think this is fair.  At the other, you have no restriction or regulation at all, which is what we have had since 2015 thanks to the CAS verdict.  I believe the compromise to be a point in the middle, where a pretty generous upper limit of 10nmol/L is set, and athletes with Y chromosomes and high T levels can still compete, provided they bring their T levels down to 10nmol/L, from say 25nmol/L.

Last year this time, I wrote two articles discussing this issue.  In the first, I interviewed Joanna Harper, and in the second, I addressed some of the common issues and questions on this issue.  I will not rehash those views here, but would steer you towards those for more discussion.  I stand by all those viewpoints, mine and Joanna’s, and this latest IAAF evidence does little to change that.

Top

Opinion: The IAAF evidence does not go far enough)

That said, I don’t think that the IAAF evidence is enough, and if I were a betting man, I’d say that CAS will dismiss their appeal and that the status quo will remain, much as I wish it would chage.  I say this for a few reasons:

  1. The IAAF evidence does not go far enough, either in terms of the depth or the range.  By depth, I mean the magnitude of the  difference.  An advantage of between 2% and 5% is very big, yes. It’s the difference between a decent club runner and an international athlete in most events.  If one uses Semenya as an illustration, prior to the 2015 CAS decision, she was running 2:01.  Within a year, she’s running 1:56, regularly, with capabilities to go faster.  That’s 4%, and it’s taken her from non-qualifier for World Champs to untouchable, dominant Olympic champion.Point is, 4% is huge.  5% is massive.  Even 2% is a big, big difference, the kind that a lifetime of training won’t overcome at elite level.  However, I don’t know if it will be enough for this context.  Remember those points I highlighted from the CAS decision.  Refresh your memories, Points 533 and 534:This does not mean that CAS are explicitly looking for 10% to 12% – that is not a required standard. But it does point strongly towards a mindset that if they can’t show a difference that is commensurate with male advantages over females, then they’re not showing a big enough advantage.As I said earlier, the big question is how rigid CAS will be with respects to the male-female difference, as opposed to a concept of “unnatural advantage”. I think the IAAF are a few percent short of a convincing case, even in those events that are statistically affected.
  2. By range, I mean the spread of events.  The IAAF have found significant differences between High T and Low T in five events – 400m, 400m hurdles, 800m, hammer and pole vault.  What of the other 16 events?  In the short sprints, for instance, the High T group is generally slower, though the difference is not statistically significant – check the table above.Now, there are some very plausible explanations for why this might be – as with the male throwers, you may be seeing the effect of doping, but that’s speculative.  For the IAAF case to be really strong, the effect of T really needed to exist across the spectrum of events, and in particular the sprints and power based events, as well as the long sprints and middle distance events.
  3. The argument of “it’s a normal genetic advantage” remains. This argument has frustrated me from the very beginning of the debate, because qualitatively, a woman with a Y chromosome and high T levels is clearly not the same thing as a Jamaican sprinter with fast twitch muscle fibers, or an NBA player who is 208cm tall.  Why?  Because we have recognized that men and women are different, and created separate categories for them to compete in.  We have NOT created categories for height in basketball, for fiber type in running, for foot size or arm length in swimming.  We have female category for a reason, and it’s to protect the integrity of the sport and women’s competition against the most powerful genetic influence known to performance – the male chromosome.There is, in a very reductionist approach, an argument to be made for saying that anyone with a Y chromosome must compete as a male.  But that’s problematic because of physiological conditions where the Y chromosome confers no advantage at all.  It’s in attempting to accommodate this complexity that we produce the impossible dilemma we are now faced with.But the idea that a “genetic advantage” that comes from having a male chromosome and male levels of Testosterone is in any way the same as that of having height, or reach, or muscle biochemistry, is completely bogus, in my opinion.  The logical extension of this argument is that we should accept that winning in sport is all about genetic advantages, and then let everyone complete in one category, and see who has the advantages.  Such is life, and you win the genetic lottery if you are male.  Only men would qualify for World Championships, with 0% female representation, let alone winning.However, we don’t do this, because we have recognized the need to regulate some genetic advantages.  We do the same for size in boxing, rowing and other contact sports, by the way – would you make an exception to allow a 95kg man, born that way, through no fault of his own, to fight against middle weights?  No, of course not – if you have a category, you protect it, and people trying to argue that these intersex athletes are females with a normal genetic advantage, just like Lebron James, Usain Bolt and Michael Phelps are basically arguing for the dissolution of female competition, they just don’t realize it yet.However, because this argument is so pervasive, it will come up in the CAS hearing.  And here, the IAAF evidence is not quite strong enough to refute it.  Had the IAAF discovered a performance advantage of 6% to 8%, then it’s different, because that’s close enough to the 10% male-female difference that I think you could make a case for it being an advantage that “outranks the influence of any other single genetic or biological factor” (see point 517 from CAS’ decision, above).

    2% to 5%, in my opinion, DOES still meet that criteria, and I believe it to be even larger in virilized women with DSDs. However, I don’t know that it will be enough for CAS, given how often they referred to the 10% to 12% male-female difference in their conclusion previously.

So, for the reasons outlined above, I think the IAAF have some evidence of advantage, which confirms physiological theory, but I don’t know that it will be enough to clear the bar that was set by the language and phrasing used by CAS in their conclusion in 2015.

Were I a betting man, as mentioned, I’d say CAS will acknowledge that this evidence goes some of the way to understanding the issues, but is insufficient in magnitude and breadth, and we will go on with the current situation.

That said, maybe I’m totally wrong.  One final point – the IAAF will have other arguments.  It won’t all hinge on this study. For instance, they now have a few cases of DSD athletes who have had hormone therapy (between 2010 and 2015), then come off, so there’s a “quasi-experiment” there to show the effects too.  There’ll be other submissions.  But this evidence alone, I just don’t think quite gets there.

That decision comes at the end of the month.

Ross

This post is part of the thread: Caster Semenya – an ongoing story on this site. View the thread timeline for more context on this post.

The Science of Sport - Scientific comment and analysis of sports and sporting performance