Weekly: April 4, 2026: SRY gene tests, LLM model collapse, and other links
SRY Gene Tests Not Suitable for Gender Testing
The International Olympic Committee reinstates SRY gene testing, a practice that was abandoned as unsuitable for the purpose. An earlier statement on World Athletics adopting the same standard:
In advance of the 2025 World Athletics Championships in Tokyo, World Athletics (WA) announced that passing a genetic test, specifically for the absence of the SRY gene, is a prerequisite for competing in the women’s category in all athletics World Rankings Competitions as of 1 September 2025 (online supplemental table 1). Athletes who do not pass the test will be considered ‘biological males’ and required to prove that they have complete androgen insensitivity syndrome (CAIS) in order to compete. This requirement recycles an intentionally discontinued practice from the last century. After ending the discredited chromosome (Barr body) tests that had been used since the 1960s, the International Olympic Committee briefly used SRY gene tests from 1992 to 19991 before they were also withdrawn because of inaccuracy, lack of evidence of performance advantage, excessive cost including the cost of counselling, and the trauma and stigmatisation experienced by athletes and their families.
-- Harmful anachronism: World Athletics reinstates gene testing to participate in women’s competitions
LLM Model Collapse
Model Collapse Is Already Happening, We Just Pretend It Isn’t has given me some food for thought. LLMs have a bias for the central tendency (most common elements) of the language they model. When you feed the results into training the next generation of models, the bias is increased in the output of the next model. This article argues that it's already happening.
In my work work, we're trying to incrementally upgrade systems with layers and layers of production web-application code. Some of the models we use are no longer maintained. I suspect some of the problems I've experienced with LLM-analysis based on this codebase comes from an inability to recognize this as distinct edge case. Feeding the LLM information from two different languages and frameworks might be poisoning the results.
It also might explain some of the uncanny valley sense we see with LLM-generated writing. The language used by individual humans has a statistical "fingerprint" unique to each writer/speaker. An LLM which delivers the most probable continuation may not have those tell-tale quirks that identify an individual writer's style. Reports that LLM-generated text feels bland and corporate might come from this.