Recent news about a possible ’gene for homosexuality’ on the X chromosome made me think a bit about the special factors influencing selection on the X and Y chromosomes. I may post more about that in future, but, as usual, I got sidetracked by a subsidiary issue.
There are more X than Y chromosomes in the population, and this has been claimed as relevant to their evolution. For example, Matt Ridley’s Genome says: ‘because females have two X chromosomes while males have an X and a Y, three-quarters of all sex chromosomes are Xs; one-quarter are Ys. Or, to put it another way, an X chromosome spends two-thirds of its time in females, and only one-third in males’. Ridley goes on to argue that this affects the competition between X and Y chromosomes. A weightier authority, W. D. Hamilton, also finds it important that an X chromosome spends only one-third of its time in males (Narrow Roads of Gene Land, vol. 1, p.146).
In some sense it is obviously true that on average an X chromosome spends two-thirds of its time in females (assuming equal numbers of males and females, and ignoring rarities like XYY males), but it wasn’t clear to me how the dynamics of chromosome transfer would work. For example, if in a given generation a particular allele on an X chromosome is by chance only found in females, will the frequency in subsequent generations tend towards two-thirds, and if so how?
Anyway, I couldn’t find anything on this point on a cursory look through my books, so I tried working it out myself….
First, suppose there is a newly arisen mutation on an X chromosome. If it arises first in a male, we can be certain that in the next generation, if it survives at all, it will be in a female, since any sperm containing an X chromosome causes the egg it fertilises (if any) to develop as a female.
If on the other hand it arises in a female (or has just entered one), it has a ½ chance of going into a male in the next generation, and a ½ chance of staying in a female. If it stays in a female, there is a ½ chance that it will go into a male in the third generation, and so on. Cumulatively there is a ½ chance that it will spend just one generation in a female, a ¼ chance that it will have just two consecutive generations in a female, a 1/8 chance that it will have three, and so on. The expected average number of unbroken consecutive generations in a female body is therefore 1/2 + 2/4 + 3/8 …. The value of this sum tends towards 2 as the number of generations considered increases [see Note]. It is therefore correct to say that on average a gene on an X chromosome will spend two generations in a female for every one in a male. This is what we would expect given the proportions in the population, but it is nice to be able to confirm it by a more direct line of argument. It also brings out the fact that the 2:1 ratio is only an average, and that from time to time an X gene may spend many consecutive generations in females, whereas it can never stay in a male for more than one. Whether this makes any difference in practice I don’t know, but in theory it might, e.g. because an X gene in a male has no kin-selective interest in that male’s sons.
So far this deals only with a single gene. But what about the frequency of all genes of a given type on X chromosomes among males and females in the population?
First, suppose that there are two gene variants (alleles) at a particular locus on the X chromosome. Call the original form of the gene A, and a recent variant B. Suppose that B is still very rare in the population, so that matings between B-males and B-females can be neglected. The genotypes in the population will therefore be AA , AB, AY and BY (where AY and BY are the genotypes of males with A and B alleles respectively on their X chromosome). For simplicity I make the usual assumption of random mating and separate generations. I use the term ‘B-bearer’ for any individual with a B allele.
Suppose now that in Generation 1 all the B-bearers are male. (Maybe the females all died.) The males have the genotype BY and mate with AA females, producing offspring of genotypes AY (male) and AB (female) in equal proportions. Therefore in Generation 2 the B-bearers are all AB females. These mate with AY males, and produce offspring of genotypes AA, AB, AY and BY in equal proportions. In Generation 3 there are therefore equal numbers of male and female B-bearers. These mate respectively with AA females and AY males (still neglecting the rare possibility of matings between B-bearers). The AA-BY matings produce ½ AB females and ½ AY males. The AB-AY matings produce offspring of genotypes AA, AB, AY and BY in equal proportions. In Generation 4 there are thus 3 AB females to every 1 BY male. Pursuing the same method of analysis, in Generation 5 there are 5 AB females to every 3 BY males, and so on. I have taken this to the 7th generation, with the following results:
Ratio…………………AB females : BY males
Gen 1…………………………..0 : 1
Gen 2…………………………..1 : 0
Gen 3…………………………..1 : 1
Gen 4…………………………..3 : 1
Gen 5…………………………..5 : 3
Gen 6…………………………11 : 5
Gen 7…………………………21 : 11
It is evident that the ratio of females to males among B-bearers is oscillating around the expected 2:1 ratio, but converging on it quite rapidly. The result is the same whether we start with a surplus of males or females. I do not have a formal proof of the convergence, but the general explanation is fairly clear. Any surplus of males (relative to the 2:1 expected ratio) in a given generation is immediately converted into a surplus of females in the next generation. The surplus of females is then converted into a surplus of males, but as AB females produce equal numbers of AB female and BY male offspring, only half of the female surplus is transferred to males at each stage. This acts as a ’damping’ influence on the oscillation. If the surplus of one sex in the first generation is Q (where Q is a proportion of the whole population) then the surplus in succeeding generations will be -Q/2, Q/4, -Q/8, Q/16, -Q/32, and so on. At the limit, when the ratio reaches 2:1, the proportions are in equilibrium, because the number of BY males created by AB females equals the number of AB females created by BY males. If the absolute numbers are small, there will of course be some random fluctuation around this ratio.
If the B allele is under positive selection, its frequency in the population will rise, and it will no longer be possible to neglect matings between B-bearing males and females. There will be 5 different genotypes in the population (AA, AB, BB, AY, and BY) and 6 different mating combinations (AA-AY, AA-BY, AB-AY, AB-BY, BB-AY, and BB-BY). This makes analysis more laborious, but I have checked that the system is in equilibrium when the distribution of B genes between females and males is in the ratio 2:1, and the genotypes of the females are in the Hardy-Weinberg proportions p^2:2pq:q^2, where p is the frequency of A, and q = 1-p is the frequency of B. Of course, since BB females have two copies of B, a ratio of 2:1 among genes no longer implies a ratio of 2:1 among individuals. At the extreme, when B has gone to fixation under selection, there are equal numbers of BB females and BY males. Matings between BB females and BY males preserve the equilibrium ratio of 2:1 among genes.
I dare say that to someone like G. H. Hardy this would all be obvious at a glance, but I’m not G. H. Hardy, and I found it quite intriguing to work out!
*a Waltz goes one-two-three, one-two-three….
Note
We wish to prove that the limit of the sum 1/2 + 2/4 + 3/8 + 4/16 + 5/32.…. is 2. I found the following way of proving this. First, arrange the terms of the series in a column as on the left-hand side of
the following series of equations:
1/2 = 1/2
2/4 = 1/4 + 1/4
3/8 = 1/8 + 1/8 + 1/8
4/16 = 1/16 + 1/16 + 1/16 + 1/16
5/32 = 1/32 + 1/32 + 1/32 + 1/32 + 1/32
6/64 = 1/64 + 1/64 + 1/64 + 1/64 + 1/64 + 1/64
[Etc.] …………………………………………………………..
Limits: 1 + 1/2 + 1/4 + 1/8 + 1/16 + 1/32 ………[limit of sum = 2]
The nth term of the series on the left-hand side has the form n/2^n, while the equations on the right-hand side expand each term in the manner shown. Now look down each column on the right-hand side and note that it is itself a regular series, with its sum tending to the limit 1/2^(c -1), where c is the column number (numbered from left to right, on the right-hand side of the equations). [Added: the columns come out ragged in this text format, so this is not as clear visually as I hoped.] The limits are shown on the bottom line. Moreover, these limits themselves form a series, the sum of which has the limit 2. Since all the terms on the right-hand side are positive, their sum must increase with increasing n, but it can never exceed the overall limit to the sum of the columns, i.e. 2. The sum 1/2 + 2/4 + 3/8 + 4/16 + 5/32.… therefore has a limit not greater than 2. To prove that the limit actually is 2, note that for any chosen finite value of n the sum of the terms in each column down to that value of n falls short of its limit by 1/2^n. There are n of these columns, so their total ’remainder’ is n/2^n. We must also allow for the columns that would appear further to the right if the series were extended beyond this choice of n. The total value of the terms of these ’invisible’ columns cannot exceed the limit of the sum of the last ‘visible’ column. Since the column number of the last ‘visible’ column is n, this limit is 1/2^(n -1). For any given finite n the sum of all the terms on the right-hand side, down to this value of n, therefore falls short of the upper limit 2 by not more than n/2^n + 1/2^(n -1). But it is evident that as n increases this maximum ’remainder’ tends to zero, therefore the series 1/2 + 2/4 + 3/8 + 4/16 + 5/32… tends to the limit 2. Q.E.D.
Posted by David B at 04:21 AM