|
|
[Sitemap]
|
|
The Faking of Genetical Results The Faking of Genetical ResultsBY PROFESSOR J. B. S. HALDANEMY FATHER published a number of papers on blood analysis. In the proofs of one of them the following sentence, or something very like it, occurred: "Unless the blood is very thoroughly faked, it will be found that duplicate determinations rarely agree." Every biochemist will sympathise with this opinion. I may add that the verb "to lake," when applied to blood, means to break up the corpuscles so that it becomes transparent. In genetical work also, duplicates rarely agree unless they are faked. Thus I may mate two brother black mice, both sons of a black father and a white mother, with two white sisters, and one will beget 10 black and 15 white young; the other 15 black and 10 white. To the ingenuous biologist this appears to be a bad agreement. A mathematician will tell him that where the same ratio of black to white is expected in each family, so large a discrepancy (though how best to compare discrepancies is not obvious) will occur in about 26 per cent. of all cases. If the mathematician is a rigorist he will say the same thing a little more accurately in a great many more words. A biologist who has no mathematical knowledge, and, what is vastly
more serious, no scientific honour, will be tempted to fake his
results, and say that he got 12 black and 13 white in one family, and
13 black and 12 white in the other. The temptation is generally more
subtle. In one of a number of families where equality is expected he
gets 19 black and 6 white mice. It looks much more like a ratio of 3
black to 1 white. How is he to explain it? Wasn't that the cage
whose door once seemed to be insecurely fastened? Perhaps the female
got out for a while or some other mouse got in. Anyway he had better
reject the family. The total gives a better fit to expectation if he
does so, by the way. Our poor friend has forgotten the binomial
theorem. A study of the expansion of He gets his Ph.D. He wants a fellowship, and time is short. But he has been reading Nature and noticed two letters* to that journal of which I was joint author, in which I might appear to have hinted at faking by my genetical colleagues. Thoroughly alarmed, he goes to a venal mathematician. Cambridge is full of mathematicians who have been so corrupted by quantum mechanics that they use series which are clearly divergent, and not even proved to be summable. Interrupting such a one in the midst of an orgy or Bhabha and benzedrine, our villain asks for a treatise on faking. "I am trying to reconcile Milne, Born and Dirac, not to mention some facts which don't seem to agree with any of them, or with Eddington," replies the debauchee, "and I feel discontinuous in every interval; but here goes. * U. Philip and J. B. S. Haldane (1939).
Nature, 143, p. 334. "I suppose you know the hypothesis you want to prove. It wouldn't be a bad thing to grow a few mice or flies or parrots or cucumbers or whatever you're supposed to be working on, to see if your hypothesis is anywhere near the facts. Suppose in a given series of families you expect to get four classes of hedgehogs or whatnot with frequencies p1, p2, p3, p4, and your total is S, I shouldn't advise you to say you got just Sp1, Sp2, Sp3 and Sp4, or even the nearest whole number. Here is what you'd better do. Say you got A1, A2, A3 and A4, and evaluate
Your "Your second order faking is the same sort of thing. Supposing your total is made up of n families, and you say the rth consisted of ar1, ar2, ar3, ar4 members of the four classes, sr in all, you take
and sum for all values of r. Your total ought to be
somewhere near 3n. The standard error is "There is also third order faking. The 4n different
components of Man is an orderly animal. He finds it very hard to imitate the disorder of nature. In fact the situation is the exact opposite of what the reader of Paley's Evidences might expect. But the problem is an interesting one, because it raises in a sharp and concrete way the question of what is meant by randomness, a question which, I believe, has not been fully worked out. The number of independent numerical criteria of randomness which can be applied increases with the number of observations, but much more slowly, perhaps as its logarithm. The criteria now in use have been developed to search for excessive irregularity, that is to say, unduly bad fit between observation and hypothesis. It does not follow that they are so well adapted to a search for an unduly good fit. Here, I believe, is a real problem for students of probability. Its solution might lead to a better set of axioms for that very far from rigorous but none the less fascinating branch of mathematics. Eureka, 6. Reproduced from Eureka 27 pages 21-24. |
|