IQ tests don’t measure what’s most valuable. By Devon Erikson.
Mensa et al:
When I was young, some time in the 20th century, someone informed me that my score on a certain academic admissions test qualified me for membership in something called the “Pi Society”.
I believe it no longer exists, but it was something like MENSA.
Being skeptical, but curious, I went down a very deep early-internet rabbit of “high-IQ societies”. There were quite a few of them. …
After some sniffing around, I had a general impression of what these groups mostly did, which was… make IQ tests, and take the IQ tests that other members made.
At which point, I had a key realization. These people weren’t the best and brightest on the planet. They were logic puzzle hobbyists. Which explained why I had never heard of any of them outside of the high-IQ society context. Because making logic puzzles isn’t exactly a trillion dollar global industry.
What do you want to measure?
Now, none of this implies that IQ isn’t real (it is), or that it isn’t important (it is), or that IQ tests don’t measure it effectively (they usually do).
The point is, they don’t measure it directly. They can’t. They measure the performance of skills which highly correlate to it. Like solving logic puzzles.
IQ is an invisible beast that casts a shadow in sunlight. You can’t see it, but the shadow proves it is there. You can measure the one by measuring the other, but your results can always be distorted by lighting conditions.
Solving logic puzzles correlates very well to IQ, but it’s also a skill on its own, which can be practiced. Or influenced by other skills and knowledge.
Example:

Whether you can solve it or not correlates to intelligence, yes.
But since I have a degree in computer science, it’s very easy for me to say “XOR”, and just be done with it.
Someone just as intelligent as me, but with no such background, would have to go through more steps. He would have to invent the concept of “XOR” on the spot before solving the puzzle. The question would be harder for him, and easier for me.
So, these super-IQs of 200 or more, attained on tests specially designed to make such results possible, might have a limited amount to do with inherent IQ as we understand it, and more to do with a practiced skill at taking IQ tests.
A test of this sort might accurately rate (or underrate) someone who comes in cold, and vastly overrate someone who enjoys logic puzzles instead of, say, designing and building rockets.
All the tests are flawed (e.g. the Flynn effect):
Now, there is a field of psychometrics, which is devoted to minimizing these sorts of effects, but having gone down that rabbit hole, too, I can tell you that success that area is much more limited than those working on it would have you believe.
In other words, any metric can become a target, and when it is a target, it begins to lose effectiveness as a metric. …
Actionable takeaway:
Because there is a point, somewhere up in the IQ stratosphere, where those aspects of IQ which are easily measured (logic puzzles) diverge from those aspects of IQ which we care about (rockets).
A logic puzzle is ultimately a question of intent. What is the intended answer by the human puzzle designer?
But we don’t care about intelligence in order to solve logic puzzles. We care about intelligence in order to solve natural problems. And natural problems have no intent. They weren’t designed by a human.
So when we give Alan and Bob an IQ test, and Alan scores 115 while Bob scores 130, we know that Bob should get the rocket-design job, and Alan should go be a doctor or a lawyer or something.
But the same is not necessarily true between 145 and 160, and even less necessarily true between 160 and 175 (which is beyond the range of standard tests anyway).
Conclusion is that there’s no point in giving Elon Musk an IQ test. And don’t try to replace him with Mary Vos Savant [highest recorded IQ in the Guinness Book of World Records], ’cause you won’t like the results.
This is why IQ tests don’t work too well on really smart people. Because sorta smart people tend to give the expected answer.
And really smart people tend to point out that the question is wrong, and start arguing with the test, or trying to correct it, thereby making the test impossible to grade and annoying everyone.

The expected answer to this is 72. Because 2*2*2 = 8 and 5*5*2 = 50, so 6*6*2 = 72.
But the (really) correct answer is “I don’t know.” Because what you have is two points on a 3 dimensional graph (x,y) -> z. … An infinite number of contiguous surfaces can be drawn in three dimensions that encompass these points (2,2,8) and (5,5,50).
Each of these surfaces can be described by its own formula. Some of them will also touch (6,6,72). But others of them will touch (6,6, {something else entirely}) instead.
This might sound really, really pedantic. But it’s not.
Everyone knows that the expected answer is the simple one, but that’s only on a test… a fake artificial made up problem.
The real world:
When we start trying to do this in the real world, which, after all is what this “IQ” thing is actually for, then using the same kind of “IQ test thinking” can get you in trouble. …
But Devon, I hear some of you ask, doesn’t the principle of Occam’s Razor demand that we fit the simplest curve?
No. No, it does not. It does not require that we select the simplest possible answer, given what we have currently seen. It requires that we prefer hypotheses that make fewer assumption to those that make more.
These are two different things entirely.
If I see one black sheep, the simplest hypothesis is that all sheep are black.
The hypothesis requiring the fewest assumptions is that at least one sheep is black on at least one side.
You will note which of these is correct.
All of this is, of course, irrelevant to questions on IQ test. But questions on an IQ test only matter as much as they are relevant to the actual universe... Where ideas like this are very relevant indeed.