Dichotomization and cut-off values

Dichotomization is when we measure or classify something using only two values. Usually this is because we have to make a decision (yes or no) regarding something; for instance, whether to treat someone for a medical or mental problem or not. Other times it is because people are using a theory which claims that entities can be divided into two (or more) clear-cut groups. Theories like that are said to be typological (e.g. Myers Briggs). The alternative to this is where traits have values along a continuum and these are called trait theories. Generally, typological theories about human psychology are inconsistent with what we know from behavior genetics and should be discarded.

If we measure a continuous trait using only two values, then we are losing precision. If we then correlate the obtained data with some other variable, the resulting correlation will be lower than it would have been had we used a continuous measurement. The issue is further complicated by the fact that we can vary at which centile we set the cut-off point between the two values. Lack of understanding of this phenomenon leads to confusion and underestimization of the strength of relationships between traits that are routinely measured as dichotomous values, such as mental illnesses.

Below we see a scatter plot of two variables. You can use the settings to the left to make one or both of the variables dichotomous. You can also decide at which cut-off centile the data are to be dichotomized. When you have made some changes and want to see the effect, click the 'Update!' button below.


Made by Emil O. W. Kirkegaard using Shiny for R. Source code available on Github.