Skip to main content
Using the right terms matters
From 50 psychological and psychiatric terms
to avoid to 10 cogneuro/stats terms to use.
There is a great review of the misuse of psychological
terms titled ‘Fifty psychological
and psychiatric terms to avoid: a list of inaccurate, misleading, misused,
ambiguous, and logically confused words and phrases’ by
Lilienfield et al. 2015 (doi: 10.3389/fpsyg.2015.01100). The review points out ‘errors’ that can be
often found in the literature and is definitively worth reading. Here are 10 terms
I took from this paper and for which I thought we have good alternatives.
Biomedical
(1) A gene for. As
the authors point out, this is extremely rare that a single gene has a causal
relationship to behaviour. When testing for the role or involvement of a gene,
we could use something like: ‘We found that gene X participates to the control
of ’
(2) Genetically determined. When it
comes to brain morphology, patterns of activation or behaviour there is no
genetic determination. Best is to stick to a statistical conclusion like:
‘Between X and Y % of the variance of X was genetically explained.’
(3) Comorbidity: Here I
think we should stick to the standard definition of simultaneous presence of a
disease or conditions within individuals. As the authors points out it is
sometimes used to refer to covariations within a sample but this is simply
wrong.
Imaging
(4) Brain region X lights up. This is a typical trait of activation studies, areas
light up, but only if you use hot colours! I can only concur with the author of
the article that we need to add the contrast under which it ‘lights up’,
without that context ‘activations’ become meaningless. Area X shows significant
activations in condition A relative to baseline or it shows a significantly
stronger signal for condition A than B.
(5) Neural signature. As the author put it ‘identifying
a genuine neural signature would necessitate the discovery of a specific
pattern of brain responses that possesses nearly perfect sensitivity and
specificity for a given condition or other phenotype’ so
simply never ever use that term – Most of the time we don’t know how sensitive
is a pattern of activation, but if a pattern is seen in one condition only then
we can say we found a specific neural pattern.
Statistics
(6) No difference. Most
often, biomedical/neuroscience paper use null hypothesis testing. In this
framework we can only fail to reject the null hypothesis and it is
therefore impossible to conclude that there is no difference. To conclude this
alternative analyses (being frequentist or Bayesian) must be used.
(7) p = 0.000. Clearly we cannot write many 0 as decimals since that
simply equivalent as writing p=0 and the probability under the null that the
observed effect differ from the null, cannot be 0 for a given sample. The authors
suggest using p < 0.01 or p < 0.001,
but it is possible to observe p=0 in a sample, using randomization procedures,
in that case we can write p~0.
(8) Interaction. The point
made by the authors is that interaction used in the general sense and in the
statistical term are different. While the former implies that multiple factors
play a role in something, the statistical interaction implies that changes in
one factor leads to changes in another. I guess the simple solution is to
systematic use ‘statistical interaction’ if this is what we mean.
(9) Validity. It is pointed out in the
paper that it is often use in ‘validity of the hypotheses’ or ‘validity of the
test’. Validity refers to the accuracy of measures, and it thus should not be
used otherwise. I concur here with the authors to use alternatives when
describing results and hypotheses such as saying this is ‘empirically
supported’ and when referring to a behavioural test say there is ‘evidence
for construct validity’
(10) Reliability. The most
common and horrific way to use that term appears in sentences like ‘our results
were very reliable with a p value of 0.001’. I am not going to explain here what p values are but this
has nothing to do with reliability. This term must be used only in the specific
contexts of i) inter-rater (a measure is reliable if different raters give the
same result), ii) test–retest (a measure is reliable if replications give the
same result) and, iii) internal consistency (a test is reliable if the
different items measure the same construct).
Comments
Post a Comment