Essay Sample: Different Methods of Establishing the Reliability of a Psychological Instrument

Published: 2022-05-17
Essay Sample: Different Methods of Establishing the Reliability of a Psychological Instrument
Type of paper:  Essay
Categories:  Psychology
Pages: 4
Wordcount: 980 words
9 min read

The reliability of a test is the degree or precision with which the test measures a psychological trait, regardless of whether or not he is capable of measure it (validity). That is, a test is said to be reliable when it measures well what is being measured. "It refers to the constancy of the measure, to the extent that an instrument of psychological measurement will not deform the result of a measurement due to changes, fluctuations or variations of the instrument itself.

Trust banner

Is your time best spent reading someone else’s essay? Get a 100% original essay FROM A CERTIFIED WRITER!

Reliability has two major components:

Internal consistency: refers to the degree to which different items, parts or pieces of a test measure the same thing. It means the constancy of the items for operate on the same psychological construct in an analogous way.

Temporal stability: refers to the degree to which a measuring instrument will yield the same result in various concrete measurements by measuring an object or subject that has remained unchanged. A totally reliable test would be the one with which it could be measured, that is, to place an individual on the scale without any error. Although, in practice, no instrument of measurement is totally reliable, not even those that measure physical characteristics. Is say, if we measure the same object repeatedly with the same instrument we get slightly different measures. Therefore, any score is composed of the true score plus the error committed, that is:

X = V + E

In this way, we can define reliability as the proportion of the variance true of the scores of a test; which means that reliability will decrease to

As the error variance increases:

rtt = 1 - SE/ SV

Methods for measuring Reliability

The concept of reliability has been defined operationally of different shapes:

Reliability of parallel forms

Reliability test-retest

Reliability of internal consistency

Reliability between qualifiers or evaluators

Method of parallel tests or parallel forms of a test

This method consists of:

Elaborating two parallel forms of the same test, or what is the same, two tests parallel.

Applying a test form to the sample of interest, and after a period of time, apply the second form of the test to the sample.

Calculating the correlation coefficient between the empirical scores obtained by the subjects on both occasions. If the forms are parallel that correlation is the reliability coefficient of the test.

Test-retest method

It is indicated to estimate the reliability of a test that we only have one shape. It would consist of:

Administer the same test on two separate occasions separated by a certain period temporary to the same sample of subjects.

Calculate the correlation coefficient between the scores obtained by the subjects on both occasions.

The method evaluates the stability of the results over a period of time. By this, the reliability coefficient obtained is called the stability coefficient temporary. Regarding the time that must pass: - A shorter time greater effect of the memory of the answers given, of the learning due to the test itself and the fatigue produced by the test itself (if the second measurement it happens more or less immediately).The longer, the more likely that the subjects have actually changed in the variable of interest due to multiple permanent or circumstantial factors: learning, evolutionary changes, emotional experiences, illness, conditions environmental and social, etc.

For all this, the estimates by the test-retest method are more appropriate for tests that measure features that cannot be affected by the effects of the practice and that are stable over the interval of time elapsed, such as the tests of perceptive speed, Sensory discrimination, rapid verification of numerical calculations, etc.

Internal consistency of a test

In many situations it is not possible to carry out two applications of the test. The objective here is to establish to what extent you can generalize the specific set of items to the domain or content universe. One way to carry out this estimation is assessing the degree of consistency with which the examinees respond the items or subsets of test items, in a single application of the same. When the subjects have a consistent performance in the different items, we say that the test has homogeneity of items. For a group of items to be homogeneous, you must measure the same construct or the same content domain.

Methods of the two halves

Using the Spearman-Brown correction formula

Administer the test to a sample of subjects only once.

Decompose the test into two parts so that they have the same number of items and that can be considered parallel.

Calculate the total score in each of these parts. (It is common to compare the first half of the test with the second, or compare the odd items with the odd ones).

Apply the Spearman-Brown correction for longitude over that double correlation:

rxx =2r/ 1+2

This correction estimates the correlation that would have been obtained between the parties if they would have had the same number of items as the complete test

Reliability among qualifiers or evaluators

In unstructured tests, although not exclusively in them, it is necessary determine whether two or more results obtained by two or more different evaluators or by the same evaluator at different times are coincident. In these cases we will be talking about Intra-judge Reliability or Inter-judge reliability.

It is calculated through an index of agreement between evaluators; the most used formula the Kappa index:

K= Po -Pc/ 1 - Pc


Po = proportion of agreement observed (sum of the agreements reached in each category divided by the number of records)

Pc = proportion of agreement expected at random (sum of the probability of agreement by chance of each category).


Neil R. Carlson [et (2009). Psychology: the science of behaviour (4th Canadian ed.). Toronto: Pearson. ISBN 978-0-205-64524-4.

Davidshofer, Kevin R. Murphy, Charles O. (2005). Psychological testing: principles and applications (6th ed.). Upper Saddle River, N.J.: Pearson/Prentice Hall. ISBN 0-13-189172-3.

Cite this page

Essay Sample: Different Methods of Establishing the Reliability of a Psychological Instrument. (2022, May 17). Retrieved from

Request Removal

If you are the original author of this essay and no longer wish to have it published on the SpeedyPaper website, please click below to request its removal:

Liked this essay sample but need an original one?

Hire a professional with VAST experience!

24/7 online support

NO plagiarism