Nursing Education and Process of Evaluation Steps By Step: Types of Instruments Part-II
Types of Evaluation Instruments In Nursing Education
Selecting an Evaluation Instrument
After a model has been selected and
the variables to be evaluated and their relationship to each other have been
identified, the evaluator then selects evaluation instruments that can be used
most easily to obtain the necessary data. The selection of evaluation
instruments is determined by the evaluation question and the evaluation model.
Types of Instruments
Many instruments are available for
measurement and can be found by doing a literature review. To use a published
instrument, faculty must contact the publisher and obtain permission.
Questionnaire
A questionnaire is a method in which a person answers questions in writing on a form. The questionnaire is usually self-administered. The person reads the question and then answers as instructed. Questionnaires are cost effective but often lack substance. Questions must be clear, concise, and simple (Polit & Beck, 2013).
This
type of instrument is often used to measure qualitative variables, such as
feelings and attitudes. Questionnaires could be used to measure a student's
level of confidence in the clinical setting or to determine students'
satisfaction with the nursing program after graduation.
Interview
An interview involves direct contact with individuals participating in the evaluation. Exit interviews, for example, are often conducted as a faculty member leaves the school of nursing or as students graduate. Interviews can be used to elicit both qualitative and quantitative data. Interviews can be conducted individually or in focus groups. Students or external evaluators may be assigned to collect the data.
The interview should be scheduled at a time that is convenient for both the interviewer and the participant. The interviewer should provide a quiet, private room or office to allow the participant to speak in privacy. A participant may open up more if he or she feels that the conversation will be private and confidential.
An objective outline should be created and followed during the interview, and notes should be kept in a file. Great care must be taken to avoid personalizing the information. One negative aspect of interviews is that they are time intensive (Polit & Beck, 2013). Sanders and Sullins (2006) define the guidelines for interviews as follows:
1. Keep the language pitched to the level of the respondent.
2. Clearly explain the purpose of the interview, who has access to the recordings or transcripts, and how it will be kept confidential.
3. Encourage honesty, but let people know they can refuse to answer a question if they choose. Establish rapport by asking easy, impersonal questions first. Avoid long questions.
4. Avoid ambiguous wording. Avoid leading questions.
5. Limit questions to a single idea. Do not assume too much
knowledge (Sanders & Sullins, 2006, p. 31).
Rating Scale
A rating scale is used to measure
an abstract concept on a descriptive continuum. The rating scale is designed to
increase objectivity in the evaluation process. Rating scales work well with
norm-referenced evaluation, although they are not the best tools to use for
this type of evaluation. Grades can easily be assigned to the ratings.
Checklist
A checklist is two-dimensional in that the expected behavior or competence is listed on one side and the degree to which this behavior meets the level of expectation is listed on the other side. With a detailed checklist of items and well-defined criteria being measured, the evaluator can easily identify expected behavior or acceptable competence.
This type of instrument is useful for formative and summative
evaluations. A checklist can be used to evaluate a student's performance of
clinical procedures. The steps to be followed can be placed in sequential order
and the observer can then check off each action that is taken or not taken.
Attitude Scale
An attitude scale measures how the participant (usually a student) feels about a subject at the moment when he or she answers the question. Several popular types of attitude scales are used in nursing education evaluation. The most popular is the Likert scale. In a Likert scale, several items in the form of statements (10 to 15 statements are recommended) are used to express an opinion on a particular issue.
Each item represents a construct of that issue; For example, a particular item may express an opinion about Latino students in nursing when the theme of the survey is diversity. Participants are asked to indicate the degree to which they agree or disagree. Equal numbers of positively and negatively worded items should be used to prevent bias in the responses.
Semantic differential is another scale used to measure attitude. Bipolar scales are used to measure the reaction of the participant. Each item on the scale is followed by bipolar adjectives such as good–bad, active–passive, or positive–negative. The number of intervals between each adjective is usually odd so that the middle interval is neutral.
A list of five to seven intervals is sufficient. Analysis is performed by adding values for each item, which is similar to what is done with the Likert scale (Polit & Beck, 2013). For analysis of Likert scale data and semantic differential scale data, it is recommended to refrain from treating the data as interval data and to use the Rasch model for analysis.
By applying the Rasch model, a more appropriate analysis of the tool and data is accomplished. Typically, Likert data are treated as interval data, although the individual responses are scaled as ordinal. Interval data are assumed. As Bond and Fox (2007) illustrate:
Five endorsements of the coding of a Likert type scale by a respondent (SD DNA SA) result in a satisfaction score of 25, five times the amount of satisfaction indicated by the respondent who endorses the five SD categories (5 × 1 = 5), or almost exactly twice the satisfaction of someone who endorses two N's and three SD's (2 × 3 × 2 = 12).
Whenever scores are added in this manner, the ratio, or at least the interval nature of the data, is being presumed. That is, the relative value of each response category across all items is treated as being the same, and the unit increases across the rating scale are given equal value. The data are subsequently analyzed in a rigidly prescriptive and inappropriate statistical way.
The Rasch model treats Likert scale data mathematically more justifiably than the ordered sequence of 1 to 5 “then add them up” approach. Rasch recognizes coding as ordered categories only, in which values of each category are higher than for the previous category but by an unspecified amount.
Likert-type scale data need to be regarded as ordinal data, whereas the Rasch model transforms the counts of items into interval scales based on empirical evidence as opposed to an assumption. The empirical evidence is calculated using log transformations of raw data odds and abstraction is accomplished through probabilistic equations.
The Rasch model is “the only model that provides the necessary objectivity for the construction of a scale that is separable from the distribution of the attributes in the persons it measures” (Bond & Fox, 2007, p. 7).
The conceptual understanding of the Rasch model is best described as a model created within item response theory. Item response theory, as explained by Rudner (2001), uses test scores and specific test item scores based on assumptions concerning the mathematical relationship between abilities or attitudes and item responses.
The Rasch model has the ability, through several
diagnostic procedures, to diagnose the tool's ability to accurately measure the
author's and respondent's intentions. The design of a rating scale has a
tremendous influence on the quality of responses provided by the respondents.
Diagnostic ability provides a powerful tool for designing ,
analyzing,
and revising attitude scales.
Give your opinion if have any.