Friday, January 30, 2009

Week 4: Measurement

Both quantitative and qualitative are empirical methodologies involving "systematic research of contemporary phenomenon to answer questions and solve questions" (Morgan 25).

Quantitative design is as follows: Quantitative is experimental with “randomized subject selection, treatment and control groups” (Goubil-Gambrell 584). According to Goubil-Gambrell, quantitative involves experimental and quasi-experimental writing research.

Quantitative research does the following:
• Quantifies key aspects
• Manipulates variables (the manipulation is called treatment)
• Measures and statistically analyzes
• Establishes cause and effect relationships
A quantitative study of user experience with online manuals might, for example, identify how much experience the readers have with these books, and how quickly they could solve the problem.

Qualitative design involves case or ethnographic study with representative subjects in natural settings (Goubil-Gambrell 587). The greatest strength of this type of research is its in-depth depiction of subjects in an actual setting. Morgan states that “qualitative research enables researchers to investigate the process of the problem situation or describe features of the problem” and adds descriptive to the types of case studies (27).

According to Goubil-Gambrell, this type of research does the following:
• Descriptive in nature, and process
• Observes a specific situation
• Identifies key variables
• Frames questions
A qualitative study of user experience with online manuals might, for example describe how they used them, who used them, how often they were used, and whether or not people liked using them. Morgan lists 6 sources of data for qualitative researchers, including documentation, archival records, interviews and surveys, direct observation, participant observation, and physical artifacts (29).

There are two requirements of measurement: reliability and validity. According to Lauer & Asher, validity and reliability influence each other, but while measurement can be reliable without being valid, it must be reliable to be valid.

Reliability is one of the main requirements of all measurement instruments, and is the ability of independent observers or measurements to agree. It is largely socially constructed and contains a collaborative interpretation of data influenced by researchers’ tacit knowledge. It is generally reported as a decimal fraction. The three types of reliability include equivalency, stability and internal consistency. The data – whether it is nominal or internal – governs the kind of analysis used to determine reliability.

In public education, the emphasis upon state and federal testing of students has brought me some familiarity with the stability needed to judge whether test results will hold across time with some accuracy, but what it fails to judge is whether students will give a flip and actually try with the same intensity from one 3 month period on MAP tests, for example, to another.

Validity is its ability to measure whatever it is intended to assess, and it rests on the “soundness of interpretation of the measurement system” (L&A 141). Writing tests are valid if they have “congruence with major components of writing behavior” (L&A 140) Four kinds of validity can be determined: content, concurrent, predictive and construct. While the question of reliability can be definitively answered, the question of validity “can never be fully solved” - writing theories continually grow, change and evolve (L&A 141).

The terms probability and significance run throughout the Williams article regarding statistics and research, and includes percentages and samples that are endlessly plotted – so it deals with the likelihood that something either will or will not happen with any statistically definable result. The article, for example, points out that in terms of distribution, it can be seen through probability rather than frequency distribution. In fact, it seems much more useful to take the extra step (and extra work) to consider the term of distribution as a probability, and later encourages us, when dealing with null hypothesis, to use probability as a basis for our decision. It is apparently a sounder model?

Significance refers to whether a researcher can call whatever the difference or relationships he/she is studying “statistically significant” (Williams 61). There also seems to be some convention that notes that “significant at the p<.05 level” occurs at the level with which the null hypothesis is rejected (upon attaining this level of calculated probability) – so the two terms work together (Williams 61).

And that’s my book report.


  1. I thought this was a very thorough synopsis of this week's readings. I especially liked the detail you went into describing the elements of and the differences between reliability and validity. I thought your point about reliability being largely socially constructed was a good insight. I think this is certainly how reliability was presented to us in these readings, but I wonder how a researcher from a statistical background would see that statement. When looking up 'reliability' online, I found a different frame on which to view it. This article described ways to test the reliability of a test by comparing it to later repeat research (stability), other people's research (equivalence), and by splitting the results in half and comparing how the two halves were measured (internal consistency). I'm sure that most people in our class see these as socially constructed, but other researchers may not. I think it all depends on what is being researched (qualitative vs. quantitative.)

  2. Excellent division of the 3 types of reliability, Bryan. Your comments provide a really slick way of explaining these constructs. Good job!

    Also, thanks, Wendy for adding the example about Federal research and how students change over time. Examples like that really help make this stuff stick.