|
UNIVERSITY OF CALIFORNIA SANTA CRUZ
PHYSICS EDUCATION RESEARCH: SUMMATION and APPLICATION A thesis submitted in partial satisfaction of the requirements for the degree of
MASTER OF SCIENCE in PHYSICS by Michael Eric Burnside September 2002
The Thesis of Michael Eric Burnside is approved:
________________________ Professor Fred Kuttner, Chair
________________________ Professor Bruce Rosenblum
________________________ Professor Joshua Deutsch
_______________________________________ Frank Talamantes Vice Provost and Dean of Graduate Studies
Copyright ã by Michael Eric Burnside 2002 Table of Contents
List of Figures The following list of six figures is complete. The odd numbering is a delebrate attempt at parralelism with the original papers
List of Tables The following list of twenty-one tables is complete. The odd numbering is a delibrate attempt at parrallelism with the original papers
ABSTRACT
PHYSICS EDUCATION RESEARCH: SUMMATION and APPLICATION Michael Eric Burnside ABSTRACT
This work summarizes sixty-two PER related articles and papers. Most of these articles and papers were published in either The American Journal of Physics or The Physics Teacher from 1992 through 2002. The FCI, MB and MPEX were given to various Cabrillo Community College physics classes, and the resulting data is analyzed. This work is presented in three parts. A short introductory overview paper. A medium length data section. And, a long individualistic bookreview style thesis.
Dedication
to my father, Don Burnside and my mother, Joyce Heigel. -- I love you, thank you.
Acknowledgements
I wish to acknowledge everyone and the kitchen sink: To Fred Kuttner -- its been a while since I was so complimented. To Steve Orr -- its been a while since I received such kindness. To my reading committee, thank you for your signatures.
To my typing committee, I needed you, thank you from the bottom of my heart. To my editing committee, Thank you for looking at this with new eyes. To the UCSC graduate students of 2002, thank you for your attendence at my orals. To the UCSC undergrads, no graduate student ever had such wonderful guinea pigs. To the UCSC professors, I enjoyed my classes and am glad you admitted me. To Cabrillo Community College teachers and students, thank you very much for helping me. To my sister -- I enjoyed and needed your visit. And to the kitchen sink -- long may you work.
Part I : The Paper Chapter 01 : Introduction to the Paper
Teaching is an honor, an obligation and a joy. You, Hestenes Oblivious, have both benefited from good teachers and been harmed by bad teachers. It is societies need and my hope that you choose to strive for excellence in your duty as a teacher. Physics Educational Research (PER) can help ease and shorten the path between where you are and where you want to be, as an effective, beneficial instructor. PER has many components and details. For the teacher, the fundamental benefit of PER is access to the knowledge of fellow teachers and researchers. PER battles the isolation of the solitary teacher. If you have knowledge to advance the art of teaching, share it. In turn, do not ignore that which others have offered you. Their years of sweat and toil are yours for the reading of a paper. I say art of teaching. Yet one of the fundamental purposes of PER is to turn this art into a science. McDermott argues that PER is both an empirical science and a fitting subject for research by physics faculty. Hammer states that PER is not yet a science, even though that is the goal. Hestenes informs us that while inspired teaching may be a nontransferable gift, good teaching is an acquirable skill. So, art or science, the distinction is material only to certain mindsets. Other mindsets enjoy the act of trying to improve the physics knowledge and ability of others. To improve others, requires many skills on the part of a teacher. The first skill is the ability to understand and do physics oneself. Trivial as the statement seems, it is not ignorable. A case in point is a co-worker's mother who teaches Spanish in a Georgia high school. This is the mothers first year; she did not speak, read or write Spanish at the beginning of the year. While I have not assessed her students' Spanish skills, I am incredulous that she was even hired. The second skill is a real understanding of who youre teaching. While social skills and interactions will turn out to be important, even more critical is knowledge about your students current beliefs on physical phenomena. Researchers state that students will not believe and use the physics you provide them, until their previous conflicting beliefs are explicitly proven wrong. Thus, at absolute minimum, you must know the predominant beliefs held by your students, so that you know what to prove wrong. You are not filling a vacuum. The third skill is mastery of both the big and small arts of connecting a known physical belief to a known physical person. One example of a big art is constructivism. This is an educational theory whose basic premise is that students have to construct their own knowledge. Teachers cannot give knowledge to students; teachers can only create an environment in which students have an improved opportunity to succeed. Entire books have been written on this theory. One example of a small art is the definitions of negatively charged and neutral in the context of electromagnetism. Between five and twenty percent of students believe that these two definitions are synonymous. The students reason that positive means yes and negative means no; so, negatively charged means no charge. They also reason that neutral, means not charged. So its pretty obvious that something which has no charge is not charged. Ipso facto -- negatively charged materials are neutral. This little gem was in a Letter to the Editor of the American Journal of Physics (Am. J. Phys.). Its author points out that this gem was actually difficult to discover, as students can quote correct physics and still not be able to perform correct physics. This in turn raises the issues of Language, Student Interviews, and Interpretation; all of which I will leave to the body of the paper. Here in the Introduction, I simply want to welcome you to a world that can supply useful knowledge and insights, even if it invites you to raise an eyebrow from time to time. PER is young and still quite messy. Chemistry was born out of Alchemy, Astronomy out of Astrology. Biology is only now coming into its own, with DNA. Education is on this path with interesting work being done in brain research, possible in large extent to MRIs. Still, I side with Hammer, PER is not yet a science, but then neither are large chunks of the medical profession. If its beneficial, use it.
CHAPTER 02 : Why Teach Physics?
PER focuses on the student and how he learns. Thus, there is a lot of research on finding reoccurring prevalent student misconceptions and methods to replace these misconceptions with accepted physics beliefs. PER stretches rather further a field, in search for a general theoretical framework in which to place the experimentally discovered PER facts. PER even briefly touches on why teach physics in the first place, and that is where well start our odyssey. There are basically two views on why to teach physics. One is job/business oriented. The other is a bit more idealistic. Four authors speak to the reasons we should teach physics. McCullough argues that the under representation of women in physics is unfortunate because to sustain our technological civilization, every one of our future workers must be prepared in science, engineering, and mathematics.01 Heuvelen states that the educational desired outcomes are raised by two sources: Bloom's Taxonomy and three recent workplace studies. These studies are 1) Shaping the Future, by the National Science Foundation, 2) the ABET Engineering Criteria 2000, and 3) The American Institute of Physics Survey. Heuvelens basic goal is to prepare physics students for a workplace in which 82% of B.S. physics graduates have final careers in industry and government, and for a workplace in which physics knowledge is the least used skill listed in the survey.02 Redish broadens this, arguing that society has a great need not only for a few technically trained people but for a large group of individuals who understand science.03 Goodstein takes a more idealistic approach. He argues that we need a revaluation in how we do our jobs. Goodstein states that we physicists understand in a very large measure, how the world works, and that to live in ignorance of [this] understanding should be intolerable. Goodstein believes that the undergraduate physics major is the liberal arts education of the 21st century. He provides some history, noting that educating the masses not just the elites started in the early 1900s with American compulsory attendance laws that Europeans of the age found fantastic and ridiculous. Goodstein also speaks against our current focus on producing Ph.D.s, primarily because exponential growth in the Ph.D. job market is gone forever.04 Why we teach physics is more than an academic issue. Why we teach, largely dictates to whom and to how many, we teach. These in turn strongly influence how and what we teach. We should next address -- What is PER? CHAPTER 03 : PER, What Is It?
Three answers, provided by three of the more prominent researchers in the field, follow. McDermott states that PER is an empirical applied science and that it should be conducted by science faculty within science departments. PER is Physics Educational Research, and for McDermott: Research on the learning and teaching of physics is essential for cumulative improvement in physics instruction. Pursuing this goal through systematic research is efficient and greatly increases the likelihood that innovations will be effective beyond a particular instructor or institutional setting. The perspective taken is that teaching is a science as well as an art. Research conducted by physicists who are actively engaged in teaching can be the key to setting high (yet realistic) standards, to helping students meet expectations, and to assessing the extent to which real learning takes place.05
Hestenes joins McDermott in an oblique way and adds a hint that PER isnt always viewed positively: PER is a credible discipline with a body of reliable empirical evidence, clarified research issues, and able researchers. It is a serious program that applies to our teaching the same scientific standards we use in physics research. Unfortunately most of our colleagues are oblivious and some who arent are contemptuous.06 Hammer does not believe that PER is a science but does believe that it has developed compelling evidence to discredit traditional methods and convictions. Hammer states that: While PER is not yet an applied science it does provide perspectives that expand, refine and support instructors perceptions and judgments. PER helps expand the instructors concentration to include not only physics content but also how the students interact with that context.07 So, rather than play: lets define science, let us agree that there is valuable material in this body of work, and go find it. Most teachers are like engineers. Theyd like to know what are the common student misconceptions and what methods other teachers have used to replace these misconceptions with current accepted physics belief. Later, perhaps, an interest in what tools were used to find these misconceptions in the first place, or to find that they had been successfully replaced by real physics comes into focus. Swiftly following is an interest in determining just how valid these tools, upon which so much rests, are. Prior to all this, is epistemology. After all, its so much simpler just to tell a stranger the physics truth and rely upon his memory to replicate the truth youve thoughtfully gifted him with. Perhaps a bit of homework for practice to bring him up to speed and a few labs to make it real and were done. Other than convincing ourselves that the fellow has an adequate memory, whats the problem? In other words, why do we even care that the student believed anything, prior to us telling him the truth? Much less, having to take the time and energy to find out what he believed? Individualized instruction in a 200+ student introductory calculus-based physics sequence is hardly conceivable much less implementable. So, before we even consider what the misconceptions are, or how we are supposed to address them in our teaching, let us try to address why we care about their existence.
CHAPTER 04 : Epistemology
At a yard sale this weekend, I saw a book on constructivism. I opened it up and noted that we are now post-epistemological. I closed the book. The owner was an elementary teacher getting her Masters Degree in Education. I did not buy the book. Whats the point of this story? -- That the definition of epistemology is author-dependent. It is, at bottom, all the non-physics, imported-from-other-departments, ideas and theories. Epistemology mostly originates from Educational, Philosophy, and Psychology Departments and is included as a beginning paragraph in most PER published papers. Unfortunately these paragraphs manage to both repeat themselves and differ. The following short paragraphs are an attempt to hit the high points only. We commence with the views of Reif, who among other distinctions is a Professor of both Physics and Psychology. In fairness to Reif, I would like to point out Paper 44 in Chapter 30 of Part III does a much better presentation of his views as they are verbose and example dependent. Reifs central instructional goal is to help students acquire a modest amount of basic knowledge which they can flexibly use. Flexible utility is paramount because science is the ability to predict or explain diverse phenomena using a small amount of basic knowledge and because the learned knowledge must retain its usefulness in a complex and rapidly changing world. The cognitive abilities required to ensure scientific knowledge can be flexibly used include: interpretation, description, organization, analysis, construction of solutions, and checking these solutions. Unfortunately, the most common method of teaching problem solving is that of example and practice. This process is flawed to the point that Reif labels it as unwise. He goes on and offers a heuristic strategy that is far more effective.8 Next up is Heuvelen. He presents Blooms Taxonomy: Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation. Heuvelen asserts that humans are pattern-recognition animals who try to match their new experiences to previous events. He refers to studies in Linguistics that emphasize: the need for referents, the requirement for multiple exposures, the benefits of multiple representations, the helpfulness of interactive simulations, and the critical importance of starting early. Brain research has shown how detrimental aging is to the development of new synapses and how ingrained old patterns become. Active learning is important because studies show that we remember only 3% of what we hear. Effective methods of inquiry are based on student observation and modeling of real phenomena.2 Kalman et al. has many references to learning theory and philosophy. The authors present Posners learning framework for conceptual change. Emphasis is placed on two points: First students must know of problems with their personal scientific conceptions,usually via curriculum induced conceptual conflict. Second and of equal importance, the student must not compartmentalize his knowledge. People can hold contradictory beliefs. Replacement not simple assimilation is the teaching goal. Kalman et al. state that there are two methods of problem solving -- Template and Paradigms. The key difference is that students who compartmentalize knowledge and apply different templates to different knowledge subsets lack the ability to apply principles garnered from a problem to an apparently different problem. Furthermore, even if problem solving methods change, knowledge acquisition methods are likely to remain compartmentalized unless critical thinking skills are developed. The authors conclude with the following observations. Students often hold views different or alternative to those that they will be taught in their courses. The students will not easily relinquish their original viewpoints because their viewpoints explain observations and required effort to construct. Conceptual change requires the student to critically examine their view of the world. For change to occur students must make value judgments, rate ideas, and accept or reject material based on standards. Producing change thus requires Evaluation, the highest ability in Blooms taxonomy.10 Bao and Redish in constructing a model of student knowledge, appeal to neuroscience, cognitive science, and education research. The agreed upon core elements are: (1) memory is associative, (2) cognitive responses are productive, and (3) cognitive responses are context dependent (inclusive of students state of mind). As this is insufficient, the authors also advance several structures proposed by researchers: (A) patterns of associations (neural nets), (B) primitives / facets, (C) schemas, (D) mental models, and (E) physical models. The authors define these terms in their paper. Some of the definitions are rather involved.11 Galili and Hazan present a full page of theoretical background on the structure of knowledge. They present Mach, Bruner, Piaget, di Sessa, and Minstrell. The first two of these argue that people are unable to grasp, remember, or manipulate a huge amount of complex content without knowledge of structure. Piaget is the founder of constructivist theory, which pursues picturing human cognition by its elements related in schemata. di Sessa argues for the existence of stable cognitive constructs spontaneously created in the form of fundamental self-explanatory patterns (p-prims). Minstrell proposes facets-of-knowledge as the means by which students understand particular physical settings. Given the great versatility of naive conceptions, instruction should aim at the essence, the learners schema. The authors also advocate presenting unsuccessful attempts at physics conceptual development that nevertheless helped to attain present scientific knowledge; these would show students a realistic picture of the complex transformation of knowledge from old to new.12 Hammer argues in a future paper, that "misconceptions" is a misnomer. Here he talks about epistemological beliefs, misconceptions, and inquiry practices: Epistemological beliefs influence how students reason in a physics class. Students beliefs about the course and the knowledge and reasoning it will entail impact student actions. Some students believe that understanding in physics means being familiar with a collection of facts and formulas, that the formalism of physics is only loosely associated with everyday experiences, and learning physics means memorizing information supplied by the professor or textbook. Other students believe understanding physics means developing a sense of its underlying principles and coherence, that formalism does represent everyday experiences, and that learning physics is applying and modifying ones own understanding. Thus in an extended student debate the instructor often faces the dilemma of dealing with physics context misconceptions and appropriate but nascent epistemological beliefs. Jumping in to fix a misconception can all too often reinforce the idea that all truth comes from the instructor; to not jump in risks a further hardening of a false belief.
Misconceptions are strongly held stable cognitive structures that differ from expert conceptions. Misconceptions fundamentally effect student understanding of science and must be overcome for students to achieve expert understanding. This is in contrast to the idea that students are simply ignorant. For the instructor to simply transfer information is ineffectual. Misconceptions must be overcome before expert opinion will be accepted by the student who finds his current misconception both reasonable and useful. This overcoming is based on a process of drawing out explicit statements of the misconception by the students, confuting the misconception with arguments and evidence, and then promoting new more appropriate conceptions.
Inquiry Practices stands the traditional view on its head arguing that social participation in the scientific community is a requirement to build individual knowledge and ability. By this view, students and physicists participate in socially constructed situated practices, scientific knowledge and practices are collective constructs of the scientific community. Fundamentally, learning physics means becoming a member, and adopting the practices of the community of physicists. These practices can be quirky and arbitrary as seen from the outside. In the class discussion, Harry responded to Amelias concern that space had gas in it that would slow down a moving ball. His response were talking about ideal space, was meant and received as a joke with the participants laughing. Assuming ideal conditions is not natural or routine ... There was much discussion about whether assuming no friction on an earth bound surface also meant that gravity had to be turned off. And of course from a thermodynamic point of view, a physicist would have to agree. In Newtonian force problems this is dismissed under the rubric ideal. Under Inquiry Practices, instructor intervention would be establishing certain social practices rather than primarily directed at individual knowledge and abilities.07 Epistemology is threatening to overwhelm my paper. I will refer you to Part III. In Chapter 27, Paper 30, Redish holds forth about the need for a general framework noting that collecting data into a wizards book of everything that happens is not science. He speaks at length about four principles: 1) Building Patterns, the Construction Principle; 2) Building on a Mental Model, the Assimilation Principle; 3) Changing an Existing Mental Model, the Accommodation Principle; and 4) the Individuality Principle.03 Hammer returns in Paper 34 of Part III. His essential point is that mental phenomena are attributed to the action of many agents acting in parallel, sometimes coherently, sometimes not. This is contrasted to a misconception, which is a single nonconforming cognitive unit. He also raises the ideas of Anchoring Conceptions and Bridging Analogies.13 In paper 35, Elby asserts Epistemological sophistication is valuable for students. His paper shows instructional practices and curricular elements explicitly intended to further epistemological development.14 In paper 32, Redish et al. presents the Maryland Physics Expectation (MPEX) survey and some results. The MPEX probes student expectations about the process of learning physics and the structure of physics knowledge. What students expect will happen in their physics course plays a critical role in how they respond to the course. Their expectations play a role in what the student pays attention to and what he chooses to ignore. It is a factor in the students selection of activity by which he will construct his own knowledge base.15 And finally, the authors of paper 43 remind us that: A lifetime of experiences pushing boxes and riding in cars is not dismissed for the sake of a memorized equation, even if we tell students explicitly to do so.16 The upcoming chapter is much more concrete than this one, but it still focuses on non-physics issues effecting the teaching of physics.
CHAPTER 05 : Concrete Small Issues
This chapter is a potpourri of small issues, some interesting, some merely important. Students seek efficiency -- the achievement of a satisfactory grade with the least possible effort -- at a severe unnoticed penalty on how much they learn.15 If students are satisfied they will work; however, if they become discouraged, students will engage in intellectual damage control and minimize their amount of work.17 The result of instruction is to substantially lower the effort a student perceives as necessary for success in physics.15 Students play the game and distort their behavior to enhance their grades at the cost of achieving deep understanding of physics. This despite there actually being no correlation between the amount of distortion and grades.18 Inquiring skills are needed in part because students do not have enough experience with everyday phenomena to tie the concrete experience to the scientific explanation.19 Many students are not able to identify and evaluate different points of view. They merely repeat themselves when asked to explain or defend their statements.07 People have different attention spans and different abilities to discriminate between important and peripheral issues.22 Effective teaching, as measured by student learning (as distinct from enthusiasm), is not tightly linked to student evaluations of the teacher, the course, nor their own learning.05 The correlation of student opinion to the real success of a course is dubious.16 Student evaluations are the end result of a chaotic system where small initial perturbations can lead to widely divergent results.20 Qualitative data results highlight the critical importance of socialization in the classroom. Field notes and video tape reveal quite apparently that the same students, in the same room, working in the same groups respond differently to different teachers. Student-faculty interactions depend greatly on the personalities involved.21 Attitude, not intelligence nor mathematical competence, is the prime cause of greater achievement.23 Student-student socialization is central to the success of students. No social barriers based on race or gender were found after careful scrutiny.21 While success in physics knows no gender or racial boundaries ... there are intrinsic differences in students that ... enable some to succeed with little effort and others to fail even after considerable effort.20 We need to start teaching physics to children. To do so has three benefits: 1) the layer of prejudice to penetrate is thinner, 2) social pressures have not yet taught children that physics is understandable only by geniuses, and 3): The most important reason is that one finds radically new physical ideas springing from the flexible minds of theoretical physicists at the commencement of their professional careers -- in their twenties. This flexibility is a hallmark of a young mind and is increasingly difficult to retain as age advances. Yet the young mind must already know enough physics to appreciate what the outstanding problems are. Gaining this essential background early widens the window of having both knowledge and flexibility. As age will close the window, leaving knowledge without flexibility, all too soon.25 The key to starting early is the realization that as a child, the physics student had already mastered a very abstract language and his ability to acquire new languages decreased with age. In fact there is a strong drop off in language acquisition ability at about ten years of age.02 At an early age, the focus should be on forming intellectual resources such as closer means stronger: (conceptual), or I see it (epistemological). This formation may of necessity come prior to alignment, with early science education mostly being messing about. Science can not end there, but Messing About is a better beginning than Remember the Magic Word.13
Story telling is an effective method of teaching children, and one that dovetails nicely with educational theories in which children are labeled concrete thinkers. Mr. Tompkins is a childrens book written by George Gamow a founder of Big Bang cosmology. This 1965 book was an immediate best-seller, popular with both the public and the professional scientists.25 The priorities of any large physics department: Faculty research comes first, followed by Ph.D. students, then Masters students, upper-level undergraduate majors and courses, introductory courses for majors, introductory courses for other scientists and engineers, and finally lowest of the low and often entirely absent, physics for non-scientists. Non- students are entirely absent. This priority list must be stood on its head. A fundamental problem of our times is the scientific illiteracy of the general population. This illiteracy threatens the foundation of our industrialized and democratic society. Physics departments should make it a top priority that 50%-75% of all non-science undergraduates take a science-literary physics course oriented toward scientific methodology and the connections between physics and society.26 One example of this step being taken, is LiPrestes appreciation course in modern physics. For a class useless to degree requirements, twenty community college students showed up out of pure interest. The texts were Issac Asimovs Atom and Brian Greenes The Elegant Universe.27 The Interactive Engagement teaching method known as HPS (Paper 17 in Part III) exposes ideas and subjective perspectives in science which humanizes science education and makes science appealing to a wider variety of minds.12 While aimed at a specific audience, The Physics Teacher has a wide range of papers that non-physicists would find interesting. For example, the May 2001 issue has an article on Surf Physics with a picture of Mickey Muñoz actually hanging ten. The issue also has an article on Marloyes Harp and the Thumb Piano, detailing 19th century musical instruments that rely on the longitudinal vibration of solid rods. It even has a cover photo taken here in Santa Cruz. The upcoming chapter discusses the PER tools used for assessment of both students and curricula.
CHAPTER 06 : Assessment Tools
The assessment tool of choice for both students and curricula in the PER literature is the FCI (Force Concept Inventory). The FCI is followed in prevalence by the MB (Mechanics Baseline Test), the FMCE (Force and Motion Conceptual Evaluation), the CSEM (Conceptual Survey in Electricity and Magnetism), and the TCE (Thermal Concept Evaluation). Student interviews are used for assessment, primarily by McDermott and by authors during their initial development of the above tests. Taped classroom activities are rare but do appear in the literature, notably by Hammer. The MD (Mechanics Diagnostic), <g> the average normalized gain, and concentration analysis will be briefly mentioned in this chapter. For more depth on these subjects the reader is referred to Chapter 21 in Part III. The FCI assesses the students overall grasp of the Newtonian concept of force. The reason the FCI is not just another physics test is that the wrong choices are correlated to specific misconceptions. These misconceptions are important as they must be overcome and replaced by Newtonian thinking before the student is asked to continue in his physics education. The errors on the FCI are more informative than the correct choices. These "errors" are common sense misconceptions which are reasonable hypotheses grounded in everyday experience. The FCI can be used: (1) as a diagnostic tool, (2) for evaluating instruction, (3) as a placement exam for college advanced courses.23 The reform movement has used FCI data as compelling evidence that there are serious problems with physics instruction. The FCI is far from the only such evidence. There is huge PER literature on student misconceptions which support the same conclusion. Lillian McDermott has documented the huge gap between what teachers think they are teaching and what students are actually learning, by methods other than FCI use. FCI questions deliberately avoid the technical, precise, unambiguous language of physics. Too often students respond to the form of the technical language rather than its meaning. For example in a survey 80% of students could state Newtons 3rd Law even though only 15% fully understood it as measured by the FCI. Validation interviews confirm that Newtonian thinkers are able to resolve the consequent imprecision and ambiguities arising from the avoidance of the technical language. To the extent students have not mastered the material in the FCI is to the extent they will systematically misinterpret what they hear and read in their physics courses; they will treat the technical language of physics as muddled jargon; and they will be forced to resort to rote methods of learning and problem solving.06 The FCI motivates change in curriculum because it is the tool used to judge curricula, not merely students.29 The FCI is seen by one author as contributing to a negative classroom climate by sending subtle cues that science is not a womans field.01 In examining large populations, student choice among FCI distractors contains information as valuable as the grosser distraction between correct and incorrect that has been the focus of most research.11 Hake constructs an average normalized gain, <g>, using a combination of pre-instruction and post-instruction FCI scores. He uses <g> to compare physics instruction methods and finds that the present interactive engagement courses are, on average, more than twice as effective in building basic concepts as traditional courses.30 The MB is a universal, basic, mechanics-concept, assessor of student understanding. There exists extensive data on post-instruction scores which allows for evaluation and comparison of instructional effectiveness. The main intent of the MB is to assess qualitative understanding; although, it looks like a conventional quantitative test. Unlike the FCI, the distractors are not common sense alternatives although they do include typical student mistakes. Problems that can be solved by plugging numbers into a formula were excluded. Formal training in mechanics is required, and so the MB is most often given as a post-test .31 The FMCE, CSEM, and TCE are all used less than the FCI and MB. The FMCE is a newer test intended to replace the decade (1992) old FCI, partly in response to concerns over the FCI test security.32 However, the FCI continues to predominate. The CSEM was intended to be the FCI of electromagnetism (The FCI addresses only the Newtonian concept of force). There are, however, rather substantial differences. First, the CSEM distractors are not explicitly matched to misconceptions, and second, the CSEM relies on domains (force, motion, energy) in addition to the one it is testing (electromagnetism).33 The TCE is almost the FCI of Thermodynamics. It matches misconception to problem number but not specific distractor choice (A, B, C, D or E). Oddly, TCE's paper doesnt specify the correct answers to the TCE questions. As a side note, the MD is an old test and is no longer used as it was supplanted by the FCI.23 Its enduring claim for attention is that the MB was the last PER published test that incorporated an explicit math component in defining physics competency.34 Concentration analysis Ill leave Part III, Paper 08, saving only this quote: If a multiple-choice question is designed with ... naive mental models is distractions, then the distribution of student responses yields information on the students state. The student with a strong naive belief will pick multiple wrong answers that are based on that belief. Students who simply lack knowledge will choose distractors reflecting no unifying mental model.11 The best way to confirm, correct, or find out what students really think, is to conduct repeated, detailed, taped, and transcribed interviews with individual students.15 In our first look at student activities, it is very important to consider them one at a time and to interview them in depth, giving them substantial opportunity for thinking aloud and not giving them any guidance at all. Very valuable initial studies in PER are done with small sample sizes; only later when seeking to determine the prevalence of a misconception, are large sample sizes and written tests appropriate.35 The greatest insight is through interviews where the students give their reasons for their choices on the [FCI]. Interviews are time consuming and not continuously necessary as the determined misconceptions are universal.23 Interviews also provide non-physics insights such as: its most beneficial when inquiry sessions follow lecture36, and cooperative grouping does have negatives for some women students (domineering partners, fears that their partners didnt respect them, and feelings that their partners understood far more than they).19 Papers illustrate incorrect reasoning with student quotes from interviews,37 or from video excerpts of real instruction.07
CHAPTER 07 : Misconceptions
Misconceptions come in two basic overlapping varieties, those that are fundamentally physics content in nature and those that are essentially language-use / word-definition in nature. In the first category, I will include the very few mathematical misconceptions spoken to in PER literature, although mathematics is usually ignored. In the second category, there also is a minor distinction involving NES (Native English Speakers) and ESL (English as a Second Language students) which I will touch on. Specific physics content misconceptions follow. While this list is extensive, it is by no means exhaustive; both Part III and the original works contain far more examples. A few specific papers with a large number of listed misconceptions are: Paper 01 with thirty distinct misconceptions about Newtonian force23, Paper 17 with forty-six facets-of-knowledge about vision and optics of which half are false beliefs12, and Paper 07 with thirty-five misconceptions about Thermal Physics.38 Their lists are not replicated here.
Language issues play an important role in student misconceptions. Students tend not to realize that we physicists have redefined their sloppy English with mathematical rigor. Several language misconceptions follow, ended by a NES versus ESL issue.
The more obvious misconceptions are in Part III. Paper 01 includes "heavier objects fall faster". Paper 17 includes "vision not requiring the delivery of light or anything from an object into an observer's eye." And Paper 07 includes "Heat and Temperature are the same thing." Next up is a chapter on I. E. methods.
Chapter 08 : Interactive Engagement Methods
I.E. methods involve: change in the curricula, change in the role of teachers, and change in the actions of students. Hakes paper, with its introduction of <g> and use of 6000 students, was instrumental in shaping the Interactive Engagement verses Traditional methods debate. In that paper, Hake defines Interactive Engagement methods as methods designed at least in part to promote conceptual understanding through interactive engagement of students in heads-on (always) and hands-on (usually) activities which yield immediate feedback through discussion with peers and / or instructors. Hake defines Traditional methods as those which primarily rely on passive-student lectures, recipe labs, and algorithmic-problem exams.30 Change can be positively motivated. At Colgate University, the motives for reform were: 1) to improve student understanding of the basic concepts of physics, 2) to exploit modern technology, 3) to bring modern physics into the introductory syllabus, and 4) to increase student engagement.17 Change can be negatively motivated. Hestenes believes change is required because the traditional methods are failures: ... there is no evidence that students who attend lectures learn more than those who dont. In fact, the complex cognitive skills required to understand physics cannot be developed by listening to lecturers any more than one can learn to play tennis by watching tennis matches. The tapes of Feynmans Lectures on Physics are the pinnacle of classroom performance. They were expressly prepared for first year physics students at Cal Tech. Feynman himself regarded them as a failure, with only a small fraction of the students really able to cope with the course.06
Change can even be motivated by Educational Theory, and Workplace Studies: Design is an upper-level Blooms Taxonomy educational objective, and one of the most frequent activities of physicist in the workplace. In response to this at Ohio State University in some labs, students design their own experiments to determine some property of a system.02
Why to change, is tied back into the earlier parts of this paper. What programs of change others have implemented follow. Chapters 22 through 26 in Part III provide significantly more detail; the following is intended as an overview. At the curricula level, more modern physics, atomic, quantum, and relativity needs to be taught. Modern physics is significant and interesting. Quantum mechanics and relativity provide the inter-connection of the various physics domains.17 An early familiarity with modern physics attracts young people to physics. The prospect of studying modern physics is the most influential reason that students choose to study physics at the university level.25 Electronics needs to be taught. Physicists are called upon to modify or combine electronic lab equipment from overlapping generations, to work in electromagnetic noisy environments, and to compensate for minimal or nonexistent technical support.50 B.S. graduates with electronics knowledge are valued by business; paid internships are available to physics students with analog electronics knowledge, and offering such courses can be part of a broad local effort to attract small high tech companies to the local area.51 To find room in the curricula to include modern physics and electronics, items must be cut. Items must be cut also to compensate for the more time intensive requirements of Interactive Engagement methods. Fifteen percent of the old traditional curricula at Harvard University is not covered.52 Twenty-five percent of the traditional content was cut at Dickinson College, although electronics and nonlinear dynamics were added.19 In his high school class, Elby cut momentum, oscillatory motion, electronic circuits, magnetism, and all of modern physics.14 Colgate Universitys reform effort cut the other way: To add new material, something must be cut and in a traditional course that something is Newtonian Mechanics. The reformers give several interesting reasons, dealt with in more detail in Part III Paper 29.17 There are review packages designed to supplement both traditional and reform instruction. MAOF is one such review package; its uniqueness is melding Mechanics and Electromagnetism via concepts common to both domains, such as conservative forces which are proportional to 1/r2 (Newtons Gravitational Law and Coulombs Force).40 A far more widespread set of review packages is the Tutorials in Introductory Physics put out by McDermott at the University of Washington. The Tutorials are a guided inquiry experience. A tutorial instructor guides students through the necessary reasoning by posing questions. The students work in groups of three or four. The Tutorials consist of pretest, worksheet, homework and post-test. They are not designed to transmit information or build standard problem solving. Rather they construct concepts, develop reasoning skills and relate the formalism of physics to the real world, thus developing a functional understanding in supplement to textbooks and lectures.05
The divisions within I.E. methods: curricula, teacher and student are not sharply delineated. The upcoming are teaching packages and names of reform attempts that mix all three, thoroughly.
There are many more Interactive Engagement issues that need to be addressed. Not all of them are in the upcoming potpourri chapter but most are.
Chapter 09 : A Potpourri of Interactive Engagement Points
There are quite a few points within I.E. which should be highlighted. The usefulness of microscopic models to explain macroscopic phenomena is debated. Thacker et al. argue that models of microscopic processes should be introduced as an integral part of any E&M course.59 Loverude et al. point out that many students in thermodynamics confirm their incorrect macroscopic arguments with reference to an incorrect microscopic model.35 Cooperative learning is widely used in I.E. methods. Cooperative learning promotes teamwork, one of the highest priorities the real world needs from education. Further, Heuvelen notes that Johnson found a grade point higher achievement in cooperative learning classes over traditional classes.02 Properly structured group work can be a particular blessing for young women; although the wrong structure can easily negate these benefits.01 The enforcement of roles (recorder, critic, etc.) is subject to much debate, with some instructors ignoring this aspect of cooperative learning.22 Other instructors explicitly teach cooperative team roles and protocols, reinforcing them via grading schemes.21 Cooperative groups and computers are two attempts to find cost-effective methods of providing timely guidance and feedback to students.08 There is debate about the effectiveness of computers / technology to enhance learning. There are those who argue: "running a computer simulation is very different than doing a physical experiment,"60 "technology by itself cannot improve instruction,"23 "to have a long-lasting impact on science education, [computer use] needs to be based on a successful pedagogy and not on the latest compilers, hardware, or algorithms." 55 Worse, "computer users seemed to rely on a more authoritarian view, that the right answer was something from the computer rather than something they had constructed themselves," 60 and "... computer use sometimes avoids error; [computer use] does not always confront [error] thus the error may persist in non-computer environments." 39 The other side is: "computer related tasks brought everyone back [to work], in this variant of peer instruction, students became involved as they had to tell the technology to do something for them." 21 Computers are "particularly useful in answering `what if' questions."02 In this day of powerful, inexpensive PCs and versatile software packages we should no longer restrict ourselves to only problems with closed form solutions, but should "perform Numerical Integration on both closed form (for comparison) and non-closed form (for realism) problems."61 "Today's software is platform-independent being based on virtual machines, meta languages and open Internet Standards, and thus not subject to obsolescence on [the historical eighteen month cycle]."55 Finally, computerized grading of homework is equally as good as conscientious, time-consuming, hand grading by a graduate student.60 Homework is viewed with some askance in the reform movement. Hestenes: "Practice makes permanent and mindless plug-and-chug without concepts is counter productive, not perfect."06 Reif: "Individual tutoring by the instructor is much more effective than homework, which too often merely perpetuates bad habits (haphazard use of formula being one)."08 Steinberg et al.: "Homework often fails to build a conceptual framework. It is often reduced to finding formula with the right combination of symbols or finding a worked out similar problem."16 Elby has an interesting page detailing his homework philosophy. His philosophy hinges on grading effort, not correctness and in handing out a partial solution set with the assignment. Elby argues that traditional students view homework as grade getting rather than learning. His unique homework methodology, combined with in-class frequent mini-quizzes, was explicitly designed to push students toward the realization that thinking through problems is the best way to learn physics, versus copying each other or a book.14 I.E. methods for the most part still use textbooks, homework, and labs. However, due to the time intensive nature of in-class activities, students are required to do more outside of class. For instance, Beichner et al. assigned students responsibility to read their textbooks outside of class in order to free up time for in-class Socratic dialogues.21 Johnson required students to attempt a subset of homework problems prior to each Supervised Practice meeting.22 Laws et al. require incomplete or incorrect labs to be completed or corrected during non-class, evening or weekend, open-lab hours.19 Not only are I.E. methods time intensive, they also often depend on small teacher-to-student ratios. Workshop labs have an instructor plus two undergraduate TAs per twenty-four students.19 Supervised Practice has two instructors drawn from graduate students, upper-class undergraduate majors, and volunteer post-docs per twenty-five students.22 SDI Labs require two Socratic dialogists, one of whom has had previous experience, for every twenty-four students.57 Many I.E. methods provide a room for students to immerse themselves in physics. The IMPEC group takes the prize with twenty-four hour access to their classroom.21 Indiana University has a "physics forum" open five to eight hours each day.30 Carnegie Mellon University allowed access to a drop-in center several hours each day.22 I.E. methods have many goals. Some are physics oriented, some are teaching oriented, and some are social. Among the many goals are: (1) to reduce the fragmentation of physics by its constituent domains, (2) to make physics instruction more concrete, and (3) to increase minority participation in physics. Students have a hard time organizing their knowledge around central characteristics. They often fail to distinguish between general concepts and their examples. This failure is exacerbated by courses which divide physics into separate fields i.e.: mechanics, E&M, optics, etc. An inter-domain organization of knowledge has several advantages. Among others, it allows a student to solve unfamiliar problems in one area of physics with tools from another.40 One tool that is used in mechanics, electrical circuits and quantum field theory is harmonic oscillation. Harmonic oscillation's multiple uses well justify the initial time spent on a mass and a spring.03 Inter-domain organization has its own pitfalls, for example "work is treated differently in mechanics and thermodynamics and this inconsistency may make it more difficult for students to transform a concept initially learned in one context to the other."35 Inter-domain organization is practical. Oregon State University's new junior level classes are Static Vector Fields, Oscillations, One Dimensional Waves, Quantum Measurement and Spin, Central Forces, Energy and Entropy, Periodic Systems, Rigid Bodies, and Reference Frames.13 I.E. methods seek to lessen the abstraction of physics. This is in part compensating for normal life50 in which less and less physical knowledge is required in everyday activities. For example, radio operators of fifty years ago had to know more electronics than radio operators of today; although the same is not true of radio designers. There are magnitudes more operators than designers, Illiterates versus Elites04 again. So in I.E. classes, students pitch baseballs, break pine boards with their bare hands, and ignite paper by compressing air.19 In tutorials, students use water tanks, dowel rods and sponges in a concrete method to understand the effects of a wave passing through a narrow slit.44 In working problems, vectors are color coded by type (displacement, velocity, acceleration, and force).64 The abstraction of physics is also reduced by various verbal strategies, such as classroom debate among student groups10 and old-fashion recitations.17 I should point out that the classroom debate is followed by a whole class vote which combats compartmentalization -- the ability of the human mind to hold two contradictory beliefs.10 I.E. methods make a conscious effort to get and keep minority participation in physics. Retention of minorities is one of the three major goals of the IMPEC program and that program achieved excellent results, passing 63% of female and 100% of minority students .21 The reform class at the City College of New York had a lower drop out rate than did the traditional class.16 At Dickinson College over 40% of the calculus-based Workshop Physics students are women.19 At Grinnel College, 50% of the physics majors are women.65 These stand in contrast to 19% -- the percentage of all physics bachelor's degrees going to women.01 Women do have one advantage, they benefit from inquiry-based laboratory exercises, men do not.36 Next up we'll address Teachers and TAs -- do they matter?
CHAPTER 10 : Teachers and TAs
PER is a bit schizoid when it comes to Teachers and TA's. Underlying this schizophrenia are two ideas. First, constructivism alters the role of teacher into one of a facilitator. Second, TA's are still students; students who's abilities both in physics and in teaching are suspect. A body of PER argues against the importance of the individual teacher as opposed to the curricula. Hestenes et al. in two separate papers, argues that "basic knowledge gain under conventional instruction is essentially independent of the professor."66 McDermott states that "student knowledge is practically instructor independent in equivalent physics courses." However, PER spends a lot of energy informing teachers on how to be effective, implying that poor teaching can be detrimental, even if curricula choice enforces an upper bound on the effectiveness of good teaching. Hestenes, in a third paper, argues that good teaching is an acquirable skill. "Technical knowledge about teaching and learning is as essential as subject content knowledge," this in part because "teachers with low FCI scores are unable to raise student scores above their own."06 While student evaluations and attitude are not measures of student learning, if one measures student opinion instead of student ability, teachers do matter. The IMPEC group found that results depend strongly on who is teaching and how much previous experience this person has had with the course... it takes about three years for a professor to achieve highly satisfied students."17 "Students respond positively when many instructors use many different techniques in a fairly short time period," according to Holbrow et al.21 Aware of the large gap between small-scale studies and practical education delivery, Reif has written a textbook-workbook combination. Reif recommends distinguishing between wrong and nonsensical answers, providing a more severe penalty to the latter. He also recommends giving students frequent diagnostic tests.08 Hammer notes that it is important for teachers to have multiple perspectives on student knowledge. Multiple perspectives are important because they increase the conceptual resources of the teacher. What instructors perceive depends critically on their conceptual resources and influences strongly how they think to intervene in student learning. What instructors notice informs their decisions as to whether to slow down the presentation, which topics to cover or problems to assign, and how to advise particular students. Conceptual resources include the instructor's knowledge of physics. The five based in PER are: 1) misconceptions, 2) p-prisms, 3) reasoning abilities, 4) epistemological beliefs, and 5) inquiry practices. As noted here and detailed at length at the end of this paper there is a difference between what is available in an instructor's head (conceptual resources) and publicly articulated perspectives. Learning to teach should involve developing the skills of gathering information, including skills of moderating discussions and interviewing students regarding their understanding. Further, forums need to be created for conversation among currently isolated instructors.07
Ehrlich speaks against "lowering the bar" for physics majors and for lower standards in "conceptual physics courses which promote greater scientific literacy among the general population." He argues that failing a class can both teach a great deal and be a long term good, even if a short term pain. Ehrlich also believes you're doing a good job teaching if you:
Teachers operate under false assumptions that students differentiate between critical and non-critical attributes in an example or that students identify strong resemblances between examples.59 One reason few students develop a functional understanding of physics, is that professors view students as younger versions of themselves, where in reality professors were atypical students.05 Professor's instructional intuitions are as inadequate and incorrect as most student's physics intuitions and for the same reason -- extensive, unstudied, personal experience.07 Lecturing and teaching are such powerful learning experiences for the teachers that we may not want to give them up, even if other methods are better for our students.03 Worse, a well trained teacher is not enough. An individual teacher does not have the means or time to transform the teaching techniques of others. Long term district support, money, and follow-up are requirements for science reform success.67 One problem at the department level, is the research-oriented hiring and tenure focus. New faculty are never selected primarily for their teaching skills and have to invest their time and energy into research performance just to keep their jobs.26 Your department is doing a good job if it:
PER does not focus on teachers; still, there are a few research comments. Fourteen out of eighteen Arizona high school teachers got better than 80% on the FCI.23 Twenty teachers benefited from going through the MAOF program as "learners."40 Colgate instructors all [six] attend each lecture.17 Finally, in a rare example of instructor-instructor learning, the importance of providing ample wait time for your questions is highlighted.21 We will now shift our focus to the Teaching Assistants (TAs). Many PER papers note that part of their methodology is the explicit formal training of TAs. An exception to this notes: "There is currently no explicit training of teaching assistants. As a result, there is great variation in their effectiveness."29 Some programs use undergraduates as teaching assistants. Hake reports that the four I.E. courses with more than 200 students, all employed undergraduates to augment the instructional staff.30 Johnson notes that his program results in a slight increase in cost, mainly to pay undergraduate TAs.22 The explicit TA training is often weekly and could include faculty and technical personnel.48 Occasionally, TAs were required to attend lectures.52 TA training is important because graduate students are not experts in course material, pedagogy nor management of a collaborative learning environment. Training addresses instructor content knowledge, student difficulties with physics concepts, student difficulties with problem-solving processes, and student difficulties with peer interactions.22 McDermott prepares tutorial instructors in weekly seminars which are conducted "on the same material and in the same manner that the tutorial instructors are expected to teach." McDermott's tutorials are judged successful if the students' post-test matches or exceeds the tutorial instructor's pre-test,05 in one case success was 15%.44 Graduate students, independent of TA status, are used as guinea pigs in PER papers. In one study, only seven out of twenty-three students,37 in another study only two out of sixteen students,23 were able to successfully answer questions that the researchers viewed as basic.
CHAPTER 11 : Validation, Literature Problems, and other issues.
Validation varies extremely. One study uses factor analysis and the KR20 Reliability Test.33 Another boldly states that formal procedures to establish the validity and reliability are unnecessary, because of similarity to previously validated work. 23 Some authors use the FCI to "norm" their studies.10 Some use the FCI despite it's non-applicability (See Part III Paper 24).60 Validation is done by interviews.15 Validity is addressed by observing the results of Newtonian Thinkers (seven out of eight questions right) "on other tests, and on their written explanations to ... question 9)".46 Validation occasionally acknowledges the prevalent use of convenience sampling; one study mitigates this by student comparisons in GPA, gender, and major.36 Another study raises the validation issues of: 1) The varying amount of time courses spend on the tested subject; 2) teaching to the test; 3) test-question leakage due to the open-source nature of the FCI, MB, etc.; 4) the Hawthorne effect; and 5) the John Henry effect.30 The more complex problems with PER literature are elaborated in Part III. Here I merely seek to raise a few red flags. There are both subtle and blatant linguistic shifts. Hake asserts that "interactive engagement courses are, on average, more than twice as effective in building basic concepts as traditional courses." Yet his building block, the FCI, does not test basic concepts (plural) it only asses a single concept that of the "Newtonian concept of force."42 A small matter, until entire curricula are modified to enhance the Hake factor29 instead of modifying only the section on Newtonian force. Redish et al. obscures that the MPEX matches your students responses to the answers nineteen college and university teachers, who were first time implementers of Workshop Physics, would "prefer their students to give." This obscuring is done by multiple re-definitions. The nineteen college and university teachers implementing Workshop Physics in their classroom after attending a workshop at Dickinson College become Group 5. Group 5 was asked to respond with the answer they would prefer their students to give, which becomes the "preferred response of Group 5." The preferred response becomes the "favorable response." The favorable response becomes the "expert response." So nineteen workshop first time implementers become "experts" in epistemology, and the "response" is idealized to the point of being neither that of the actual teachers nor that of any real students. 15 The conclusions an author reaches do not always match those he has stated earlier in his paper due to linguistic slipperiness [Part III, Paper 28].59 Sometimes the linguistics is not slippery just thick, as for example a study that tries, by means of elicited structural components of students' knowledge, to infer the influence of a historically oriented instruction in optics on the content conceptual knowledge of students in this science domain.12 Many studies judge results on "correct reasoning," a few require "correct answers"; some use both.37 One study shifts between the two depending on which is most favorable.44 Reif notes a case where 45% of his students gave correct answers, although 70% reasoned correctly; he attributes the difference to minor algebra mistakes.08 In other cases, notably Yes, No, Explain problems, the reverse occurs with more correct answers than correct reasons. There are authors who advocate a position despite there being "no overall improvement in gains."23 There are authors who speculate despite explicitly stating that their comments / beliefs are "impossible to verify" [See Part III Paper 21].10 There are studies based on one qualitative question.36 There are studies which provide questions and misconceptions but no match of distractor to specific survey item choice.38 There is at least one test published with questions which are known to the author to be defective and for which, in one case, no satisfactory replacement has been devised in over a decade.06 I.E. methods are not always implemented ideally. Sometimes, the issue is as basic as irregularities in light bulbs and inadequately charged batteries.36 Other times, the issues are more complex. In the case of implementing ILD, less time was spent on analogies to similar situations and having the students discuss results than the ILD teacher notes suggest. Furthermore, students were allowed too much time to make predictions resulting in fast students loosing interest and focus. The small studio classroom with non-sloping floors made it difficult for all students to see the demos. And finally, the students are so collaboratively oriented that it was difficult to get individual initial predictions resulting in some students never making a personal intellectual commitment. In implementing CGPS there were deviations from the five-step process strategy taught. Some of these deviations were the result of Studio Physics structures common to all classes Experimental and Standard, for example no context-rich problems were included on exams. Only half the class periods were given over to CGPS with the other half focusing on standard problems which would be tested. Three additional issues were 1) due to time constraints the Instructor did not model CGPS techniques as often as desired; 2) students were strongly resistant to cooperative group roles and this aspect soon died out; and 3) CGPS problem solving strategy is typically irrelevant to textbook-style homework, thus students resented having to use a procedure when it was easier to solve the problem without it.29
Knowledge retention / extension is not commonly addressed; when it is, the results are usually poor. The PEG at the University of Washington has to design and implement whole series of tutorials precisely because students who have completed one tutorial still did not recognize its relevance to new situations.44 This series development finally terminates when students extend existing knowledge into new areas.45 The only other positive retention report that I've found is Thornton and Sokoloff's finding that after six weeks of no additional dynamics instruction, there was an approximate 6% increase in students responding in a Newtonian way.46 Otherwise, things are a bit bleak. Thacker et al. notes in the body of their paper an important and discouraging finding. Students after explicit instruction do not use microscopic mechanisms when confronted by completely unfamiliar phenomena.59 Beichner et al. notes that no significant difference between previous IMPEC students and their traditionally taught peers was found on standard exam performance in a following traditionally taught course.21 Finally, Marshal and Dorward inform us that no cascading effect was noted. The only difference between inquiry and non-inquiry groups were those dealt with directly in the inquiry exercises.36 The size of PER studies varies widely in both time frame and number of participants. Most studies last a semester or less. There are, however, exceptions. Peer Instruction has a ten year study,52 Colgate University reformers provide their experiences and lessons over an eight year period,17 and the MAOF teaching package was developed over a four year period.40 McDermott has been working on various PER issues for over twenty-five years.41 The number of participants tends to range from a class worth up to around fifteen classes worth, realizing that not all class sizes are equal. There is wide fluctuation. Hestenes has an FCI database of more than 20,000 students and 300 physics classes.06 The FCI was originally given to 1500 high school students and 500 university students.23 Hake made his average normalized gain, <g>, public with a 6542 student, 62 introductory-class study.30 Poulis et al. introduce Audience Paced Feedback (APF) in a 2600 student survey.53 Maloney et al. bring us the Conceptual Survey of Electricity and Magnetism (CSEM) in a 1500 student sample.33 McDermott weighs in with 1000 introductory physics students.05 Bao and Redish bring us Concentration Analysis with 778 students in fourteen classes. Loverude et al. perform a Thermodynamics study with 500 students.35 Interactive Lecture Demonstrations (ILD) is evaluated at the University of Oregon using 200 students.54 An E&M study used both ninety students at the University of Ohio and twenty-six students at the University of Michigan-Flint. Hammers Ph.D. involved the study of six students over the course of a semester.15 Some papers only provide us with the number of classes, three in the case of Steinberg.60 Some studies start small, seven graduate students and twenty advanced undergraduates, and end big, five years, fourteen instructors, 800 students and various universities.37 Although most studies consist of calculus-based introductory physics students, by no means do all. Some studies even involve only high school students.12 These studies are reported in multiple forums, thirty-nine papers in fourteen different publications are noted in the back of PER Am. J. Phys. Suppl. 68(7). The Studio Physics paper (#15 in Part III) is important because it is one of very few acknowledged and examined failures in PER. It's authors conclude that it is necessary to mentally engage students; small classes, cooperative groups and computer availability are good but sadly, insufficient. Equally important are research based questions and activities.29 I think this is enough for now except for a brief conclusion. Part III deals with additional issues such as: the hidden curriculum,24 knowledge structure,08 McDermott,05 and the whole pro-math,17 anti-math09 issue that usually shows up as a qualitative bias in the literature; the authors of which, believe compensates for a quantitative bias in Traditional instruction. Finally, we reach the conclusion.
CHAPTER 12 : Conclusion to the Paper
In conclusion, I hope this has whetted your appetite for more. Part III has significantly more information, after which I would recommend The American Journal of Physics, PER Supplements that came out in July 1999, 2000, and 2001 because of their singular focus. A subscription for The Physics Teacher for the long haul is also a good idea. Teaching is a profession and, as a professional, society expects you to keep up with advances in your field. It is, within limits, like being a doctor or a lawyer; the new matters. As a professional, I hope you benefit from the work of your fellow professionals, contribute your own insight and knowledge to the public record, and enjoy yourself.
References
1. L. McCullough, Women in Physics: A Review, The Physics Teacher Vol. 40, 86-114 (2002)
2. A. Heuvelen, Millikan Lecture 1999: The Workplace, Student Minds, and Physics Learning Systems, Am. J. Phys. 69(11), 1139-1146 (2001)
3. E. Redish, The Implications of Cognitive Studies for Teaching Physics, Am. J. Phys. 62(6), 796-803 (1994)
4. D. Goodstein, Now Boarding the Flight from Physics, David Goodsteins Acceptance Speech for the 1999 Oersted Medal presented by the American Association of Physics Teachers, 11 January 1999, Am. J. Phys. 67(3), 183-186 (1999)
5. L. McDermott, Oersted Medal Lecture 2001: Physics Education Research The Key to Student Learning, Am. J. Phys. 69(11), 1127-1137 (2001)
6. D. Hestenes, Who needs physics education research !?, Am. J. Phys. 66(6), 465-467 (1998)
7. D. Hammer, More than misconceptions: Multiple perspectives on student knowledge and reasoning, and an appropriate role for education research, Am. J. Phys. 64(10), 1316-1325 (1996)
8. F. Reif, Millikan Lecture 1994: Understanding and teaching important scientific thought processes, Am. J. Phys. 63(1), 17-32 (1995)
9. P. Lindenfeld, Format and content in introductory physics, Am. J. Phys. 70(1), 12-13 (2002)
10. C. Kalman, S. Morris, C. Cottin, R. Gordon, Promoting conceptual change using collaborative groups in quantitative gateway courses, PER Am. J. Phys. Suppl. 67(7), S45-S51 (1999)
11. L. Bao and E. Redish, Concentration analysis: A quantitative assessment of student states, PER Am. J. Phys. Suppl. 69(7), S45-S53 (2001)
12. I. Galili and A. Hazan, The Influence of an historically oriented course on students content knowledge in optics evaluated by means of facets-schemes analysis, PER Am. J. Phys. Suppl. 68(7), S3-S15 (2000)
13. D. Hammer, Student resources for Learning, PER Am. J. Phys. Suppl. 68(7), S52-S59 (2000)
14. A. Elby, Helping physics students learn how to learn, PER Am. J. Phys. Suppl. 69(7), S54-S64(2001)
15. E. Redish, J. Saul, and R. Steinberg, Student expectations in introductory physics, Am. J. Phys. 66(3), 212-224 (1998)
16. R. Steinberg, and K. Donnelly, PER-Based Reform at a Multicultural Institution, The Physics Teacher Vol. 40, 108-114 (2002)
17. C. Holbrow, J. Amato, E. Galvez, and J. Lloyd, Modernizing introductory physics, Am. J. Phys. 63, 1078-1090 (1995)
18. A. Elby, Another reason that physics students learn by rote, PER Am. J. Phys. Suppl. 67(7), S52-S57 (1999)
19. P. Laws, P. Rosborough, and F. Poodry, Women's responses to an activity-based introductory physics program, PER Am. J. Phys. Suppl. 67(7), S32-S37 (1995)
20. R. Ehrlich, How do we know if we are doing a good job in physics teaching?, Am. J. Phys. 70(1), 24-29 (2002)
21. R. Beichner, L. Bernold, E. Burniston, P. Dail, R. Felder, J. Gastineau, M. Gjertsen, and J. Risley, Case study of the physics component of an integrated curriculum, PER Am. J. Phys. Suppl. 67(7), S16-S24 (1999)
22. M. Johnson, Facilitating high quality student practice in introductory physics, PER Am. J. Phys. Suppl. 69(7), S2-S11 (2001)
23. D. Hestenes, M. Wells, and G. Swackhamer, Force Concept Inventory, The Physics Teacher Vol. 30, 141-157 (1992)
24. (a) E. Redish, J. Saul and R. Steinberg, Student expectations in introductory physics, Am. J. Phys. 66(3), 212-224 (1998); (b) D. Hammer, More than misconceptions: Multiple perspectives on student knowledge and reasoning, and an appropriate role for education research, Am. J. Phys. 64(10), 1316-1325 (1996)
25. R. Stannard, Communicating physics through story, Physics Education, 30-34 (2001)
26. A. Hobson, Science Literacy and Departmental Priorities, Am. J. Phys. 67(3), 177 (1999)
27. M. LiPreste, A Comment on Teaching Modern Physics, The Physics Teacher Vol. 39, 262 (2001)
28. The Physics Teacher Vol. 39 (2001)
29. K. Cummings, J. Marx, R. Thornton, D. Kuhl, Evaluating innovations in studio physics, PER Am. J. Phys. Suppl. 67(7), S38-S44 (1999)
30. R. R. Hake, Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses, Am. J. Phys. 66(1), S64-S74 (1998)
31. D. Hestenes and M. Wells, A Mechanics Baseline Test, The Physics Teacher Vol. 30, 159-166 (1992)
32. (a) R. Thornton and D. Sokoloff, Assessing student learning of Newtons laws: The Force and Motion Conceptual Evaluation and The Evaluation of Active Learning Laboratory and Lecture Curricula, Am. J. Phys. 66(4), 338-352 (1998); (b) R. Hake, Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses, Am. J. Phys. 66(2), 64-74 (1998)
33. D. Maloney, T. OKuma, C. Hieggelke, A. Heuvelen, Surveying students conceptual knowledge of electricity and magnetism, PER Am. J. Phys. Suppl. 69(7), S12-S23 (2001)
34. I. Halloun and D. Hestenes, The initial knowledge state of college physics students, Am. J. Phys. 53(11), 1043-1055 (1985)
35. M. Loverude, C. Kautz, and P. Heron, Student understanding of the first law of thermodynamics: Relating work to the adiabatic compression of an ideal gas, Am. J. Phys. 70(2), 137-148 (2002)
36. J. Marshal and J. Dorward, Inquiring experiences as a lecture supplement for preservice elementary teachers & general education students, PER Am. J. Phys. Suppl. 68(7), S27-S37 (2000)
37. R. Scherr, P. Shaffer, and S. Vokos, Student understanding of time in special relativity: Simultaneity and reference frames, PER Am. J. Phys. Suppl. 69(7), S24-S35 (2001)
38. S. Yeo and M. Zadnik, Introductory Thermal Concept Evaluation: Assessing Students Understanding, The Physics Teacher Vol. 39, 496-504 (2001)
39. L. McDermott, "Millikan Lecture 1990: What we teach and what is learned -- Closing the gap," Am. J. Phys. 59(4), 301-315 (1991)
40. E. Bagno, B. Eylon, and U. Gamiel, From fragmented knowledge to a knowledge structure: Linking the domains of mechanics and electromagnetism, PER Am. J. Phys. Suppl. 68(7), S16-S26 (2000)
41. L. Kirkpatrick, American Association of Physics Teachers 2001 Oersted Medalist: Lillian C. McDermott, Am. J. Phys. 69(11), 1126 (2001)
42. (a) D. Hestenes, M. Wells, and G. Swackhamer, Force Concept Inventory, The Physics Teacher Vol. 30, 141-157 (1992); (b) R. R. Hake, Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses, Am. J. Phys. 66(1), 64-74 (1998)
43. P. Colin and L. Viennot, Using two models in optics: students difficulties and suggestions for teaching, PER Am. J. Phys. Suppl. 69(7), S36-S44 (2001)
44. K. Wosilait, P. Heron, P. Shaffer, L. McDermott, Addressing student difficulties in applying a wave model to the interference and diffraction of light, PER Am. J. Phys. Suppl. 67(7), S5-S15 (1999)
45. S. Vokos, P. Shaffer, B. Ambrose, L. McDermott, Student understanding of the wave nature of matter: Diffraction and interference of particles, PER Am. J. Phys. Suppl. 68(7), S42-S51 (2000)
46. R. Thornton and D. Sokoloff, Assessing student learning of Newtons laws: The Force and Motion Conceptual Evaluation and The Evaluation of Active Learning Laboratory and Lecture Curricula, Am. J. Phys. 66(4), 338-352 (1998)
47. R. Harrington, Discovering the reasoning behind the words: An example from electrostatics, PER Am. J. Phys. Suppl. 67(7), S58-S59 (1999)
48. (a) P. Lindenfeld, Format and content in introductory physics, Am. J. Phys. 70(1), 12-13 (2002); (b) M. Johnson, Facilitating high quality student practice in introductory physics, PER Am. J. Phys. Suppl. 69(7), S2-S11 (2001)
49. (a) D. Styer, The Word Force, Am. J. Phys. 69(6), 631-632 (2001); (b) E. Redish, The Implications of Cognitive Studies for Teaching Physics, Am. J. Phys. 62(6), 796-803 (1994)
50. D. Henry, Resource Letter: TE-1: Teaching electronics, Am. J. Phys. 70(1), 14-23 (2002)
51. T. Usher and P. Dixon, Physics goes practical, Am. J. Phys. 70(1), 30-36 (2002)
52. C. Crouch and E. Mazur, Peer Instruction: Ten years of experience and results, Am. J. Phys. 69(9), 970-977 (2001)
53. J. Poulis, C. Massen, E. Rubens, and M. Gilbert, Physics lecturing with audience paced feedback, Am. J. Phys. 66(5), 439-441 (1998)
54. D. Sokoloff and R. Thornton, Using Interactive Lecture Demonstrations to Create an Active Learning Environment, The Physics Teacher Vol. 35, 340-347 (1997)
55. W. Christian, Educational Software and the Sisyphus Effect, Computing in Science and Engineering May-June 1999, 13-15 (1999)
56. A. Heuvelen and D. Maloney, Playing Physics Jeopardy, Am. J. Phys. 67(3), 252-256 (1999)
57. R. Hake, Socratic Pedagogy in the Introductory Physics Laboratory, The Physics Teacher Vol. 30, 546-552 (1992)
58. G. Güémez, C. Fiolhais and M. Fiolhais, "Revisiting Black's Experiments on the Latent Heat of Water," The Physics Teacher Vol. 40, 26-31 (2002)
59. B. Thacker, U. Ganiel and D. Boys, Macroscopic phenomena and microscopic processes: Student understanding of transients in direct current electric circuits, PER Am. J. Phys. Suppl. 67(7), S25-S31 (1999)
60. R. Steinberg, Computers in teaching science: To simulate or not to simulate?, PER Am. J. Phys. Suppl. 68(7), S37-S41 (2000)
61. P. Assimakopoulos, A Computer-Aided Introductory Course in Electricity and Magnetism, Computing in Science and Engineering Nov/Dec 2000, 88-94 (2000)
62. S. Bonham, R. Beichner, and D. Deardorff, Online Homework: Does it Make a Difference?, The Physics Teacher 39, 293-296 (2001)
63. The American Physical Society, The Forum on Education, Spring / Summer 2000
64. (a) R. Hake, Socratic Pedagogy in the Introductory Physics Laboratory, The Physics Teacher Vol. 30, 546-552 (1992); (b) R. Hake, Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses, Am. J. Phys. 66(1), 64-74 (1998)
65. M. Schneider, Encouragement of Women Physics Majors at Grinnell College: A Case Study, The Physics Teacher 39, 280-282 (2001)
66. (a) D. Hestenes, M. Wells and G. Swackhamer, Force Concept Inventory, The Physics Teacher Vol. 30, 141-157 (1992); (b) I. Halloun and D. Hestenes, The initial knowledge state of college physics students, Am. J. Phys. 53(11), 1043-1055 (1985)
67. J. Bower, Scientists and Science Education Reform: Myths, Methods, and Madness, http://www.nas.edu/rige/backg2a.htm 10 pages
Part II : Data Analysis
Chapter 13 : Introduction to the Data Analysis
In Part II Data Analysis, I seek to determine the initial knowledge state of the students in five community college classes. I also seek a few common unifying mental models composed of a limited and patterned set of misconceptions. Finally, I make some comments on the inefficiencies of the FCI as a tool for seeking these proposed common unifying non-Newtonian mental models. The five Cabrillo Community College physics classes were all assessed in the spring semester of 2002. Cabrillo Community College (CCC) Physics 2B (group JM) and CCC Physics Physics 10 (Group FK) were given the MPEX, and their data is presented in Chapter 15. Data is presented in two formats; one is in parallelism with the original paper, and the other is in a new format. CCC Physics 4C (Group CF) was given the MB and its data is presented in Chapter 16. This MB data is also used in a concentration analysis aka Bao and Redish. CCC Physics 4A (group JM) was given the FCI and its data is presented in Chapter 17. While using this data in an attempt to find the sought mental models, some test analysis of the FCI is performed. CCC physics 11 (Group PG) was given the FCI and its data is presented in Chapter 18. Chapter 14 is my attempt at obtaining self consistency between FCI Tables I, II and V of Hestenes, et al. I sought the existence of a few common unifying mental models composed of a limited and patterned set of misconceptions. The literature in Part III raises the possibility that such exists. For example, Boa and Redish in Paper 08 note three accepted PER facts. First, there are a small number of research identified common mental models. Second, multiple-choice tests can be designed with these mental models as distractors -- one such test is the FCI. Third, a student with a strong naïve belief will pick multiple wrong answers that are based on this unifying mental model; the simply ignorant will chose distractors randomly. Leaving aside the muddying of mental model and misconception definitions, If a few unifying mental models composed of a limited and patterned set of misconceptions exist, then the teachers efficiency is notably increased as the target of his energy becomes both small and well defined. Improvements in the the FCI would help teachers in their attempts to correlate several different student misconceptions into a student's non-Newtonian mental model. MCSR items are valuable resources in a twenty-nine question test and certain null distractors could be reassigned to other misconceptions to enhance misconception to misconception correlation. Data is presented in a new format that enhances analysis of individual student beliefs. The tables I used differ from those in this part. The tables I used were hand generated on graph paper. The symmetric uniform cell sizes enhances the presented formats utility, as does being able to put the entire graph on one sheet of paper
Chapter 14 : FCI Table Modifications and Data
D. Hestenes, M. Wells and G. Swackhamer, "Force Concept Inventory," The Physics Teacher Vol. 30, 141-157 (1992)
In this chapter, I present the Cabrillo Community College data in the manner used by the original FCI paper. I reprint AVH data from that paper for comparison purposes, but do not reprint data from the remaining five groups. The focus of this paper is the initial knowledge state of the student. This paper does not seek to compare the different instructional strategies, nor does it seek to develop a teacher competence ranking. For this last reason, Table IV is omitted from this paper. For the above reasons, post-instructional data is unnecessary, and could have been counter productive to the development of this paper. Tables III & V reflect the lack of post-instructional data. A reader who compares the structures of Tables I, II, and V between this paper and the original FCI paper will note some modifications. The minor change is the use of italics to note multiple appearances of a single inventory item in Tables I and II rather than the original paper's incomplete use of parentheses. The major change is the mutual consistency between Tables I and II on the one hand, and Table V on the other. I inserted eleven inventory items, and removed two from table II. I inserted nine table I, and two Table II, inventory items into Table V. I also removed one inventory item from table V. My primary goal in these alterations was to attain consistency between all three tables. My secondary goal was to enhance the possibility of detecting the hoped for internal connections in a student's alternative belief structure. The Table II changes are: 16A inserted at line I1; 6D inserted at I2; 8E and 10C at I3; 27B and 27D at I4; 4C and 10D at I5; 15D at Ob; 1D at G3; and 16C at G5. In table II, I also removed 12B from line AF1 and the question mark from R1. The Table V changes are: R1 inserted in the grid locations 29A and 29B; [0] inserted in grid locations 23D, 24E and 25B; [2] inserted in 7E; [4] inserted in 9D, 18B and 28C; [5F] inserted in 22D; and [5G] inserted in 18B. In Table V, I also removed [5S] from grid location 22D. Unfortunately formatting issues encourage me to present the explanation for Table V here instead of on the table itself. The far left column indicates the question number of the Force Concept Inventory diagnostic test. Columns A, B, C, D, E are the multiple choice alternatives (items) for each question using codes in Table I & II. The correct (Newtonian) answer, expressed in Table I code, is enclosed in square brackets. The common sense alternative choices are classified by the Table II code. In each grid, the percent frequency distributions of the students answer (item) choice is given for the pretest (upper row) and the posttest (lower row). The groups from left to right are ASU PHY 105, Cabrillo Community College Phys 11, and Cabrillo Community College Phys 4A. All numbers are percentages, they may not add up to 100% per question per group because some students did not answer all twenty-nine questions. Post-instruction testing for both Cabrillo Community College groups was not performed.
Table I: FCI Modified, Newtonian Concepts
Inventory Item0. Kinematics Velocity discriminated from position 20E Acceleration discriminated from velocity 21D Constant acceleration entails parabolic orbit 23D, 24E Constant acceleration entails changing speed 25B Vector addition of velocities 7E 1. First Law With no force 4B, 6B, 10B With no force and velocity direction constant 26B With no force and speed constant 8A, 27A With canceling forces 18B, 28C 2. Second Law Impulsive force 6B, 7E Constant force implies constant acceleration 24E, 25B 3. Third Law For Impulsive forces 2E, 11E For continuous forces 13A, 14A 4. Superposition Principle Vector sum 19B Canceling forces 9D, 18B, 28C 5. Kinds of Force 5S. Solid contact Passive 9D, 12B, 12D Impulsive 15C Friction opposes motion 29C 5F. Fluid contact Air resistance 22D Buoyant (air pressure) 12D 5G. Gravitation Gravitation 5D, 9D, 12B, 12D, 17C, 18B, 22D Acceleration independent of weight 1C, 3A Parabolic trajectory 16B, 23
Table II: FCI Modified, A Taxonomy of Misconceptions Probed by the Inventory. Presence of the Misconception is suggested by Selection of the corresponding Inventory Item.
Inventory Item Kinematics K1. position-velocity undiscriminated 20B, 20C, 20D K2. velocity-acceleration undiscriminated 20A, 21B, 21C K3. nonvectorial velocity composition 7C Impetus I1. impetus supplied by "hit" 9B, 9C, 16A, 22B 22C, 22E, 29D I2. loss/recovery of original impetus 4D, 6C, 6D, 6E, 24A, 26A, 26D, 26E I3. impetus dissipation 5A, 5B, 5C, 8C, 8E, 10C, 16C, 16D, 23E, 27C, 27E, 29B I4. gradual/delayed impetus building 6D, 8B, 8D, 24D, 27B, 27D, 29E I5. circular impetus 4A, 4C, 4D, 10A, 10D Active Force AF1. only active agents exert force 11B, 13D, 14D, 15A, 15B, 18D, 22A AF2. motion implies active force 29A AF3. no motion implies no force 12E AF4. velocity proportional to applied force 25A, 28A AF5. acceleration implies increasing force 17B AF6. force causes acceleration to terminal velocity 17A, 25D AF7. active force wears out 25C, 25E Action/Reaction Pairs AR1. greater mass implies greater force 2A, 2D, 11D, 13B, 14B AR2. most active agents produce greatest force 11D, 13C, 14C Concatenation of Influences CI1. largest force determines motion 18A, 18E, 19A CI2. force compromise determines motion 4C, 10D, 16A, 19C, 19D, 23C, 24C CI3. last force to act determines motion 6A, 7B, 24B, 26C Other Influences on Motion Cf. centrifugal force 4C, 4D, 4E, 10C, 10B, 10E Ob. obstacles exert no force 2C, 9A, 9B, 12A, 13E, 14E, 15D Resistance R1. mass makes things stop 23A, 23B, 29A, 29B R2. motion when force overcomes resistance 28B, 28D R3. resistance opposes force/impetus 28E Gravity G1. air pressure-assisted gravity 9A, 12C, 17E, 18E G2. gravity intrinsic to mass 5E, 9E, 17D G3. heavier objects fall faster 1A, 1D, 3B, 3D G4. gravity increases as objects fall 5B, 17B G5. gravity acts after impetus wears down 5B, 16C, 16D, 23E
Table III: FCI Inventory Scores
Group Mean Inventory Pretest Number of Students
AVH 34% (14%) 116
PG 35% (14%) 33
JM 59% (22%) 43
Percentages in parenthesis are standard deviations assuming a normal distribution which only roughly approximates the data.
Chapter 15 : MPEX Data on CCC Physics 2B and CCC Physics 10
15.1 : MPEX, Template and Expansion
E. Redish, J. Saul, and R. Steinberg, "Student expectations in introductory physics," Am. J. Phys. 66(3), 212-224(1998)
In this chapter, I present Cabrillo Community College data in both the manner used in the original MPEX paper and in a new format that preserves question and student responses. I reprint Expert and TYC data from Redish et al. for comparison purposes but do not reprint data from the remaining three calibration groups nor the five additional institutions. The original MPEX paper is interested in the effect of instruction on student attitudes and so needs post-instructional data. This paper is interested in the initial knowledge state of students, and thus has no such need. I followed the original paper's procedure by collapsing a 5 response Likert system into a two response for analysis, although the analysis possible from the ET1 and ET2 would be enhanced by retaining the actual Likert numbers, as this would allow analysis of belief strength. Also, my previous educational class in rubric design and use, strongly advised use of even numbered system to prevent fence sitting (four or six choices instead of five). Table IV and Fig. 2(b) are matched to the corresponding Redish et al. presentations. ET1 is a hierarchical table of student responses for the fifteen students of group JM. The least favorably responded-to question is at the top; the most favorably responded-to question is at the bottom. The student who most agrees with expert opinion is on the left; the student who least agrees with expert opinion is on the right. Blank squares are responses by the student which are in agreement with the expert opinion; actual non responses are blackened in squares. Letters A and D represent disagreement with expert opinion on that question. ET2 follows the same patterns as ET1 for the thirty-six students of group FK.
Table IV: MPEX Percentages of students giving favorable / unfavorable responses on overall and clusters of the MPEX survey at the beginning (pre) and end (post) of the first unit of university physics.
Overall Independence Coherent Concept Reality Link Math Link Effort Expert 87/06 93/03 85/12 89/06 93/03 92/04 85/04
TYC pre 55/22 41/29 50/21 30/42 69/16 58/17 80/08 TYC post 49/26 42/32 48/29 35/41 58/17 58/18 65/21
JM pre 67/33 50/50 75/25 69/31 88/12 53/47 61/39 JM post
FK pre 52/34 40/52 44/37 39/40 76/16 46/32 70/17 FK post
Expert is defined in the original MPEX paper. TYC is a Community College reported in the original paper. JM is Cabrillo Community College (CCC) Phys. 2B and FK is CCC Phys. 10
15.2 : Commentary on JM Data in Parallelism with Redish et al. Using ET1
The Independence cluster in Redish et al. specifically notes survey items #1 (SI#1) and #14 (SI#14). The expert group disagreed with SI#1 at 100% and disagreed with SI#14 at 84%. The JM group disagreed with SI#1 at only 36% and also disagreed with SI#14 at 36%. As ET1 shows, SI#1 and SI#14 were the two survey items on which the expert and JM groups were in least agreement. ET1 contains a wealth of information. Some examples follow. Student #10 (S10) is in 100% disagreement with experts on the Independence cluster as seen by highlighting SI#1, SI#14, SI#13, SI#27, SI#8, and SI#17. S6, S13 and S2 are in disagreement with the expert opinion 67% of the time, with an additional six students disagreeing 50% of the time. There is general correlation between Overall and Independence cluster student scores with S4 only disagreeing once and S10 disagreeing all six times. SI#17 is the only easy item in this cluster, with the other five survey items all in the upper third of ET1. In individual percentage terms, S7 and S1 had forty percent of their disagreement with experts in this cluster (2 our of 5 each). The Coherence cluster in Redish et al. specifically notes survey items SI#21 and SI#29. The expert group disagreed with SI#21 at 85%; group JM disagreed at 93%. SI#29 is course-specific; it is dependent on the availability of formula sheets or books for exams. In mirror image to the Independence cluster, four of the five survey items (SI#16, SI#29, SI#15, and SI#21) are in the lower half of ET1 showing good correlation between expert and JM groups. Only SI#12 is in the upper third, showing strong disagreement. There is less correlation between Overall and Coherence cluster than was the case for the Independence cluster with S15 very much in disagreement with expert opinion (4 out of 5 not matching expert opinion), whereas S10 is in disagreement only 1 out of 5 times. Only two students (S15 and S14) disagree with expert opinion more than 50% of the time. Other than S15, S9 is the only other person above 20 percent of his individual percentage disagreement with experts in this cluster (2 out of 8). The Concepts cluster in Redish et al. specifically notes SI#4 and SI#19. TYC is highlighted as having a pre-instruction percentage of 16 in agreement with expert opinion on SI#4. JM group is at 80%. SI#19 is distinguished from SI#4 only by the words "most crucial". The Concepts survey items (SI#19, SI#27, SI#4, SI#32, and SI#26) are spread out. SI#19 is quite difficult at 06/09, and SI#26 is one of the easiest at 14/01. It's rather amazing that while no student agreed with expert opinion completely, eleven out of fifteen students agreed at the 80% level. Three students (S2, S6, and S10) strongly disagreed with expert opinion. Twenty-five percent of S6's disagreement with expert opinion occurred in this cluster (4 out of 16). The Reality Link cluster in Redish et al. is agreed upon at the 93% level by experts. JM group comes close, agreeing at the 88% level; half of all disagreement with expert opinion came in SI#22. Even so, all survey items are in the lower half of ET1, with two of the four reality link items being amongst the easiest at 14/01. It is interesting to note that what disagreement exists is dispersed amongst the students. The Math Link cluster in Redish et al. makes note of SI#2. Group UMN had a 20/48 response in percentage form. In percentage form, group JM had a 33/67 response to SI#2. Within one place, all survey items of the Math Link cluster are in the upper half of ET1, with both SI#2 and SI#6 among the most difficult at 05/10. Four students (S3, S13, S14, and S10) agree with experts only 20% of the time. S4 and S7 agree with experts only 50% and 60% of the time respectively. Math Link is fully 50% of all disagreements S4 has with expert opinion, for S7 it is 40% of the same. One student, S1, is in absolute agreement with expert opinion and three others (S9, S5, and S12) are in excellent (80%) agreement. The Effort cluster in Redish et al. focuses primarily on the sever lowering of favorable response due to instruction. It does mention that experts are in agreement at the 85% level. JM group is in agreement a only at 61% level pre-instruction. The five survey items are quite spread out in student agreement with SI#6 only at 05/10 and SI#3 at a strong 13/02. Six students ( S3, S2, S13, S6, S14, and S10) agree with expert opinion only at the 50% level. Two students (S7 and S9) are in full agreement with expert opinion. This method of arranging a table, highlighting certain rows of information (survey items for a specific cluster), and looking for correlations along a student's column, helps find outliers which can otherwise be overlooked. S10 has complete (6/6) disagreement with expert opinion on the Independence cluster. S15 has a high (4/5) disagreement with expert opinion on the Coherence cluster. S6 has a equally high (4/5) disagreement on the Concept cluster. These are examples of uniqueness which would be worth an instructors time to address individually. Conversely, S1 is in full agreement with expert opinion in the Math Link cluster, and thus, she is a candidate for being a peer tutor to someone like S3 who- while in severe disagreement (4/5) with expert opinion in the Math Link cluster- is in general agreement (23/11) overall.
15.3 : Commentary on FK data in Parallelism with Redish et al. Using ET2
The Independence cluster in Redish et al. specifically raises SI#1 and SI#14. The expert group disagreed with SI#1 at 100% and disagreed with SI#14 at 84%. The FK group disagreed with SI#1 at only 8% and disagreed with SI#14 at 17%. As ET2 shows, SI#1 was the item with the least agreement between the expert group and the FK group. SI#14 was fifth out of thirty-four, in non-congruence between the groups. ET2 contains a wealth of information. Some examples follow. S2 and S30 are in active disagreement with expert opinion, while S10 is not in agreement, his actual opinions are vague. Seven out of thirty-six students fail to agree with expert opinion 83% of the time. Of these seven, S3 and S16 have one third of their total disagreement with expert opinion in this cluster. In fact, only ten of thirty-six students agree with expert opinion more than 50% of the time in this cluster, with S12 and S22 the only two in complete agreement. This is particularly interesting in S22's case, as that student was only in agreement with expert opinion nineteen out of a possible thirty-four times. In personal percentage terms, S24 and S17 were the most impacted with approximately 40% of their total disagreement with expert opinion in this cluster. SI#17 is the only item in this cluster to be in the lower (agreeable) half of ET2. All after survey items (SI#1, SI#14, SI#27, SI#8, and SI#13) are in the upper (disagreeable) half. ET2's black squares are neutral or non-responses by survey takers. It's interesting to note that disagreeable SI#1 had no black squares, whereas agreeable SI#17 had two! The most black squares were six for SI #27. The Coherence cluster in Redish et al. specifically notes SI#21 and SI#29. The expert group disagreed with SI#21 at 85%. Group FK disagreed at only 47%. SI#29 is course-specific; it is dependent on the availability of formula sheets or books for exams. Survey items are fairly spread out ranging from SI#12 at six agreement to SI#15 at twenty-seven agreement out of a possible thirty-six. There were significant black box impacts on SI#12 at eleven and SI#16 at ten. No student completely agreed with expert opinion in this cluster, although S36, S25, S14, S15, S23, S9, and S10, all only failed to agree because of neutral or no decision being made (black boxes). S4 is in complete and active disagreement with expert opinion in this cluster. S3, S10, and S6 fail to agree with the experts on any choice, primarily by passive black boxing. In personal percentage terms, S12 had 33% of her personal disagreement with expert opinion in this cluster and both S11 and S8 had approximately 25% of the same. The Concepts cluster of Redish et al. specifically notes SI#4 and SI#19. TYC is highlighted as having a pre-instruction percentages of 16 in agreement with expert opinion on SI#4; FK group is at 11%. SI#19 is distinguished from SI#4 only by the words "most crucial". Given the similarity in these two items, it's interesting to note that no student agreed with expert opinion on both items. No student was in complete concordance with expert opinion, nor was any student in complete active disagreement. S34 chose not to respond to any of the five items in this cluster, which is not unique for this student as he chose to not respond twenty-two times. Similarly S22, S9, S10 and S6 chose to ignore the majority of items in this cluster. Student 22 may be worth interviewing as she was in overall agreement with the expert opinion (19 agreement out of 34 possible) and black boxed eleven responses, four of which were in this cluster (36%). S31 and S5 stand out by disagreeing with expert opinion 80% of the time. This is particularly interesting in S5's case as all of her active disagreements with expert opinion are in this cluster. Eleven out of thirty-six students agree with expert opinion more than 50% of the time, with only two of these in agreement at the 80% mark. SI#32 and SI#26 were noticeably more likely to have student-expert concordance (25 out of 36), and SI#4 and SI#9 were much more likely to have student-expert discordance (~5 out of 36). The Reality Link cluster in Redish et al. is agreed on at the 93% level by experts. FK group agrees at the 76% level. The four survey items for the Reality Link (SI#22, SI#18, SI#10, and SI#25) are all in the most agreeable (lowest) third of all survey items. All most all active disagreement is by the right-hand most one third of students (17 out of 23), as is almost all black boxing (11 of 12). The only student who stands out from this pattern is S3 who agrees with the experts only 50% of the time in this cluster yet is in the left-hand half of ET2 with a 56% over all agreement with experts. The Math Link cluster in Redish et al. specifically notes SI#2. Group UMN had a 20/48 response in percentage form. In percentage form, group FK had a 25/50 response to SI#2. Within two places, all the Math Link items are in the upper (disagreeable) half of ET2. The three students (S24, S11, and S17) who agree most often with expert opinion, totally agree with expert opinion in this cluster. While there is no absolute active disagreement, students (S34, S30, S6, and S21) fail to agree at all with expert opinion. S31, S1, S20, S15, and S19 only agree with expert opinion once in this cluster. In all, only fourteen out of thirty-six students agree with experts more than 50% of the time. Group FK was a conceptual physics class so Redish et al. comments in regards to high school physics courses in the Math Link cluster are relevant. An expert (H.S. teacher) level of 67% is more realistic than the expert (Group 5) level of 92% for comparison purposes. The Effort cluster in the original paper focuses primarily on the severe lowering of favorable responses due to instruction. It does mention that expert agreement is at the 85% level. FK group is in agreement at the 70% level pre-instruction. The five survey items in the effort cluster are all in the lower (agreeable) two thirds of all possible survey items. It's interesting to note that both S14 and S16 agree with expert opinion nineteen times but have very different individual responses in this cluster. S14 agrees with expert opinion only once, whereas S16 agrees with expert opinion completely. It may be worth the instructor's time, to get these two students involved in a discussion on effort expectations is a physics course. Further, S33 is a useful peer tutor on this cluster because of her oddity, she is in less than 50% overall agreement with expert opinion and yet in complete 100% agreement with expert opinion in this cluster. It would be worth the instructor's consideration to use her as a peer tutor in this issue particularly with her near neighbors in overall agreement S20 and S2. Combating the negative 18% in Redish et al. requires effective focused effort and similar co-workers may have insight on this subject unavailable to the instructor. As mentioned in the opening paragraph of this chapter, more information on the strength of students' views rather than their mere existence, could be obtained by retaining the actual Likert numbers rather than converting to a quasi-binary format of agree-disagree (black boxes). The allowance of neutrality [Likert response 3] is a flaw in the methodology. My educational classes for my secondary school math credential emphasized the need for rubrics to be even numbered, normally four, occasionally six, precisely to prevent the purely neutral response. This is to make people commit; a six point rubric (strongly agree, agree, slightly agree, slightly disagree, disagree, strongly disagree) would probably work best for the MPEX. The previous information was representative of what the MPEX can provide for a physics instructor in regards to the initial knowledge state of the student. Beyond the student, classes and specific categories of classes, have initial knowledge composites. While it behooves an instructor to be aware of possible the MPEX can provide some comparison and contrast between, in this case, the second class in a calculus based introductory sequence (group JM) and a stand-alone conceptual physics class (group FK).
15.4 : Comparison and Contrast Between Groups JM and FK
Beyond the individual, stands the initial knowledge composite of a class. At the real risk of discriminating, an instructor would still do well to prepare for the reality of the class he is to face. Only the roughest of comparisons is possible from Table IV in part 1 of this chapter. The following are some example comparisons between group JM a calculus-based, second course in a four course sequence and group FK a non-mathematical isolated course. Independence survey items track each other, with SI#1, SI#14, SI#27, SI#8 and SI#17 in order from least in agreement with expert opinion to most. The exception is SI#13. Experts disagreed with the statement "My grade in this course is primarily determined by how familiar I am with the material. Insight or creativity has little to do with it." Only seven out of fifteen students in group JM disagreed with the statement and thus agreed with expert opinion at a 47% rate. Eighteen out of thirty-six students from group FK disagreed with this statement and thus agreed with expert opinion at the 50% rate. So what has happened is SI#13 stayed is place in percentage terms and all five other survey items shifted in respect to percentage around it. SI#1 the top (least in agreement with expert opinion) for both groups JM and FK, shifted in percentage terms from 20% for JM to 08% for FK. SI#14 from 20% to 17%. SI#27 from 53% to 36%. SI#8 from 60% to 42%. SI#17 from 87% to 86%. So even through in percentage terms SI#17 remained constant between groups, its relative position did not. It has four problems below it, and seven sharing its ranking, in group JM. Whereas group FK it has one problem below it and no shared ranking. Thus, while table formats used in ET1 and ET2 are very useful to show data within a class, additional information such as percentage lines might help make inter-class comparisons more reliable. So, SI#27 and SI#8 are the main differentiators between these two groups with both percentage and relative changes. More non-math students (group FK) than calculus students (group JM) believe, in opposition to expert opinion, that "understanding physics basically means being able to recall something you've read or been shown." More non-math students (group FK) calculus students (group JM) believe, in oppression to expert opinion, that "In this course, I do not expect to understand equations in an intuitive sense; they just have to be taken as givens." Coherence survey items do not track well, as the only constant relative position is SI#12 which is at the top for both groups. SI#12 however has a large percentage shift from 40% agreement with expert opinion for group JM (calculus) to a 17% for group FK (non-math). Thus almost all of a non-math class (group FK) believes, in disagreement with expert opinion, that "knowledge in physics consists of many pieces of information each to which applies primarily to a specific situation." whereas a substantial portion of a calculus class does agree with expert opinion that the above quote is false. The shifts for the other survey items are, going from JM to FK by SI number: SI#16 from 73% to 53%. SI#29 from 80% to 31%, SI#15 from 87% to 75% and SI#21 from 93% to 47%. The largest percentage changes between classes in this cluster are SI#29 and SI#21. FK students are two to three times as likely as JM students to believe that it will be " a significant problem in this course [to be] able to memorize all the information I need to know." Worse, the same ratio applies to their relative beliefs that: "If I came up with two different approaches to a problem and they gave different answers, I would not worry about it; I would just choose the answer that seemed most reasonable!" This last is a statement accepted in defiance of expert opinion by more than half of the non-math class. Whereas only one person out of fifteen held this view in the calculus class. Concept survey items do hold their relative position with the exception of survey item number four. SI#4 went from the middle of group JM, to the top of group FK. Its percentage shifted from 80% to 11%. Experts and group JM strongly disagree with the following statement; group FK just as strongly agrees that: "problem solving in physics basically means matching problems with facts or equations and then substituting values to get a number." Percentage shifts from JM to FK for the remaining items are: SI#19 from 40% to 14%, SI#27 from 53% to 36%, SI#32 from 80% to 69%, and SI#26 from 93% to 69%. Other than SI#4, the most significant percent shifts were in SI#19 and SI#26. Experts disagree with the upcoming statement as do 40% of calculus-based students (group JM) and 14% of non-math students (group FK): "The most crucial thing in solving a physics problem is finding the right equation to use." Experts agree with the upcoming statement as do 93% of group JM and 69% of group FK: " When I solve most exam or homework problems, I explicitly think about the concepts that underlie the problem." Reality survey items do not hold their relative positions but are so close to one another that it doesn't matter. The percentage spread from top to bottom for group JM is 13 %, for group FK 12%. Group JM averages 12% above group FK. The only survey item to merit be worth individual attention is SI#18 which has both the biggest relative shift, last to second, and the largest percentage shift at 21%. Expert and group JM agree with the following statement as do about three-fourths of group FK: "To understand physics, I sometimes think about my personal experiences and relate them to the topic being analyzed." Apparently, one is four non-math students do not either 1)relate, 2) think, or 3) have relevant personal experiences. I suspect the third. Today in class roughly twenty students had no idea what the basic purpose of a pulley is, did not know how it actually works, and never used one before. Math survey items hold their relative position excepting SI#6. In relative position SI#6 goes from second most in disagreement with expert opinion for group JM to most in agreement with expert opinion for group FK. In percentage shift, it goes from 33% for JM to 61% for FK; a reversal where the normal trend of JM agrees more with expert opinion than does FK. This is particularly noteworthy as JM is the math sophisticated group, and FK is not. Experts agree as do most of group FK with the following statement, group JM however disagrees strongly that: "I spend a lot of time figuring out and understanding at least some of the derivations or proofs given either in class or in the text." As several hypothetical reasons for this outcome come to mind, some individual student interviews from the majority opinion in group JM would be illuminating to me were I the instructor. The remaining survey item percentage shifts from JM to FK are: SI#2 from 33% to 25%, SI#8 from 60% to 42%, SI#20 from 67% to 47%, and SI#16 from 73% to 53%. SI#8, SI#20, and SI#16 all have a roughly 20% gap with, as normal, group JM in closer agreement to expert opinion. The smallest gap is for SI#2. Neither group has more than one out of three students agreeing with the expert opinion that the following statement should be disagreed with : "All I learn from a derivation or proof of a is valid and that it is OK to use it in problems." Effort survey items hold their relative position excepting survey item #6. SI#6 drops right in between SI#7 and SI#24 assuming SI#24 on top for both groups although SI#7 and SI#24 are equals for group JM. In mirror image to the math cluster, the only survey item in which JM continues to agree more often than FK does with expert opinion is SI#24 going from 53% to 44%. Experts, 53% of group JM and 44% of group FK disagree with: "The results of an exam don't give me any useful guidance to improve my understanding of the course material. All the learning associated with an exam is in the studying I do before it takes place." For all other survey items of FM, the non-math novice students agreed more with the experts than do the students with a semesters worth of calculus-base physics under their belts, group JM. To keep the pattern, percent shifts still will go from JM to FK. SI#6 from 33% to 61%, SI#7 from 53% to 67%, SI#31 from 80% to 81%, and SI#3 from 87% to 97%. My choice of JM to FM pattern made earlier is now anti-intuitive; the more math knowledge and experience the less likely a student is to agree with expert opinion which matches the information in Redish et al. SI#6 is by far the most extreme example of this phenomenon, and was discussed and incorporated into the Math cluster above. The MPEX can provide insight and information and to an instructor on the initial knowledge state of his students. Constructionist's argue that students learn, teaches only facilitate the process. Thus, it is as important to know who is learning, as it is, what they should be studying.
Chapter 16 : MB Data and Concentration Analysis on CCC Physics 4C
16.1 : MB Paper Template and Expansion
D. Hestenes and M. Wells, "Mechanics Baseline Test," The Physics Teacher Vol. 30, 159-166 (1992) In this chapter, I present the Cabrillo Community College data in three separate formats. First, in the manner of the original MB paper. Second, in a Concentration Analysis patterned after Bao and Redish's paper. Third, a new format that preserves question and student responses. The original MB paper spends a large part of its effort correlating MB data to post-instruction FCI data. I do not perform this comparison as it does not advance my focus on the initial knowledge state of the student. While the MB is most often given as a post-instruction test, the authors allow that it can be used as a "pre-instruction placement exam" for "advanced university courses." Group CF is such a class; it is the Cabrillo Community College 4C class, the third class in a calculus-based introductory sequence for scientists and engineers. Only four questions are specifically addressed in Hestenes and Wells. Questions 4 and 5 are labeled as especially significant in that they reveal widespread deficiencies in the qualitative understanding of acceleration and questions 20 and 22 are probes of the conservation of energy and momentum which "present difficulties even to advanced students." Table I in the original paper associates questions to specific topics. Unlike the FCI, the distracters for the MB are not "commonsense alternatives" although they include "typical student mistakes." The Table I presented here has two very minor corrections and uses italics instead of parenthesis to show that a question involves more than one concept. The Table II presented here reprints AVH data and presents group CF's percentages Table I: MB Modified, Newtonian Concepts on the Mechanics Baseline. Each concept is involved in the corresponding question
Question
Linear Motion Constant acceleration 1, 2, 3 Average acceleration 18, 23 Average velocity 25 Integrated displacement 24 Curvilinear motion Tangential acceleration 4 Normal acceleration 5, 8, 12 a = v2/ r 9, 12 B. General Principles First Law 2 Second Law 3, 8, 9, 12, 18 Dependence on mass 17, 21 Third Law 12, 13, 14 Superposition Principle 5, 7, 13, 19 Work-Energy 20 Energy conservation 10, 11 Impulse-Momentum 16, 22 Momentum conservation 15
Gravitational free-fall 6, 26 Friction 9
16.2 : Concentration Analysis on Group CF Data L. Bao and E. Redish, "Concentration analysis: A qualitative assessment of student states," Am. J. Phys. Suppl. 69(7), S45-S53 (2001) Concentration analysis measures the distribution of student responses to multiple-choice questions. This information provides insight into possible common incorrect models held by students and insight into whether a given question is effective in detecting student models. My main purpose in this section is to perform Bao and Redish's analysis so as to better understand their paper, which applies this analysis to the FCI. As spoken to at length in Part III, I do not agree with some of Bao and Redish's paper. I feel that while this method may be useful for researchers confronted by a high volume of data, the average teacher would benefit more from the method presented in part 3 of this chapter. Still, being able to perform concentration analysis is a valuable skill for a person with a PER focus and it does facilitate the on going dialogue in published work. Basically, we mathematically create a C, which paired with the student's score can be used as a point on a S-C graph or the S and the C can be binned to create a two letter label that is then matched to some "implications of the patterns." Bao and Redish also advance a third variable called gamma, which is C without the offset due to S; I do not use gamma. As implied, S and C are not independent and the S-C graph has boundaries. The original paper also does a graphical shift analysis on S-C graphs showing both pre and post instructional data from tutorial and traditional classes. As my focus is on the initial knowledge state of the student, I do not perform graphical shift analysis. While it is not my goal to repeat Bao and Redish's paper here, there are a few more relevant highlights. C is a number from zero to one that shows how concentrated student responses are to a question, independent of correctness of choice. Zero is random student response; each multiple-choice response (a b c d e) got picked by the same number of students. One is concentrated student response; all multiple choice responses are unpicked except one, which was picked by everybody. In the equation for C, "m" is the number of different choices available. For the Mechanics Baseline, m = 5 for twenty-three questions and m = 4 for the remaining three questions. "N" is the number of students; in the case of group CF, this equals 42. There exists a small problem in the case of a non-response by a student; this is not addressed by Bao and Redish. The options are: increase m by one in all cases, as all students do have the implicit option of choosing to not respond to a question, or adjust N to match the number of students who responded to that specific question. In solomaic fit, I did both and averaged the results. Given the gross binning advanced in table II of Bao and Redish, my choice affected only four of the twenty-six questions (Q22, Q02, Q03, and Q24). The boundaries of the S-C plots are no longer sharp, they vary slightly by question. The actual equation for C is equation 7 in Bao and Redish. The binning is in Table II of Bao and Redish or is marked on the axes of the S-C plot. A modified Table III is presented below that more accurately reflects the written commentary in Bao and Redish's paper. The score is percentage of students in class who got that question correct in decimal form. Table IV presents the values for both S and C. It also provides the matching two letter label. An S-C plot similar to Bao and Redish's fig 2(b) but using group CF's pre-instruction data is presented. Finally, prior to its discussion in part 3 of this chapter, Table Alpha the new data presentation format is provided.
Table III: CA Modified, Concentration Analysis - Implications of Patterns. Combining score(s) and concentration factor(c), we can code the student response on a single question with a response pattern when using the three-level coding system in Table II
S C Implications of the Patterns
One model H H one correct model ~student doing well M H [exists in Table IV]* ~student doing well L H common incorrect student model Two model H M (does not exist but bin cutoff dependent M M two models, one of which is correct L M two models, both of which are wrong Non model H L (does not exist and is logically suspect) M L [exists in Table IV]* L L near random situation ~no models
*Table IV of Bao and Redish, but no commentary.
Fig 1
16.3 : Introduction and Discussion of Table Alpha
Table Alpha at the end of this part presents group CF's data in a nice visual layout. Highlighting a subset of questions can bring some fairly detailed information to light as the following paragraphs make clear. The table can be worked both ways, information about students can be found by highlighting selected questions. It is possible to study questions by highlighting subsets of students, such as those with minority status. Partitioning the table is also of value, particularly in assigning students to cooperative groups. Given the focus of this paper on the initial knowledge state of the student, we forgo question analysis here In the Calculation sub-group (Q12, Q18, Q9, Q11, Q22, Q20, and Q21), thirty-seven out of forty-two students got less than 50%. While only two students got all the questions wrong, nine got all but one wrong. Conversely no student got this sub-group completely correct and only one student got all right expect one. The five best students (S41, S34, S31, S20, and S35) in this sub-group, showed calculations on their test papers seventeen out of thirty-five possible opportunities. Five of the worst students (S22, S17, S32, S16, and S15) in this sub-group, showed calculations on their test papers three out of thirty-five possible opportunities. What written calculations that existed for the better group showed eleven correct answers out of the seventeen calculations. The remaining eighteen possible opportunities showed thirteen correct answers despite no work being shown. For the worst group, all three calculations resulted in wrong answers. Of the remaining thirty-two possible opportunities, three were correct, a percentage result worse than random guessing. The problem with examining only test papers is that I can't be certain that no scratch paper was used as the test was proctored by the class instructor and use of scratch paper was a subject we did not discuss. Question twelve was correctly answered twice. I believe that in both cases this was the result of a lucky guess; no work was shown in either cases and in both case the companion question number eleven was wrong. Table I shows that question 12 addresses four different Newtonian concepts. For those few students that showed their work, it seems that the final nail in the coffin was forgetting that the centripetal force is added to the person's weight. For example, S41 wrote F = mv2/r and substituted in F = 50 (20)/5 and circled "E", none of above, as his answer. This shows an understanding of normal acceleration, a = v2/r, and Newton's second law. It was arguably shows a lack of understanding the third law, but I suspect it's more a case of I've found "a" force so it most be "the" force, and not bothering to look around for any other forces to perform vector summation with. Weight is written down on only one of the forty-two papers. On the other hand, choice "B" is very close to the weight alone with no centripetal force added to it. "B" was chosen by thirteen out of forty-two students and the weight calculation is possible to do in ones head. Eight of these thirteen wrote nothing on their papers; half of all those who wrote nothing, and 3 more of these thirteen had only written a single formula with no numbers, for two of the three, the formula was F = mv2/r. Q12 was atypical in that twenty-five students wrote something down on paper for an answer attempt. Proportionally, S25 was most impacted by this calculation sub-group; six of his eleven non-correct responses were in this sub-group. Of these, four were non-attempts at the four most difficult of this sub-groups questions. Conversely, S35 shines with only two of his thirteen failures in this sub-group. Still the overall result of 34% passing is dismal. Each question can be so analyzed but that is not the purpose of this paper. Focusing on the initial knowledge state of the student, it is fair to say the following. First, the vast majority of students do not show calculations for those multiple choice questions that are designed to reward such explicit effort. Without individual interviews it's impossible to say for sure, but given the literature comment about students seeking efficiency even at substantial cost to their physics understanding, I would venture that multiple choice format tests encourage students to mentally do "just enough" to pick one of the given answers after which the student moves on to the next question without completing his work, much less checking it. Second, students did significantly better on those calculation sub-group problems that were both addressing only a single Newtonian concept and also were not also part of the diagram or kinematics sub-groups (Q11, Q22, Q20, and Q21). Third, out of seventy-four blanks, forty-two are in the calculation group; 58% of all blanks in just 27% of all questions (seven out of 26). While it is a good test taking strategy to skip the hard problems, one is suppose to come back to them at the end of the test, and guess if you have to, particularly if you can exclude some possible answers. While eighteen of all blanks look as if they are the result of time pressure, the remaining fifty-six blanks indicate poor test taking strategies on the part of fifteen students. Although student guessing means a bit more work for the researcher, students should know how to maximize their scores on multiple-choice tests. Such tests can have major impacts on their lives (think SAT, GRE, etc.) and explicitly teaching this skill should be part of physics' "hidden agenda". Highlighting Q12, Q5, Q18, Q7, Q26, Q13, and Q19 brings the diagram sub-group to relief. Both the diagram and the calculation sub-groups have seven questions; one of the more stark differences between them is the comparative lack of blanks in the diagram sub-group. Twenty compared to the calculation groups forty-two. Unfortunately, the overall success rate is even worse, the diagram sub-group was correctly answered only 31% of the time. Diagram subgroup is those questions for which force diagrams would facilitate the solution according to Hestenes and Wells' original paper. The fundamental problem is students did not draw force diagrams on their test papers. The worst five students (all seven wrong) is this sub-group (S17, S38, S1, S37, and S7) drew two out of a possible thirty-five force diagrams. Five (S41, S30, S5, S34, and S29), of the best students (two or three wrong), drew one out of a possible thirty-five force diagrams. To add insult to injury all three diagrams were incorrect and resulted in incorrect answer choices. In a generally poor field, a couple of students manage to stand out. S7 failed seven out of his thirteen total failures in this sub-group. Conversely, S16 only failed three out of his fifteen total failures in this sub-group. While Q13 and Q15 managed a 50% pass rate, the three toughest questions (Q12, Q5, and Q18) are in this sub-group. Q5 has a very strong wrong distracter in its choice "E", half the class chose it, three times as many as picked the correct answer, "A". Student interviews would be interesting. I wonder if students are aware that in Q5 they are in a region of circular motion. I hypothesize that they see a block going down a hill, then back up. They realize that there is a "change" in the direction of motion and quickly guess (wrongly) that it is analogous to a ball being thrown up into the air and just say that the acceleration is zero momentarily in a false similarly to the ball going through a zero velocity at the top of its "arc". A student misguess facilitated by the incorrect arc we draw in the ball problem that is false, the ball actually goes straight up and down over the same path but that's not how we draw the ball problem on chalkboards. Further complicating the issue, only one of the seven students who got Q5 correct, also got a majority of the remaining easier diagram sub-group problems correct; that plus the lack of force diagrams does nothing to build confidence that all seven know why their choice is correct. On the other hand, there are no blanks for Q5, indicating a general perception that the question was answerable. To finish off the thought, eight people would get this question if the class engaged in for random guessing. S20, S31, S34, and particularly S41 have a majority of their errors in this sub-group and yet did comparatively well on the test as a whole. Telling them that drawing force diagrams on multiple choice tests is appropriate, may well be all they need, but the majority of the class should be explicitly assessed on whether they can construct force diagrams, not merely whether they can recognize an appropriate venue for their use. The large kinematics sub-group (Q12, Q5, Q18, Q9, Q25, Q24, Q8, Q23, Q1, Q2, Q3, and Q4) show some improvement over the other sub-groups coming in at 40% correct over all, but this is still not up to the test as a whole's average of 47%. A few students did well to OK, notable S34, S41, S30, S24, and S33. Student 34 in particular, only got one wrong out of twelve questions and that was Q12. Still, many students showed a complete lack of kinematics understanding, with twelve students out of forty-two, getting 25% or less of these questions correct; a level close to random guessing (20%). S38 managed the amusing feat of getting them all wrong. In percentage terms, S9 out did him, ten out of S9's total of fourteen wrong answers were in this sub-group for a 71%; S38 only had 57% of his personal wrong choices in this sub-group. Some of the easier questions in this sub-group are Q1 and Q2. They refer to the same figure and while having a nice pass rate of twenty-eight out of forty-two still have seven students with the paired misconceptions of "A" for Q1 and "E" for Q2. This pairing is a graphical mirror image of the correct answer, and given one choice, the other is logically consistent. Student interviews might reveal this to be a graph reading error, rather than a physics misconception. Q4 is another of the easier questions and choice D might very well indicate students believe that gravity is the "only" force, forgetting the normal force and forgetting that acceleration is by definition in the direction of the change in velocity. Improvement on tests is as much a matter of making sure a student gets the easy ones as it is a matter of tackling the difficult issues. Still the difficult issues should be tackled and Q9 is by far the most ignored question with fifteen out of forty-two students leaving it blank; the runner up is Q18 with eight non-respondents. Those students who did respond to Q9 picked incorrect response "B" by a plurality. Of these ten "B" responders, only one showed any work on his test, that student (S15) actually got V = sqrt(mgr), but showed no numbers, thus indicating a math rather than a physics error. Unfortunately, the mesh of physics errors, math errors, and multiple choice format enhancement of laziness is hard to untangle. A separate math diagnosis test as used by Halloun and Hestenes in their "Instrument" would be beneficial to me as an instructor as Part III details in depth. Other easy and interesting issues include the choice of "C" in Q23. I actually couldn't see how "C" was chosen until one paper wrote (5m/s) / (6sec) for me and sure enough 0.83 pops out. The question now is why 5 m/s instead of 5m/s - 1m/s = 4m/s, which is then divided by 6sec to get the correct response. Is it conceptual difficulty with change in V = Xf - Xi, or is it a graph reading error. Six students out of the ten, wrote something on their test for this problem, some quite complex. I didn't understand five of them and would have enjoyed having a student explain some of these attempts to me; two in particular were very detailed. Q8 is amazing, we could get this question from seventeen correct to thirty-three correct, by just having our students agree that net force is a vector in the SAME direction as the acceleration vector, something that should be quite possible for third semester calculus-based physics students. For our final comparatively easy question Q24, students papers for those who chose "E" (fourteen students) are mostly blank; one shows (0.67)*6 = 4.02, another shows x = Vot + (0.5)at2x = 1(6) + (0.5)(2/3)(62) = 18m, and two show marks on the graph which I can't follow; the remaining ten show no work. As a side note ten of the fourteen who chose "E" on Q24 did get the correct acceleration on Q23. Student 8 should be applauded for remembering the distance formula and for showing his work both worthy accomplishments given his peer group and then reminded about the caveat of "constant acceleration" and how the graph distinctly shows that accelerate is not constant even though an average acceleration can be found. A different method of analyzing Table Alpha is to partition it. I partitioned Table Alpha into nine parts by dividing between S27&S10, Q2&Q10, S37&S12, and Q3&Q4. The overall passing percentages per partition follow: upper left 28%, upper middle 17%, and upper right 05%; middle left 73%, middle middle 47%, middle right 20%; lower left 93%, lower middle 77% and lower right 49%. The passing rate for the test as a whole was 47%; the same as the middle middle partition. Random guessing would result in roughly a 20% passing rate. With the singular exception of S34, all students would benefit from directed instruction by the teacher on the five most difficult questions (Q12, Q5, Q18, Q9, and Q25). Students in the right most partition would as so benefit from directed instruction on the middle group questions perhaps in a mandatory discussion group environment also open to the other students on a voluntary basis. To offset the social stigma of mandatory participation perhaps a ten to twenty percent "extra credit" could be applied to participant's grades. The bottom five questions do not need teacher directed instruction. The left most group of students can act as peer instructors on these topics given their 93% pass rate. Homework or class work should be given on these topics to both increase the middle and right groups abilities in these subjects and to provide the left group with opportunities to be peer instructors. In-class cooperative grouping with one student from the left group, one student from the right group and two students from the middle group would prove beneficial to all, as the lowest group of questions cover a surprisingly wide array of topics. If such formal cooperative grouping is not part of the class structure, explicitly using the left group as tutors on homework in exchange for that wonderful motivator, extra credit, would go a long way in helping all students. Teacher explicit effort to build an environment conducive to student mastery of Newtonian concepts is required, as the literature indicates that without this effort, students are not likely to change. In looking for questions that discriminate between groups, several stand out. Notably Q6 isolates the combined left and middle groups from the right group. The passing rates by group left to right are roughly 100%-90%-20%. Q8 is the strongest of several discriminators between the left group and the combined middle and right groups. The passing rates by group left to right are roughly 100%-20%-20%.
Chapter 17 : FCI Data on CCC Physics 4A
17.1 : Presentation of Group JZ's Raw FCI Data in Table A
Group JZ is Cabrillo Community College class 4A. It is the first in a sequence of four calculus-based introductory physics class; it has forty-four students. In this part, I present the cleaned up raw FCI data for group JZ in Table A. The clean up required dealing with: student #23, questions not responded to, questions eliciting a double response, student initiated written comments, and completion time notations. Table A is not hierarchical and presents raw data in student and question number order, excluding student 23. Student #23 mislabeled his responses so that he ended at question 30, when the original FCI is only a 29 question test. I was unable to figure out how to realign his responses and so left him out of consideration. All questions not responded to are marked with pound signs ( # ) in Table A, and referred to as "black box". This includes student provided question marks, student numbered but not answered questions, and complete blanks with no student acknowledgment that they even saw or got to the question. There are twenty-six black boxes in Table A; this is two percent of all possible responses. These black boxes are distributed among five students. Of these five, four students failed to do the last page of the test, questions 26, 27, 28, and 29. Of these four students two (S2,S20) showed lack of time as the cause of their incompleteness. Of these four students, two others (S15,S33) showed an unawareness of the last page's existence. All questions eliciting a double response are marked in Table A by presenting their first response in italics. In order to keep the test taking instructions between groups JZ and PG the same, students were allowed to mark a second response, if they were not confident of their first. The second response was clearly labeled as such, and fifteen students availed themselves of this opportunity (35%). They provided double responses a total of thirty-one times. The test itself had a total of 29x43 opportunities for response; thus, double responses are two and a half percent of all opportunities. Two questions (Q2 and Q15) had three students each, avail themselves of this opportunity; all other questions had fewer double responders. The double response usually had one correct response with only four out of thirty-one double responses having both responses incorrect. The remaining twenty-seven split 15/12. Fifteen students who got their first response correct, showed their uncertainty by picking incorrect second choices. Twelve students who got their first response wrong showed their uncertainty by then picking the correct choice a bit late. All questions eliciting student initiated written comments have an * in that response's table cell. There were twenty-two unsolicited statements on the forty-three examined tests. Three students (S40, S36, and S33) did most of the commentary at four comments each. Six questions (Q1, Q3, Q5, Q10, Q16, and Q19) elicited two comments each, for a majority of all comments. Most comments could be categorized as having problems with the concept of "ideal space", whether these problems were real or legalistic, however, is open to interpretation. The following are a few examples. Question 10 elicited a couple of comments to the effect that rotation ("English" or "spinning") would impact the subsequent path of the ball. This could be viewed two lights. First, there is reference to real world actual experience. Balls often spin at least to some degree, and with real world friction, this results in curved paths. This appeal to the real world is most evident in a comment in regard to question #29 by a student who got seventeen out of twenty-nine on the test (59%). He stated: "TRIED IT with my Pencil!", and then answered incorrectly with choice "A". This ties into the literature in two ways. First, epistemologically, conducting an experiment with concrete reference is commendable. Second, Hammer emphasizes how difficult it is for most beginning students to view knowledge threw the prism of ideal space. The problem for this student experimenter, most likely, is that if you stop applying a (small) force to a (high friction coefficient) pencil, it will seem to the human eyeball (a low resolution detection device) to "stop immediately" (choice "A"). The correct Newtonian response is "C", "immediately start slowing to a stop" as the problem does specify the existence of a frictional force with unknown magnitude. The second way to view Q10's comments are not as an appeal to real world activities, but rather an appeal to legalism. There are legalists amongst our test takers which is made most evident by the two written comments to question #19. One response is by a student who got twenty-one out of twenty-nine on the test (72%) and the other respondent got twenty-five out of twenty-nine (86%). "If man pulls harder than boy", was the response of the first. "Assuming they pull different amounts" was the response of the second. Both students correctly answered "B", there by implicitly accepting the assumption they questioned, that in fact "a large man" does pull harder on the crate than "a boy" does. The students thus get both their politically correct disclaimers and credit for their politically non-correct answer. As a small aside, few students proved both double responses and written comments, with only six responders doing both. The amount of time required to take the test varied considerably with the twenty-nine minute average based on a subset of students who provided the requested finishing time on their papers. Harvard University data in the original FCI paper, lists an average of twenty-three minutes
17.2 : Table B, A New Format
Table B is the same data as Table A in a more analytically friendly format and is the foundation for the rest of this chapter. Table B is hierarchical with the best student on the left, the worst on the right. The easiest question is at the bottom, the hardest at the top. The letter responses have been converted to blanks for correct responses, and to misconception codes for incorrect responses. The numbers in parenthesis are the number of blanks. Note, question mark (?) is a code in modified Table II, and some response grids hold two or three misconception codes because some MCSR items have more than one associated misconception code. Before doing my own analysis I'd like to tie in to three ideas presented by Hestenes et al. in their original FCI paper. The first of these is their comment that Q5, Q9, Q12, Q22, and Q18 "lend themselves to analysis by force diagrams." This is odd given the literature's stance that the FCI does not require physics formalism, and is valuable as a pre-instruction assessment tool. No student in group JZ used force diagrams on any question. Thus, it is not surprising to find four of these questions in the upper (difficult) half of Table B. These five questions all deal with the Newtonian Concept 5G (Gravitational Force), but only Q5 does so uniquely. I1, I3, and CI1 look like strong distracters against 5G. Unfortunately, only I1 is used in multiple questions and then only twice. So on the face of it, it is difficult to know how much the lack of force diagrams has to do with the results, but Chapter 3 of this Data Analysis highlights the lack of force diagram usage by more experienced students on the MB. There also, questions associated to the use of force diagrams were found to be among the more difficult. In part, this difficulty can be traced to rare usage of force diagrams with incorrect diagrams predominating those drawn. Hestenes et al. labeled Q19 and Q29 as "weak discriminators, so they could be dropped from the test." It is possible to arrive at correct answers for both via non-Newtonian beliefs. Q19 is in the easiest quadrant of Table B (79%). Excluding the five blanks, Q29 is also quite easy with a 74% pass rate. The down grade in Q19's value is particularly regrettable because it is the only question that pits the CI1 and CI2 misconceptions against each other, with a clear win for CI2. This is particularly noteworthy in comparison to Q18 where CI1 predominates over AF1, G1, and the three concept Newtonian response. Q29 also would be more valuable were it not undermined by the authors' comments particularly if its response "E" were altered to match student misconception AF1. Currently response "E" is not picked by any student and is associated to misconception I4. Were it altered, Q29 would make for an interesting comparison with Q15 given the common Newtonian Concept 5S, and their current relative positions. Finally Hestenes et al. speak to the "persistence" of 3B and 3D test responses both of which match to student misconception G3 (heavier falls faster), and then compare Q3 to Q1. Group JZ data (Table B) confirms the vast disparity between Q3, and Q1 results if not the persistence. The cause of the disparity is not evident in the coding of the questions. Both questions have 5G alone as the correct Newtonian concept. Both have [G3] and [?] as the only two student misconceptions. So, by labeling alone these two questions are identical. Yet the results are not. After looking at the questions themselves, I believe Q3 requires additional knowledge about vector decomposition and the effect of orthogonal forces on motion that Q1 does not require. Thus, we are warned that questions which are labeled as exactly the same, may well not be. See pairs Q1& Q3, Q27& Q8, and Q14& Q13, pairs which show marked disparity in results, yet are labeled exactly the same.
17.3 : Table C, Misconception to Question
Table C is the same data as Table B presented by question and misconception. The numbers are the count, out of forty-three students, who share that misconception on that question. I use Table C by highlighting a given Newtonian concept, and I use it by question and misconception as well, although not by student as that data is lost in this format. A computer could make each number a vector of student identities and allow individual student Table Cs to be generated. Unfortunately such effort to identify individual initial knowledge structures is unwarranted as we shall see. Blanks in this table are unavailable options and zeros are available but unselected options. All rows will not add up to forty-three, even remembering the Newtonian thinkers, because a single MCSR item can be associated to more than one misconception. Some misconceptions were chosen often, such as I3 (forty-eight times), but had low percentages (16%), because of their high prevalence in the FCI (301 opportunities). Other misconceptions such as G3 had high percentages (33%) which are diminished in value because of the absence of competing misconceptions. In the case of G3, it only competes with the Newtonian concept code, [5G], and the misconception code, [?]. In contrast, I3 competes with four Newtonian codes (5G, 1, 0. and 5S), and with nine misconception codes ( I5, CI2, CF, I4, R1, G5, I1, AF2, and G4). Further complicating the issue, I3 is offered on seven questions and thus is a high candidate to have been picked by students engaged in random selection (guessing). G3 is offered only on two questions, but is associated to two responses on each question. In seeking student knowledge structures, Tables C and B work in tandem. From Table C, I might hypothesize that a student who believes G3 (heavier objects fall faster) at least has the possibility of also believing AR1 (greater mass implies greater force). Both do not directly compete against each other, both have decent percentages, and don't sound automatically self conflicting. Going back to Table B, we find that of the twenty-two students who picked G3 on Q3, only 28% (25 out of 88) picked AR1 when given the opportunity. Given that 20% is random guessing, 28% is not a call to arms. Worse, of the twenty-two students who picked G3 on Q3, only four chose G3 when given the opportunity to do so again on Q1 (18%). To complete the circle, of the six students who chose G3 on Q1 only four also chose it on Q3 (67%). This example highlights the fundamental result that I was unable to determine an alternative belief structure of student misconceptions that is strongly held and in competition with the Newtonian. Adding a third misconception such as K2 (velocity-acceleration undiscriminated) to tie AF1 and G3 together, reduce the results to unity. There are four students (S32, S42, S16, and S5) who chose AF1 in Q15, G3 in Q3, and K2 in Q21. These four had the opportunity to pick AF1 five more times each (Q18, Q13, Q22, Q11, and Q14) for a total of twenty opportunities; they picked it four times. These four students had the opportunity to pick G3 again on Q1, they did so zero times. They had the opportunity to pick K2 again on Q20 and did so twice. Not the definition of an alternative belief structure, even though S6 would be interesting to interview. The problem is if each individual has a unique structure made of common components, the teacher is reduced to addressing the most prevalent components piecemeal. Table C is particularly useful in test analysis. It shows, with some specific highlighting, what distracters work best against a specified Newtonian concept and more importantly which do not. For example, student misconceptions code I3 (impetus dissipation) works well as a distracter in gravitational force problems, but fails to distract any students in kinematics problems, Thus, Q23 choice "E" would be a candidate for alteration to some other misconception code if the question itself could be so altered. This however is going far beyond my focus on the initial knowledge state of the student so I will leave this for now and introduce Table D in the upcoming section
17.4 : Table D, Misconception to Large FCI Divisions
Table D is the result of misconception correlated to the larger divisions of the FCI. It is useful in conjunction with the previous tables, but primarily highlights the structure of the FCI itself rather than possible student states. One example is provided here with the remainders regulated to the last part of this chapter. It's evident from Table D that misconception G5 in a valid distracter for the Newtonian concept, Gravitational Force [5G]. G5, however is useless as a distracter against Kinematics [0]. The previous table allows misconceptions to be matched against specific questions. Q23 is the only question of interest, and item 23E is our focus. In this item, combined misconception G5 and I3 are pitted against the combined Newtonian concept of 5G and 0. Do not confuse 5G and G5; misconceptions always start with a letter and Newtonian concepts with a number]. G5 and I3 lose completely ( a null result). The original FCI paper's data also show that 23E is rarely to never chosen. MCSR items are precious resources. Another misconception should be allocated to item 23E. In the event, reading Q23 results in a recommendation to make 23E the mirror image of response 23C, but the following analysis indicates the desirability of finding some practical method of assigning misconception AF6 to item 23E. Q23 is the only question that combines Newtonian concepts 5G and 0. Misconceptions G5, CI2, AF6, I3, # and ? Are the misconceptions that overlap with both 5G and 0 at some point. G5 and I3 are null results for Kinematics [0], but do well against Gravitational Force [5G]. CI2 is the reverse it does well against 0, but does very poorly against 5G. AF6 effects both equally on separate questions (leaving aside the issue of the 2nd Law [2]). As AF6 seems to affect both 5G and 0, using it as a distracter in a question that combines 5G and 0 makes more sense than a I3/G5 combo nobody picks. Going back to the question 23 itself, reveals 23E, is a very odd choice indeed. Yet, the original FCI claims both G5 and I3 as valid useful distracters to kinematics which they are not. One of the major claims of the FCI is that it assesses thirty specific misconceptions, not merely twenty-three divisions within the Newtonian force concept. Literature implies that students structure their misconceptions. To find these misconception structures, a test must be designed to pit misconceptions against each other not just against valid Newtonian concept. While null results are valuable doing test development, and even should be reported; they are much less valuable in general test application. MCSR tests are not a substitute for individual interviews but rather seek the frequency of known misconceptions in a large population. The precious MCSR items should not be wasted on known statistical nulls. Given the noise from low responses, the incomplete competition between misconceptions, and the large disparity of misconception availability, finding a structure, if such exists, is beyond my abilities. I have instead found changes that I'd like to make is the FCI and some all but random connections more notable for these oddity than for their utility. Still in an effort to show some of what I have gleaned from these tables, I will present some additional insights in the last part of this chapter.
17.5 : Additional Insights
In this Table B macro analysis, I drew lines at the junctions of Q13&Q5, S43&S17, Q23&Q17, and S26&S10. These roughly correspond to 25%, 50%, 25% divisions of axes and result in nine partitions. Q15 thru Q13 are labeled hard (H). Q17 thru Q10 are labeled easy (E). All other questions are labeled medium (M). S34 thru S43 are labeled Newtonian thinkers (N). S10 thru S32 are labeled non-Newtonian thinkers (non-N). All other students are labeled average(A). Numbers are correct percentages: NH = 61, AH = 31, non-NH = 06, NM = 91, AM = 58, non-NM = 27, NE = 98, AE = 80, and non-NE = 59. In Table B, highlighting empty boxes left and right yields the insight that questions: 05, 21, 07, 20, 17, 04, 14, 01, and 10 are guaranteed to be gotten correct by Newtonian thinkers. Questions: 22, 25, 11, 08, 29, 16, 23, 19, and 27 are almost so guaranteed. Newtonian thinkers in the original paper are students with 80% mastery or 23 out of 29 questions correct. I use the same standard. The original paper labels 60% or 17 out of 29 questions as a minimal mastery, my non-Newtonians are at the 40% level (12 out of 29). Highlighting the filled boxes right to left makes evident that non-Newtonian students will fail to get questions: 18, 3, 13, 5, and 22 correct. They will have a below random chance at questions: 15, 28, 24, 09, and 07. They will finally achieve a random chance at questions : 2, 26, and 11. Single questions able to discriminate between Newtonian and non-Newtonian students are question 5, 22, 11, and 7. If a student gets Q8 wrong, they will get Q3 wrong. Q8 is assessing Newtonian concept [1]. Q3 is assessing Newtonian concept [5G]. A choice of misconceptions I3 or I4 on Q8, guarantees a choice of misconception G3 on Q3. If a student gets Q20 wrong, he will get Q3 wrong. Q20 is assessing Newtonian concept [0]. A choice of K1 or K2 on Q20, also guarantees a choice of G3 on Q3. Interesting, but if a student gets Q8 wrong, he should also get Q1 wrong as it also is a [5G]. This is not the case. The same applies to Q20 and Q1. The relationship does not work the other direction. Getting Q1 wrong guarantees nothing about either Q20 or Q8. Artifacts of the non-correlation between Q1 and Q3in the first place. Q1 and Q3 are not correlated despite having the exact same Newtonian concept and misconception choices. Even though Q1 and Q3 are labeled the same, they aren't, as the results show. I believe Q3 requires an understanding that forces in orthogonal directions are independent, which Q1 does not. Further, if I3/I4 correlates to G3 and K1/K2 correlates to G3. It is reasonable to look for correlation between I3/I4 and K1/K2. None exist, although this is muddied by the lack of any common competitor. Misconception AF1 is not self correlated across six questions. This is odd because AF1 was the most prevalent answer to the hardest question, Q15. This implies people prefer AF1 to both Newtonian concept [5S] and misconceptions [0b, ?]. AF1 (rating 15%) is never pitted against 5S again and does badly against Newtonian concept [3], and misconceptions: AR2 (29%), I2 (12%), and CI1 (28%). On the other hand, in the only other question in which 5S is alone (Q29), 5S does quite well against misconceptions: I3 (16%), R1 (22%), and AF2 (14%). So why should AF1 kick 5S butt (Q15) and AF2 fail to have a similar impact on 5S (Q29). Particularly as they have essentially the same rating? Is it because of intrinsic differences between AF1 and AF2, or is it the presence of the R1 distracter in one question and not the other? Perhaps all 5S questions are not equal. Q15's 5S is solid contact impulsive and Q29's 5S is solid contact friction opposes motion. Further diluting AF1 as a prime candidate as an alternative belief, is the odd fact that a student who picked AF1 on Q15 [5S] is twelve to three more likely to pick AR2 over AF1 on Q13 [3]. AR2 is the most consistent of distracters. It is applied against 3rd Law problems only (Q13, Q11, Q14) and competes against the same misconceptions (AF1, AR1, and Ob). If you pick AR2 on Q14 (rating 79%) there is a six out of seven chance of picking it on Q13 (rating 40%). This strong relationship does not hold both ways. Student who pick AR2 on Q13 are more likely to get Q14 correct than to pick AR2 again at a ratio of fourteen to six. So like Q1&Q3, Q13&Q14 are labeled the same but have disparity in the results. Q13&Q14 are mentioned in the original FCI paper but only to say that they "appear different to most students". First note: Ob is a worthless distracter for 3rd Law problems. Reassigning these MCSR items to other distracters would benefit internal analysis as long as awareness of Ob results was not lost. Second note: Q2 is the fourth 3rd Law problem and is paired with Q11 as "Impulsive Force". Q2 does not use AR2 as a misconception. The overwhelming choice of AR1 on Q2 does not indicate a stable misconception as AR1 is pitted against AR2 on Q13 and loses four to twenty. Further, AR1 loses again 0 to seven on Q14. The apparent balance in Q11 an artifact of AR1 and AR2 sharing a common MCSR item. Thus AR2 reigns supreme as the alternative to the 3rd Law. This supremacy hides an unknown additional factor as the rating difference between Q13 and Q14 brings to light. Third note, independent of distracter choice, if you get Q14 wrong, you have roughly a 85% chance of getting Q11, Q2 or Q13 wrong. If you get Q11 wrong you have close to a 95% chance of blowing Q2 or Q13. If you get Q2 wrong, you have a 75% chance of blowing Q13. This is odd because Q2 and Q11 are impulse form and any correlation to Q13 and Q14 (continuous force) should be the same. These correlations only work one way. For example, eight selectors of AR2 on Q13 are happy to get both Q11 and Q14 correct. I4 was a distracter for Newtonian concepts 0,1, 2, and 5S. It suffered a null response against 5S (Q29). Even leaving out Q29, there is no correlation between choosing I4 as a distracter in any two situations. Q8 and Q27 both dealt with Newton's 1st Law, both shared the same sub-category of "no force" and "speed constant". In both Q8 and Q27 the only competing misconception was I3. Thus like pairs Q1&Q3 and Q13&Q14, Q8 (ranking 65%) and Q27 (ranking 81%) are an identical pair. And again we have a ranking disparity, which in this case would be worse if you exclude the four students who never got to Q27 in the first place. Interestingly, excluding blanks, picking I4 on one of the pair, guaranteed you the other was correct! Two questions labeled as the same, with very different results. The labels seem to be, at best, incomplete. Q27 is part of a series of questions on the same subject. Thus the previous questions may have helped guide the student toward the Newtonian choice, but without student interviews this is conjecture. Radically different results on identical questions should raise all kinds of red flags and spoken to in the presenting paper, including acknowledgment of explicit but unlabeled language or math issues impacting choices in addition to the physics misconceptions. McDermott does not use the FCI, and I'd be interested in the reason. Finally, if a student picked I4 on Q24, he was most likely (83%) a Newtonian or Average student. Non-Newtonian students strongly preferred (75%) distracter C12 over I4. In a probable fluke, if you chose I4 on Q24 you were guaranteed to pass both Q10 and Q19. Having found and exhausted the only pairs of identical questions, life because more complicated but there are a few more items of interest. The non-self correlation of misconception CI2 across its six questions is hardly surprising given that these questions encompass five major divisions of force [0,1,2,3,4, and 5G]. What is surprising is that a single misconception could be viewed as a valid distracter for so wide a range of Newtonian concepts. CI2 fails to distract in Q16, Q4, and Q1. Again we argue MCSR items are actually valuable commodities and, outside of test development, should not be wasted on null responses. Misconceptions G4 or CI1 might prove more useful as distracters in Q16 than CI2 in light of Q5 and Q18. It's tantalizing in our search for a students alternative belief structure to note that students who pick CI2 on Q24 also do not pick G1 on Q12. Thus tying by a negative interference the Newtonian concepts 2/0 and 5S/5G. Unfortunately, this connection between CI2 and G1 does not hold for Q19 responses; however, Q19 does have an entirely different Newtonian concept [4]. Q22 and Q9 both have misconception I1 as a distracter and Newtonian concept 5G as part of the multiple Newtonian concepts each addresses. The competing distracters differ, but twenty students chose I1 in each case. However, only thirteen students chose I1 on both questions. Five Q9 responders chose the correct response and two picked AF1 over I1 on Q22. Three Q22 responders chose the correct response and four picked Ob/G1 over I1 on Q9. Worse, I1 is a complete null on Q16 which is a pure 5G question; it does have a different set of distracters. It would prove interesting to change Q9's G2 distracter (choice E) to a G5 distracter to see if there is an impact on I1 choice. In what are most likely flukes, the following questions correlate even though they share neither Newtonian Concepts nor distracters. If they appear in other class data, they would be quite valuable ties between misconceptions - the beginning of a system. If Q4 is wrong, Q5 is wrong. If Q14 is wrong, Q9 is wrong. Because of multiple misconceptions per MCSR item, the following is a bit vague but if you picked an item associated with I2 on Q6, you had a 91% chance of picking an item associated with I3 on Q5. The initial knowledge state of the student must be determined by some method, and the FCI is PER's flagship. I did not use the revised edition, but still some thoughts on the FCI might be worth considering. I doubt that G2 or AF3 should be included on the test given not only their null responses here but the very low responses documented in the original FCI paper. I also doubt that the FCI as it stands is a valid assessment for either G2 or AF3 anyway. AF3 is assessed once, as item 12E; AF3 is "no motion implies no force." Question 12 is the question with two correct answers and option E's "none of these?" is highly unlikely to be picked over 12A which at least acknowledges the existence of a gravitational force. G2 is assessed three times as items 5E, 9E, and 17D. G2 is "gravity intrinsic to mass". 5E is "none of the above, the ball falls back down to earth simply because that's its natural action." 9E is "gravity does not exert force on the puck it falls because of the intrinsic tendency of the object to fall to its natural place". And 17B is "falls because of the intrinsic tendency of all objects to fall toward the earth." Leaving aside arguments as to what gravity is, I see these three responses arguing against the existence of gravity particularly in light of the competing items. For example, 5D is "a constant downward force of gravity only". So 5E is with in a philosophic hair of rejecting the existence of gravity no matter whether "intrinsic to mass" or not. I'm not surprised by the null results for G2 simply because "intrinsic" or not, everybody believes in the existence of gravity. Even if I for one, would never claim to understand it. For all I know gravity is intrinsic to mass. I've never seen gravity without mass. Any two or more masses give you gravity, and spherical cows aside, I know of no truly isolate single mass with which no other masses interact. An interesting thought experiment, but I'm not a theorist. On a final note, I don't believe misconception [Ob] really needs six opportunity to be chosen; particularly as those in competition with the 3rd Law are wasted. Perhaps an AF1 choice for Q2 instead of the current Ob could be worked out. AF1 involves "active agents" and there is some uncertainty about how rigid the author's definition is, but this might work. Have one vehicle parked and on the other one moving. Thus the moving vehicle could be the "active agent" and the parked one, the non-active agent in setting up misconception AF1.
Chapter 18 : FCI Data on CCC Physics 11
18.1 : Presentation of Group PG's Raw FCI Data in Table M
Table M is the cleaned up raw data for Cabrillo Community College Physics 11 class, which is labeled PG. Physics 11 is a one semester algebra based course for students with no prior physics experience; it is used as a Physics 4A preparatory class. Clean up deals with: double answers allowed by instructor directions, blanks both adverted and inadvertent, unsolicited comments, solicited high school physics status, and time of completion. Fifteen students provided thirty-eight double answers; this was out of thirty-three students and 957 response opportunities. Out of these thirty-eight double answers, twelve had correct first answers and wrong second answers; eleven were the reverse, wrong answer first, correct answer second. Fifteen responses were wrong both times. S28 doubled up eight times, S24 five times, S33 and S13 four times each, five students twice each, and seven students doubled up on their answers once. Eighteen students did not provide double answers. Seven questions had no double answers, and eight questions had a single double response each. Twelve questions had two double responses, and two questions (Q4 and Q15) had three. There were twelve blanks. All questions had zero (18) or one (10) blank excepting Q28 which had two blanks. Only six students left blanks.S27 left the four questions on the last page blank with no indication that he had even seen the questions, further he did have time to complete the test. S6 deliberately left three blanks but with no explanations (two of the problems involved graphs). The remaining four students left one blank each, except S32 who left two. There were very few students who both double answered and left blanks. S27 and S6 offered no doubled answers; S28, S24, S33, and S13, conversely provided no blanks. There were eleven unsolicited comments by seven students. S18 and S20 provide three comments each; all other students provided one each. Q8 elicited three comments; Q27 elicited two. All other questions had zero or one comment. S18 had two comments to back up his two answers to Q8. He picked 8A "assuming no friction due to air", and he picked 8C "assuming the presence of atmosphere with friction." S20 answered 8C and then commented "* Forgot the description of friction?" And finally, S29 picked 8A and said "* IF FORCE THAT KEPT VELOCITY CONSTANT INITIALLY STILL APPLIES AFTER `KICK', THEN I THINK THAT THERE WOULD BE A SLIGHT INCREASE IN SPEED IMMEDIATELY AFTER 'KICK', THEN A RETURN TO `CONSTANT VELOCITY?'" Seven students had a high school physics class (S2, S4, S9, S20, S21, S25, and S30); one student (S17) had a "conceptual physics college class five years ago". Group PG's average is 35%; the sub-group of the above eight students has an average of 30%. The sub-group of the seven students with high school physics experience only has an average of 33%. Student interviews might revival why previous high school experience is of no statistical help at the community college level. It could be that unsuccessful high school physics students are trying to brush up on their skills by taking this class. Still, I would expect them to do better than S20 does with her Q8 comment: "forgot the description of friction". The high school group's poor showing may be that result of time passage, S21 informs us that in 1988 he got an "A"; it's now 2002. Or, high school physics classes may not all be equal, S25 with eight correct responses, "took High School Physics @ Harbor High w/ Mr. Grove [and] Got a B final grade." The average time of completion using the subset of nine students who provided their completion times is thirty minutes. On a final note, it is interesting to see S5's (twenty-two correct answers) response to the inquire about previous high school physics experience: "I have had some physics experience thru personal interest."
18.2 : Table N, A New Format
Table N is the hierarchical presentation of Table M data with MCSR items replaced by misconception codes from modified Table II in chapter 1 of this Data Analysis section. The right hand side has both Newtonian code for correct answers [blank boxes] and misconception codes available for student choice. Questions range from Q18 and Q5 which had one correct response each, to Q10 with twenty-eight correct responses. The average correct response for the questions is twelve out of thirty-three students. Students range from S5 with twenty-two correct responses, to S17 with three. The average correct responses for the students are ten out twenty-nine questions. In comparison the median for the questions was only nine not twelve, and the median for the students was nine not ten. It's painfully obvious from Table N that only the easiest questions are understood and then only by the better students. In looking at JZ's flukes (Chapter 4, part 5), if Q4 is wrong, Q5 is wrong if only because exactly one student got Q5 correct. If Q14 is wrong, Q9 is not guaranteed to be wrong (S11, S7, S27, and S26). Finally, if I2 is picked on Q6, there is a only a 80% chance at I3 on Q5 which is close to the 75% chance any Q5 responder has of selecting I3. So despite one technical match, I'm willing to forego considering then ties in a misconception belief system.
18.3 : Analysis Let's look at where the students are in this class. They do better on pure 1st Law [1] and pure Gravitational Force [5G] problems than on any other major divisions. The pure Third Law [3] problems and Superposition [4] are the worst. Q19 stands out with its 82% correct. But as the literature makes clear, Q19 is such a poorly designed problem that it was dropped from the revised FCI because the correct answer is selectable for non-Newtonian reasons. So I discount it here. The pure 1st Laws (Q26, Q8, Q4, Q27, and Q10) show strong belief in I2 and I3. They also show belief in I4 and I5. Impetus (I1-I5) is described in the FCI paper. Impetus is an inanimate motive power that keeps things moving. It must be supplied, kind of like gasoline, and the moving object must store it, and use it up the way a car does gasoline. Impetus might be more easily addressed via the concepts of energy and friction than a head on clash with Newton's First Law. Better is a relative term, and the pure 5G problems (Q5, Q3, Q17, Q16, and Q1) are scattered across the board. Q5 is tied for hardest and Q1 third easiest. Given an Impetus option, the students take it. Particularly I3, Impetus dissipation. After all cars stop when they run out of gasoline; they don't coast forever. Telling students to ignore friction is meaningless if they can't define friction (S20) and all but meaningless to the rest because students don't have frictionless real life experiences to provide a context for these idealized questions. Even television has the Starship Enterprises stop when its engines stop; in order to be moving anywhere its engines have to be on. To cap it all off the engines need fuel. The perfect definition of impetus in space which for our students is a frictionless environment, solar wind and microscopic meteorites not withstanding. Other than Impetus, G3 and to a lesser extent G4 and G5 are attractive distracters on the pure 5Gs, and G1 is picked often enough on the mixed 5G problems to warrant attention. G5 ties gravity to impetus. G4, gravity increases as objects fall, highlights the reverse problem of students being too theoretical and the problem too practical. Gravity does increase the closer an object gets the earth's surface. The problem is both a matter on scale and a decision as to what is ignorable. Physicists are not mathematicians and often, if not always, ignore variations of one part in 1013. Students and physicist don't ignore the same things. Physicists ignores friction which students don't, ditto for changes in gravity. Explicitly explaining to students, the advantages of absurd simplifications and the conditions under which physics label quantities as zero may go a long way toward having students' observable performances match our expectations. G3, heavier objects fall faster, is slightly tricky in that heavy objects are often more dense and thus, tend to have smaller surface areas; this given the real world air resistance, does allow them to hit the floor first in non-rigorous kitchen experiments. Worse, human eyeballs and reaction times in clicking timing devises are not ignorable factors in observing or timing short distance drops. The concrete referent of an evacuated transparent tube in which a feather and coin can be dropped together is very helpful in changing this belief in G3. A discussion on sky diving may also help if it is not too theoretical, and takes into account terminal velocity. G1, air pressure-assisted gravity, mixes the ideas of number density of air molecules, a directional result of gravity [more molecules per unit volume closer to earth], with the idea of pressure direction exerted by those molecules. The pressure of an air molecule is assumed to rely on its weight, much like the pressure a kid exerts on a trampoline. Thus the more kids, the more pressure in the same direction as their weights. That the pressure due to the random-walk, high-speed, motion of the air molecules, overwhelms its mere weight, is not a normal awareness as air inside a classroom seems quiescent, not in high-speed motion. A discussion of how odors cross a room may help shift the students awareness of his air-molecule surroundings. The classic crushed can experiment is also beneficial. All 3rd Law (Q2, Q11, Q13, and Q14) questions are pure and none is easy. Even Q14 is in the upper half of the plane. AR1 and AR2 are overwhelming believed. Part of the problem, is the equation F = ma. This formula implies, forces are singular, not paired, and that changes in "m" do (instead only can) change "F". F = ma strongly implies a single mass in absolute isolation can exert a force. The form of Newton's Law of Gravitation F = G(M1*M2/R2) at least implies the force is between two bodies and is the same magnitude. Multiplying M1*M2 is the same after all as multiplying M2*M1, even if M1 is not equal to M2. Thus in the Law of Gravity, the Third Law about paired forces between two objects being equal, is more believable. With F = ma, Force on, Force by, and Force net are confusing and overshadowed by the math simplicity that if I change m, then F is going to change. This implicit act of keeping "a" the same is hidden by the label "a" itself not changing to say "a (prime)", when the value of m is altered. Further the literature highlights that the words such as Force, Energy, Power, etc. that physics uses with absolute mathematical precision are viewed as synonyms in everyday speech. And it is quite true that the parties (truck and car) in a 3rd Law problem very often do not have the same kinetic energy and thus by sloppy everyday English are seen as not having the same Force. Newton's 3rd Law is merely a mantra, if these underlying math and language issues are ignored. Then there is the whole "action/reaction" definition that implies a time delay between two separate events: I hit you, you hit me back. Not I hit you and simultaneously your nose is hitting my fist, which is why boxers wear gloves. This belief in two separate, time-delayed events adds to student difficulties with the 3rd Law. After all, if the events are separate and time delayed what "memory" requires them to be equal? It's kind of like believing that dice are going to roll a seven merely because this particular pair hasn't rolled a seven in the last five attempts so now it is "due". If we say the third law is "for" every action there is an equal but opposite reaction", and if we say F = ma, then I'm very surprised anybody believes the 3rd Law, although some may remember the existence of such an oddity. Telling people physics English is not common English; telling people a single event has effected 2 objects simultaneous, and there is an effect we call force which has the same magnitude but opposite direction on each object, and that in the math of F = ma : changing "m" is not guaranteed to effect "F" because we are not guaranteed to be able to hold "a" constant; instead of "F" perhaps "a" or perhaps both change when "m" does, all of this would all help build a foundation on which Newton's 3rd Law is believable and useful. After all, in every math class this kid ever had, if not told to change "a" then it does not change. F = ma, let m = 2 what's F equal? Let m = 4 what's the answer? Trick question what's the relationship between the two Fs? There EQUAL! Yea right, last time I checked 2a did not equal 4a. The Superposition Principle actually has a pure question in Q19; as spoken to previously, it's discounted. The remaining three Superposition questions are among the most difficult questions. Leaving aside Impetus, R2 and CI1 are the big misconception winners, and they are the same thing if "resistance" is viewed as a weaker force. In reality, CI1, largest force determines motion, and R2, motion when force overcomes resistance, are actually true! In a typical one dimensional chalkboard example, the largest force does determine motion as there are only two directions possible for the forces and we never show multiple small forces overcoming a single large force. Further, certainly in our static friction examples, there is no motion until the force does overcome the resistance. So, labeling these as misconceptions is a bit much, perhaps misapplied would be the more palatable. I suspect other problems, notable the idea that you need a leftover force in order to move after balancing out any opposing forces: i.e. problems with Newton's 1st Law. Things move on their own with no outside or inside requirements. Once moving they just continue to more for NO reason. Effects need causes for most people. They have yet to forge the odd way of thinking that for Q18: constant velocity = no change in velocity = no acceleration = zero acceleration = zero force (net) = in one direction, force (up) equals force (down). This is a lot more a matter of definitions than superposition; students have to focus on the code words "constant velocity" to the exclusion of all else, and then run a string of translations. This leaves aside the recurring problem of acceleration / velocity discrimination and Force / Energy / Momentum discrimination. After all, to go anywhere one clearly need "left over" momentum, and energy and velocity, just not force or acceleration. The two remaining major divisions are the Second Law and Solid Contact Force. The 2nd Law is never pure, three times it's linked to Kinematics and once to the First Law. Solid Contact Force is pure twice and linked to 5G alone once and linked to the combo [5G] and [4] once. CI2 and CI3 are the misconceptions of choice in competition with Newton's 2nd Law [2]. CI2, force compromise determines motion, is true if "compromise" is rigorously defined vector addition. So, in my mind, this becomes almost a math issue instead of a physics content misconception. Besides it's easier to think of Q24 in terms of momentum anyway with one orthogonal component constant and the other steadily increasing from zero to significantly faster than its original component, but you still need math, not physics, to separate 24C (CI2) from 24E (correct). Belief in CI3, last force to act determines motion, is quite scary. It implies that what your doing now will not effect the future if there is any intervening action. This is not true in the real world anymore than in the idealized, so interviews with these students would be a valuable learning experience for me! By the way, I picked 24D and do not believe in I4. I do believe in scale. Given both the speed most "rockets" orbit the earth and their low thrust capabilities after discarding their boosters, trajectory 24D still makes sense to me particularly given the FCI authors comments about the scale effects of gravity (one part in 1013) being ignorable. And yes, this is the trap of having real world considerations intrude into idealized space, a point our students need highlighted as well. The point is that I chose a MCSR item for a reason other than the listed misconception of I4, "gradual / delayed impetus building". I buy into gradual and building. I don't buy into delayed or impetus. Scale considerations are only acknowledged in the comments to G3 misconceptions by the FCI authors. They are not acknowledged in the misconceptions themselves. Math and scaling causes for item selection are ignored to the detriment of the test's utility. Wrapping up, the two pure 5G, Solid Contact Force, problems do not share common misconception choices which is unfortunate. Still, it looks as if AF1 is the idea to address, particularly as even the best students picked it. AF1, only active agents exert forces, means according to the FCI authors that "usually living things... by direct contact... cause motion". If a student selects 15A or 15B, then according to Table V, he believes in AF1. The problem is 15A and 15B, do not involve a "living thing", they do not explicitly specify direct contact as do 15C and 15D, and causing motion and changing the direction of motion are not automatically the same thing. The above alone is a slim reed but 15A "energy of the ball is conserved" and 15B "momentum of the ball is conserved" seem to have many other reasons to be picked than AF1, not least being momentum of the system is conserved and is usually the way these bouncing ball problems are done. Further, given that the ball and floor don't break apart or stick to each other, conservation of energy is true. So in 15A students may forget vectors or that Energy are not a vector. In 15B, word games (important ones) between "system" and "ball" may be the root cause of error. Or the implied time delay in 15C may cause some to shy away from the Newtonian response "...stops ...and then..." without student interviews I'm leery of buying into AF1 as the root cause for 15A or 15B selection.
Chapter 19 : Conclusion to the Data Analysis
I was unable to find a few common unifying mental models comprised of a limited and patterned set of misconceptions. I thus take refuge in Redish's Paper 30 in which he advances the Individuality Principle: "each individual constructs his or her own mental ecology, different students have different mental models for physical phenomena...." Analysis of individual students is very useful for teacher-student interactions. Were I teacher JM, knowing that S10 completly disagreed with expert opinion on the Independence Cluster and that S01 completly agreed with expert opinion on the Math Link Cluster, would greatly enhance my ability to teach S10 physics and to use S01 as a peer instructor. Written work and comments, or their absence, on these multiple choice tests provides insight into the state of the class as a whole. Were I teacher CF, the general lack of force diagram use by my third-semester, calculus-based-physics students on the MB would be the catalyst for both explicit instruction on test taking techniques and a pop quiz on force diagrams. The pop quiz would be to test my assumption that the students know how to construct force diagrams but were lolled by the multiple choice format to expend minimal effort. After all, I could be wrong, a majority of students may have wanted to use force diagrams and been unable, or they may have not recognized the applicability of force diagrams when lacking explicate instructions. So, while I was unable to categorize students by a few, well-defined, non-Newtonian, mental models, the tests proved to be quite beneficial as pre-instruction reality checks.
Part III : The Thesis Chapter 20 : Introduction to a Thesis
I have no desire to repeat my previous introductions. I needed to write Part III both to learn the material and to enjoy my learning. Having read Part I is as thorough an introduction as possible on the material. If you chose to dig into the meat of my effort, thank you. May you find some diamonds in your future mines (credit to D. Goodstein).
CHAPTER 21 : Assessment Tools
20.1 : Paper 01, Force Concept Inventory Following this introductory paragraph is a review of D. Hestenes et al. paper, "Force Concept Inventory", commonly referred to as the FCI. The FCI is the touchstone of PER literature; it is used in a large majority of papers which address Newtonian Mechanics. Hake's average normalized gain, <g>, uses pre- and post-instruction FCI scores in its derivation. <g> is then in turn used quite extensively to judge the effectiveness of various curricula. Specifically, it is the main criteria by which Interactive-engagement (I.E.) instruction is judged to be more effective than Traditional instruction. Thus a large portion of current PER results rely on the soundness of the FCI. There are criticisms of the FCI, notably that it is a qualitative, not a quantitative, test and that its use of normal everyday language, instead of physics formalism, leads to some ambiguity. There was a minor revision some years ago, with the most up-to-date version distinguishable by having thirty questions versus twenty-nine in the original. The change has had no noticeable effect, unlike the first shift from Mechanics Diagnostics (MD) to the FCI. The FCI is unique in PER in that its distractors [wrong multiple choice answers] have explicit meaning and are commonsense non-Newtonian widely held beliefs. In fact, analysis of wrong answers can tell an instructor a wealth of information about his students. It is this, rather than <g>, that makes this test most useful to an individual instructor.
D. Hestenes, M. Wells, and G. Swackhamer; "Force Concept Inventory," The Physics Teacher Vol. 30, 141-158 (1992)
The FCI was designed to improve on the Mechanics Diagnostic test (MD). The FCI has a systematic and complete profile of the various non-Newtonian misconceptions as they relate to force and Newton's Laws. The authors believe 80% on the FCI to be the "threshold" score for Newtonian thinkers. Furthermore, data suggests that an FCI score below 60% indicates the student's grasp of Newtonian concepts is insufficient for effective problem solving. Additionally, data suggests that student scores are unlikely to surpass their teacher's score. It is noted that "The FCI score should be viewed as an upper-bound on a student's Newtonian understanding" as interviews show that students sometimes choose Newtonian choices for non-Newtonian reasons. The FCI assesses the student's overall grasp of the Newtonian concept of force. It decomposes this concept into six components: Kinematics, First Law, Second Law, Third Law, Superposition Principle, and Kinds of Force. The FCI can be used as a diagnostic tool, for evaluating instruction, and as a placement exam for advanced college courses. The FCI is not an intelligence test, it is a probe of a belief system. It should not be used to place students in regular versus honors high school classes. Additionally, there is no correlation between scores and socioeconomic ranking of the schools used during test validation. The fundamental reason the FCI is not "just another physics test" is that the incorrect choices are correlated to specific misconceptions. These misconceptions are very important as they must be overcome and replaced by the non-commonsense Newtonian thinking before the student is asked to progress further in his physics education. Otherwise, the student is building his educational edifice on a foundation of sand. The FCI probes 28 distinct misconceptions in six major commonsense categories: Kinematics, Impetus, Active Force, Action/Reaction Pairs, Concatenation of Influences, and Other Influences on Motion. These 28 misconceptions are matched to corresponding incorrect answers in Table II of paper 1. A few examples of misconceptions are: velocity-acceleration undiscriminated, circular impetus, no motion implies no force, largest force determines motion, mass makes things stop, and heavier objects fall faster. These errors are commonsense misconceptions which are reasonable hypotheses grounded in everyday experience. Some of these errors were firmly held by Galileo and even Newton. That these misconceptions are, in fact, false is often NOT easy to prove. There is a section on Overcoming Misconceptions which stresses the "unitary concept of force", and that the instructor should anticipate the 28 misconceptions. The instructor should discuss specific misconceptions, focus student attention on crucial issues, and bring discussions to a "satisfying closure". The authors argue that effective instruction requires more than dedication and subject knowledge; it requires technical knowledge about how students think and learn. There is the additional note that the misconceptions most difficult to overcome are the impetus concept of motion and the dominance principle. The FCI was given to 1500 high school students and 500 university students during validation, primarily in the state of Arizona. This study indicates that attitude, not intelligence nor mathematical competence, is the prime cause of greater achievement. In the case of Arizona high school students, their attitude is primarily attributed to family influence. This pre-Hake paper argues that post-test results are essentially independent of pretest scores and reiterates an earlier paper's findings that for conventional instruction [Mechanics Diagnostic] post-test scores are instructor independent. The paper goes on and discusses the Wells method. This method is computer-based, laboratory oriented instruction with no lectures and is not compatible with large lecture class format used in college introductory physics classes. The Wells method did result in comparatively high post FCI results, but it had not yet been successfully used by other instructors at the published date of paper 1. At Harvard, the average time to take the FCI was 23 minutes. At Arizona State University 40 minutes was given for test taking. Most high schools gave a full 50 minute class period.
21.2 : Paper 02, Average Normalized Gain; <g> Following this introductory paragraph is a review of R.R. Hake's paper, "Interactive-engagement versus traditional methods...." The main tool used to show Interactive-engagement's (I.E.) superiority over Traditional Methods in physics instruction is the average normalized gain, sometimes referred to as the "Hake factor" and symbolized by <g>. Paper 2 is unusual in that its sample size is six thousand students, whereas most PER data is derived from a few hundred students. Those papers based on interview data sometimes focus on only ten or twenty students. <g> takes into account fluctuations in FCI pretest scores; these scores fluctuate significantly, contrary to what was asserted in the original FCI paper. Hake also uses a couple of additional tests to support his basic thesis that I.E. instruction is better than Traditional, but this paper's basic tool, <g>, is what future papers will most often use themselves. I should also point out that the fairly clear distinction in this paper between I.E. and Traditional becomes blurred in future work. For example, Interactive Lecture Demonstrations (ILD) "enhances" traditional lectures and some I.E. methods such as Studio Physics at Rensselaer are no "better" than Traditional methods. Both of the above judgments are based on the use of <g>.
R. Hake, "Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses," Am. J. Phys. 66(1), 64-74 (1998)
Each student gets a "g" which is FCI post-instruction test score minus FCI pre-instruction test score all divided by the quantity: 100% minus FCI pre-instruction test score, where all scores are in percentage form. Each class gets a <g> which is the same math with each score replaced by its class average. Each type of instruction (I.E. and Traditional) get a <<g>>NP which is the average of <g> over N courses of type P. The basic result is <<g>>14T = 0.23 ± 0.04 and <<g>>48IE = 0.48 ± 0.14. Thus, I.E. is a two sigma winner over traditional methods. The next big issue is how I.E. and Traditional are defined in this paper, but there are several smaller issues to be addressed first. The greater standard deviation for I.E. is attributed to "the variety of I.E. options and the varying effectiveness of implementation". There exists a wide range of course average pretest scores, 18% through 71%. Data gathering was biased in favor of high gaining classes. Due to the self reporting nature of data collection, low gains are usually neither published nor communicated. To increase statistical reliability, averages over courses are limited to those courses enrolling twenty or more students. The use of average normalized gain, <g>, instead of average post-test scores or average gain is defended in this paper. Interactive Engagement (I.E.) is defined as:
methods as those designed at least in part to promote conceptual understanding through interactive engagement of students in heads-on (always) and hands-on (usually) activities which yield immediate feedback through discussion with peers and instructors.
Traditional methods are defined as relying "primarily on passive-student lectures, recipe labs, and algorithmic-problem exams." For paper 2, 6548 students in 62 introductory physics courses were divided into forty-eight I.E. courses and 14 Traditional courses. The paper mentions many I.E. programs by name, such as: Overview Case Studies , Active Learning Problem Sets , Concept Tests, S.D.I. labs, and Minute Papers. While many of the I.E. courses have lower enrollments, four are highlighted as having enrollments of over 200 students. These four make use of collaborative peer instruction and employ undergraduate students to augment the instructional staff. At Indiana University where Hake teaches, I.E. activities include team teaching, a "Physics Forum" open 5-8 hrs/day, color coding of displacement, velocity, acceleration, and force vectors in all components of the course, and the use of grading acronyms. The key to <g> is the FCI, and Hake spends some time addressing FCI-specific issues. He contends that "most physicists would probably agree that a low score on the FCI / MD indicates a lack of understanding of the basic concepts of Mechanics". He points out the existence of pro and con arguments as to whether a high FCI score really is an indicator of attaining a unified force concept. He then notes that whether yea or nay, the literature labels the FCI as the "best test currently (1998) available." He addresses five systematic errors that are "sometimes involved" in FCI testing. First, question ambiguities and isolated false positives. The revised FCI (1995) was used, though with "little impact" on <g> . Interview data suggests this first issue is rare, and these errors would effect both I.E. and Traditional classes equally; thus they have little effect on the differences in their gains. Second, teaching to the test and test-question leakage. Buried in his reference 48 is his belief that both the FCI and MB (Mechanics Baseline) will have "useful lives" only until 1999 and that new and better tests are both sorely needed and should be treated with the confidentiality of the MCAT. That being his belief for the future, in the paper itself, he argues qualitatively that as of 1998 this issue is not yet a problem. Third, courses spend varying amounts of time on mechanics; the assumption is that students would show higher Hake factors in courses devoting more time to mechanics. Hake does an institution comparison of course time spent on mechanics and finds that the gain difference between I.E. and Traditional courses is "not very sensitive" to this possible systematic error. Fourth, a failure of students to do their best work on the pretest can artificially raise <g>. A failure of students to do their best work on the post-test can artificially lower <g>. Hake asserts that students did in fact take both tests seriously in part due to responses on instructor surveys and in part because <g> showed minimal fluctuations whether or not explicit grade motivations were offered by the instructor. Fifth, there are effects which produce short-term benefits independent of any intrinsic worth of instructional method. Two of these are: The Hawthorne effect, where the research test group benefits simply by receiving special attention, and the John Henry effect, where the control group benefits by a competitive desire to out perform the test group. The Hawthorne effect is discounted for the I.E. groups by taking a subset of long standing courses and comparing its average to the total groups. The John Henry is ignored as it would only further increase <<g>>48IE - <<g>>14T. While noting that there was no "reliable quantitative estimate of the influence of systematic errors", the general uniformity of results suggests that the 2-sigma difference in average normalized gains between I.E. and Traditional courses is primarily a reflection of pedagogical effectiveness and/or implementation. There are four additional points to raise briefly. First, Hake also uses the Mechanics Baseline (MB) test. The MB is a more quantitative test that the FCI and is usually given as a post-test. The MB and FCI scores show an extremely strong positive correlation, with MB scores about 15% below FCI averages. The MB has its detractors. For some, it does not sufficiently probe advanced abilities required in "context rich", "experiment", "goal-less" or "out-of-lab" problems. For others, its problems are not sufficiently similar to those in Halliday-Resnick. Second, all of these tests are multiple choice, with random guessing yielding a score of 20%. However, it is quite possible for non-Newtonian thinkers to score below 20% on the FCI, due to "very powerful interview-generated distractors". Third, Hake concludes his paper with four sections that strongly advocate the use of I.E. methods. In summary, "The use of I.E. strategies can increase mechanics course effectiveness well beyond that obtained with traditional methods". Finally fourth, in an epilogue, he speaks to his fear that history shows that the best of efforts may have little lasting impact. Hakes paper by virtue of its 6000 student base and its introduction of <g> as a tool for instructional comparison, is one of the most referenced PER papers.
21.3 : Paper 03, Mechanics Diagnostic Following this introductory paragraph is a review of I. Halloun and D. Hestenes paper, "The initial knowledge state of college physics students". D. Hestenes is an author of both the FCI paper and the Mechanics Diagnostic, which is the FCI's forefather. The most interesting thing about paper 3 is its inclusion of a Math Diagnostic test in its Competence Index. This is the last paper (1985), of which I am aware, in which math is explicitly incorporated into a definition of physics competence. A large majority of PER is qualitative in nature and a large portion of it has an implicit anti-math stance. As an example, P. Lindenfeld in his "Format and content in introductory physics" letter published in Am. J. Phys. Jan. 2002, states "My own opinion is that we are spending too much time trying to improve the mathematical facility of our students. I am not sure that to do so is our primary responsibility." In fact, it seems to not be our secondary responsibility either. Lindenfeld equates teaching physics through mathematical procedures to teaching poetry "by parsing, analysis, and rhyme structure, with little energy left for thought and substance." He is, in my opinion, a representative voice for a large segment of PER. In the MD paper, math ability counts for half of the Competence Index, an acknowledgment of the language of physics which I wish was emulated by others. Part of math's devaluation is that while the MD is included as an appendix to their published paper, the math diagnostic test was not. Worse, there is a belittling comment that the reader could make up his own. As an aside, there is a Math Project at UCLA that provides validated multiple choice math diagnostic tests for free and provides free grading. Tests are available for all subdivisions of math including algebra and calculus. The following short review is not comprehensive since the MD has been superseded by the FCI. Nevertheless, certain aspects should be highlighted. The reader is cautioned that the MD and the MB are totally different tests.
I. Halloun and D. Hestenes, "The initial knowledge state of college physics students," Am. J. Phys. 53(11), 1043-1055 (1985)
Each student entering physics possesses a system of beliefs and intuitions about physical phenomena derived from his life. This is his common sense theory of the physical world. The student uses it to interpret what he uses and what he hears in his physics course. Conventional physics instruction fails to take this non-Newtonian common sense theory into account. Because common sense theories are both non-Newtonian and very stable [conventional instruction does little to change them], students systematically misinterpret material in introductory physics courses. This discrepancy between common sense theory and Newtonian physics is brought to light by "the Instrument". The Instrument assesses the student's Basic Knowledge State by two tests. The first is a physics diagnostic test which assesses the students qualitative concepts of common physics. The second is a math diagnostic test to assess the student's mathematical skills. The Mechanics (physics) diagnostic test assesses the student's qualitative conceptions of motion and its causes; it also identifies common misconceptions. The early versions required written answers and the most common misconceptions were selected as the alternative answers in the final multiple choice version. Particular items were chosen to highlight the major differences between common sense and Newton concepts. The basic kinematical items are position, distance, motion, time, velocity and acceleration. The basic dynamical items are force, resistance, vacuum, and gravity. The Instrument also includes a thirty-three question multiple choice mathematics diagnostic test which is not included paper 3. The authors note that the math errors were not completely random and that error patterns indicated common misconceptions, but these were not analyzed. They note that MD and math pretests assess independent components of a student's initial knowledge state and that high mathematical competence is not sufficient for high performance in physics [there is no comment as to whether it is a necessary condition]. The Instrument was given to 1500 Arizona State University students and eighty high school students, distributed among six professors and one teacher. The very low pretest scores for high school students led the authors to state that "physics instruction in high school should have a different emphasis than it has in college... the low scores indicate that [high school] students are prone to misinterpreting almost everything they see and learn in a physics class." Furthermore, given that the styles of the four lecturers in university physics vary widely within the formula of conventional instruction, and that all four basically had the same knowledge gains, the authors conclude that "basic knowledge gain under conventional instruction is essentially independent of the professor." The instrument is recommended for use as a placement exam, to evaluate instruction, and as a diagnostic test. Its validity and reliability were addressed by a variety of means including: giving it to graduate students, interviewing undergraduate takers, and performing the Kuder-Richardson test of reliability. It was found that differences in academic background have small effect on performance in introductory physics, with no effect found deriving from gender, age, academic major, and high school mathematics differences. The Instrument generates the Competency Index (C.I.), which is a practical measure of the student's knowledge state . The C.I. is the raw pretest scores of both diagnostics tests added together for a maximum of sixty-nine. It was found that end-of-class grades correlated strongly to the pre-instruction C.I. scores. Students who had a C.I. < 30 had a 95% chance of getting a C or worse. The authors argue that such C.I. recipients should be "considered candidates for a pre physics preparatory course." The overall small gains and low pretest scores mean that many students continually misunderstand the material presented. In particular, a low score on the MD does not mean simply that basic concepts of Newtonian mechanics are missing; it means that alternative misconceptions about mechanics are firmly in place. "To ignore the initial common sense knowledge in physics instruction is akin to ignoring initial conditions in integrating differential equations."
21.4 : Paper 04, Mechanics Baseline Following this introductory paragraph is a review of D. Hestenes and M. Wells, "A Mechanics Baseline Test." This comparatively short paper presents the FCI's stepbrother, the Mechanics Baseline Test (MB). The MB appears more quantitative than the FCI and uses more physics formalism. It is used almost exclusively as a post-test. In the literature, it is often used to show that a focus on concepts and less in-class problem solving are not detrimental to student results on a quantitative test. Unlike its brother, the multiple choice distractors are not value laden. As already mentioned, please note that the MB is the not same thing as the MD.
D. Hestenes and M. Wells, "Mechanics Baseline Test," The Physics Teacher Vol. 30, 159-166 (1992)
The Mechanics Baseline Test is a universal, basic, mechanics concept assessor of student understanding. There exists an extensive baseline of post-instruction scores which allows for evaluating and comparing the effectiveness of instruction. The best use of the test is for post-instruction evaluation. The MB is a step above the FCI in assessing mechanics understanding. The FCI was designed for students without formal training in mechanics, to elicit their preconceptions. The MB emphasizes concepts which require that formal training. The MB and the FCI are complementary and together provide a "fairly complete profile" of Newtonian understanding. The main intent of the MB is to assess qualitative understanding although it looks like a conventional quantitative test. Unlike the FCI, the MB distractors are not "common sense alternatives" although they do include "typical student mistakes". Problems that can be solved by plugging into a formula were excluded. The MB is not easy. Students at all levels get low scores. This despite less than a third of the questions requiring algebraic manipulation or more than one step reasoning, and despite the exclusion of advanced topics such as angular momentum. Tables highlight the Newtonian Concepts on the MB, provide the correct responses, and provide the percentage of correct answers by various groups (mostly Arizona high schools). The MB addresses kinematics thoroughly with twelve questions. The MB addresses the conservation laws for energy and momentum in both the work-energy and the impulse-momentum forms. It also has subsets of questions addressing Calculation and Diagrams. Widespread deficiencies in the qualitative understanding of acceleration were found among both students and among university introductory physics instructors! The MB is graphically referenced to post-instruction FCI data with the authors asserting that a good score on the FCI is a necessary but not sufficient condition for a good score on the MB. Sixty percent on the FCI is a "conceptual threshold" needed for effective problem solving on the MB, and eighty percent on the FCI is a "mastery threshold" required to achieve more than eighty percent on the MB. This paper ends with a "Note added in proof" by Prof. Eric Mazur of Harvard University, who outlines a procedure of interacting with his lecture class that raised his FCI / MB class averages from 77% / 60% to 85% / 72%. While the end of Paper 4 is a good segue into ConcepTests and Peer Instruction, we aren't yet finished with the assessment tools by which most of the literature judges its success. The upcoming review will be our last Mechanics test. It will be followed by some tests focusing on other aspects of physics including E&M and Thermodynamics. I chose to place assessment tools at the beginning of Part III because what tool you use to determine the success or failure of your procedure is almost as critical as how you define success in the first place.
21.5 : Paper 05, Force and Motion Conceptual Evaluation Following this introductory paragraph is a review of R. Thornton and D. Sokoloff's paper "Assessing student learning... the Force and Motion Conceptual Evaluation..." (FMCE). Paper 5, published in 1998, seems to be the successor of the 1992 FCI. Its use would address some, if by no means all, of Hake's reference 48 issues. Notably, if the useful life of the FCI and MB ended in 1999 then despite its nonsecure status the FMCE should be their replacement. This does not seem to be what's happened. The FCI and <g> are currently (2002) in use, and I have rarely seen usage of the FMCE. In particular, I have found no paper correlating FMCE scores to FCI scores, which would allow FCI-based papers to still be useful after such a transition. I suspect it's a feet and meters kind of thing; even if a conversion scale existed, an education community comfortable in its traditional ways feels no need to transition to the FMCE despite the prevalent use of FCI test questions in other forums. For example, the FCI question 10 was used as a concept test in a Cabrillo Community College physics class last semester. Further, an instructor at Cabrillo, who gave the FCI as a pretest this semester, promptly handed it back, after grading, so students could study it. This is what he does with all of his tests. So, while I do not believe there is a concerted effort to bias FCI data, the unconcerted result is concerning. The FCI is a ten year old public test with really neat questions that instructors find useful and interesting. The FMCE, on the other hand, is basically an unknown. It lacks explicit meaning for its distractors, and it lacks the equivalent of Hake's 6000 student survey which brought widespread respect to the FCI. In an effort to increase awareness, I present the following review.
"R. Thornton and D. Sokoloff, "Assessing student learning of Newton's laws: The Force and Motion Conceptual Evaluation and the Evaluation of Active Learning Laboratory and Lecture Curricula," Am. J. Phys. 66(4), 338-352 (1998)
Paper 5 presents the FMCE. It also uses the results of a subset of four FMCE dynamical questions to justify an instructional method that combines "the Tools for Scientific Thinking (TST) Motion and Force", the "Real Time Physics (RTP) Mechanics, and Interactive Lecture Demonstrations (ILD)". The first two (TST and RTP) are microcomputer-based laboratory (MBL) curricula. The total gain as measured by the FMCE for using all three methods was an incredible 75%! Unfortunately, combining these two goals results in a minimally useful FMCE. The authors imply that wrong answers can be used to evaluate student views, but they offer no explicit methodology to accomplish this end. The authors "are able to identify statistically most student views from the patterns of answers and because there are very few random answers." Unfortunately, they don't provide us with the statistical patterns, nor with a listing of the different student views. What they do provide is a "focus" on dynamics concepts, as probed by four sets of questions: the Force Sled, the Cart on a Ramp, the Coin Toss and the Force Graph. These sets are a substantial chunk of the 43 question test. The Force Sled (questions 1-7) and the Force Graph (questions 14-21) were given as a pre and post-test to 240 students at the University of Oregon. The total improvement due to traditional instruction averaged seven percent. The Force Sled (FS) and Force Graph (FG) ask about similar motions in very different ways. The FS uses natural language as much as possible and explicitly describes the force. The FG uses graphical representations in an explicit coordinate system but does not explicitly describe the force. If students do much better answering the FG than the FS, it is possible that their English Language skills are weak. Conversely, if students answer Question 15 (FG) incorrectly, it is likely that they are unable to read a graph. Question 5 (FS) is designed to identify students who are just beginning to consider Newton's First Law. Question 6 (FS) should be interpreted cautiously, as 40% of physics faculty chose the incorrect answer, F. Questions 1-4 and 7 are used to make a composite average labeled "natural language evaluation". Questions 14 and 16-21 make a composite average labeled "graphical evaluation". There are no more specifics. On a general note, the authors assert that there is a learning hierarchy formed first by kinematics and then by dynamics. An improved student understanding of kinematics also improves student learning of dynamics. The validity of the FMCE is addressed by observing the results of those students labeled Newtonian Thinkers (seven out of 8 questions right on the Force Graph) on other tests, and on their written explanations to the Cart on Ramp, question 9. It is noted that guessing correctly is very difficult because there are up to nine choices, with choices being derived from an open-ended questioning process during student interviews. Finally, paper 5 is one of the few papers to mention retention of knowledge. Retention, after six weeks in which there was no additional dynamics instruction, was superb with an increase in students answering in a Newtonian way. This increase (approx. 6%) is attributed to assimilation of concepts. Well that's it for mechanics tests. While a great deal of PER focuses on Introductory Physics, Mechanics, Newton's Laws, Force, and acceleration, recently there has been an expansion into a wider range of physics domains, and thus a need for assessment tools in those domains. The next couple of reviews introduce assessment tools for Electromagnetism and Thermodynamics.
21.6 : Paper 06, Conceptual Survey in Electricity and Magnetism The following is a review of D. Maloney's, et al. paper in which the Conceptual Survey in Electricity and Magnetism (CSEM) is introduced and used. This recent paper (2001) is an attempt to provide PER with a tool for E&M which is analogous to the FCI for Mechanics - a standard which allows various research and different programs to be compared and contrasted quantitatively.
D. Maloney, T. O'Kuma, C. Hieggelke, A. Van Heuvelen, "Surveying students' conceptual knowledge of electricity and magnetism," PER Am. J. Phys. Suppl. 69(7), S12-S23 (2001)
The FCI has raised the consciousness of many physics teachers about the effectiveness of traditional education in educating students about basic kinematics and Newton's three laws. To assess students' knowledge of electricity and magnetism, the authors have developed the CSEM. The CSEM is a broad survey instrument for use in general physics courses. It can be used to assess a student's initial knowledge state as well as the effect of various curricula and instructional methods on improving that state. The CSEM has a number of significant differences in comparison to the FCI; not least is its reliance on other domains, such as force, motion and energy. The CSEM deliberately excludes DC circuits because the test needed to be shortened, and there are other instruments for assessing student understanding of DC circuits. The iterative four year development process started with questions gathered in a workshop and worked its way through an open-ended version to gather valid distractors. What began as two separate tests, one for electricity and one for magnetism, were combined. The final stages of revision were based on feedback from instructors who evaluated and/or administered the earlier versions. The quality of individual test items and the quality of the overall test are addressed; two measures of item quality are difficulty and discrimination. Difficulty is the percentage of students who get the item correct; for the CSEM, the difficulty ranges from 0.1 to 0.8. There are only seven items with a difficulty over 0.6, which is less than ideal. Discrimination assesses how well a test question differentiates between students in the top 27% (Nu) of the entire test, and the students in the bottom 27% (NL). A value called the "Item discriminator" (Id) is created where: (NU - NL) ÷ (NL - NU) = 1/2 Id. CSEM item discriminators range from 0.10 up to 0.55. All but four questions out of thirty-two have Ids above the traditional lower limit of acceptability, 0.20. Difficulty and discrimination are correlated with the full range of discrimination only available to those problems with a difficulty of 0.5. Two standard measures of overall quality are validity and reliability. Validity was assessed by forty-two community college instructors using a 5 point scale. A table provides the mean and standard deviation for each question's validity. Algebra-based courses and Calculus-based courses are presented separately. All means are above 4.0. Reliability is calculated by the KR20 test; the authors cite a reference and give a brief description. Essentially, the actual test is broken into two tests, each consisting of half the items; the correlation between performance on these two subtests is calculated. This is repeated for all possible half-item subtests. The KR20 post-test estimates for the CSEM are around 0.75. Reliabilities of 0.9 to 1.0 are rare. Reliabilities of 0.8 to 0.9 indicate that a test can be used for both individual and group evaluation. Reliabilities of 0.7 to 0.8 are common for well-made cognitive tests. Reliabilities of 0.6 to 0.7 are weak cognitive tests but acceptable for personality tests. Reliabilities of 0.5 to 0.6 are common for well-made classroom tests. Further validation was done by running factor analyses (principle component) on the CSEM. Factor analysis looks for significantly correlated groups of test questions. Eleven factors were found, with the largest at 16%. The improvement of factor structure requires additional questions which would increase testing time, a sensitive issue with classroom instructors. The 16% is very small for a first factor, and while the eleven are mathematically identifiable, they are not considered meaningful by the authors. The CSEM is a valid reliable instrument that probes both the limited existing set of student alternative concepts and aspects of E&M formalism. Having presented the instrument, the authors use it. The result for algebra-based pre-instruction testing is 25% ± 8%; post-instruction testing is 44% ± 13%. The result for calculus-based pretesting is 31% ± 10%; post is 47% ± 16%. The sample size total is approximately 1500 students. There is a noticeable disparity between the results on the electricity questions (1-20) and on the magnetism questions (21-32). Scores were lower on magnetism questions by roughly ten percent depending on the bin. Question results are presented with student percentages for each of the multiple choice letter responses, as broken out by pre/post test and algebra/calculus based. The CSEM has eleven conceptual areas; the paper addresses seven of them. First, Conductors and Insulators: There are a substantial number of students who can not distinguish between conductors and insulators. Second, Coulomb's Law: students seem to believe larger charge magnitudes exert larger forces than smaller charge magnitudes. Third, Force and Field Superposition: students confuse magnetic field effects with electrical field effects. Fourth, Force, Field, Work and Electric Potential: students still associate constant velocity with a constant force and cannot deduce the direction of the electric field from a change in a potential. Fifth, Magnetic Force: getting students to see if an electric charge has a velocity with a component perpendicular to the magnetic field direction is very difficult in magnetic force problems. Sixth, Faraday's Law: students do not see a collapsing loop as changing magnetic flux or rotating loops as not changing that flux. Seventh, Newton's Third Law: over 50% of students fail to believe Newton's Third Law extends to electromagnetic situations. The CSEM provides an estimate of student learning for some important ideas in electricity and magnetism; it can provide guidance for research. Research needs to be done in determining the nature of students' alternative ideas about topics in electricity and magnetism. pretest responses show recurrent patterns, some of which are highly resistant to change by traditional instruction. Interwoven in this unclarified area is language use and students' interpretations of language use. Fractional gains [<g>s using FMCE scores] range from 15% up to 60%, clustering around 30%. Additional research on instructor strategies is needed to determine the impact of particular techniques on student performance. As a concluding paragraph of my own, I would like to highlight an issue: Validation. This paper presented a far more in-depth presentation of assessment instrument validation than any other paper I have read in PER. While I have never performed a KR20 test, I can but contrast this with the validation presented by the authors of the FCI paper (paper 1):
Formal procedures to establish the validity and reliability of the FCI are UNNECESSARY [my highlight] because of its similarity to the Mechanics Diagnostic for which considerable care was taken to validate.
This casual attitude to validity is extensive. For example, the validity of the FMCE (paper 5) is addressed by observing the results of those students labeled "Newtonian Thinkers" (seven out of 8 questions right on the Force Graph) on "other tests" and on their "written explanations to the Cart on Ramp question 9." This nonexistent or casual validation leaves open legitimate skepticism and illegitimate skepticism that these tests actually are what they profess to be: the bedrock on which instructional change is built. Reducing skepticism is one important factor in achieving widespread use of PER results, and therefore validation should be done thoroughly and seriously. The final full physics-content test presented in this chapter is the Thermal Concept Evaluation (TCE). I say full test because many of the research papers include a question or five, written and used by the authors for their paper-specific purposes. A few of these questions are designed to be added to the FCI, but many of them are stand-alone little quizzes. I say physics-content because PER also extends beyond content into epistemology. Epistemology has its own set of tests, one of which, the Maryland Physics Expectations Survey (MPEX), I will review as paper 32.
21.7 : Paper 07, Thermal Concept Evaluation S. Yeo and M. Zadnik's upcoming paper was published in "The Physics Teacher", a very enjoyable magazine. Most of this literature review is of papers from that source and from the American Journal of Physics (Am. J. Phys.). The Am. J. Phys. is a far more scholarly journal which, in addition to a monthly issue, puts out a yearly PER supplement. There are a large number of additional PER sources. For example, the back of the PER Am. J. Phys. Suppl. 68(7) lists 39 papers in thirteen publications for 1999 alone.
S. Yeo and M. Zadnik, "Introductory Thermal Concept Evaluation: Assessing Students' Understanding," The Physics Teacher Vol. 39, 496-504 (2001)
The authors present an instrument that is almost the FCI of Thermal Physics. I say almost because while naive alternative concepts are used as distractors [wrong answers for Multiple Choice Single Response (MCSR) test questions] and matched to specific questions in a table, the distractors are not matched to specific answer choices within the question. Thus, the reader is left to match misconception to answer; a step I would have preferred the authors to have done. Further, the correct answers are not noted, and I was uncertain of the correct responses to three questions. The answers I chose are in agreement with those chosen by my mentor, who also took this test. Having contrasted it to the FCI, the comparison is still good; the TCE is valuable to the Thermodynamics professor. The specific alternative conceptions: Heat and Temperature are the same thing, Skin or Touch can determine temperature, Heat rises, the bubbles in boiling water contain "air", "oxygen" or "nothing", will surprise few, However, the shear number of thirty-five Alternative Thermal Physics Conceptions will provide some insight for everybody. Paper 7 also addresses Instrument Development, Testing the Instrument, and Test Validity. The TCE is provided as Appendix I to paper 7. It has been made available for use as a pre/post-test, for assessing alternative concepts at any point of instruction, and for planning instruction or remediation. Paper 7 is very similar to the FCI paper in its non-content specific parts, and the content specific alternative beliefs are presented, not developed. Yeo and Zadnik have provided a valuable tool for the Thermodynamics professor.
21.8 : Paper 08, Concentration Analysis Tests can provide and are most often used for an overall score. However, individual questions or subsets of questions can provide useful information to instructors and students. A method to extract information internal to a test is presented by L. Bao and E. Redish in their paper a "Concentration analysis..." They devise an algorithm applicable to "any" MCSR test, and they demonstrate their technique on the FCI. The basic idea is one I find valuable, and in the Cabrillo Community College Data Analyses Section, I have applied their Concentration analyses to MB data. My basic problem with their paper lies in their application of this method to the FCI. As I understand their math, each MCSR item is independent and unique; this is not true for the FCI. As an example: Bao and Redish label FCI question 15 as a "LL", which means a "near random situation." This is false, because both distractor "a" and distractor "b" are representing the same misconception. This single misconception according to the FCI paper (paper 1) is labeled "AR1: greater mass implies greater force." Thus, what at first glance looks like a distribution: a = 23, b = 38, c = 32, d = 0, e = 6 [data from AVH table V paper 1] is in reality: AF1 = 61, 5S = 32, Ob = 0, ? = 6. 5S represents the correct Newton response; the other symbols represent various specific misconceptions. What were three seemingly different and near equally distributed answers (a, b, c) are in fact two different, non-equally distributed answers (AF1 & 5S). Thus, question 15 is really a "MM": "Two popular models one of which is the correct answer." Having dispatched the idea that FCI choices are independent, I will now offer evidence that they are not always unique. FCI question #11 is labeled by Bao and Redish as "MM" which, as has already been stated, means "Two popular models one of which is the correct answer." The raw data, using AVH from paper 1 because Bao & Redish don't provide raw data, is a = 2, b = 4, c = 4, d = 21, e = 68. Unfortunately d is not unique; it can be chosen for two separate reasons: AR1 and/or AR2. AR1 is greater mass implies greater force; AR2 is most active agent produces greatest force. Without delving into the definition of "active agent", suffice it to say that it does not mean mass. Thus we do not have two models, we have three. Two of the three are entangled in the distractor "d" and thus inseparable by this question. Having argued against the application of concentration analysis to the FCI, I did use this analysis on Cabrillo MB data. I did this in part because I was interested in the results and in part to make sure I understood what the authors were proposing. Paper 8's quantitative analysis and graphical displays imply a greater certainty than the reader should accept, but it does provide avenues for further exploration and allows for manipulation of large amounts of data. The fundamental fact is that each choice on a MCSR test is mathematically assumed to be independent and unique. To the extent that this assumption is false is to the extent that the end results are suspect.
L. Bao and E. Redish, "Concentration analysis: A quantitative assessment of student states," PER Am. J. Phys. Suppl. 69(7) S45-S53 (2001)
Qualitative research based on interviews and analysis of open-ended problem solving has documented different clusters of semi-consistent reasoning that students use in responding to physics problems. This knowledge has been used to create attractive distractors for multiple choice examinations such as the FCI. In examining large populations, student choice among wrong distractors contains information as valuable as the grosser distinction between correct and incorrect, that has been the focus of most research. The authors have developed an algorithm that extracts and displays how students produce incorrect answers. Their method analyzes the concentration/diversity of student responses to particular multiple-choice questions. In constructing a model of student knowledge, the authors appeal to neuroscience, cognitive science, and educational research. The agreed upon core research elements are: (1) memory is associative; (2) cognitive responses are productive; and (3) cognitive responses are context dependent (inclusive of students state of mind). This alone is not enough of a base, so the authors also focus on several structures proposed by researchers: (a) patterns of associations (neural nets), (b) primitives/facets, (c) schema, (d) mental models, and (e) physical models. The authors define these terms. They desire to determine the effectiveness of a particular multiple choice question in triggering the small number of research identified common naive schema or mental models. If a multiple-choice question is designed with these naive mental models as distractors, then the distribution of student responses yields information on the student's state. The student who has a strong naive belief will pick multiple wrong answers that are based on that unifying mental model. Students who simply lack knowledge will choose distractors randomly. The authors create a concentration factor (C). This factor measures the distribution of responses to a question on a scale of 0 to 1. C = 0 is an even distribution with all choices (A, B, C, D, E) selected by the same number of students. C = 1 is the other extreme with all students selecting the same choice. Thus each test question gets a C value from 0 to 1 that has no relationship to correctness or incorrectness. This C value merely reflects the distribution of choices. Paper 8 explains the math formula used to create C and verifies C's range. The concentration factor (C) is used to study several different aspects of student data. One study labels a question with two letters. The first (L = low, M = middle, H = high) characterizes the student's scores; the second is derived by binning the concentration factor. For example, a LH question implies an incorrect prevalent model, i.e. a low score and a high concentration. A table provides the implications of the two letter labels. Not all permutations are possible as score and concentration are not independent. Pattern shifts from pre- to post-instruction reflect the impact of instruction. Rather than gross binning (low, medium, high), the score (S) and the concentration factor (C) of a question can be displayed as a point (S,C) on an S-C plot. S-C plots are shown; they have boundaries due to the constraint between S and C. The score (S) is the number of students who chose the correct choice out of all possible students (N); on the plots, S is normalized. A new variable G is defined as the new concentration factor of incorrect responses only. It shows more detail and is determined by the removal of the absolute offset created by the score. G is called the concentration deviation. C and G highlight different aspects of the data. An example is provided by analyzing the FCI pre and post-test results from fourteen introductory calculus-based physics classes at the University of Maryland. The reference provides the scores (S), the concentration factor (C) and the two letter label (HH...LL) for each question on the FCI, based on pretest data from 778 students. LH and LM questions are analyzed with all of them addressing either the naive model that motion requires an unbalanced force, or the naive model that the larger or more active agent will produce the larger force. Two LL questions are addressed. It is argued that no naive model accounts for the low score, low concentration. Both questions deal in detailed physical processes that require the integration of various pieces of physics knowledge [see my introductory paragraph]. S-C plots are offered for Traditional versus Tutorial instruction. The average shift is larger for the tutorial classes, increasing in both S and C, showing more students holding the single correct model. Traditional classes shift, but only into a two model region where a significant number of students still hold incorrect beliefs. S-C plots are used on the FCI subset of Low Scoring Questions (2, 5, 9, 13, 15, 18, 22, 24, 28) and on the Force-Motion Questions (5, 9, 18, 22, 28) showing the same result. S-G plots are also presented and analyzed. In comparing S-G plots of Traditionally instructed classes to those of Tutorially instructed classes, Questions 9 and 22 stand out. For Question 9, Traditional instruction changed student distractor choice from b to c. This implies that while students learn to recognize the "normal force," they continue to believe that a force is needed in the direction of motion. There is no comment on Question 22. Concentration factoring can facilitate test development, instruction, and assessment. It can help confirm the presence and prevalence of erroneous models detected through research. It allows the detection of questions that do not have a relevant distractor, and can lead to improving multiple-choice tests. Information on how the majority of students get a question wrong cannot be analyzed using test scores alone; concentration factoring provides important clues for improving instruction. Excluding the MPEX (paper 32), that ends our review of assessment tools. PER uses these tools primarily to establish the ascendancy of Interactive Engagement (I.E.) methods/curricula over those labeled Traditional. Interactive Engagement can be roughly divided into how physics-content should be taught and what physics-content should be taught. How outweighs what in the literature and is our next focus. Three points are in order before we advance. First, how and what are not mutually exclusive; fragments of one will be intermixed in a paper dominated by the other. Second, the work of L.C. McDermott's University of Washington Physics Education Group, will be presented later in this paper even though their final product, the tutorials, is very focused on how to teach physics-content. Third, epistemological issues which are yet to be focused on, play an extensive role in a few of these papers. PER is more of a web than a logic tree. Thus, it's not always possible to present the material in an ordered linear format.
CHAPTER 22 : Interactive Engagement Methods that Retain the Lecture
The main stay of traditional physics teaching methodology is and certainly has been the lecture. Due in large part to the miserable <g>s of lecture classes, the lecture is and will continue to be viewed as ineffective and probably detrimental to the student's learning of physics. Having said that, most physics students are still subjected to the lecture and are so subjected for more reasons than mere bureaucratic inertia. Thus we shall lead off our I.E. methods section by some papers that seek to retain the lecture but improve its <g>s (effectiveness). The upcoming paper is on an I.E. method that is often referred to by other papers; wherein it enjoys a good reputation.
22.1 : Paper 09, Interactive Lecture Demonstrations
D. Sokoloff and R. Thornton, "Using Interactive Lecture Demonstrations to Create an Active Learning Environment," The Physics Teacher Vol. 35, 340-347 (1997)
Efforts to improve physics education while maintaining existing structures has resulted in Tools for Scientific Thinking Microcomputer-Based Interactive Lecture Demonstrations (ILDs). ILDs are an extension of the authors' Microcomputer Based Learning (MBL) curricula for introductory physics, Laboratory Tools for Scientific Thinking, and Real Time Physics. An ILD is an eight step procedure that engages students actively via individual predictions, small-group discussion with nearest neighbors , and after the MBL measured demonstration, the completion of a results sheet. For the instructor, picking the appropriate moment to move on to the next step is important, as is having a definite agenda for the last two steps, which are "wrap-up results" and "extension of this concept into different physical situations." ILD's uniqueness is the real time data provided by the MBL tools. The ILDs must be presented in a manner that builds student confidence in the measurement devices. Flashy exciting demonstrations are eschewed as too complex to be effective learning experiences. A sequence of ILDs, say on Human Motion, takes about 40 minutes and makes use of the motion detector, force probe, Universal Laboratory Interface and Tools for Scientific Thinking Software. Evaluation of student learning was done by an in-house test and the FMCE. A summary of pre- and post-instruction results are examined for a subset of problems: Force Sled, Force Graph, Cart on Ramp, and Coin Toss. Evaluation of a University of Oregon, 200 student, non-calculus, introductory physics lecture showed a huge improvement on the 10% overall gain due to traditional lectures. Replacing 80 minutes of traditional lecture with ILDs resulted in more than a 50% overall gain. Evaluation at Tufts University confirmed the Oregon results. The actual Force Sled and Coin Toss questions are in the paper as is the ILD prediction sheet. The paper also has some demonstration descriptions from Newton's First and Second Law ILDs. I want to raise three points. First, these authors also authored the FMCE; they just reversed precedence. Second, use of computers in I.E. classes is a controversial topic and will be dealt with in some detail in papers 22 through 25. Third, very little of PER is strictly technical in nature, but given the reliance of ILD on equipment, I thought the following paper might be of interest. D. Maclsaac and A. Hämäläinen in "Physics and Technical Characteristics of Ultrasonic Sonar Systems", published in the Physics Teacher vol. 40, give a very detailed analysis of the SX-70 ultrasonic ranging system components, patented and manufactured by Polaroid Corp. This system is used in most introductory physics lab ultrasonic motion detectors. Their analysis includes history, beam patterns, blind spots, buzzing sounds, resolution, precision, accuracy, common classroom difficulties, and pedagogy. Any user of motion detectors would benefit from reading this information, particularly before doing an instructional demonstration in front of a couple hundred students.
22.2 : Paper 10, Audience Paced Feedback
J. Poulis, C. Massen, E. Rubens and M. Gilbert, "Physics lecturing with audience paced feedback," Am. J. Phys. 66(5), 439-441 (1998)
Traditional lecture format is flawed, and the use of computers and multimedia to improve upon them is constrained. Paper 10 provides a technique to improve lectures; it involves Audience Paced Feedback (APF). APF is the provision for each student in the lecture theater to have an electronic handset which allows each/all to answer simple binary questions from the lecturer. There are four question formats possible: (1) exploration, (2) verification, (3) interrogation, and (4) organization. APF is fundamentally different than the raise-your-hands-to-answer style because: (a) all students are answering, (b) the lecturer can ask multiple choice questions, (c) the students replies are anonymous, (d) the lecture format shifts closer to that of a seminar, and (e) a permanent record is possible. The students rated APF lectures at 6.7 and non-APF lectures at 5.1 on a one to nine Likert Scale (nine being very strong positive). The pass rate of APF lectures was approximately 87%, while for the non-APF lectures it was approximately 58%. APF lectures also had a smaller standard deviation. The sample size for both APF and non-APF was approximately 2600 students. APF allows the lecturer to ensure that the majority of the student body has understood the material before moving on. APF also gives the students an active role in the lecture. During many questions, students are given time to discuss the problem which brings an element of student-to-student teaching into the lecture. Finally, there is a small Hawthorne effect due to the unusual and positive environment. The above paper has many twin brothers, the most equal being Eric Mazur's book, Peer Instruction, A Users Manual, ISBN 0-13-565441-6. All of these focus on achieving student feedback to the lecturer; how feedback is achieved and what kind of feedback occurs vary. What the lecturer is supposed to do with the feedback also varies. Nevertheless, there is a strong push in lecture-enhancement PER to construct a student-to-lecturer feedback loop primarily to reduce the passivity of the student. There is also a strong push to incorporate student-to-student discussion/instruction in the lecture format, in part for the same reason. The upcoming paper is our last to advocate modifying the lecture itself. It will be followed by several that enhance the lecture through activities in the associated lab or discussion groups.
22.3 : Paper 11, Peer Instruction
C. Crouch and E. Mazur, "Peer Instruction: Ten years of experience and results," Am. J. Phys. 69(9), 970-977 (2001)
Actively engaged students learn more than passive receivers of knowledge. Cooperative activities are an excellent way to engage students. Paper 11 presents the results of ten years of Peer Instruction at Harvard University. The courses are a mix of algebra-based and calculus-based introductory physics courses for non-majors. Peer Instruction has been adapted to a wide range of contexts and instructor styles. Over the ten year period, Peer Instruction has been refined. The 3 major changes were: (1) replacement of reading quizzes with warm-up exercises of the Just-in-Time-Teaching (JiTT) strategy [ISBN 0-13-085034-9], (2) use of a research-based mechanics text, and (3) use in discussion sections of Tutorials in Introductory Physics (McDermott, et al.) and of group problem-solving activities (Hellen, et al.). Peer Instruction is a structured questioning process that involves every student in the class. It divides a class into a series of short presentations, each followed by a related conceptual question (ConcepTest). Students are given one or two minutes to formulate individual answers and report their answers to the instructor. If the percentage of correct students is between 30 and 70, which it normally is, then the students discuss their answers and the underlying reasoning with each other. After these two-to-four minute discussions, the instructor polls students for their answers, explains the answer, and moves to the next topic. ConcepTest questions are part of midterms and finals. If the percentage is less than 30, additional instructor-centered teaching is needed; if greater than 70, moving straight to the next subject is the most effective use of time. Time is an issue. In the algebra-based class, approximately 15% of the old traditional curricula is not covered. The calculus-based class does cover all of the old curricula. To free-up class time, students are required to read prior to class. This was enforced by a beginning class reading quiz; it is now enforced by a three question web-based assignment due before class. This pre-class information is used to focus the lecture on student identified needs. Student knowledge is assessed by FCI, MB, traditional exams, and ConcepTest performance. Conceptual Mastery, as measured by <g>, has risen from 50% to 78% over a six year span for the calculus-based classes and has averaged around 62% for the algebra-based classes over the last couple of years. This is in comparison to Hake's reported average for IE classes of 48%. Problem solving has been de-emphasized in lecture, with students learning these skills in discussion section and via homework assignments. Still, quantitative problem solving skills, as measured by the MB, have risen from 72% to 79% for the calculus-based class and are around 66% for the algebra-based class. In a comparison of Peer Instruction versus Traditional back in 1990 and 1991, <g> jumped from 23% to 50% in one year, and MB scores increased from 66% to 72% in the same year. Also during those two years, common exam problems were given and a statistically significant increase in student quantitative problem solving skills was found. [Note: paper four states the MB's "main intent" is to assess qualitative understanding even though it looks like a conventional quantitative test.] Three results stand out from analyzing student responses to all of the ConcepTests over an entire semester. First, 40% of answers were correct both before and after discussion, 32% shifted from incorrect to correct, 22% remained incorrect both before and after discussion, and 6% changed from correct to incorrect. Thus after discussion 72% of the answers were correct, and 28% incorrect. Prior to discussion 46% to 54%. Second, evaluation of semester testing shows that students retain real understanding of the concepts. Third, the strongest students are challenged by ConcepTests with no student getting more than 80% correct prior to discussion. The last part of the paper is Implementation, with subsections: Reading Incentives, Cooperative Activities in Discussion Sections, Quantitative Problem Solving, Student Motivation, ConcepTest Selection, Time Management, Teaching Assistant Training, and Resources. Most sections are two or three paragraphs in length. Some highlights not already mentioned include: (1) approximately one-third to one-half of class time is spent on ConcepTests, (2) teaching assistants are required to attend lecture, (3) there is a web site, http://galileo.harvard.edu, which includes over 800 ConcepTests, and (4) student evaluations and attitudes are not measures of student learning. Obviously, after ten years there is a lot here. Two points I wish to highlight at this time are the sensitivity PER has in regards to quantitative problem solving and to content-coverage. More than a few I.E. methods have been poorly received by traditionalists in part because some I.E. students can talk-the-talk but not walk-the-walk when it comes down to getting a numerical answer to a real-life question, and in part because most I.E. students have "not been exposed" to certain subjects such as buoyancy due to the time-intensive nature of I.E.. Interactive Engagement exchanges breadth for depth, arguing that mere "exposure" is worse than useless as it wastes valuable class time. Traditionalists argue that to not teach buoyancy and its ilk is to reduce introductory physics to mechanics. Leaving aside these deep and unresolved issues, the following papers propose that the lecture can be redeemed by activities outside of lecture, or at least propped up by them.
22.4 : Paper 12, Socratic Dialogue Inducing Hake, who six years after paper 12 will father <g> for an apprehensive PER community, brings us an I.E. use for labs. Normally, lecture classes have attached to them labs and discussion groups. This trinity is usually what's meant by "Traditional lecture-based classes." Rather than mess with the large student group, the lecture, Hake focuses on a small student group, the lab. Much of I.E. is small teacher-student ratio oriented.
R. Hake, "Socratic Pedagogy in the Introductory Physics Laboratory," The Physics Teacher, Vol. 30, 546-552 (1992)
Socratic Dialogue Inducing (SDI) labs are simple Newtonian experiments designed to produce conflict between a student's common sense understanding and Newton's Laws. This conflict induces collaborative discussion among lab partners and/or a socratic dialogue with an instructor. SDI is an active-engagement method that is much more successful in transforming a student's thinking from Aristotelian to Newtonian then the usual bombardment-of-passive-students method. SDI is useful in large enrollment settings (~100 students) and was inspired by the empirical work of Arnold Arons. This reference describes SDI labs and procedures, gives examples, and presents some conclusions. SDI labs emphasize hands-on experience in five manuals: #1 Newton's First and Third Laws, #2 Newton's Second Law, #3 Circular Motion and Frictional Forces, #4 Rotational Dynamics, and #5 Angular Momentum. These labs promote student mental construction of concepts through: (1) conceptual conflict, (2) extensive verbal, written, pictorial, diagrammatical, graphical and mathematical analysis of concrete Newtonian experiments, (3) repeated exposure to experiments at increasing levels of sophistication, (4) peer discussion, and (5) Socratic dialogue with instructors. SDI labs are: (a) adaptable to a wide range of student populations, (b) popular with students, (c) inexpensive in equipment costs, (d) easily modified, and (e) combinable with either other active-engagement methods or standard methods. Further, they allow instructors to discover learning problems and can provide valuable research data, if the dialogues and conversations are recorded and analyzed. This reference goes into detail about SDI Lab Procedures. Each lab session has 24 students (4 students at each of 6 tables) with 2 Socratic dialogists, one of whom has had previous experience. There are five primary ground rules, each given several paragraphs of development. A short sketch of the rules follow ['you' refers to the students]: First, you must understand the material you work on rather than "cover" all the prescribed sections. Second, draw "snapshot sketches" with color-coded vectors. Third, justify your collaborative responses with a thoughtful explanation and/or sketches. Fourth, if you are confused, after serious effort, call in a Socratic dialogist. And fifth, handed-in lab manuals will be examined and deficient work must be corrected at the next lab period. Hake provides five lab questions and a Representative Socratic Dialogue out of SDI Lab Manual #1. He highlights three of his newer lab manuals, including the Water Bucket Swing, the Old Spinning-Wheel-in-the Suitcase Trick, and the Cat Twist. SDI labs are effective in guiding students to construct a coherent conceptual understanding of Newtonian mechanics. This is due to: (1) the interactive engagement of students, (2) the Socratic method of instruction, (3) kinesthetic sensations that intensify cognitive conflict, (4) cooperative grouping, and (5) repeated exposure to the coherent Newtonian explanation in many different contexts. Hake concludes by noting that more research and development is needed, including moving some of the instructional load to computers. Four points in regards to the above. First, the Socratic method, in brief, is a questioning process by which the dialogist "leads" and "prods" his student to insight and the correct answer without actually telling him either. To put it mildly, such a process can backfire, particularly if the student is at the cognitive level of desiring "the truth" from an authoritative source. Second, cognitive conflict is critical to several I.E. methods, most notably McDermott's Tutorials. Very basically, the idea is that students enter class with non-Newtonian beliefs that work for that student; many of these beliefs were created by the student from his life experiences. In order to replace one of these preexisting beliefs with a Newtonian one, the preexisting belief must be proven to be in obvious conflict with simple observable fact. Only then is the student receptive to accepting and using a "better" belief, the Newtonian. This too has its foundation in cognitive studies, far outside of physics-content and thus dealt with in Chapter VII. Also from cognitive studies is our third point, constructivism. The fundamental idea of constructivism is that students construct their own concepts, and that, at best, the teacher can semi-prepare the work site, and provide a few basic tools. Thus, for constructivists, teachers don't teach so much as facilitate the students learning process. The upcoming paper also advocates interactive engagement of students during the lab component of the lecture, lab, discussion-group triad. It is interested in the non-major and raises several issues such as the difference in instructional impact on men and women.
22.5 : Paper 13, Inquiry Experiences
J. Marshall and J. Dorward, "Inquiry experiences as a lecture supplement for preservice elementary teachers and general education students," PER Am. J. Phys. Suppl. 68(7), S27-S36 (2000)
The introduction of paper 13 is a literature review advocating interactive engagement classes. It notes that significant gains on the FCI are possible even with the comparatively small investment of using I.E. to supplement a Traditional class. Paper 13 argues that elementary school teachers should be taught in the way that they will teach. Further, it argues that elementary school students learn best via hands-on interactive methods. The literature review of twenty-seven references includes the important Resource Letter by L. McDermott and E. Redish, "Resource Letter PER-1: Physics education research," Am. J. Phys, 67(9), 755-767 (1999). Convenience samples are widely used in PER, with true random sampling rare. In paper 13's Preliminary Study, students were divided based on whether they had or had not also signed up for the separate lab course. Over the two quarter Preliminary Study, the match of lab and non-lab to inquiry and non-inquiry was flipped. To further mitigate the effects of convenience sampling, students were compared by GPA, gender, and major. In the Comparison Study, the entire third quarter class did inquiry activities and was compared to algebra and calculus-based classes. This comparison was done with a McDermott-published, light-bulb-brightness, test question. A quantitative comparison question was not used; in a nod to Thaker et al., this was viewed as a shortcoming in the research. The inquiry exercises were adapted from Physics-by-Inquiry, Amusement Park Physics, and suggestions in Aron's book, A Guide to Introductory Physics Teaching. The exercises did not involve explicit pre or post-testing; although the material covered was included in course exams. Students worked in self-selected cooperative groups of two to six people, and due to time constraints, several Physics-by-Inquiry activities were shortened. While the activities in this paper are not appropriate for young children, material developed by Elementary Science Study is appropriate. There are constraints and limitations to the generality of results, most particularly from nonrandom (convenience) sampling and the small size of subgroups. Assessment was based on final exam grades, course grades, and an inquiry subset of midterm exam questions. Standardized tests such as the FCI were not used, although a few specific questions from published sources were used. One hour focus group interviews with volunteer students were also conducted. The results of the Preliminary Study are that women benefit from inquiry-based laboratory exercises at a statistically significant level; men do not. Gender plays a more important role in determining inquiry benefit than does choice between elementary education and general education. Multivariate analysis of variance (MANOVA) and T-tests were used to arrive at this conclusion. The results of the Comparison Study are 26% of inquiry and 9% of non-inquiry students ranked the light bulbs, in McDermott's question, correctly and gave the correct explanation. Most of the student comments match those found in McDermott's published paper. The authors do note some real life problems such as irregularities in bulbs and inadequately charged batteries can lead to misconceptions. They now have TA's check all bulbs and batteries prior to lab. The particular misconceptions noted were that bulbs used up some of the current, and that batteries were constant current sources. Interviews provided some insights. Inquiry sessions were most beneficial when they followed a lecture. The sessions helped narrow the preexisting distinction between teaching science and doing science. For a few students, the perceived lack of direction in comparison to strongly prescriptive labs, led to frustration and withdrawal. Most students found inquiry "dynamic", "exciting", and "alive". No cascading effect was found; the only differences between inquiry and non-inquiry groups were in those questions dealt with directly in the inquiry exercises. Thus, as many topics as possible need to be addressed via the inquiry method. Non-lab inquiry was performed during six one-hour lecture periods. Lab inquiry was performed during six two-hour lab sessions. Non-inquiry students were assigned extra homework, or if assigned to a non-inquiry lab, did normal prescriptive labs. From one point of view, there is a lot less to this reference than meets the eye. The authors are doing MANOVA and T-tests on one class of students at one institution in their Preliminary Study, and comparing two classes on one question in their Comparison Study. From another point of view, paper 13 advances some important points. It's up front about convenience sampling, something to which most PER is oblivious. It uses interviews, although they were student-led. It acknowledges the reality of cooperative grouping in many circumstances, and stands in stark contrast to the rotating, random, four-person groups with internally rotating assignments such as recorder, critic, etc. that is the theory. The primary point of interest is the assertion that inquiry methods gender differentiate with women benefiting and men not.
22.6 : Paper 14, Supervised Practice I'm going to label Supervised Practice as SP. The paper after this one is Studio Physics; let's label it StP. Which beats SP1 and SP2, or so I hope. On a more serious note, I can never trust my memory with "Supervised Practice"; I think of this paper as "White Board" for reasons you'll see soon enough.
M. Johnson, "Facilitating high quality student practice in introductory physics," PER Am. J. Phys. Suppl. 69(7), S2-S11 (2001)
Paper 14 investigates issues involved in facilitating high quality practice of the knowledge and skills that students are learning in introductory physics. A classroom peer-collaborative structure, Supervised Practice (SP), is described and critiqued. Experts and Novices approach problems differently. Experts often use complete accurate diagrams and start from fundamental principles. Novices use the skills they have previously developed: algebra and calculator use. Once a Novice has an answer, he's done. For an Expert, the answer must yet be checked for reasonableness in both magnitude and units. A goal of education is to develop in the Novice, the problem solving (approaching) skills of the Expert. Human tutors and, in certain settings, computer use, have proved themselves as viable methods in achieving this development. There are, however, cost and technical issues that encourage other approaches to this goal. Some effective, affordable, large-class structures exist that are able to provide timely guidance and feedback to groups of students while they practice expert skills for problem solving and concept interpretation. Small group problem solving is another approach that can be used to promote high quality practice among students. Timely feedback, including correctness of solutions, is critical to student learning. In supervised small group practice, a given student can get feedback from either the teacher and/or his several group members, swiftly. Additionally, a student also gains educationally by giving feedback. Finally, a teacher can communicate to several people who share a common problem at a common time. The supervising teacher can also focus on process, praising a well-labeled diagram or encouraging students to demonstrably check their answers for magnitude and units. There are challenges in implementing Supervised Practice. These include: (1) peer-peer communication difficulties, (2) student-instructor communication difficulties, and (3) shifts in attention that hinder group coherence. Physics requires a special vocabulary and special diagramming techniques which a student must learn. Students face difficulties communicating in the language of diagrams and algebraic symbols. These difficulties erode the effectiveness of communicative attempts at providing useful guidance and feedback. The effectiveness of instructor feedback depends on both the level of understanding the instructor has about what the students have done, and the level of shared understanding the group of students have about what they have done. At the extreme, if an instructor does not understand what the students have done, he is reduced to working the problem for the group. If students don't share an understanding, the instructor must address each individual separately. SP directly fights against the common problem of students using tools they already know to quickly get homework answers; students thereby fail to develop the expert problem-solving skills so beneficial in real world contexts. People have different attention spans; they also have different abilities to discriminate between important and peripheral issues. Thus, it is probable that individual students will focus on different aspects of the solution and will not follow the details of the group's solution as it progresses for the 30 minutes needed to do the average homework problem. General strategies that address these three difficulties are creating a classroom environment that (a) facilitates effective communication between peers about the details of the group problem-solving process, (b) allows the instructor to clearly see what the students have done in the process of generating the group's solution, and (c) provides a semi-permanent record that allows students to see what has happened when they tune back in. The tool that allows easy achievement of this general strategy is the white board [about 18" x 24" is a good size]. As the single solution is written out, it is a common, easily seen representation of the group's work; this facilitates both inter-group communication and instructor feedback. Its semi-permanence accommodates variations in student attention spans. Supervised Practice integrates the interventions proposed above and was implemented at Carnegie Mellon University from 1993 to 1996. SP met twice a week for 50 minutes each, in groups of twenty-five students. The structure of the introductory physics course also included lectures (60 to 240 students) on Monday and Wednesday, and exams or quizzes on Friday. There was access to a drop-in center several hours each day. Reif's research-based text, Understanding Basic Mechanics, was used. SP practical points follow. (1) SP meetings are staffed by two instructors drawn from graduate students, upper-class undergraduate majors, and volunteer post-docs. (2) The instructional staff meets weekly. (3) There is a "group-record" on which attendance, preparation, and progress are recorded. (4) Students do not hand in completed problem solutions. (5) Instructors are encouraged, but not required, to assign students randomly to collaborative groups. (6) The enforcement of roles within groups was the subject of much debate and ultimately left to the instructor's discretion. (7) Only one pen is given out per group, with explicit instructions to pass the pen after each problem. (8) Students are required to attempt a subset of homework problems prior to each SP meeting. (9) The TA checks that each student has attempted the subset. (10) SP meetings start with a warm-up problem. The students work on it for the first five minutes, then the instructor presents a complete solution and explanation on the blackboard. (11) Instructors encourage students to talk about the solution process and to write all the details of the solution on the white board. (12) After completing each problem, each group is required to discuss its solution with an instructor, at a "check point". This check point facilitates student-instructor communication. (13) No solutions are collected nor graded. The TA does note problem completion on each group record during the checkpoint process. Paper 14 provides a two page illustrated example of SP. The example focuses on student interactions during the diagramming of an Atwoods machine problem. The example came from a real classroom interaction which was recorded by an observer in 1995. It is interesting to read, for flavor and authenticity. The main conclusion drawn is that high quality of practice can be achieved in collaborative groups when students can communicate effectively with each other. The white board facilitates communication between students and between the student group and the instructor. It serves as a powerful memory device to facilitate the asking of questions between students, and it facilitates spontaneous review and summary between students. The white board is a powerful tool in the implementation of collaborative problem-solving practice. Other strengths of SP are: (1) problems are from a research-based course text; (2) closely related material is on the quizzes and exams; (3) close coordination between SP and the lecture, homework, and assessments; (4) students like it; (5) it's transportable to other institutions (University of Oregon); and (6) while several other learning features impact this, the FCI scores go from pre = 64% to a post = 84%, with a <g> of 0.55. Challenges are: (a) participants can be unfamiliar with, or discomforted by, their new roles in peer collaboration; (b) there is a slight cost increase, mostly to pay undergraduate TA's for their 12 hour-a-week assignments; and (c) instructor training is very important as graduate students are not necessarily experts in course material, pedagogy, and the management of a collaborative learning environment. This reference also highlights the Instructor Development at the University of Oregon. Part of this instructor development was a required one hour weekly staff meeting. Instructors did the problems beforehand and brought their solutions. Instructor physics-content knowledge, student difficulties with physics concepts, student difficulties with problem-solving processes, and student difficulties with peer interactions were also addressed at the staff meetings. The above paper was fairly normal, if more verbose than usual, on Instructor Training. As another example of TA training, paper 11 notes a requirement for TAs to attend lecture; I was required to do this as an Astronomy TA and actually enjoyed it. Not only does attending lectures help train the TA in content, it also allows the TA to know what his students have been exposed to in lecture. This knowledge helps in both content instruction and social interaction between the TA and the students. It also closes several possible communication gaps between the TA and the professor. Drop-in centers under various names are an often used but little commented on part of published I.E. methodologies. The above paper's acknowledgment of non-content goals is also common and dealt with at length in the upcoming epistemology section (papers 30 through 35). The most unusual aspect of this paper was its focus on problem solving skills.
CHAPTER 23 : Interactive Engagement Methods that Replace the Lecture
23.1 : Paper 15, Studio Physics The above six papers all retained the lecture to a large body of students as part of the courses overall methodology. The upcoming papers dispense with the large body of students and, largely, with the lecture. If teachers still talk a lot, students are talking more. Unfortunately this alone is not enough to enhance our student's knowledge of physics, as our next paper painfully acknowledges. The upcoming paper is unique in my reading, and valuable to establishing the credibility of PER. Too often published PER papers base a mountain of conclusions on a mole hill of data, as with paper 13, or arrive at refutable conclusions, as with paper 8. The upcoming paper is sadly honest, and, as such, a great boon. Examined failure is at least as great a teacher as success.
K. Cummings, J. Marx, R. Thornton, and D. Kuhl. "Evaluating innovation in studio physics," PER Am. J. Phys. Suppl. 67(7), S38-S44 (1999)
Introductory Studio Physics (StP) is Rensselaer's equivalent to the standard calculus-based two-semester physics course for engineers and scientists. The lecture and laboratories are integrated. Class size is 30-45 students. There is extensive use of computers. There is collaborative group work and a high level of faculty-student interaction. There has been no significant reduction in course content. Class meets twice a week for 110 minutes each. Studio physics uses traditional activities adapted to the studio environment and incorporates the use of computers. These activities do not directly address student misconceptions and employ neither cognitive conflict nor bridging techniques. There is currently no explicit training of TA's, and as a result there is great variation in their effectiveness. Unfortunately, the <g> is 0.22 for normal studio physics classes. This is at the same level as traditional classes despite the studio classroom appearing to be interactive. There is a standard approach to Studio Physics, but a certain flexibility allows for some diversity. In an effort to improve both <g> and FMCE scores, ILD and Cooperative Group Problem Solving (CGPS) were incorporated into five experimental StP classes. These were contrasted with seven standard StP classes. Paper 15 has a page describing ILD and CGPS. The authors also took great care in specifying how their implementation of ILD and CGPS varied from the standard ILD/CGPS formats. In the case of ILD implementation, less time was spent on "analogies to similar situations" and "having the students discuss results" than the ILD Teacher Notes suggest. Furthermore, students were allowed too much time to make predictions, resulting in fast students losing interest and focus. The small studio classroom with non-sloping floors made it difficult for all students to see the demonstrations. Finally, the students are so collaboratively oriented that it was difficult to get individual predictions. This resulted in some students never making a personal intellectual commitment. Personal commitment is a key epistemological point necessary in creating an environment in which the mind can change its belief patterns. In implementing CGPS there were deviations from the five-step-process strategy. Some of these deviations were the result of StP structures common to all classes, experimental and standard; for example, no context-rich problems were included on common exams. Furthermore, only half the class periods were given over to CGPS, with the other half focusing on standard problems which would be tested. Three additional issues addressed were: (1) due to time constraints, the instructor did not model CGPS techniques as often as desired; (2) students were so strongly resistant to cooperative group roles (critic, recorder, etc.) that this aspect soon died out; and (3) because the CGPS five-step problem solving strategy is typically irrelevant to textbook-style homework, students resented having to use it; this was particularly true when they could solve the problems quickly and correctly without it. The students enrolled without knowing whether the class would be standard or experimental. The tests were given back-to-back with twenty-five minutes allotted for the FCI and 35 minutes for the FMCE. Post-instruction testing was ten weeks later. The standard 110 minute classes began with 30 minutes of answering questions and working board problems. This was followed by roughly 15 minutes of lecture. In-class assignments filled out the remaining class time. Some instructors did allow students to leave early if the assignment was completed. The experimental classes were very similar to the standard classes with experimental activities replacing, not supplementing, part of the curricula. Four sections got all four ILD sequences; two sections got the entire CGPS package, with a one section overlap. The standard class results are <gFCI> = 0.18 ± 0.12 and <gFMCE> = 0.21 ± 0.05. The experimental ILD class results are <gFCI> = 0.35 ± 0.06 and <gFMCE> = 0.45 ± 0.03. The experimental CGPS class results are <gFCI> = 0.36 and <gFMCE> = 0.36. Information is provided per section, per person, and per group. Per section information is provided in bar graph and table format. There is a minor labeling error in paper 15 with section 9 listed as either ILD or ILD/CGPS. ILD alone or CGPS alone work as well as the two combined, with ILD taking far less time. The authors do note that CGPS students "performed better" on the problem-solving section of the last course exam. There are scatter plots of individual FMCE pre versus post-scores. Fifteen percent of standard and three percent of the experimental students have a post-score up to 10% worse than their pre-score. However, there are impressive gains with thirty-four percent of standard and sixty-four percent of experimental students having a post-score 20% or more, better than their pre. From the scatter plots, it is evident that the weaker students benefited from the experimental curricula. To determine the effect on the better students, the students were divided into thirds based on their pre-instruction test scores with <g>s figured for all groups. The top third of the experimental group had the highest <g>s. The top third of the standard group had the lowest <gFCI>, if not quite the lowest <gFMCE>. The authors conclude that it is necessary to mentally engage students, and that small classes, cooperative groups, and computer availability are good but insufficient. They emphasize that it is equally important to use research-based questions and activities. Upcoming papers (30-35) also discuss cognitive conflict and bridging techniques. Paper 1 lists the average FCI taking time at Harvard as 23 minutes. I believe twenty-five minutes is not enough time to give the FCI, as little more than half of the students will have time to complete it. The listing of ILD implementation mistakes provides practical counterpoint to paper 9. I am curious as to whether any class switching by students after enrollment occurred and if so in what direction. Computers and convenience grouping rear their ugly heads but will be addressed elsewhere.
23.2 : Paper 16, Integrated Math, Physics, Engineering, and Chemistry
R. Beichner, L. Bernold, E. Burniston, P. Dail, R. Felder, J. Gastineau, M. Gjertsen, and J. Risley, "Case study of the physics component of an integrated curriculum," PER Am. J. Phys. Suppl. 67(7), S16-S24 (1999)
The Integrated Math, Physics, Engineering and Chemistry (IMPEC) curriculum is a four-year experimental program whose goal is to minimize attrition and improve both student understanding and student attitude. Many different research-based approaches were used to modify the first semester physics course. These included activity-based pedagogues, collaborative learning, integration of curricula, content-rich problems, and the use of technology. In addition, close attention was paid to student-student and student-instructor interactions, with the goal of arranging classroom layout and usage to facilitate a student-centered learning environment. The instructional environment is described in two pages of detail with a couple of examples. All the classes met in the same room which was open 24 hours a day. The students were assigned to three-person teams. Team roles (recorder, checker, coordinator) and protocols were explicitly taught and were reinforced via grading schemes. Women and minorities were paired within teams and a variety of seating arrangements tested. Lecturing was minimized. Class time was spent doing hands-on activities adapted from existing curricula like Workshop Physics, Physics by Inquiry, ConcepTests, ALPS worksheets, and the simulation engine: Interactive Physics. In spite of some distractions, continuous accessibility to computers with MBL interfaces and software added enormously to the classroom milieu. In addition to computers, assigning students responsibility to read their textbooks outside of class, freed instructors to move about the classroom and enter into Socratic dialogues. Quizzes on the text and end-of-chapter homework ensured that students took the reading responsibility seriously. Labs were conducted as short exercises interwoven with the discussions. Labs ranged from ten minutes to several hours in length and were often initiated by a student question. These planned activities often appeared spontaneous to the students, and the instructor found that an excellent student motivator was for students to never know what to expect in class. The strong group ties cultivated through IMPEC led students to quietly "check with their neighbor" before turning to the instructor for information. In the process of constructing their own understanding, students increasingly challenged each other and their instructors. In evaluating and assessment, IMPEC students were compared to a control group of students who had volunteered for IMPEC but had missed out in the semi-random draw. IMPEC membership was constrained to match the gender and minority status of the entire engineering freshman student body. There was both qualitative and quantitative assessment. The qualitative assessment included: (1) analysis of Listserv e-mail via Strass and Corbins Grounded Theory, (2) use of student work and direct questionnaires, and (3) more than two hundred hours of field notes and videotape of teacher-student and student-student activities. The quantitative assessment included the FCI, the Test of Understanding Graphs in Kinematics (TUGK), and calculation-oriented traditional class exams. Both experimental and control group students took a common final exam. Qualitative data highlights the critical importance of socialization in the classroom. The major student use of the Listserv was for socialization messages (38%). Encouragement and socialization made up 24% of faculty e-mails. Field notes and videotape reveal quite apparently, that the same students, in the same room, working in the same groups, responded differently to different teachers. Student-faculty interactions depend greatly on the personalities involved; although difficult, it is possible to improve these interactions. In an example of instructor-instructor learning, the importance of providing ample wait time for your questions is highlighted. One instructor answered his own questions too quickly, resulting in passive class behavior by an otherwise aggressive class. In experimenting with classroom layout, large circular tables without computer monitors were found to most facilitate group dynamics. No real determination of students per computer was made except to note three on two did not work. Students showed extremely high satisfaction on course evaluations with all but one student rating IMPEC 5 out of 5 on a Likert scale. Quantitative results focus on passing, Likert confidence levels, final exam scores, TUGK scores, and FCI results. Passing was a grade of C or better in five courses, calculus I, calculus II, chemistry, physics, and engineering. IMPEC versus Traditional overall passing rates were 73% versus 51%. For female students, passing rates were 63% versus 44% and for minorities, 100% versus 20%. Likert confidence levels were up in all classes for IMPEC students and down for Traditional students. While positive attitudes are valued, engineers solve problems. On shared exam problems, IMPEC students averaged 80% with Traditional students averaging 68%. On the TUGK, IMPEC students in comparison with peers at other institutions (89 ± 2)% versus (48 ± 2)%. <g> for IMPEC in 1996 was 0.42 ± 0.06, for IMPEC in 1997 0.55 ± 0.05. It's noted that the <g> results are instructor independent. The same instructor teaching a traditional class in the same year had a class <g> = 0.21 ± 0.04. In a follow-up, no significant difference in standard exam performance was found between previous IMPEC students and their traditionally taught peers during a subsequent traditional E&M course. The authors believe their most important finding is the central role socialization played in the success of the IMPEC students. No social barriers based on race or gender were found after careful scrutiny. Technology has a central role in IMPEC. The assignment of computer-related tasks brought everyone back on focus; in this variant of peer instruction, students became involved because they had to interact with the technology. IMPEC class size was roughly thirty-five students. The authors are in the process of scaling IMPEC up to class sizes of 100 students. In two last notes, the authors assert that the focus must be kept on the phenomenon being studied rather than on an authority talking about the phenomenon and that students respond positively when many instructors use many different techniques in a fairly short time period. This is still another paper that emphasizes reading outside of class. It's common for I.E. classes to offer depth in the classroom and to make an attempt at breadth outside of the classroom. "Socratic dialogue" brings up echoes of paper 12. The use of the FCI as a quantitative results measure is highly unusual, particularly given its focus on natural language and concepts. The MB is far more common to be labeled quantitative, even though paper four explicitly states that, "the main intent [of the MB] ... is to assess qualitative understanding although it looks like a conventional quantitative test." Still, the excellent retention percentages for women and minorities and its good <g>s make paper 16 a valuable contribution to PER. Its highlighting of socialization as critical to student success is all but unique. Unfortunately, this reference's comment in regards to there being no significant differences between ex-IMPEC students and their traditionally-taught peers in subsequent courses is all too common. While some I.E. benefits could easily be course-specific, one hopes some benefits are generalizable and useable by students in future situations without a continual support structure provided by an instructor. Socialization would have seemed to fit the bill. It's sobering that even socialization fails to be a self-sustaining tool for enhancing future learning.
23.3 : Paper 17, History and Philosophy of Science The upcoming paper is incredibly laden with theory. It uses high school physics classes for its data, and, while HPS is a method that does work in a lecture format, its application in this paper is decidedly lecture excluding.
I. Galili and A. Hazan, "The influence of an historically oriented course on students' content knowledge in optics evaluated by means of facets-schemes analysis," PER Am. J. Phys. Suppl. 68(7) S3-S15 (2000)
[This study] tries, by means of elicited structural components of students' knowledge to infer the influence of a historically oriented instruction in optics on the content conceptual knowledge of students in this science domain.
Paper 17 provides a full page of theoretical background on the structure of knowledge in learners, most of it historical and developed by non-physicists. The authors present Mach, Bruner, Piaget, di Sessa, and Minstrell. The first two argue that people are unable to grasp, remember, or manipulate, a huge amount of complex content without knowledge of structure. Piaget is the founder of constructivist theory, which "pursues picturing human cognition by its elements related in schemata." Di Sessa, the father of p-prims, argues for the existence of stable cognitive constructs spontaneously created in the form of fundamental self-explanatory patterns. Minstrell advance facets-of-knowledge as the means by which students understand particular physical settings. Facets are more context specific and thus less fundamental that p-prims. Facets-of-knowledge can be grouped in clusters. The core which underpins the facets is more inclusive and less context dependent. Thus, the core constitutes a scheme-of-knowledge. Schemes relate concrete entities and evolve in the course of formal learning. Schema do not require mutual consistency. The impact of instruction can be judged by which schema are held by students and the prevalence of those schema. Given the great versatility of naive conceptions, instruction should aim at the essence, the learners' schema. Scientific treatises written at the dawn of science are good examples of claims and accounts about regularities observed in specific situations (facets) which were later represented by an inclusive proposition (scheme-of-knowledge). In the past, teaching physics by historical method had both proponents and opponents. Recently, the arguments in favor of HPS (history and philosophy of science) use in science instruction have been strengthened. First, presenting "unsuccessful" attempts at conceptual development that nevertheless helped to attain present scientific knowledge shows students a realistic picture of the complex transformation of knowledge from old to new. Second, the historical conceptual difficulties overcome by scientists in the past are similar to those faced by learners today. The arguments employed by the great minds of the past can be reapplied today, helping the learners of today. Third, HPS exposes competitive ideas and subjective perspectives in science which humanizes science education and makes science appealing to a wider variety of minds. Optics was the field chosen to test the HPS method of teaching science because: (1) current belief is highly anti-intuitive, (2) there is an impressive abundance of naive concepts, and (3) there is a rich 2500 year chronicle of optical conceptions replacing each other. The HPS course preserved the standard menu of regular curriculum topics. It interwove presentations on the historical growth of optics understanding with discussions about the nature and behavior of light. The authors chose the historical content to match known schema of students' alternative knowledge. One example from a small table matching "historical sciences" to "students' (mis)conceptions" is Al-Hazen's concept of vision matching to Image Projection Schema. The subjects were in four 10th grade classes. There was a control group of three 10th grade classes. The students were from three types of schools and had four hours of instruction a week. A specially prepared textbook was used. Both qualitative and quantitative assessment was performed. The facets-of-knowledge (F-of-K) and the schema-of-knowledge were compiled from student questionnaires. Content assessment was over the standard optical curriculum material only. Problem solving was addressed in class but was not reported. The frequency of a facet or schema and the difference between experimental and control group frequencies is addressed quantitatively. The Findings-and-Interpretations-of-Data section is subdivided into knowledge of vision, knowledge of the nature of light, and knowledge of optical imaging. Facets associated with scientific conceptions indicate learning and positive gain in knowledge but not complete acquisition. The data is presented in tables, backed-up with schematic reproductions and highlighted by student quotes. Under knowledge of vision, the nonscientific schema is "Spontaneous Vision Scheme" with its constituent five facets-of-(mis)knowledge. All five of these share the common misconception that vision is a natural phenomenon, lacking delivery of light (or anything) from the object into the observer's eye. As an example of facets-of-(mis)knowledge (F-of-(mis)K), #2 is: objects are observed when merely being located in the field of vision (and are not blocked). An example of a facet-of-(true)knowledge associated with the schema "Scientific Conceptions" is #4: vision is explained by the fact that light must leave the object and enter the observer's eye. #2 was the most common misconception out of five listed. #4 was the most common true conception out of six listed. Under knowledge of the nature of light, the schema are "Reified Light" and "Scientific Conception". The most prevalent (F-of-(mis)K) is #4: light is comprised of (many or an infinite number of) light rays which fill space. The most common facet-of-(true)knowledge is #4: light expands in the environment from objects with a decreasing intensity until it strikes opaque objects. Under knowledge of optical imaging the four schema are: Image Holistic Scheme, Image Projection Schemes, Scientific Conception (lens), and Scientific Conception (mirror). Image Holistic Scheme and Image Projection Scheme look similar and can only be distinguished through the mechanism of image transfer, a fact distilled from interviews. A F-of-(mis)K for Image Holistic Scheme is: Image is always formed and can be obtained on a screen (mirror); there it could be observed (afterwards). A F-of-(mis)K for Image Projection Scheme is: Explaining the image in a lens, students produce a diagram of a point-to-point connection of an object with its image by means of a single ray. While the F-of-K's are interesting in their own right, the paper compares the frequency with which each is held by the control [FAc] and experimental [FAe] groups. The F-of-(true)K most favorable to the experimental group and least favorable to the control group is: when the lens is removed, no image is produced, the FAc=0, the FAe=79. There are a total of forty-six F-of-Ks listed. Twenty-three of them are matched to nonscientific schema, and thus are really F-of-(mis)Ks. The data is illustrated by twenty-nine reproductions of student sketches. HPS subjects are shown to have both a far higher frequency of valid scientific knowledge and a far lower frequency of invalid alternative knowledge schemes. The benefits are attributed to the HPS curricula, highlighted by a couple of examples from the Atomist's theory of Eidola and Al-Hazen's medieval theory of image transfer. In refuting those historical models, their descendants provide us today with a rich, elaborate, interesting set of methods to refute in turn the Image Holistic Scheme and the Image Projection Scheme. Historical models are neither too obscure nor too complex for students to learn; they help make physics courses more effective and attractive to a wider population. This method does require two things. First, one must elicit the structure of student knowledge. Second, this knowledge must guide the selection of appropriate historical content. Having written the above, I feel the gem has been lost. Leaving aside the justifying theory and the assessment discussion, the basic message is this: the misconceptions of today's students match, exactly, historical science believed in its time by famous intelligent people (Aristotle). This historical science was refined into more correct yet still wrong not-so-old historical science by famous intelligent people (Al-Hazen). This refinement continues to the present day. The students' incorrect beliefs are not dumb, nor are the students alone in believing them. The only basic problem is that real life is very complex, intricate, and non intuitive. HPS shows how Al-Hazen corrected some of Aristotle's misconceptions and was in his own turn corrected. Using the insights and failures of great men and the wonders of time compression, you and I can walk this historical road and see both where the scientific community is today in its beliefs and why so many good ideas were left behind for ideas that were yet better if not yet perfect. Before moving on, I'd like to bring to the reader's attention a book review by S. Mahajan, "A Gold Mine of Teaching Ideas," published in The Physics Teacher, Vol. 39, page 512 (2001). This reviews Time For Science Education which combines physics with history, philosophy, sociology and education theory. The book's author, M. Matthews, uses a fascinating tale of French revolutionary politics and science to show the connection between ?2 and the acceleration due to gravity as measured in metric units.
CHAPTER 24 : Interactive Engagement Methods Potpourri
24.1 : Paper 18, Communicating Physics Through Story That ended my third chapter, we started with assessment tools, papers 1-8. The second chapter was I.E. methods that retain the lecture, papers 9-14; third was I.E. methods that replace the lecture, papers 15-17. The upcoming four papers also focus on how to teach physics. This chapter is a bit of a potpourri, unified in that all four papers neither redeem nor replace the lecture, but rather ignore it. If anything, these papers flirt with what and why to teach as much as they do how. Still, there are very concrete, do-this recipes advanced in these works.
R. Stannard, "Communicating physics through story," Physics Education, 30-34 (2001)
Stannard's article details reasons why we should introduce modern physics to children and advocates the methodology of story telling. Paper 18 provides in its references a list of award-winning books written for children on the subject of modern physics, most of which were authored by R. Stannard. It's important to "get in quick"; society teaches all of us that anything associated with the name Einstein is accessible only to geniuses. To defeat this negative socialization, we need to hook children on modern physics before they learn that they aren't supposed to understand it. Some of the findings of relativity and quantum theory appear to defy common sense. Common sense was for Einstein that "layer of prejudice laid down in the mind before the age of eighteen." So the earlier we start teaching or exposing children to modern science, the thinner the layer of prejudice to penetrate; children's thinking is not yet "too set in its ways." An early familiarity with modern physics attracts young people to physics. The prospect of studying modern physics is the most influential reason why students choose to study university-level physics. The most important reason for getting in quick is that one finds radically new physics springing from the flexible minds of theoretical physicists at the commencement of their professional careers, in their twenties. This flexibility is a hallmark of a young mind and is increasingly difficult to retain as age advances. Yet the young mind must already know enough physics to appreciate what the outstanding problems of the day are. Gaining this essential background early, widens the window of having both knowledge and flexibility. Age will close this window, leaving knowledge without flexibility, all too soon. Story telling is the primary method of imparting physics knowledge to children. This method is in accordance with modern cognitive development of children (Piaget, Shayer, Wylam) which labels children as concrete thinkers who think outwards from experience rather than as formal thinkers who extrapolate from a theoretical framework. Stories also have a rich historical validity, having shown over time that they interest and amuse listeners while providing a framework which allows accurate retrieval of knowledge. Story telling can carry serious messages and ideas, as shown by William Golding's Lord of the Flies and George Orwell's Animal Farm. Science can be conveyed by story telling. The classic example is Mr. Tompkins by George Gamow, which Stephen Hawking has declared to be a "great book". George Gamow was one of the founders of Big Bang cosmology. In Gamow's book, Mr. Tompkins finds himself in a fantastical world where Plank's constant is much larger and the speed of light incredibly slower than is the case in our world. Relativistic effects are thus a common factor of day-to-day living, as are quantum uncertainties. This entertaining and intriguing book also allows the more serious-minded reader to explore these ideas in greater depth through interleaved formal lecture extracts. Mr. Tompkins, published in 1965, was an immediate bestseller, popular with both the public and the professional scientists. In 1989, R. Stannard started the first of many books that bring modern science to children in both an accurate and a comprehensible manner. His latest book (2002) is Dr. Dyer's Academy in which scientific misconceptions are dealt with in a manner similar to how morality is approached in C.S. Lewis' Screwtape Letters. As one of the motivators for instructing the young, it is noted that in a survey of 250 twelve-year olds, 65% had no idea what a star was (i.e. a large distant sun). Teaching the young is not normally given energy and time in university physics departments. Still, internships for undergraduates to "do real physics" are pushed by various institutions, particularly as a summer occupation. If teaching is the best way to learn, then some consideration should be given to undergraduate internships that employ the undergraduate as a teacher. The level of difficulty could vary from something as simple as reading Mr. Tompkins and other books to a second grade class one hour a week, to something as complex as serving as a high school physics teacher's TA one day each week. In any event, this wider socialization of physics through accurate but non-rigorous books is not limited to the young, and despite excellent reasons to start young, the old and late starters should have their opportunity as well. M. LiPreste, in "A Comment on Teaching Modern Physics," published in the Physics Teacher Vol. 39, pg. 262 (2001), states his positive experience with an appreciation course in modern physics. As texts, he uses Issac Asimov's Atom and Brian Greene's The Elegant Universe. For a night class that was useless for degree requirements, 20 students showed up out of pure interest.
24.2 : Paper 19, Linking the Domains of Mechanics and Electromagnetism The basic idea in paper 19 has been implemented at several universities. For example, the nine classes of the junior year physics curricula at Oregon State University are: Static Vector Fields, Oscillations, One Dimensional Waves, Quantum Measurement and Spin, Central Forces, Energy and Entropy, Periodic Systems, Rigid Bodies, and Reference Frames. These stand in contrast to the traditional courses of the senior year curricula: Classical Mechanics, Quantum Mechanics, E&M, Statistical Mechanics, Optics, and Math Methods. This breaking and shuffling of old domains centralizes common principles that connect the old separate divisions into a continuous whole. The central principle of waves is used in the University of Oregon's Physics 351, the Physics of Waves, to connect the old divisions of mechanics, electricity, optics, and quantum. The upcoming paper connects Mechanics and E&M via several common principles.
E. Bagno, B. Eylon, U. Ganiel, "From fragmented knowledge to a knowledge structure: Linking the domains of mechanics and electromagnetism," PER Am. J. Phys. Suppl. 68(7), S16-S26 (2000)
Knowledge structure is the critical difference between expert and novice problem solving. Expert knowledge is organized around central principles; novice around external characteristics. The expert's advantage is that his organization allows the same knowledge to be used in different domains and in unfamiliar situations. Students, in contrast, have a difficult time organizing their knowledge around central characteristics. They often fail to distinguish between the general concept and its examples, and may, for instance, define potential energy as a function of height. This failure to distinguish between general concepts and their examples is exacerbated by introductory courses which divide physics into separate domains: mechanics, electricity, magnetism, optics, etc. As an example: the principle of conservative force is usually fragmented. The gravitational force, elastic force, and electrostatic force are all taught at different times and are not explicitly linked by instruction. Although, they are all merely examples of conservative force. An inter-domain organization of knowledge has several advantages. It reduces the load on memory. It enables students to become accustomed to difficult concepts by elucidating them from various points of view. It enables the student to employ the methods used in one domain to solve unfamiliar problems in another. Paper 19 focuses on the conservative forces (fields), conservative forces which are proportional to 1/r2, various examples of these, and the conservation of mechanical energy. A general concept is characterized by its critical attributes. For example, quadrilateral has two critical attributes: "quad" meaning four, and "lateral" meaning sides. Each concept has sub concepts which are its examples: for example, "square" and "rectangle". The problem is that examples have additional attributes not critical to the definition such as the ninety degree angles in both the square and the rectangle. Teachers often operate on two false assumptions. The first false assumption is that the strong resemblance between several examples is readily identifiable by learners. The second is that learners can easily differentiate between critical attributes and non-critical attributes. People place new information in a hierarchy. As teachers, we should see to it that central principles are at the top and examples are lower. As time passes, the lower levels are forgotten; the high-level important information is retained. The MAOF teaching package was developed over four years. It has vector fields and potential as its central concepts, linking mechanics to electromagnetism. Two schema of knowledge structures are illustrated; one is hierarchical; the other has feedback loops. During development, it was found that students totally confused the conservation of mechanical energy with the conservation of energy. Further, the conditions necessary for the conservation of mechanical energy were not clear to a large number of students. These and other confusions are cleared up by problem solving in different contexts and by the construction of concept maps. MAOF is a flexible review process for use in existing courses on mechanics and electromagnetism. The process uses a package of student workbooks and instructor transparencies for guided class work. Pre- and post-homework sets bracket the class work, preparing and amplifying the reviewed concepts. Three general considerations come into play when constructing an instructional method. The first is to divide up the material into manageable units, avoiding divisions which are either too fine or too gross. The second is to decide which teaching-learning sequence to use in presenting general concepts. The third is to design a sequence of activities that guide the learner toward the desired knowledge structure. MAOF uses a bottom-top-bottom sequence to define general concepts. For example, a discussion on conservative forces can start by dealing with the force of gravity near the Earth's surface. This will lead to a definition of the concept in general, and will be followed by other examples such as Coulomb's force. MAOF uses a five step model to guide the learner toward the desired knowledge structure. The steps are: solve, reflect, conceptualize, apply, and link. Each step is given a paragraph description, with concept maps playing an important role in the last two steps. The first step was kept deliberately simple by use of standard, familiar, few-step problems. The four MAOF units, conservative and non conservative forces, 1/r2 forces, electromagnetic fields, and vector fields are each addressed via the five-step process. Relationships, whether as counterexamples or as special cases, are emphasized. A diagnostic study examined the knowledge structure that students formed after a conventional physics course. This reference presents an example which probes the critical attribute of reference point in the concept of potential energy. The three questions of the probe reveal that approximately half of the students do not recognize contexts where the choice of reference point is arbitrary. An example is:
Statement 1: The point of reference for the calculation of electrostatic potential cannot be located on a positively charged object.
Correct Incorrect Explanation
... Student Response: Correct. It can not be located there because it must be at infinity.
In detailed interviews, eight students were asked to write down additional concepts to the general concept of potential energy and to note relationships by using directional arrows. In the example response, the student did not relate any critical attribute of potential energy, i.e. conservative force, reference point, conservation of mechanical energy, and work. The student did conceive the general concept, potential energy, as subordinate to one of its examples "Gravitational Energy" and focused on non critical details such as "m"(mass) and "g" (9.8m/s2). Evaluation of MAOF used a test provided as paper 19's Appendix A. The test investigated attributes of the general concepts, attributes of examples, judgment of unfamiliar cases, and the conservation of mechanical energy. The two general concepts addressed are potential energy and 1/r2 forces. In part one of the test, six critical conceptual attributes are tested. Four of these attributes showed statistically significant improvement from pretest to post-test. In part two, students distinguished between critical and non critical example attributes at a statistically significant level for the four examples. In part three, only about 30% of the students were able to ignore the many non critical similarities and realize that Gauss' Law is in fact not applicable on the pretest. This nearly doubled to 60% for the post-test. In part four, the use of energy instead of cumbersome kinematics went from 30% (pre) to 70% (post). The pretest was given to twenty teachers. There were striking similarities between student and teacher knowledge structures, although all teachers had a B.S. in physics. After going through the program as learners, the teachers found MAOF very useful in enhancing their own understanding of the material. Other programs which facilitate knowledge organization include the efforts of Van Heuvelen and those of Leonard et al. The expert versus novice labeling will be highlighted in the MPEX paper (paper 32). It's unfortunate, but practical that most reforms start as review. If reforms are both needed and effective, the obvious question is: why not teach that way initially? The talk of concept maps and knowledge structures is a bit more fleshed out in the paper, but the authors assume knowledge contained in their references. Still, their basic idea is sound; after all, we spend so much time on the simple harmonic oscillator in classical mechanics precisely to build a foundation model that will be applied again and again in many other fields. The simple harmonic oscillator is a "general concept" that physicists find quite useful. The admonition that the obvious distinctions between critical and non critical aspects of examples, is in fact not obvious, needs to be taken to heart. Asking several students to tell you what's important as in these eight interviews, is the quickest way to know you live in a different world than they. This, after all, makes a lot of sense. You've had years of instruction which they have not yet had. Hopefully, all that instruction did change you from what they are, to what you are now. A lot of instruction should result in a wide gap; bridging this gap between yourself and your students is the art of teaching. Finally, paper 1 asserts that a student is unlikely to surpass his teacher; this one only implies it. I hope students can and do surpass me. Otherwise, how could the human race advance over time? Some student must surpass his teacher; hopefully many do. How to get your students beyond your personal limitations would be an interesting subject to study.
24.3 : Paper 20, Physics Jeopardy Problems Well, we've jumped from reading children stories to changing the instructional structure of a University Physics Department. Perhaps a middle ground would be of use. The upcoming paper offers a modest proposal, one easily implemented by an individual instructor.
A. Heuvelen and D. Maloney, "Playing Physics Jeopardy," Am. J. Phys. 67(3), 252-256 (1999)
Jeopardy Problems are problems in which the student works backwards from a given mathematical equation to a diagrammatical, graphical, pictorial, and/or word description of a physical process. There are also Diagrammatical and Graphical Jeopardy Problems, where students invent a word or picture description and a math description consistent with the given diagram or graph. Jeopardy Problems have many strengths, and they are easy to create. They can even be multiple choice. They prevent formula-centered, plug-and-chug, problem solving. They promote multiple representations where equations, diagrams, and graphs all become the same short story about life. Jeopardy problems highlight units which are the key to determining whether we are dealing with pressures, densities, accelerations, or distances. Because of their probable novelty, students will need practice before these problems are used on tests. Jeopardy Problems come in multiple levels of difficulty with this reference giving several examples from the easy, to the quite difficult. While this paper does not offer examples at this level, the authors note that "Richard Feynman's last blackboard [was]: Given S matrix, find problem." Jeopardy Problems help the student develop qualitative understanding and help the students learn to use the symbolic language of physics. They accomplish this in part by preventing the means-ends analysis which many novice students use to do end-of-chapter problems. This reference offers eight examples of Jeopardy Problems with a discussion of possible answers. The direct impact of Jeopardy Problems on learning is difficult for the author to assess as they are used in conjunction with several other methods. The combined results are impressive with MB scores of 78% and FCI scores of 86% reported.
24.4 : Paper 21, Promoting Conceptual Change Let's delve into a paper that PER detractors would love. It is part of published PER and is representative of the far end of the PER spectrum. It is also the last paper in our Potpourri section.
C. Kalman, S. Morris, C. Cottin, and R. Gordon, "Promoting conceptual change using collaborative groups in quantitative gateway courses," PER Am. J. Phys. Suppl. 67(7), S45-S51 (1999)
Students often hold views different or alternative to those which they will be taught in their courses. The students will not easily relinquish their original viewpoints because these viewpoints explain observations and required effort to construct. Conceptual change requires the students to critically examine their view of the world. For change to occur, students must make value judgments, rate ideas, and accept or reject material based on standards. Producing change thus requires evaluation, the highest ability in Bloom's taxonomy. Helping students initiate a growth process can easily span the entire course. Simplistically, there are two methods of problem solving: Template and Paradigms. The key difference is that students who compartmentalize knowledge and apply different templates to different knowledge subsets, lack the ability to apply principles garnered from one problem to an apparently different problem. Furthermore, even if problem solving methods change, knowledge acquisition methods are likely to remain compartmentalized unless critical thinking skills are developed. Finally, students not only have personal scientific concepts, but personal epistemological beliefs as well. Paper 21 provides many references to learning theory and philosophy. Posner's learning framework for conceptual change is presented. Emphasis is placed on two points. First, students must know of problems with their personal scientific conceptions, usually via curriculum induced conceptual conflict. Second, the student must not compartmentalize his knowledge. People can hold contradictory beliefs. Replacement, not simple assimilation, is the teaching goal. Using a model based on proposals by Hewson, the authors attempt to produce change in four common personal scientific concepts. These concepts are: bodies of different masses falling from rest through a non-viscous media for a short time are found at later times to move at different speeds (concept 1), a fast-moving arrow stays in the air because of its great speed (concept 2), if a sandbag is dropped from an ascending balloon, immediately upon release, the initial velocity of the sandbag is zero (concept 3), and a ball, thrown in the air, is in equilibrium at the highest point in its motion (concept 4). Instruction must show both that the replacement concept is intelligible and that the personal concept is less plausible. The authors argue that the above is a reasonable strategy for younger students but is cumbersome. They argue that it is better to "get the students to critically analyze the two concepts and come to the realization that the personal scientific concept needs to be replaced." The basic procedure to achieve concept replacement is a collaborative group exercise. Three or four students are assigned to a group and to individual roles within the group (reporter, critic, etc.). Students are presented with a demonstration or qualitative problem. They discuss it for a fixed time period. They then report. The principle is that there are at least two ways of looking at a problem non-judgmentally. Two groups with different concepts, report to the class. The spokespersons debate, and the rest of the students may ask questions. The opposing issues are clearly presented; then the class votes as to which concept resolves the demonstration or qualitative problem. Voting is essential to combat compartmentalization. The professor then resolves the conflict. The test used to determine concept replacement was the FCI + 3. The FCI was used to norm; the three additional questions were specific to this study and are attached to paper 21 as its Appendix A. The treatment group was more successful in making conceptual change than the control group. Treatment group test sheets are provided as paper 21's Appendix B. For concept 1, the students had very high pretest scores so no inference could be drawn. For concepts 2 and 4, the treated group outperformed the standard group in a statistically significant fashion. For concept 3, no statistically significant difference between the groups was noted. I included this paper for several reasons. One, it was in the 1999 PER Am. J. Phys. Supplement, and I desired to review the entire supplement. Two, it provides some good theory summations. Three, the idea of conceptual debate by student spokespersons is intriguing. Four, paper 21 is an example of sweeping judgments based on two classes worth of students on four questions, only two of which were statistically significant. And five, reading the paper will allow you to empathize, if not agree, with the viewpoints of some PER detractors. Were this reference alone, it could be dismissed as an aberration. Unfortunately, it is representative of its end of the spectrum. For example, consider D. Abbott et al.'s paper "Can one lab make a difference," published in PER Am. J. Phys. Suppl. 68(7) S60-S61 (2000). Abbot's published paper describes its author's use of one McDermott tutorial on one lecture section's worth of students in one university. The students were assessed by an eight question traditional quiz, which found there was a statistically significant difference on one question in favor of the tutorial group. The other seven questions were not statistically significant. The students were also assessed by six "Direct" questions on which the tutorial group outperformed the traditional group by 20%. The "Direct" questions are referenced to a Ph.D. thesis. All questions were multiple choice, but they aren't provided. At best, Abbot is simply reporting the results of a multiple choice test given immediately after instruction to a small group of students. Yet he provocatively claims this makes a difference. To whom? The authors of paper 21 have even more prestigious company. The authors of paper 1 end their venerable FCI paper with severe contortions. They spend the last couple of pages singing the praises of the "Wells Method" even though they admit there were "no overall improvement in gains via the NSF physics education project conducted by Wells..." I am presenting paragraph G of paper 21, in its entirety, as an example of the contortions resulting when desire prevails over logic:
Question 30 was the only question that especially addressed concept 3; the idea that if a sandbag is dropped from an ascending balloon, immediately upon release the initial velocity of the sandbag is zero. The fact that there was no statistical difference between the two groups in their improvements on post-test scores may have occurred because the groups were still not used to working together, but it is impossible to verify this. A more interesting explanation is that this also had something to do with the way the question was framed. The key point is, as pointed out earlier, that students lack the ability to apply principles garnered from a problem to an apparently different problem.15,16 Students may not recognize that the problem of a brick falling off the edge of a descending construction elevator (question 30 in Appendix A) is identical to the problem of a sandbag released from an ascending balloon. The premise of this paper is that the students' development of critical thinking is essential. This is the only way that students will not simply accommodate the replacement concept by compartmentalization of their knowledge. After the first exercise, the students had not developed their critical thinking skills and the different appearance of question 30 caused them to utilize their personal scientific concept instead of the replacement concept. This would account for the result that no significant improvement of the treated group over the control group occurs for concept 3 whereas significant improvements were observed for concepts 2 and 4. To test this idea in September 1997, in a two-semester course on physics for nonscience students, Dr. Kalman tried the following experiment: After the students had read about inertia in the textbook, but only as applied to horizontal motion, Dr. Kalman presented the sandbag problem. By vote the entire class without exception concurred that the sandbag would fall immediately without rising. The correct result that the sandbag would initially continue with the same speed as the balloon was then fully explained in terms of inertia. The students expressed themselves as delighted with the correct answer. Dr. Kalman then presented an experiment from the "The Video Encyclopedia of Physics Demonstrations"23 in which a ball was fired vertically from a "car" moving horizontally at constant velocity. The video asks where the ball will land: in front of, behind or on top of the "car" and then pauses. Fully one half of the class considered that the ball would hit the ground ahead of or behind the "car".
The key words are "it's impossible to verify this" and that pretty much is my summary of the whole work. A science is verifiable, and thus, paper 21 illustrates just how far PER has yet to go in transforming the art of teaching into the science of teaching. The upcoming four papers focus on computer usage.
CHAPTER 25 : Interactive Engagement Methods Highlighting the Computer
25.1 : Paper 22, Physlets and Just-in-Time-Teaching
W. Christian, "Educational Software and the Sisyphus Effect," Computing in Science and Engineering, May-June 1999, 13-15 (1999)
Today's physics education software is fundamentally different from that of even 1991. Historically, the software was platform-dependent and thus obsolete within eighteen months as the supporting platform (Apple II's, etc.) was eclipsed. Today's software is platform-independent being based on virtual machines, meta-languages, and open Internet Standards, and thus not subject to obsolescence on such a short time scale. Christian asserts that computers using commercial mass market technology such as Java and Java Script qualify under Hake's definition of and validation of Interactive Engagement (IE). The author distinguishes between media-enhanced and media-focused problems, and he introduces Physlets and Just-in-Time-Teaching. Physlets are multimedia-focused problems in which the text does not give numbers. Observing an animation on the computer screen, the student must find the minimum speed, observe the motion, apply physics concepts, and with use of a mouse, measure parameters that the student deems important. Only then can he solve the problem. This requires the student to consider the problem qualitatively and prevents the "plug-and-chug" method of problem solving. To be truly effective, computer-assisted instruction must create a feedback loop between the instructor and the student. Just-in-Time-Teaching (JiTT) is one method of achieving this feedback. JiTT consists of short Web-based assignments which are due a few hours before class. The instructor builds an interactive lecture around the students' answers. Thus, students take part in a guided discussion that begins with their own preliminary understanding of the material. Christian notes that technological advances do not necessarily improve learning. Two examples of this are watching video and using database techniques in attempts to tailor individual curricula for learners. He further asserts that virtual reality, 3-D modeling, and voice recognition are likely to have little impact without curricular development efforts. The author believes that for computerization to have a long-lasting impact on science education, it needs to be based on a successful pedagogy and not on the latest compilers, hardware, or algorithms. The FCI authors concur stating in paper 1 that "technology by itself cannot improve instruction." In an interesting historical note, the author mentions that post-Sputnik curricular reform material such as the Berkeley Physics Series is still available precisely because it was preserved in books and not subject to the hardware obsolescence that erased much of the early computer-based curricula reforms. The above review brought to mind an interesting letter to the editor by D. Edmonds published in Am. J. Phys. Vol. 69(6) entitled "Troy Ounces (or Tons) of Silver." The letter is also about something made with great effort and obsolete shortly thereafter; in this case, 146,000,000 Troy ounces of Fort Knox silver were required to construct it. During WWII's Manhattan Project, one method to separate uranium 135 and 138 was by using calutrons. These large machines used the silver as wiring in huge electromagnets which were disassembled at the end of the war. "Fort Knox wanted its silver back," all 5,000 tons of it.
25.2 : Paper 23, Numerical Integration
P. Assimakopoulos, "A Computer-Aided Introductory Course in Electricity and Magnetism," Computing in Science and Engineering, Nov/Dec 2000, 88-94 (2000)
Electricity and Magnetism introductory courses expose students to new and complex concepts, such as: action at a distance, fields, potentials, and the superposition principle. In order for students to digest and assimilate these concepts, small groups of students must work out examples. These examples are typically those for which we have closed-form solutions. Today with powerful, inexpensive PCs and versatile software packages, we should no longer restrict ourselves to this small set of examples. Holding to the pedagogical approach that students benefit by doing everything for themselves as they thereby understand every step, the author advocates use of Microsoft Excel and Microsoft's Visual Basic Applications to perform numerical integration on both closed form (for comparison) and non-closed form (for realism) problems. The author gives examples using both the trapezoidal rule and Simpson's rule. He offers examples in: (1) electrostatic fields and potentials, (2) magnetic induction, and (3) visualization of electric and magnetic fields. He provides a very nice 3-D Excel graph of the electrostatic potential for a quadrupole as determined by a numerical solution of Poisson's equation. The author speaks to several issues in his Assessment section. Therein I found one of the more poignant statements in PER literature:
Some of the happiest moments in my teaching career came from the satisfaction expressed by students at such simple accomplishments as verifying results obtained from numerical integration of functions by comparing them with closed-form solutions.
25.3 : Paper 24, To Simulate or Not to Simulate? Computers are used not only for calculation purposes, but also to out-and-out replace lab experiments. After all, if simulation is good enough for our National Nuclear Laboratories, surely it's good enough for us.
R. Steinberg, "Computers in teaching science: To simulate or not to simulate?," PER Am. J. Phys. Suppl. 68(7), S37-S41 (2000)
Paper 24 starts by giving some background references and by acknowledging that computer simulations often teach in fundamentally different ways than those methods scientists employ to discover knowledge. This reference compares three interactive learning classes, two use computers and one uses pencil and paper. The strategy and curricula of all three are based on McDermott's work at the University of Washington. The classes were introductory calculus-based physics at the University of Maryland. The "fraction of the possible gain" in the FCI scores (<g>) were 32% and 25% for the computer-based air-resistance tutorial classes and 32% for the pencil-and-paper air-resistance tutorial class. Air resistance was covered in one lecture and one homework problem. The tutorial pretest showed that while 80% of the students gave qualitatively correct graphs of position versus time for motion without air resistance, fewer than 10% could do so with air resistance. On velocity versus time graphs for motion with air resistance, only 9% indicated terminal velocities. The lesson with and without computer use is described. Both the computer and the pencil and paper tutorials are essentially the same in content, coverage, and interactive engagement level. The significant difference is that the students use computers to verify their predictions in the first, and use more graphs and free body diagrams in the second. All students were diligent and engaged, but the second group had no means of "knowing" the answer and had to rely on consistency between their graphs and their diagrams. The first group seemed to rely on a more authoritarian viewpoint, that the right answer was something from the computer rather than something which the students had constructed themselves. A common midterm question is given along with the percent correct. Student performance was not significantly different across the classes, neither on the question nor on the test as a whole. The author sums up the research of others mostly pro-simulation. He does add the proviso that "running a computer simulation is very different than doing a physical experiment." Paper 24 is another example of nothing posing as a lot. In all ways, it's worse than paper 13. Basically, we have two versions of the same commercially available tutorial being compared. One version uses computer simulation, the other, graphs and diagrams. The FCI is used to assess air resistance learning, and the result is charitably a tie: 32% to 32%. First, sample size is three classes. Second, this is not a comparison of simulation to real life, there are no actual experiments performed. Third, the FCI is not a valid assessment tool on the subject of air resistance. The FCI has one question on air resistance, with several others explicitly stating: "Disregarding any effects of air-resistance" (question 5); "When responding to the following question, assume that any frictional forces due to air resistance are so small that they can be ignored." (question 18). Implicitly, questions 16, 24, 25, and 26 all assume no air resistance. Only question 22 includes "the force of air resistance" and then only to acknowledge that air resistance affects the flight of a golf ball; question 22 does not even probe how air resistance affects this flight! The authors of paper 13 at least recognized that the FCI was not a valid assessment tool for their experiment. Paper 24 is mostly pro-simulation commentary based on references. The best that can be said about Steinberg's actual experiment is that compared to pencil and paper graphs and diagrams, computer simulations don't hurt. And if you choose to focus on the 25%, you couldn't even say that. In fact, given the invalid assessment tool, no one can say anything, and yet this paper is published in the American Journal of Physics. That was an ending, and a good one, but I want to comment on two other things. The Studio Physics authors (paper 15) realized that the number of people per computer is an issue that should not be conspicuous by its absence even though their only constructive comment is to not have a 3 person to 2 computer ratio. Then there is the issue of a server crash. Here at UCSC, our upper-division lab's computer system crashed for the last few weeks of class in the winter quarter of 2002. This resulted in over half of the students having to take incompletes and make-up the missing work in the spring quarter. Relying on computers is a double-edged sword, and when things go bad, they can go horrid. Our last paper focusing on computers is coming up. It is also the last paper focused specifically on how to teach physics content.
25.4 : Paper 25, Online Homework
S. Bonham, R. Beichner, and D. Deardorff, "Online Homework: Does it make a Difference?," The Physics Teacher, Vol. 39, 293-296 (2001)
The experimental fact is that good, conscientious, extensive, time-consuming, hand-graded, comments written on graded homework by a graduate student have NO observable benefit or determinant to undergraduate learning as compared to computer grading. Computers do have the advantage in that they can support pedagogues other than the traditional back-of-the-textbook physics problems used in paper 25's comparison. Methods of student assessment ranged from the FMCE, through quiz averages, all the way to time spent on homework per week as self reported. Students respond positively to using the computer for homework. This reference supports the viewpoint that technology does not improve or harm student learning, but rather that pedagogy is the critical issue. Pedagogy will be addressed in Chapter VII. Next up, a small chapter focusing on what to teach, which I view as how-driven. Interactive Engagement methods require more time than Traditional lectures, and so we face the classic depth versus breadth problem. As I.E. methods chose depth for us, they are also choosing breadth against us. So, the question is what content to cut? Because everybody starts with a slightly different course content and a desire to be psychologically positive, the question is rephrased to be: what to keep? What to keep, creeps into what to teach? Pandora's Box opens. The majority of papers in PER which address the subject of what to teach, are actually advocating the inclusion of modern physics and electronics into physics' curricula. After all, qualitatively these subjects are quite interesting and fun, unlike algebra (otherwise known as mechanics). Further, why stop teaching physics at the 19th century, when we're in the 21st? In all this, what to cut quietly disappears, with each I.E. instructor deciding on his own to cut buoyancy. Which is O.K., in a way, except nobody gets around to teaching buoyancy, and thus, one old traditionalist has a stick with which to beat I.E. about the head and shoulders. My use of buoyancy here and earlier is an echo of the FCI (paper 1). It's authors end a paragraph in defense of problem 12 with: "Besides, some teachers might think physics students should know why things float!" The FCI authors, of course, definitely do not self-label as Traditionalists. Sadly, what to cut is a choice often made by an isolated teacher praying that the next course doesn't catch his students too much by surprise. Fundamentally, I.E. or not, our courses either have not enough time, (10 credit classes anyone?) or have too much content. Nobody is willing to publicly advocate throwing out buoyancy, although in private it must be common. I say "must be" in part because of the aforementioned FCI (paper 1) paragraph, but more so because of the huge choice of 12B over 12D on the old FCI. In paper 1, 12B = 949 students, 12D = 0, combined wrong distractors = 176 on post-instruction testing. It's not often the right answer is a null. It's even more unusual to accept two answers as correct on a MCSR test question as "SR" means "single response". The reason advanced to "allow" 12B as "acceptable" is simply nobody was picking 12D! So, if you missed my buoyancy allusions earlier, you're now in the know, how does it feel? To capstone the issue, by accepting two answers, question 12 on the original FCI made nobody happy and opened the test up to criticism. The revised version comes down on the practical side by getting rid of option D entirely [revised FCI question 29]. Thereby implicitly acknowledging that, in fact, NO, students do not need to know why things float! So, what else do they not need to know? It's the unanswered question in PER. Strictly by inference, I get the idea that students definitely don't need electives. We already know they should read outside of class (paper 11) and do homework problems at 30 minutes each (paper 14). What we don't know is how much reading or how many problems. The IMPEC room being open 24 hours a day (paper 16) brings it all together; one lives physics, one doesn't merely study it. ERGO we cut no subject, let us only add subjects, with any time savings coming from how we teach, not from what. Which brings us full circle in absurdities, because we have in our I.E. methods picked the time-intensive method of teaching physics!
CHAPTER 26 : Interactive Engagement Curricula
26.1 : Paper 26, Physics Goes Practical The next few papers advocate including electronics classes in the curricula and modern physics in the introductory physics syllabi. These are good ideas, but there is no advice on what to cut out of curricula or syllabi to make room/time for these additions. Sadly, it's not required for authors of good ideas to designate what current practice must be cut to make room on the smorgasbord-of-physics for the up-and-coming hot new idea.
T. Usher and P. Dixon, "Physics goes practical," Am. J. Phys. 70(1), 30-36 (2002)
The applied physics option at California State University San Bernardino is distinguished from the traditional B.S. primarily by three courses: Introductory Electronics, Data Acquisition and Control, and Advanced Electronics. These courses focus on the analog electronic domain because digital courses are available through other departments. In Introductory Electronics, pre-calculus freshmen are taken from Ohm's law to constructing lock-in amplifiers in one quarter. In part, this rapid progression is the result of using LabVIEW and borrowing concepts from both "Workshop Physics" and "Studio Physics". It's also due to the amount of time invested by students with 6 hours and 40 minutes a week for ten weeks committed in a largely laboratorial setting. The option strongly pushes paid internships for a number of reasons. One reason is that such a program establishes good contacts with industry, thereby providing important curricula feedback. Finally, this program is part of a broad local effort to attract small high-tech companies to the San Bernardino area.
26.2 : Paper 27, Resource Letter: TE-1 Teaching Electronics Paper 27 also addresses electronics, albeit for a different reason; it is a Resource Letter with a good introduction. Any electronics instructor will find it useful.
D. Henry, "Resource Letter: TE-1: Teaching electronics," Am. J. Phys. 70(1), 14-23 (2002)
The letter lists 253 publicly available resources, including: web sites, lab manuals, articles, textbooks, reference books, videos, and sources of equipment and parts, both new and used. In the introduction to the list of resources, the author strongly advocates an electronics course for future scientists. The focus of this course is different from that of an electrical engineering course. While scientists will not be designing stereo preamplifiers or reinventing the digital computer, they will be called upon to modify or combine electronic lab equipment and instrumentation in both standard and creative ways. Physics students who go into experimental physics or engineering will encounter overlapping generations of electronics equipment. They will research and work in environments that are electromagnetically noisy. Worse, both local and distant technical support for most electronics equipment is becoming minimal or nonexistent. In a related manner, the students will have to dig into undocumented software written in a multitude of languages. Not only will a formal electronics course help them with these issues, but it will also compensate for a world in which normal life no longer requires even a modest understanding of how electronics equipment works. Such a course would also compensate for the less-is-more trend in introductory physics sequences, which frequently sacrifice coverage of AC circuits and sometimes even of discrete digital components.
26.3 : Paper 28, Macroscopic Phenomena and Microscopic Processes For those who followed my earlier commentary, the last sentence in paper 27's review makes for a good feedback loop (ha ha, bad joke). The upcoming paper is our last electronics paper before moving on to modern physics.
B. Thacker, U. Ganiel, and D. Boys, "Macroscopic phenomena and microscopic processes: Student understanding of transients in direct current electronic circuits," PER Am. J. Phys. Suppl. 67(7), S25-S31 (1999)
Paper 28 starts with acknowledging that most PER has focused on Mechanics, but that in recent years, Physical Education Researchers have turned their attention towards E&M. The authors propose to test Eylon and Ganiel's assumption that knowledge of microscopic models would enable qualitative reasoning about macroscopic E&M phenomena. They compare a Traditional class to one using Chabay and Sherwood's new text which emphasizes microscopic processes. A written questionnaire, very similar to the one used in a previous study of Israeli High School (Group HS) students, was given to 90 University of Ohio (Group A) students and 26 University of Michigan-Flint (Group B) students. Hour-long interviews were also conducted with 20 and 6 students respectively. Questionnaires and interviews were conducted during the last few weeks of the students' calculus-based introductory E&M course. Group A was taught in the traditional format of lectures, labs, and discussion groups. Group B differed only in that it was "based on a text that focused on qualitative reasoning, desktop experiments, well-constructed explanations and discussion with partners." Group HS was taught while models of microscopic processes were still being developed. The five assessment questions are given in paper 28's appendix. Questions 1-4 are identical to the Israeli study. Question 5 was added to test student understanding of grounding. The authors' hypothesis is that: for students to have a solid understanding of transients in DC electric circuits, a model of microscopic processes is required. The spread between Group A (normal textbook) and Group B (reform textbook) ranges from 18% / 90% to 68% / 80% in favor of Group B, depending on the question. The correct written explanations of Group A versus Group B are also strikingly different, with B's being much longer and more specific. A's wrong answers illuminate four points. First, too heavy a reliance on memorization and math can be fatal. An illustrative example is given. Second, students erroneously believe that the order of elements matters in a series circuit. Third, students think that charge somehow has to jump from one plate of a capacitor to the other in order for current to flow. Finally, fourth: 7% of Group A clearly stated that they did not understand the relationship between charge and current. Group HS did significantly better on questions 3 and 4 than either group A or B. The authors credit in-service courses taken by the Israeli teachers for this difference, [for example: A = 7%, B = 41%, & HS = 64% for question explanation 4b]. Paper 28 offers percentages for both "correct answer" and "correct explanation". A common mistake for both groups A and B was that charges originated in the battery only; the roles of the conductor need to be given more attention. Neither group A nor group B had instruction where grounding was specifically discussed. The A/B percentage correct ranged from 83/100 down to 13/14 on the different parts of question 5. This was an important and a discouraging finding. It indicates that even after explicit instruction, students do not use microscopic mechanisms when confronted by completely unfamiliar phenomena. This reference provides several discouraging student quotes, with neither Group A nor Group B discussing electric forces or electric fields in their explanation to question 5c. There was enough consistency in A's wrong answers to questions 5a & 5b to believe that many students do not understand "net change", and misunderstand "potential difference". The paper offers a sample interview of a Group A student. Many of A's interviews were similar in that students would "search for replies that would utilize phrases (and even equations) they had encountered." They did not use a mental model of the physical situation, and some even failed to see when they were contradicting themselves. B's interviews were quite different. They would recognize inconsistencies and construct models easily. The authors conclude that models of microscopic processes should be introduced as an integral part of any E&M course. This is based on the assertion that "Group B students exhibited a superior understanding of the phenomena... including those which were less familiar to them, than Group A students." Let's start the commentary with five small items and then discuss the big problem. First, an emphasis: groups A and B differ only in their required textbook. Group A used a "traditional text" and Group B used a "text that emphasizes models of microscopic processes." Second, placing paper 28 in this chapter is not ideal, but this is the best spot I could find to put it. Third, the best part of this paper is its identification of several common student misconceptions in E & M. Fourth, there is no pre-instruction comparison between groups A & B; thus, it is an assumption that post-instruction differences were the result of instruction only. Fifth and the last small item, Paper 40 notes that in Thermodynamics, students use microscopic processes to justify both science and misconceptions alike. Now to the big problem, the body of paper 28 contradicts its conclusion. Question 5 is a four-part question, the last two parts ask questions on the never taught subject of grounding. The authors had hoped students taught via the microscopic process method [Group B] would be able to transfer their skills into an unknown domain [grounding] and outperform the students who were not exposed to microscopic process considerations [Group A]. The results for the four parts of Question 5 are: 5a - 83/100, 5b - 26/52, 5c - 13/14, and 5d - 13/17; all numbers are percentage correct with Group A first and Group B last. Question 5 itself follows for context:
5. Consider what would happen if, after the switch S had been closed for a long time, the capacitor was removed from the circuit (without touching the capacitor leads). a) Would there be charge on either plate of the capacitor? What would be the net charge on the capacitor? Explain. b) Would there be a potential difference across the capacitor? Explain. c) If one plate of the capacitor were then connected to ground, would the charge on either plate change? Explain your answer. d) If one plate of the capacitor were then connected to ground, would the potential difference change? Explain your answer.
The authors themselves acknowledge in the body of paper 28 the unhappy repercussions of the miserable tie in part 5c [13/14]:
In total, the answers of the two groups of students were not very different on this question. This is an important (and somewhat discouraging) finding, since it indicates that even when instruction emphasizes microscopic mechanisms (as done for Group B), students do not use these mechanisms when confronted with phenomena that are completely unfamiliar to them: transfer is of limited extent.
The authors, however, conclude paper 28 quite differently:
|